Datasets are used in the following file structure:
│modular-model-adapt/
├──data/
│ ├── cybersecurity
│ │ ├── cybersecurity_source.csv
│ │ ├── cybersecurity_target.csv
│ ├── disaster
│ │ ├── disaster_source.csv
│ │ ├── disaster_target.csv
│ ├── review
│ │ ├── review_source.csv
│ │ ├── review_target.csv
│ ├── socialmedia
│ │ ├── socialmedia_source.csv
│ │ ├── socialmedia_target.csv
cs_source.csv: You can download it from: herecs_target.csv: You can download it from: heredisaster_source.csv: You can download it from: heredisaster_target.csv: Please refer to emergency.csv file.hotel_review.csv: You can download it from: herereview_source.csvandreview_target.csv: You can download it from: heresocialmedia_source.csvandsocialmedia_target.csv: You can downloadRS_2019-03.zstandRS_2019-04.zstfrom: here
All code was developed and tested on Nvidia RTX A4000 (48SMs, 16GB) the following environment.
- Ubuntu 18.04
- python 3.6.9
- gensim 3.8.3
- keras 2.6.0
- numpy 1.19.5
- pandas 1.1.5
- tensorflow 2.6.2
To pre-train the model, run the following script using command line:
sh run_pretrain_offline.shTo adapt the model online, run the following script using command line:
sh run_update_online.shThe following options can be passed to main.py
-dataset: Name of the dataset. (Supported names are cybersecurity, disaster, review)-model: Neural architecture of the OnlineClassifier. (Supported models are CNN, LSTM, Transformer)-ood_trigger: The number of batches to trigger OOD. Default is 5.-adjust_weight: Relative importance between learning efficiency and accuracy. Default is 0.5.-epochs: Epochs for training model. Deault is 20.-event_size: Size of streaming batches.-batch_size: Size of batch to train the model.-keyword_size: Size of keyword set to calculate the frequency indicator.-embedding_size: Size of embedding layer.-output_path: Path for the output results.-token_path: Path for saving and loading tokenizer.-model_path: Path for saving and loading machine learning-based OnlineClassifier.-ml_path: Path for saving and loading machine learning-based AccPredictor.-pretrain: Execute the model pre-training in offline.-update: Execute the model update in online.