Online Code For Edge Probing

This fork of jiant1 supports estimating online codelength on the edge probing tasks.

A 3-step Guide:

Step 1 Background and Basics

This repo is based on jiant1. If you are unfamiliar with jiant, We suggest you set up the jiant environment following this tutorial.

We strongly recommend running the standard edge probing experiment at least once following the tutorial here before running the online coding experiment. The tutorial teaches you how to set the envirnoment variables, download and preprocess the data, set config flags, etc.

Step 2 Main Experiments (Data Transmission)

Run the following commands to compute the portion codelengths on the task dependency labelling with RoBERTa-BASE. You can also custmize the split of the data by modifying online_code_data_split.

python main.py --config_file jiant/config/edgeprobe/edgeprobe_miniberta.conf --overrides 'exp_name=dep_roberta-base_1, target_tasks=edges-dep-ud-ewt, transformers_output_mode=mix, input_module=roberta-base, tokenizer=roberta-base, target_train_val_interval=1000, batch_size=32, target_train_max_vals=100, lr=0.0001, online_code_preshuffle_seed=1234, online_code_data_split="0,0.002,0.004", patience=10'
python main.py --config_file jiant/config/edgeprobe/edgeprobe_miniberta.conf --overrides 'exp_name=dep_roberta-base_2, target_tasks=edges-dep-ud-ewt, transformers_output_mode=mix, input_module=roberta-base, tokenizer=roberta-base, target_train_val_interval=1000, batch_size=32, target_train_max_vals=100, lr=0.0001, online_code_preshuffle_seed=1234, online_code_data_split="0,0.004,0.008", patience=10'
python main.py --config_file jiant/config/edgeprobe/edgeprobe_miniberta.conf --overrides 'exp_name=dep_roberta-base_3, target_tasks=edges-dep-ud-ewt, transformers_output_mode=mix, input_module=roberta-base, tokenizer=roberta-base, target_train_val_interval=1000, batch_size=32, target_train_max_vals=100, lr=0.0001, online_code_preshuffle_seed=1234, online_code_data_split="0,0.008,0.016", patience=10'
python main.py --config_file jiant/config/edgeprobe/edgeprobe_miniberta.conf --overrides 'exp_name=dep_roberta-base_4, target_tasks=edges-dep-ud-ewt, transformers_output_mode=mix, input_module=roberta-base, tokenizer=roberta-base, target_train_val_interval=1000, batch_size=32, target_train_max_vals=100, lr=0.0001, online_code_preshuffle_seed=1234, online_code_data_split="0,0.016,0.032", patience=10'
python main.py --config_file jiant/config/edgeprobe/edgeprobe_miniberta.conf --overrides 'exp_name=dep_roberta-base_5, target_tasks=edges-dep-ud-ewt, transformers_output_mode=mix, input_module=roberta-base, tokenizer=roberta-base, target_train_val_interval=1000, batch_size=32, target_train_max_vals=100, lr=0.0001, online_code_preshuffle_seed=1234, online_code_data_split="0,0.032,0.0625", patience=10'
python main.py --config_file jiant/config/edgeprobe/edgeprobe_miniberta.conf --overrides 'exp_name=dep_roberta-base_6, target_tasks=edges-dep-ud-ewt, transformers_output_mode=mix, input_module=roberta-base, tokenizer=roberta-base, target_train_val_interval=1000, batch_size=32, target_train_max_vals=100, lr=0.0001, online_code_preshuffle_seed=1234, online_code_data_split="0,0.0625,0.125", patience=10'
python main.py --config_file jiant/config/edgeprobe/edgeprobe_miniberta.conf --overrides 'exp_name=dep_roberta-base_7, target_tasks=edges-dep-ud-ewt, transformers_output_mode=mix, input_module=roberta-base, tokenizer=roberta-base, target_train_val_interval=1000, batch_size=32, target_train_max_vals=100, lr=0.0001, online_code_preshuffle_seed=1234, online_code_data_split="0,0.125,0.25", patience=10'
python main.py --config_file jiant/config/edgeprobe/edgeprobe_miniberta.conf --overrides 'exp_name=dep_roberta-base_8, target_tasks=edges-dep-ud-ewt, transformers_output_mode=mix, input_module=roberta-base, tokenizer=roberta-base, target_train_val_interval=1000, batch_size=32, target_train_max_vals=100, lr=0.0001, online_code_preshuffle_seed=1234, online_code_data_split="0,0.25,0.5", patience=10'
python main.py --config_file jiant/config/edgeprobe/edgeprobe_miniberta.conf --overrides 'exp_name=dep_roberta-base_9, target_tasks=edges-dep-ud-ewt, transformers_output_mode=mix, input_module=roberta-base, tokenizer=roberta-base, target_train_val_interval=1000, batch_size=32, target_train_max_vals=100, lr=0.0001, online_code_preshuffle_seed=1234, online_code_data_split="0,0.5,1", patience=10'
python main.py --config_file jiant/config/edgeprobe/edgeprobe_miniberta.conf --overrides 'exp_name=dep_roberta-base_10, target_tasks=edges-dep-ud-ewt, transformers_output_mode=mix, input_module=roberta-base, tokenizer=roberta-base, target_train_val_interval=1000, batch_size=32, target_train_max_vals=100, lr=0.0001, online_code_preshuffle_seed=1234, online_code_data_split="0,1,2", patience=10'

Step 3 Compute the codelengths

After you finish all the experiments, copy compute_online_codelength.py to $JIANT_PROJECT_PREFIX, and run the script to get the final results:

cp compute_online_codelength.py $JIANT_PROJECT_PREFIX
cd $JIANT_PROJECT_PREFIX
python compute_online_codelength.py

Name		Name	Last commit message	Last commit date
Latest commit History 1,670 Commits
.circleci		.circleci
gcp		gcp
jiant		jiant
probing		probing
scripts		scripts
tests		tests
tutorials		tutorials
.gitignore		.gitignore
.gitmodules		.gitmodules
.pep8speaks.yml		.pep8speaks.yml
.pre-commit-config.yaml		.pre-commit-config.yaml
.pre-commit-hooks.yaml		.pre-commit-hooks.yaml
CODEOWNERS		CODEOWNERS
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
allennlp_mods		allennlp_mods
cola_inference.py		cola_inference.py
compute_online_codelength.py		compute_online_codelength.py
environment.yml		environment.yml
main.py		main.py
pyproject.toml		pyproject.toml
setup.py		setup.py
user_config_template.sh		user_config_template.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Online Code For Edge Probing

A 3-step Guide:

Step 1 Background and Basics

Step 2 Main Experiments (Data Transmission)

Step 3 Compute the codelengths

About

Releases

Packages

Contributors 11

Languages

License

nyu-mll/online-code-for-edge-probing

Folders and files

Latest commit

History

Repository files navigation

Online Code For Edge Probing

A 3-step Guide:

Step 1 Background and Basics

Step 2 Main Experiments (Data Transmission)

Step 3 Compute the codelengths

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 11

Languages

Packages