Skip to content

Commit b749ecc

Browse files
authored
General Updates (#13)
1 parent a37ea8f commit b749ecc

13 files changed

+495
-575
lines changed

.pre-commit-config.yaml

+39-40
Original file line numberDiff line numberDiff line change
@@ -1,48 +1,47 @@
11
# See https://pre-commit.com for more information
22
# See https://pre-commit.com/hooks.html for more hooks
33
repos:
4-
# Linting
5-
- repo: https://github.com/pre-commit/pre-commit-hooks
6-
rev: v4.4.0
7-
hooks:
8-
- id: check-ast
9-
- id: trailing-whitespace
10-
- id: end-of-file-fixer
11-
exclude_types: [jupyter]
12-
- id: check-toml
13-
- id: check-added-large-files
14-
- repo: https://github.com/psf/black
15-
rev: 23.9.1
16-
hooks:
17-
- id: black
18-
- id: black-jupyter
19-
# Python static analysis
20-
- repo: https://github.com/charliermarsh/ruff-pre-commit
21-
# Ruff version.
22-
rev: 'v0.0.288'
23-
hooks:
24-
- id: ruff
25-
# Shell static analysis
26-
- repo: https://github.com/koalaman/shellcheck-precommit
27-
rev: v0.9.0
28-
hooks:
29-
- id: shellcheck
30-
# precommit invokes shellcheck once per file. shellcheck complains if file
31-
# includes another file not given on the command line. Ignore this, since
32-
# they'll just get checked in a separate shellcheck invocation.
33-
args: ["-e", "SC1091"]
34-
# Misc
35-
- repo: https://github.com/codespell-project/codespell
36-
rev: v2.2.5
37-
hooks:
38-
- id: codespell
39-
args: ["--skip=*.lock,*.pyc,tests/testdata/*,*.ipynb,*.csv","--ignore-words-list=codebook"]
40-
# Hooks that run in local environment (not isolated venv) as they need
41-
# same dependencies as our package.
42-
- repo: https://github.com/pre-commit/mirrors-mypy
4+
# Linting
5+
- repo: https://github.com/pre-commit/pre-commit-hooks
6+
rev: v4.4.0
7+
hooks:
8+
- id: check-ast
9+
- id: trailing-whitespace
10+
- id: end-of-file-fixer
11+
exclude_types: [jupyter]
12+
- id: check-toml
13+
- id: check-added-large-files
14+
# Python static analysis
15+
- repo: https://github.com/charliermarsh/ruff-pre-commit
16+
# Ruff version.
17+
rev: "v0.0.288"
18+
hooks:
19+
- id: ruff
20+
# Shell static analysis
21+
- repo: https://github.com/koalaman/shellcheck-precommit
22+
rev: v0.9.0
23+
hooks:
24+
- id: shellcheck
25+
# precommit invokes shellcheck once per file. shellcheck complains if file
26+
# includes another file not given on the command line. Ignore this, since
27+
# they'll just get checked in a separate shellcheck invocation.
28+
args: ["-e", "SC1091"]
29+
# Misc
30+
- repo: https://github.com/codespell-project/codespell
31+
rev: v2.2.5
32+
hooks:
33+
- id: codespell
34+
args:
35+
[
36+
"--skip=*.lock,*.pyc,tests/testdata/*,*.ipynb,*.csv",
37+
"--ignore-words-list=codebook",
38+
]
39+
# Hooks that run in local environment (not isolated venv) as they need
40+
# same dependencies as our package.
41+
- repo: https://github.com/pre-commit/mirrors-mypy
4342
rev: v1.5.1
4443
hooks:
45-
- id: mypy
44+
- id: mypy
4645
args: [--follow-imports=skip]
4746

4847
exclude: (mod_model_classes.py|tl_mods.py|run_clm.py)

README.md

+3-4
Original file line numberDiff line numberDiff line change
@@ -55,12 +55,12 @@ python -m codebook_features.train_codebook model_args.model_name_or_path=ronenel
5555

5656
Once a codebook model has been trained and saved on disk, we can use the interpretability webapp to visualize the codebook. First, we need to generate the relevant cache files for the codebook model that is required for the webapp. This can be done by running the script `codebook_features/code_search_cache.py`:
5757
```
58-
python -m codebook_features.code_search_cache --model_name <path to codebook model> --pretrained_path --dataset_name <dataset name> --dataset_config_name <dataset config name> --output_base_dir <path to output directory>
58+
python -m codebook_features.code_search_cache --orig_model_name <orig name/path of model> --pretrained_path <path to codebook model> --dataset_name <dataset name> --dataset_config_name <dataset config name> --output_base_dir <path to output directory>
5959
```
6060

61-
Once the cache files have been generated, we can run the webapp using the following command:
61+
Once the cache files have been generated, we can run the webapp using the following command with the base output directory used in the above command:
6262
```
63-
python -m streamlit run codebook_features/webapp/Code_Browser.py -- --cache_dir <path to cache directory>
63+
python -m streamlit run codebook_features/webapp/Code_Browser.py -- --cache_dir <path to the base cache directory>
6464
```
6565

6666
### Code Intervention
@@ -97,7 +97,6 @@ The `codebook_features/train_fsm_model.py` script provides an algorithmic sequen
9797
The `codebook_features/train_fsm_model.py` script can be used to train a codebook model on the TokFSM dataset. The syntax for the arguments and training procedure is similar to the `train_codebook.py` script. The default arguments for the training script is available in `codebook_features/config/fsm_main.yaml`.
9898

9999

100-
101100
For tutorials on how to use the library, please see the [Codebook Features Tutorials](https://github.com/taufeeque9/codebook-features/tree/main/tutorials).
102101

103102
</details>

0 commit comments

Comments
 (0)