setup

usaito · usaito · commit 4b3ada612d8b · 2020-04-26T22:45:14.000+09:00
diff --git a/.gitignore b/.gitignore
@@ -127,3 +127,12 @@ dmypy.json
 
 # Pyre type checker
 .pyre/
+
+# others
+.DS_Store
+data/
+*.html
+*.ipynb
+*.csv
+*.npy
+*.json
diff --git a/LICENSE b/LICENSE
@@ -1,3 +1,5 @@
+Copyright 2020 held by Yuta Saito.  All rights reserved.
+
                                  Apache License
                            Version 2.0, January 2004
                         http://www.apache.org/licenses/
diff --git a/README.md b/README.md
@@ -1,2 +1,71 @@
-# asymmetric-tri-rec-real
-(SIGIR2020) “Asymmetric Tri-training for Debiasing Missing-Not-At-Random Explicit Feedback’’
+## Asymmetric Tri-training for Debiasing Missing-Not-At-Random Explicit Feedback
+
+---
+
+### About
+
+This repository accompanies the real-world experiments conducted in the paper "[**Asymmetric Tri-training for Debiasing Missing-Not-At-Random Explicit Feedback**]()," by [Yuta Saito](https://usaito.github.io/), which has been accepted at [_SIGIR2020_](https://sigir.org/sigir2020/) as a full paper.
+
+If you find this code useful in your research then please cite:
+```
+@inproceedings{saito2020asymmetric,
+  title={Asymmetric tri-training for debiasing missing-not-at-random explicit feedback},
+  author={Saito, Yuta},
+  booktitle={Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval},
+  year={2020}
+}
+```
+
+### Dependencies
+
+- numpy==1.17.2
+- pandas==0.25.1
+- scikit-learn==0.22.1
+- tensorflow==1.15.2
+- optuna==0.17.0
+- plotly==3.10.0
+- pyyaml==5.1.2
+
+### Running the code
+
+To run the simulation with real-world datasets,
+
+1. download the Coat dataset from [https://www.cs.cornell.edu/~schnabts/mnar/](https://www.cs.cornell.edu/~schnabts/mnar/) and put `train.ascii` and `test.ascii` files into `./data/coat/` directory.
+2. download the Yahoo! R3 dataset from [https://webscope.sandbox.yahoo.com/catalog.php?datatype=r](https://webscope.sandbox.yahoo.com/catalog.php?datatype=r) and put `train.txt` and `test.txt` files into `./data/yahoo/` directory.
+
+Then, run the following commands in the `./src/` directory:
+
+- for the MF-IPS models **without** *asymmetric tri-training*
+```bash
+for data in coat
+do
+  for model in uniform user item both nb nb_true
+  do
+    python main.py -d $data -m $model
+  done
+done
+```
+
+- for the MF-IPS models **with** *asymmetric tri-training* (our proposal)
+```bash
+for data in coat
+do
+  for model in uniform-at user-at item-at both-at nb-at nb_true-at
+  do
+    python main.py -d $data -m $model
+  done
+done
+```
+where (uniform, user, item, both, nb, nb_true) correspond to (*uniform propenisty*, *user propensity*, *item propensity*, *user-item propensity*, *NB (uniform)*, *NB (true)*), respectively.
+
+These commands will run simulations with real-world datasets conducted in Section 6.
+The tuned hyperparameters for all models can be found in `./hyper_params.yaml`.
+
+Once the simulations have finished running, the summarized results can be obtained by running the following command in the `./src/` directory:
+
+```bash
+python summarize_results -d coat yahoo
+```
+
+This creates the `./paper_results` directory.
+
diff --git a/requirements.txt b/requirements.txt
@@ -0,0 +1,8 @@
+numpy==1.17.2
+pandas==0.25.1
+scikit-learn==0.22.1
+tensorflow==1.15.2
+optuna==0.17.0
+plotly==3.10.0
+mlflow==1.7.0
+pyyaml==5.1.2

Original file line number	Diff line number	Diff line change
`@@ -1,3 +1,5 @@`
	`1`	`+Copyright 2020 held by Yuta Saito. All rights reserved.`
	`2`	`+`
`1`	`3`	`Apache License`
`2`	`4`	`Version 2.0, January 2004`
`3`	`5`	`http://www.apache.org/licenses/`