Skip to content

Commit 4b3ada6

Browse files
committed
setup
1 parent ff17ad2 commit 4b3ada6

File tree

4 files changed

+90
-2
lines changed

4 files changed

+90
-2
lines changed

.gitignore

+9
Original file line numberDiff line numberDiff line change
@@ -127,3 +127,12 @@ dmypy.json
127127

128128
# Pyre type checker
129129
.pyre/
130+
131+
# others
132+
.DS_Store
133+
data/
134+
*.html
135+
*.ipynb
136+
*.csv
137+
*.npy
138+
*.json

LICENSE

+2
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
1+
Copyright 2020 held by Yuta Saito. All rights reserved.
2+
13
Apache License
24
Version 2.0, January 2004
35
http://www.apache.org/licenses/

README.md

+71-2
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,71 @@
1-
# asymmetric-tri-rec-real
2-
(SIGIR2020) “Asymmetric Tri-training for Debiasing Missing-Not-At-Random Explicit Feedback’’
1+
## Asymmetric Tri-training for Debiasing Missing-Not-At-Random Explicit Feedback
2+
3+
---
4+
5+
### About
6+
7+
This repository accompanies the real-world experiments conducted in the paper "[**Asymmetric Tri-training for Debiasing Missing-Not-At-Random Explicit Feedback**]()," by [Yuta Saito](https://usaito.github.io/), which has been accepted at [_SIGIR2020_](https://sigir.org/sigir2020/) as a full paper.
8+
9+
If you find this code useful in your research then please cite:
10+
```
11+
@inproceedings{saito2020asymmetric,
12+
title={Asymmetric tri-training for debiasing missing-not-at-random explicit feedback},
13+
author={Saito, Yuta},
14+
booktitle={Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval},
15+
year={2020}
16+
}
17+
```
18+
19+
### Dependencies
20+
21+
- numpy==1.17.2
22+
- pandas==0.25.1
23+
- scikit-learn==0.22.1
24+
- tensorflow==1.15.2
25+
- optuna==0.17.0
26+
- plotly==3.10.0
27+
- pyyaml==5.1.2
28+
29+
### Running the code
30+
31+
To run the simulation with real-world datasets,
32+
33+
1. download the Coat dataset from [https://www.cs.cornell.edu/~schnabts/mnar/](https://www.cs.cornell.edu/~schnabts/mnar/) and put `train.ascii` and `test.ascii` files into `./data/coat/` directory.
34+
2. download the Yahoo! R3 dataset from [https://webscope.sandbox.yahoo.com/catalog.php?datatype=r](https://webscope.sandbox.yahoo.com/catalog.php?datatype=r) and put `train.txt` and `test.txt` files into `./data/yahoo/` directory.
35+
36+
Then, run the following commands in the `./src/` directory:
37+
38+
- for the MF-IPS models **without** *asymmetric tri-training*
39+
```bash
40+
for data in coat
41+
do
42+
for model in uniform user item both nb nb_true
43+
do
44+
python main.py -d $data -m $model
45+
done
46+
done
47+
```
48+
49+
- for the MF-IPS models **with** *asymmetric tri-training* (our proposal)
50+
```bash
51+
for data in coat
52+
do
53+
for model in uniform-at user-at item-at both-at nb-at nb_true-at
54+
do
55+
python main.py -d $data -m $model
56+
done
57+
done
58+
```
59+
where (uniform, user, item, both, nb, nb_true) correspond to (*uniform propenisty*, *user propensity*, *item propensity*, *user-item propensity*, *NB (uniform)*, *NB (true)*), respectively.
60+
61+
These commands will run simulations with real-world datasets conducted in Section 6.
62+
The tuned hyperparameters for all models can be found in `./hyper_params.yaml`.
63+
64+
Once the simulations have finished running, the summarized results can be obtained by running the following command in the `./src/` directory:
65+
66+
```bash
67+
python summarize_results -d coat yahoo
68+
```
69+
70+
This creates the `./paper_results` directory.
71+

requirements.txt

+8
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
numpy==1.17.2
2+
pandas==0.25.1
3+
scikit-learn==0.22.1
4+
tensorflow==1.15.2
5+
optuna==0.17.0
6+
plotly==3.10.0
7+
mlflow==1.7.0
8+
pyyaml==5.1.2

0 commit comments

Comments
 (0)