Adding link for pretrained FP32 and INT8 models (#197)

rchinna · claynerobison · commit 1e22b92e8916 · 2019-02-25T18:48:58.000-07:00
Adding link for Kaggle Display Advertising Challenge Dataset download

Signed-off-by: Rajendrakumar Chinnaiyan &lt;rajendrakumar.chinnaiyan@intel.com&gt;
diff --git a/benchmarks/recommendation/tensorflow/wide_deep_large_ds/README.md b/benchmarks/recommendation/tensorflow/wide_deep_large_ds/README.md
@@ -11,25 +11,26 @@ Benchmarking instructions and scripts for model training coming later.
 ## INT8 Inference Instructions
 
  
-1. Download large <> dataset income dataset from <>: 
-   
-   To be updated post dataset approval
-       
+1. Download large Kaggle Display Advertising Challenge Dataset from
+   http://labs.criteo.com/2014/02/kaggle-display-advertising-challenge-dataset/
+
 2. Pre-process the downloaded dataset to tfrecords using [preprocess_csv_tfrecords.py](/models/recommendation/tensorflow/wide_deep_large_ds/dataset/preprocess_csv_tfrecords.py)
-   
     ```
-	$ python3.6 preprocess_csv_tfrecords.py --csv-datafile eval.csv 
+	$ python3.6 preprocess_csv_tfrecords.py --csv-datafile eval.csv
+    ```
+3. Download and extract the pre-trained model.
+    ```
+    $ wget https://storage.googleapis.com/intel-optimized-tensorflow/models/wide_deep_int8_pretrained_model.pb
     ```
-   
-3. Clone the [intelai/models](https://github.com/intelai/models) repo.
+4. Clone the [intelai/models](https://github.com/intelai/models) repo.
 
    This repo has the launch script for running benchmarks, which we will
    use in the next step.
 
     ```
     $ git clone https://github.com/IntelAI/models.git
     ```
-4. How to run benchmarks
+5. How to run benchmarks
 
    * Running benchmarks in latency mode, set `--batch-size 1`
        ``` 
@@ -44,7 +45,7 @@ Benchmarking instructions and scripts for model training coming later.
             --batch-size 1 \
             --socket-id 0 \
             --docker-image tensorflow/tensorflow:latest-mkl \
-            --in-graph /root/user/wide_deep_files/int8_wide_deep_final.pb \
+            --in-graph /root/user/wide_deep_files/wide_deep_int8_pretrained_model.pb \
             --data-location /root/user/wide_deep_files/preprocessed_eval.tfrecords 
        ```
    * Running benchmarks in throughput mode, set `--batch-size 1024`
@@ -60,7 +61,7 @@ Benchmarking instructions and scripts for model training coming later.
             --batch-size 1024 \
             --socket-id 0 \
             --docker-image tensorflow/tensorflow:latest-mkl \
-            --in-graph /root/user/wide_deep_files/int8_wide_deep_final.pb \
+            --in-graph /root/user/wide_deep_files/wide_deep_int8_pretrained_model.pb \
             --data-location /root/user/wide_deep_files/preprocessed_eval.tfrecords 
        ```
 6. The log file is saved to the value of `--output-dir`.
@@ -69,7 +70,6 @@ Benchmarking instructions and scripts for model training coming later.
    something like this:
 
     ```
-    
     --------------------------------------------------
     Total test records           :  2000000
     No of correct predicitons    :  1549508
@@ -80,31 +80,33 @@ Benchmarking instructions and scripts for model training coming later.
     Latency (millisecond/batch)  :  0.000988
     Throughput is (records/sec)  :  1151892.25
     --------------------------------------------------
-    numactl --cpunodebind=0 --membind=0 python /workspace/intelai_models/int8/inference.py --input-graph=/in_graph/int8_wide_deep_final.pb --inter-op-parallelism-threads=28 --intra-op-parallelism-threads=1 --omp-num-threads=1 --batch-size=1024 --kmp-blocktime=0 --datafile-path=/dataset
+    numactl --cpunodebind=0 --membind=0 python /workspace/intelai_models/int8/inference.py --input-graph=/in_graph/wide_deep_int8_pretrained_model.pb --inter-op-parallelism-threads=28 --intra-op-parallelism-threads=1 --omp-num-threads=1 --batch-size=1024 --kmp-blocktime=0 --datafile-path=/dataset
     Ran inference with batch size 1024
     Log location outside container:  {--output-dir value}/benchmark_wide_deep_large_ds_inference_int8_20190225_061815.log
     ```
 
 ## FP32 Inference Instructions
 
-1. Download large <> dataset income dataset from <>: 
-   
-   To be updated post dataset approval
-       
-2. Pre-process the downloaded dataset to tfrecords using [preprocess_csv_tfrecords.py](../../../../models/recommendation/tensorflow/wide_deep_large_ds/dataset/preprocess_csv_tfrecords.py)
-   
-   ```
-    $ python3.6 preprocess_csv_tfrecords.py --csv-datafile eval.csv 
-   ```
-3. Clone the [intelai/models](https://github.com/intelai/models) repo.
+1. Download large Kaggle Display Advertising Challenge Dataset from
+   http://labs.criteo.com/2014/02/kaggle-display-advertising-challenge-dataset/
+
+2. Pre-process the downloaded dataset to tfrecords using [preprocess_csv_tfrecords.py](/models/recommendation/tensorflow/wide_deep_large_ds/dataset/preprocess_csv_tfrecords.py)
+    ```
+	$ python3.6 preprocess_csv_tfrecords.py --csv-datafile eval.csv
+    ```
+3. Download and extract the pre-trained model.
+    ```
+    $ wget https://storage.googleapis.com/intel-optimized-tensorflow/models/wide_deep_fp32_pretrained_model.pb
+    ```
+4. Clone the [intelai/models](https://github.com/intelai/models) repo.
 
    This repo has the launch script for running benchmarks, which we will
    use in the next step.
 
     ```
     $ git clone https://github.com/IntelAI/models.git
     ```
-4. How to run benchmarks
+5. How to run benchmarks
 
    * Running benchmarks in latency mode, set `--batch-size 1`
        ``` 
@@ -119,7 +121,7 @@ Benchmarking instructions and scripts for model training coming later.
             --batch-size 1 \
             --socket-id 0 \
             --docker-image tensorflow/tensorflow:latest-mkl \
-            --in-graph /root/user/wide_deep_files/fp32_wide_deep_final.pb \
+            --in-graph /root/user/wide_deep_files/wide_deep_fp32_pretrained_model.pb \
             --data-location /root/user/wide_deep_files/preprocessed_eval.tfrecords 
        ```
    * Running benchmarks in throughput mode, set `--batch-size 1024`
@@ -135,7 +137,7 @@ Benchmarking instructions and scripts for model training coming later.
             --batch-size 1024 \
             --socket-id 0 \
             --docker-image tensorflow/tensorflow:latest-mkl \
-            --in-graph /root/user/wide_deep_files/fp32_wide_deep_final.pb \
+            --in-graph /root/user/wide_deep_files/wide_deep_fp32_pretrained_model.pb \
             --data-location /root/user/wide_deep_files/preprocessed_eval.tfrecords 
        ```
 6. The log file is saved to the value of `--output-dir`.
@@ -155,7 +157,7 @@ Benchmarking instructions and scripts for model training coming later.
     Latency (millisecond/batch)  :  0.001749
     Throughput is (records/sec)  :  571802.228
     --------------------------------------------------
-    numactl --cpunodebind=0 --membind=0 python /workspace/intelai_models/int8/inference.py --input-graph=/in_graph/fp32_wide_deep_final.pb --inter-op-parallelism-threads=28 --intra-op-parallelism-threads=1 --omp-num-threads=1 --batch-size=1024 --kmp-blocktime=0 --datafile-path=/dataset
+    numactl --cpunodebind=0 --membind=0 python /workspace/intelai_models/int8/inference.py --input-graph=/in_graph/wide_deep_fp32_pretrained_model.pb --inter-op-parallelism-threads=28 --intra-op-parallelism-threads=1 --omp-num-threads=1 --batch-size=1024 --kmp-blocktime=0 --datafile-path=/dataset
     Ran inference with batch size 1024
     Log location outside container: {--output-dir value}/benchmark_wide_deep_large_ds_inference_fp32_20190225_062206.log