|  | 
| 2 | 2 | 
 | 
| 3 | 3 | This document has instructions for how to run Inception ResNet V2 for the | 
| 4 | 4 | following modes/precisions: | 
| 5 |  | -* [Int8 inference](#int8-inference-instructions) | 
| 6 | 5 | * [FP32 inference](#fp32-inference-instructions) | 
| 7 | 6 | 
 | 
| 8 |  | -## Int8 Inference Instructions | 
| 9 |  | - | 
| 10 |  | -1. Clone this [intelai/models](https://github.com/IntelAI/models) | 
| 11 |  | -repository: | 
| 12 |  | - | 
| 13 |  | -``` | 
| 14 |  | -$ git clone https://github.com/IntelAI/models.git | 
| 15 |  | -``` | 
| 16 |  | - | 
| 17 |  | -This repository includes launch scripts for running benchmarks and the | 
| 18 |  | -an optimized version of the Inception ResNet V2 model code. | 
| 19 |  | - | 
| 20 |  | -2. A link to download the pre-trained model is coming soon. | 
| 21 |  | - | 
| 22 |  | -3. Build a docker image using master of the official | 
| 23 |  | -[TensorFlow](https://github.com/tensorflow/tensorflow) repository with | 
| 24 |  | -`--config=mkl`. More instructions on | 
| 25 |  | -[how to build from source](https://software.intel.com/en-us/articles/intel-optimization-for-tensorflow-installation-guide#inpage-nav-5). | 
| 26 |  | - | 
| 27 |  | -4. If you would like to run Inception ResNet V2 inference and test for | 
| 28 |  | -accuracy, you will need the full ImageNet dataset. Benchmarking for latency | 
| 29 |  | -and throughput do not require the ImageNet dataset. | 
| 30 |  | - | 
| 31 |  | -Register and download the | 
| 32 |  | -[ImageNet dataset](http://image-net.org/download-images). | 
| 33 |  | - | 
| 34 |  | -Once you have the raw ImageNet dataset downloaded, we need to convert | 
| 35 |  | -it to the TFRecord format. This is done using the | 
| 36 |  | -[build_imagenet_data.py](https://github.com/tensorflow/models/blob/master/research/inception/inception/data/build_imagenet_data.py) | 
| 37 |  | -script. There are instructions in the header of the script explaining | 
| 38 |  | -its usage. | 
| 39 |  | - | 
| 40 |  | -After the script has completed, you should have a directory with the | 
| 41 |  | -sharded dataset something like: | 
| 42 |  | - | 
| 43 |  | -``` | 
| 44 |  | -$ ll /home/myuser/datasets/ImageNet_TFRecords | 
| 45 |  | --rw-r--r--. 1 user 143009929 Jun 20 14:53 train-00000-of-01024 | 
| 46 |  | --rw-r--r--. 1 user 144699468 Jun 20 14:53 train-00001-of-01024 | 
| 47 |  | --rw-r--r--. 1 user 138428833 Jun 20 14:53 train-00002-of-01024 | 
| 48 |  | -... | 
| 49 |  | --rw-r--r--. 1 user 143137777 Jun 20 15:08 train-01022-of-01024 | 
| 50 |  | --rw-r--r--. 1 user 143315487 Jun 20 15:08 train-01023-of-01024 | 
| 51 |  | --rw-r--r--. 1 user  52223858 Jun 20 15:08 validation-00000-of-00128 | 
| 52 |  | --rw-r--r--. 1 user  51019711 Jun 20 15:08 validation-00001-of-00128 | 
| 53 |  | --rw-r--r--. 1 user  51520046 Jun 20 15:08 validation-00002-of-00128 | 
| 54 |  | -... | 
| 55 |  | --rw-r--r--. 1 user  52508270 Jun 20 15:09 validation-00126-of-00128 | 
| 56 |  | --rw-r--r--. 1 user  55292089 Jun 20 15:09 validation-00127-of-00128 | 
| 57 |  | -``` | 
| 58 |  | - | 
| 59 |  | -5. Next, navigate to the `benchmarks` directory in your local clone of | 
| 60 |  | -the [intelai/models](https://github.com/IntelAI/models) repo from step 1. | 
| 61 |  | -The `launch_benchmark.py` script in the `benchmarks` directory is | 
| 62 |  | -used for starting a benchmarking run in a optimized TensorFlow docker | 
| 63 |  | -container. It has arguments to specify which model, framework, mode, | 
| 64 |  | -precision, and docker image to use, along with your path to the ImageNet | 
| 65 |  | -TF Records that you generated in step 4. | 
| 66 |  | - | 
| 67 |  | -Substitute in your own `--data-location` (from step 4, for accuracy | 
| 68 |  | -only), `--in-graph` pre-trained model file path (from step 2), | 
| 69 |  | -and the name/tag for your docker image (from step 3). | 
| 70 |  | - | 
| 71 |  | -Inception ResNet V2 can be run for accuracy, latency benchmarking, or throughput | 
| 72 |  | -benchmarking. Use one of the following examples below, depending on | 
| 73 |  | -your use case. | 
| 74 |  | - | 
| 75 |  | -For accuracy (using your `--data-location`, `--accuracy-only` and | 
| 76 |  | -`--batch-size 100`): | 
| 77 |  | - | 
| 78 |  | -``` | 
| 79 |  | -python launch_benchmark.py \ | 
| 80 |  | -    --model-name inception_resnet_v2 \ | 
| 81 |  | -    --precision int8 \ | 
| 82 |  | -    --mode inference \ | 
| 83 |  | -    --framework tensorflow \ | 
| 84 |  | -    --accuracy-only \ | 
| 85 |  | -    --batch-size 100 \ | 
| 86 |  | -    --docker-image tf_int8_docker_image \ | 
| 87 |  | -    --in-graph /home/myuser/inception_resnet_v2_int8_pretrained_model.pb \ | 
| 88 |  | -    --data-location /home/myuser/datasets/ImageNet_TFRecords | 
| 89 |  | -``` | 
| 90 |  | - | 
| 91 |  | -For latency (using `--benchmark-only`, `--socket-id 0` and `--batch-size 1`): | 
| 92 |  | - | 
| 93 |  | -``` | 
| 94 |  | -python launch_benchmark.py \ | 
| 95 |  | -    --model-name inception_resnet_v2 \ | 
| 96 |  | -    --precision int8 \ | 
| 97 |  | -    --mode inference \ | 
| 98 |  | -    --framework tensorflow \ | 
| 99 |  | -    --benchmark-only \ | 
| 100 |  | -    --batch-size 1 \ | 
| 101 |  | -    --socket-id 0 \ | 
| 102 |  | -    --docker-image tf_int8_docker_image \ | 
| 103 |  | -    --in-graph /home/myuser/inception_resnet_v2_int8_pretrained_model.pb | 
| 104 |  | -``` | 
| 105 |  | - | 
| 106 |  | -For throughput (using `--benchmark-only`, `--socket-id 0` and `--batch-size 128`): | 
| 107 |  | - | 
| 108 |  | -``` | 
| 109 |  | -python launch_benchmark.py \ | 
| 110 |  | -    --model-name inception_resnet_v2 \ | 
| 111 |  | -    --precision int8 \ | 
| 112 |  | -    --mode inference \ | 
| 113 |  | -    --framework tensorflow \ | 
| 114 |  | -    --benchmark-only \ | 
| 115 |  | -    --batch-size 128 \ | 
| 116 |  | -    --socket-id 0 \ | 
| 117 |  | -    --docker-image tf_int8_docker_image \ | 
| 118 |  | -    --in-graph /home/myuser/inception_resnet_v2_int8_pretrained_model.pb | 
| 119 |  | -``` | 
| 120 |  | - | 
| 121 |  | -Note that the `--verbose` flag can be added to any of the above commands | 
| 122 |  | -to get additional debug output. | 
| 123 |  | - | 
| 124 |  | -6. The log file is saved to the | 
| 125 |  | -`models/benchmarks/common/tensorflow/logs` directory. Below are | 
| 126 |  | -examples of what the tail of your log file should look like for the | 
| 127 |  | -different configs. | 
| 128 |  | - | 
| 129 |  | -Example log tail when running for accuracy: | 
| 130 |  | - | 
| 131 |  | -``` | 
| 132 |  | -Processed 49800 images. (Top1 accuracy, Top5 accuracy) = (0.8015, 0.9523) | 
| 133 |  | -Processed 49900 images. (Top1 accuracy, Top5 accuracy) = (0.8016, 0.9524) | 
| 134 |  | -Processed 50000 images. (Top1 accuracy, Top5 accuracy) = (0.8015, 0.9524) | 
| 135 |  | -lscpu_path_cmd = command -v lscpu | 
| 136 |  | -lscpu located here: /usr/bin/lscpu | 
| 137 |  | -Ran inference with batch size 100 | 
| 138 |  | -Log location outside container: /home/myuser/intelai/models/benchmarks/common/tensorflow/logs/benchmark_inception_resnet_v2_inference_int8_20190104_193854.log | 
| 139 |  | -``` | 
| 140 |  | - | 
| 141 |  | -Example log tail when benchmarking for latency: | 
| 142 |  | -``` | 
| 143 |  | -Iteration 39: 0.052 sec | 
| 144 |  | -Iteration 40: 0.052 sec | 
| 145 |  | -Average time: 0.052 sec | 
| 146 |  | -Batch size = 1 | 
| 147 |  | -Latency: 52.347 ms | 
| 148 |  | -Throughput: 19.103 images/sec | 
| 149 |  | -lscpu_path_cmd = command -v lscpu | 
| 150 |  | -lscpu located here: /usr/bin/lscpu | 
| 151 |  | -Ran inference with batch size 1 | 
| 152 |  | -Log location outside container: /home/myuser/intelai/models/benchmarks/common/tensorflow/logs/benchmark_inception_resnet_v2_inference_int8_20190104_194938.log | 
| 153 |  | -``` | 
| 154 |  | - | 
| 155 |  | -Example log tail when benchmarking for throughput: | 
| 156 |  | -``` | 
| 157 |  | -Iteration 39: 0.993 sec | 
| 158 |  | -Iteration 40: 1.023 sec | 
| 159 |  | -Average time: 0.996 sec | 
| 160 |  | -Batch size = 128 | 
| 161 |  | -Throughput: 128.458 images/sec | 
| 162 |  | -lscpu_path_cmd = command -v lscpu | 
| 163 |  | -lscpu located here: /usr/bin/lscpu | 
| 164 |  | -Ran inference with batch size 128 | 
| 165 |  | -Log location outside container: /home/myuser/intelai/models/benchmarks/common/tensorflow/logs/benchmark_inception_resnet_v2_inference_int8_20190104_195504.log | 
| 166 |  | -``` | 
| 167 |  | - | 
| 168 |  | - | 
| 169 | 7 | ## FP32 Inference Instructions | 
| 170 | 8 | 
 | 
| 171 | 9 | 1. Clone this [intelai/models](https://github.com/IntelAI/models) | 
|  | 
0 commit comments