Skip to content

Commit 20a2fbb

Browse files
WafaaTkanvi-nervanaakhilgoesyedshahbaazgera-aldama
authored
Sync with r2.12 (#1416)
* Enabling float16 training for ResNet50v1_5 (#1079) * Enabling float16 training for BERT large / SQuADv1.1 (#1018) * Enable FP16 support for Distilbert Inference (Tensorflow) (#1075) * Fix OOM caused by incorrect thread setting. (#1084) * Ejan/model zoo quickstart (#1082) * Fix syntax for resnet50v1.5 inference * Fix syntax for bert_large accuracy * Import GPU Max and Flex Series workloads from develop-gpu (#1080) * Add GPU DLRM FP16 inference * Change to install ATS drivers from local repo * Add GPU PYT bert large FP16 Inference * fix _FusedMatmlul issue in GPU * Updated PyTorch to use the common compiler partial and added ARG for the env var file since that changes per compiler * Add package for ResNet 50 v1.5 int8 Inference pytorch gpu * Update specs & build files for alpha2 rc1 whls * Add ResNet50 v1.5 bf16 Training PYT GPU * Add wrapper package for TF GPU tool container * Update TF GPU training packages to use alpha2-rc1 * Update IPEX tools container and resnet50v1.5 models for alpha2 rc1 * Update PYT Bert LG and DLRM FP16 inference alpha2-rc1 * Update tf-gpu branch for ww15 dpcpp compiler * Set ITEX_ENABLE_ONEDNN_LAYOUT_OPT=0 for bert training * Add section to validate base container, fix dlrm printed statement * Update the docs for alpha2-rc2 models * fix ipex tool container readme * Fix dlrm print using CPU statement to be XPU * add 1t env vars * Use add instead of addn * Update bert large docs to be specific about which pretrained model to use * Sync with develop Signed-off-by: Abolfazl Shahbazi <[email protected]> * Update the main benchmarks README for gpu models * Set ITEX_ENABLE_ONEDNN_LAYOUT_OPT=0 in ResNet50v1.5 bf16 training quickstart scripts * Revert "tmp fix res50v1_5 int8" This reverts commit 3c120e0bee3a576ee1548d9258b611a889897ee6 * Updates to match batch sizes in docs and updated pb links * Updating compilar binary * Update PYT GPU packages for IPEX alpha2 rc6 * Update GPU specs to make the docs section a list and update TF training docs for DevCloud * Doc updates for ResNet50v1.5 and BERT large training for GPU * tf-gpu doc updates * Fix the BKC and environment for resnet50v1.5 INT8, bert-larget and resenet50v1.5 BF16 training * Update GPU PYT packages to have 2 READMEs * Remove duplicate license from package * AI Kit Model Package README * Clean up PYT model pkgs and update baremetal docs * Add ITEX ATS-M whl updates (#696) * made changes for ITEX ATS-M * indentation changes * Update Resnet50v1.5 (#684) * Update Resnet50v1.5 * Adjust format and restore file * ATS-M TF changes (#699) * add benchmark mode for tensorflow ssd * add resnet50 benchmark mode * add rn50 * modify rn50 files * fixing tengfei PR * fixing incorrect folder changes * added licences header * fixed year Co-authored-by: Tengfei, Han <[email protected]> * merge TF base container based on new RC1 whl packages (#700) * ssd-mobilenet tf gpu spec * build based on latest RC1 whl packages * changed horovod version * Add PyTorch SSD-Mobilenet inference for GPU (#685) * Add SSD-Mobilenet * modified some files * modify readme.sh and add link in reference.sh * add dummy data mode * modify some description * modify description about enviroment * Added rc1 update (#702) * Added oneccl whl (#704) * do not use oneccl from basekit (#705) * Add YOLOv4 (#687) * Add YOLOv4 * update README and inference.sh * modify readme and inference.sh * add dummy data mode * test lowecase * test again * modify script and description about dummy, add dummy img * add miss file * updated readme (#706) * Updated RN50 PyTorch Inference spec file (#707) - Updated names in the spec file for RN50 based on scripts in quickstart folder - Updated scripts names in run.sh * correct ssd-mobilenet and yolov4 (#709) * fix bug where only default images would be used * correct scripts * ssd-mobilenet support int8 only * pretrained waight file link is not a direct link, so remove if from script and nee user dowmload it * Modified some descriptions Co-authored-by: Feng Yuan <[email protected]> * Added Pytorch RC3 whls (#730) Co-authored-by: Tengfei, Han <[email protected]> * updating TPPs (#728) * 2.8 tpps * remove old files * Added Resnet50_Pytorch for ATS-M (#729) * Added Resnet50_Pytorch for ATS-M * Added documentation and wrapper README * Made changes as per reviews Co-authored-by: Tengfei, Han <[email protected]> * ATS-M support for SSD-Mobilenet and Resnet50V1-5 (#724) * modifying scripts for gpu ssd-mobilenet * changed docker image name * modify changes to test functionality * made changes for obj_det build * changed np version * made version changes * revert changes * add 3.9 dev version and remove 1.17.4 np version to latest * change path of coco py files in models to int8 folder * update new .pb model file * export vars and change/remove DATASET_DIR * made -f to -d change in checkir DIR path * add batch inference for ssd-mobilenet * use dummy data for online and batch inference * add untracked file * change to new models * change warmup and steps * change warmup and steps * add docs for ATS-M ssd-mobilenet * add docs section for ATS-M w/ links * add docs section for ATS-M w/ correct links * modify baremetal.md * modify spec file to add model package * generate model-builder doc * make alignment changes * update GPU name and TF version in README.md and add oneapi dir path var * unify docs of ssd-mbnet and rn50 * make rn50 doc changes and add oneapi as path var * generate model-builder readmes * correct typo * correct typo * add INT8 check,remove other precisions * add ONEAPI_DIR to array * formatting lines * delete baremetal for ATS-M * remove typo and baremetal.md * cleanup and modify readmes * create oneapi_dir for base build * remove hrvd for rc2 test * remove hvd from base build and ITEX BKC env * Delete -tf-gpu-ssd-mobilenet-inference.temp.Dockerfile * initial review changes * add prvileged mode for cpu freq scaling * remove aikit.md * correct readme typos * correct comments * minor readme changes * check dataset path only for accuracy * check dataset_dir only for accuracy * add aikit back * add gpu name and refine base readme * change docker.md on dummy data * add aikit for both models * add aikit for both resnet * add privileged mode Co-authored-by: Ramakrishna, Srikanth <[email protected]> Co-authored-by: Mahathi Vatsal <[email protected]> * Update readme (#733) * updated readmes * updated readmes again * Added ssd-mobilenet pytorch for ATS-M (#734) * Added ssd-mobilenet pytorch for ATS-M * Made changes as per reviews * Added YOLOv4 for ATS-M (#735) * Added YOLOv4 for ATS-M * Made changes as per reviews * Made changes in model.py to run yolov4. - Modified build.sh for ipex-tool-container. - Modified run.sh in yolov4 to mount PRETRAINED_MODELS * update docs * Removed HVD and torch ccl whls (#741) * Removed HVD and torch ccl whls * Removed sythentic_data scripts ffrom rn50 spec file * Removed scripts from run.sh * Update rn50, ssd-mobilenet and yolo (#748) * Update rn50,yolo and ssd-mobile * delete emulation * update model Co-authored-by: chaohan <[email protected]> * Mahathi/ipex mkl update (#753) * Added mkl/compiler packages * Added tbb in spec file * Removed oneapi path in build.sh * Modified old files Co-authored-by: Srikanth Ramakrishna <[email protected]> * dpcpp,mkl,tbb inside container ATS-M (#756) * test dpcpp,mkl in base * make partial changes * add tbb files to partial * fix typo in ttb addition * remove two export vars * remove oneapi dir check and mount * add end of line * re-add end of line Co-authored-by: Mahathi <[email protected]> * Removed oneapi from run.sh in workloads (#758) Co-authored-by: Srikanth Ramakrishna <[email protected]> * doc-level changes for ATS-M TF base and WL containers (#754) * test dpcpp,mkl in base * make partial changes * add tbb files to partial * fix typo in ttb addition * remove two export vars * remove oneapi dir check and mount * change name of gpu * change gpu name * add driver download link and remove custom paths * provide driver download link * refine typos in wl and base docs * remove onapi volume mount * remove model req and path for ITEX Co-authored-by: Mahathi <[email protected]> * Modified all README's (#757) * Modified all README's * Modified README's * update readmes Co-authored-by: Srikanth Ramakrishna <[email protected]> * Fixed typo in IPEX dockerfile (#760) * Fix styler and unit tests for develop-gpu (#777) * Fix styler and unit tests for develop-gpu Signed-off-by: Abolfazl Shahbazi <[email protected]> * Fix unittests too Signed-off-by: Abolfazl Shahbazi <[email protected]> Signed-off-by: Abolfazl Shahbazi <[email protected]> * Sync with develop branch (#774) * Update args.rank and args.world_size for maskrcnn (#338) * Pytorch updates for SPR 2022 ww01 and resolve AIDEVOPS-703 (#330) * Updates to resolve AIDEVOPS-703 * Removing empty .dockerignore * Removing extra line * Update TF inference language modeling (BERT Large) docs for instructions to run on Windows (#342) * update tf inference language modeling for windows instructions * modify the BS of maskrcnn throughput (#356) * Fix quick start scripts links in object detection docs (#358) * Enable running models on certain num of cores (#343) * Enable running on certain num of cores * Removed hard-coded number * Checking if HT is on/off * Fixed tests and platform util for perf notebook (#361) * Update dataset to 3 RNN-T training datasets (#357) * Update dataset to 3 RNN-T training datasets In this commit, train-clean-360 and train-other-500 are added in model. These datasets need 500GB disk space to preprocess. It will take ~4 hours to run the entire 3 datasets for one epoch in BF16. You can terminate the training process by adding `num_steps` in models/language_modeling/pytorch/rnnt/training/cpu/train.sh. * Set NUM_STEPS outside of bash script * Add note that FP32 runs 100 steps * workaround to fix distributed training issue (#365) * update the BS of maskrcnn throughput (#366) * Fix maskrcnn output scirpt for ipex distributed training (#360) * Update 3D UNet MLPerf doc to run FP32 inference on windows (#367) * update 3dunet mlperf doc to run fp32 inf on windows * Fix doc links for the Windows supported models list (#368) * update links * Transformer ML-Perf SPR WW04 (#359) * Changed the attention part so that it can utilize the existing fusion of batchmatmul+mul+addv2, and also use static varibles to reduce redundant compution * fixed a minor bug for a static variable * Changed the model so that the reshape can be moved out of dense layer so that we can fuse the ops in the dense layers * Changed the depth of attention to a static variable * fix bert pre train distributed bug (#369) * Weizhuoz/fix bert ddp (#374) * tee Bert ddp to a specific log file * Add tee on phase1 * Fix maskrcnn distributed training calculation * Enable jemalloc for BERT throughput mode (#375) * update bs and use ipex Lamb (#382) * fix distribute training for DLRM and use launcher (#383) * Add a separate doc for windows env setup (#371) * add a separate doc for windows support on baremetal * use msys bash to run start.sh for windows * update supported models docs for model dependencies on Windows * fix distribute training for DLRM and use launcher (#386) * Update ImageNet Dataset preprocessing instructions (#385) * update imagenet dataset preprocessing scripts and doc * Ttitswor/snyk cli support (#340) * tables version out of date whl would not build properly on sf-client. * Updating intel-tensorflow version does not exist * Updating tensorflow-addons Version does not exist. * Updating horovod whl no longer builds successfully on Python 3.9+ * remove empty requirements.txt file sf-client will fail, no need for empty req file. * Updating Pandas version out date, whl no longer builds successfully on Python 3.9+ * Update pandas Version not longer builds whl successfully on python 3.9+. * Update numpy Version whl fails to build successfully on python 3.9+ * Updating horovod Version fails to build whl successfully on Python 3.9+. * Update SimpleITK Version does not install correctly on python version 3.9+. * Updating numpy numpy==1.16.3 does not build whl successfully on Python 3.9+. * Updating scipy scipy==1.2.0 fails to build whl successfully on Python 3.9+ * Updating h5py h5py==2.10.0 fails to build whl successfully on Python 3.9+. * Updating numpy numpy>=1.16.3 fails to build whl successfully on Python 3.9+. * Update h5py h5py==2.10.0 fails to build whl successfully on Python 3.9+. * Remove upload to GCS (#387) * Remove upload to GCS Signed-off-by: Abolfazl Shahbazi <[email protected]> * remove gcs option from the shell script Signed-off-by: Abolfazl Shahbazi <[email protected]> * Add support for CentOS 7 and Debian 10, 11 (#391) * Add support for CentOS 7 and Debian 10, 11 Signed-off-by: Abolfazl Shahbazi <[email protected]> * Replace 'dnf' with 'yum' for CentOS 7 compatibility Signed-off-by: Abolfazl Shahbazi <[email protected]> * remove commented line Signed-off-by: Abolfazl Shahbazi <[email protected]> * Add the Yum repo fix for 'CentOS 8' Signed-off-by: Abolfazl Shahbazi <[email protected]> * Adding support for RedHat 7 and 8 (#394) Signed-off-by: Abolfazl Shahbazi <[email protected]> * Update COCO validation dataset instructions for bare metal and docker (#390) * update coco dataset instructions for baremetal and docker * update coco script and instructions to remove output dir env var * Add numactl partial to wide and deep (#396) Signed-off-by: Abolfazl Shahbazi <[email protected]> * Making Platform and OS check more portable (#393) * Making Platform and OS check more portable Signed-off-by: Abolfazl Shahbazi <[email protected]> * Fix a minor syntax error Signed-off-by: Abolfazl Shahbazi <[email protected]> * Improve OS version checking (#401) Signed-off-by: Abolfazl Shahbazi <[email protected]> * Minor syntax updates for py38 or newer (#400) * Minor syntax updates for py38 or newer Signed-off-by: Abolfazl Shahbazi <[email protected]> * More Python3.8 compliant literal comparison fixes Signed-off-by: Abolfazl Shahbazi <[email protected]> * Update training.sh (#403) change "socked_id" to "node_id" for ipex launcher * Fix tcmalloc path to set LD_PRELOAD (#388) * Fix tcmalloc.so path * Formatting * Removing debug messages * Unit test update * Test updates * Add tcmalloc to the int8 dockerfiles * Removing files we don't need * Finalize Red Hat and CentOS 7, 8 support (#398) * Minor fix for Red Hat support Signed-off-by: Abolfazl Shahbazi <[email protected]> * Improve OS version checking Signed-off-by: Abolfazl Shahbazi <[email protected]> * Introduce devtoolset-7 for CentOS and Red Hat 7 Signed-off-by: Abolfazl Shahbazi <[email protected]> * minor regex fix Signed-off-by: Abolfazl Shahbazi <[email protected]> * yum install consistency Signed-off-by: Abolfazl Shahbazi <[email protected]> * Stock TensorFlow v2.5/v2.6/v2.7 support for performance analysis notebook -(sync with develop branch Jan 26) (#377) * add back some missing patches * add TF_ENABLE_ONEDNN_OPTS support for stock TF 2.5 and above * transformer patch fix * Update README.md * online mode support * Adding support for SLES 15 (#399) * Adding support for SLES 15.03 Signed-off-by: Abolfazl Shahbazi <[email protected]> * Improve SLES version check regex Signed-off-by: Abolfazl Shahbazi <[email protected]> * Fix a minor typo in OS name Signed-off-by: Abolfazl Shahbazi <[email protected]> * Fix BERT data instructions (#402) * add bert data instructions in a separate doc * update bert large dataset instructions * Weizhuoz/fix ipex ww05 (#404) * fix DLRM throughput output error * Modify socket_id to node_id for ipex launcher * fix data preprocessing script link for bert base and bert LT(#407) * Add kmp_blocktime arg for ResNet101 int8 (#410) * [RNN-T training] Update download_dataset.sh (#412) Align with MLPerf: Remove --speed 0.9 1.1 * Add a snippet to download COCO2014 dataset files (#411) * Fix failing unit tests (test_bare_metal and bert_fp32_inference) (#409) * Fix unit tests * benchmarks/ * Rename var so that it's not confused with the actual number of platform cores * Add socket id 0 test * Fix the link for the income census dataset download script (#413) * BERT: Enable weight sharing and remove data layer for benchmarking (#406) * Fix unit and style tests for BERT (#415) Signed-off-by: Abolfazl Shahbazi <[email protected]> * Add Jupyter notebooks for fine tuning BERT from TF Hub (#408) * Add WIP notebooks * Add question and answering notebook * Update classifier to clean up and document and add a second dataset * updated notebook and model map with more BERT models * Add README and update the ipynb name * Remove unused notebook * Update to remove the section that displays data with the predictions * Add utils file * utils comments and README update * Updated files * Clean up displaying predictions to use a pandas df * Updates after notebook clean up and add export to the q&a notebook * Retested and updates * README updates and comments/formatting in utils scripts * Add note about expecting that tensorflow has already been installed * Add notebooks to the main TL README * Add missing new lines * Add pip install ipywidgets==7.6.5 after testing on bare metal * Rename BERT Question Answering notebook * Notebook updates based on review feedback * Remove inadvertant changes * Removing empty line * PYT transfer learning notebook for object detection (#397) * Initial commit of notebook and utils * Added a README * Removed non-functioning datasets & models * Doc edit * Fixed bugs, improved explanations, suppressed warnings * Adds notebook for generic image classification (#364) * Adds image classification notebook for user datasets * Adds Image Classification transfer learning notebook * Fixed links and text * Minor doc updates * Updated for review feedback * Moved training-specific vars to TL section * Newline and license header * fix python seed (#417) * Fix DIEN no requirements.txt file found (#422) * bug fix in ssd-resnet34 (#423) * update the BS of maskrcnn throughput (#425) * Add a doc for transformer language mlperf dataset (#419) * Add a doc for wide and deep large dataset instructions (#420) * add a doc for inference dataset instructions, and updating the models docs * Doc updates for the Transfer Learning notebooks (#430) * Add the TF models dataset links to the main models table (#429) * Fix dlrm without ipex-interaction (#434) * Fix link for PyTorch RoBERTa base inference (#436) * Enable inference for PyTorch TransNetV2 (#426) * Enable inference for PyTorch TransNetV2 * enable bf16 inference for PyTorch TransNetV2 * update README * use dummy data * Add the option to use a custom dataset in the BERT binary text classification notebook using TF Hub (#435) * Add the option to use a custom dataset in the BERT binary text classification notebook using TF Hub * update bert_utils to add the download_and_extract_zip function * Updates based on review feedback * add WER for RNN-T (#440) * Update recommendation inference docs for Windows instructions (#437) * add windows instructions for dien and wide&deep inference * fix accuracy issue in 4.10 transformers in patches (#441) Co-authored-by: Jiayi Sun <[email protected]> * update pytorch maskrcnn for PT change (#442) * use multi-instances(one node for each instance) for throughput run (#443) * A new Jupyter notebook for lpot quantization tutorial and related perf analysis (#115) * draft for lpot quantization and perf analysis jupyter notebook * Update Louie/lpot perf analysis by review comments (#298) * update with formal name of model zoo, correct wrong words, add license in python file * rm empty line Co-authored-by: Neo Zhang Jianyu <[email protected]> Co-authored-by: Abolfazl Shahbazi <[email protected]> * use multi-instances for maskrcnn training (#445) * Update language translation docs for windows support (#444) * update bert and transfromer lt official docs for windows support * fix a wrong link for 3dunet readme * Update run_bert_pretrain_phase2.sh (#449) * Update run_bert_pretrain_phase1.sh (#450) * enable resnet50 training for multi sockets (#448) * Update DLRM training to train on 2S. (#451) * Launcher command shell in Windows to achieve better AI workload performance for certain Intel client hardware (#395) *Launcher command shell in Windows to achieve better AI workload performance for certain Intel client hardware * update the list of supported models on windows (#455) * Update BraTS2018 data preprocessing instructions for 3D-UNet (#452) * Fix for keras experimental for bert. (#433) * Add a PyTorch NLP fine tuning notebook using the IMDb dataset for sentiment analysis (#453) * Add the pytorch IMDB fine tuning notebook * Update markdown * Add README * Renaming notebook and main doc update * Fix link * fix path in readme * Update requirements.txt * Add datasets to requirements * Add transformers to requirements * add sklearn to requirements * Updates based on review feedback - fixing 'extends pytorch * Update the README to specify 3.9 * Use 'NeoZhangJianyu' ID from GitHub (#456) Signed-off-by: Abolfazl Shahbazi <[email protected]> * Leslie/add runtime extension support (#457) * add runtime extension for ssd-rn34 accuracy inference * support iteration larger than dataloader * change the weight sharing script name (#461) * fine tune for dataset env var configuration (#463) * Add rn50 inference runtime extension support for throughput/accuracy (#462) * add rn50 throughput mode runtime extension support * add rn50 accuracy mode runtime extension support * Update the PyTorch Text Classification fine tuning notebook to allow using a custom dataset (#467) * Update the PyTorch Text Classification fine tuning notebook to use a custom dataset * update description at the top of the notebook to mention the custom dataset option * Add citation for the SMS text collection dataset * Update the PyTorch text classification README to note the custom dataset option * Rename the notebook and update the main TL ReadMe * Clearing notebook output * Fix syntax * Fix Transformer Language mlperf to add arg --kmp-blocktime (#469) * fix transformer mlperf to parse --kmp-blocktime, in case set on the system * Windows support for Transformer Language MLPerf inference (#471) * fix python format and update docs for instructions * Minor clean up (#459) Signed-off-by: Abolfazl Shahbazi <[email protected]> * Updated the transformer_mlperf inference profiling option, and some minor changes in the README (#472) * Modify the output tag for IPEX DDP (#475) * remove manual conversion of models to datatype (#478) * feed sample input while prepacking for training (#479) Co-authored-by: Wang, Chuanqi <[email protected]> * Minor flake8 fix (#481) Signed-off-by: Abolfazl Shahbazi <[email protected]> * update the Pytorch URL for develop branch (#485) * Update versions and URLs for release v2.7 (#484) * Update versions and URLs for release v2.7 Signed-off-by: Abolfazl Shahbazi <[email protected]> * Regenerate docs and dockerfiles Signed-off-by: Abolfazl Shahbazi <[email protected]> * Update the main IMZ README.md to list models per use case (#466) * add usecases tables in the main model readme and benchmarks readme * revert bf16 changes (#488) * Add partials and spec yml for the end2end DLSA pipeline (#460) * Add partials and specs for the end2end DLSA pipeline * Add missing end line * Update name to include ipex * update specs to have use the public image as a base on one and SPR for the other * Dockerfile updates for the updated DLSA repo * Update pip install list * Rename to public * Removing partials that aren't used anymore * Fixes for 'kmp-blocktime' env var (#493) * Fixes for 'kmp-blocktime' env var Signed-off-by: Abolfazl Shahbazi <[email protected]> * update per review feedback Signed-off-by: Abolfazl Shahbazi <[email protected]> * Add 'kmp-blocktime' for mlperf-gnmt (#494) * Add 'kmp-blocktime' for mlperf-gnmt Signed-off-by: Abolfazl Shahbazi <[email protected]> * Remove duplicate parameter definition Signed-off-by: Abolfazl Shahbazi <[email protected]> * add sample_input for resnet50 training (#495) * remove the case when fragment_size not equal args.batch_size (#500) * Changed the transformer_mlperf fp32 model so that we can fuse the ops… (#389) * Changed the transformer_mlperf fp32 model so that we can fuse the ops in the model, and also minor changes for python3 * Changed the transformer_mlperf int8 model so that we can fuse the ops in the model, and also minor changes for python3 * SPR updates for WW12, 2022 (#492) * SPR updates for WW12, 2022 Signed-off-by: Abolfazl Shahbazi <[email protected]> * Update for PyTorch SPR WW2022-12 Signed-off-by: Abolfazl Shahbazi <[email protected]> * Update pytorch base for SPR too Signed-off-by: Abolfazl Shahbazi <[email protected]> * Stick with specific 'keras-nightly' version Signed-off-by: Abolfazl Shahbazi <[email protected]> * Updates per code review Signed-off-by: Abolfazl Shahbazi <[email protected]> * update maskrcnn training_multinode.sh (#502) * Fixed a bug in the transformer_mlperf model threads setting (#482) * Fixed a bug in the transformer_mlperf model threads setting * Fix failing tests Signed-off-by: Abolfazl Shahbazi <[email protected]> Co-authored-by: Abolfazl Shahbazi <[email protected]> * Added the default threads setting for transformer_mlperf inference in… (#504) * Added the default threads setting for transformer_mlperf inference in case there is no command line input * Fix unit tests Signed-off-by: Abolfazl Shahbazi <[email protected]> Co-authored-by: Abolfazl Shahbazi <[email protected]> * PyTorch Image Classification TL notebook (#490) * Adds new TL notebook with documentation * Added newline * Added to main TL README * Small fixes * Updated for review feedback * Added more models and a download limit arg * Removed py3.9 requirement and changed default model * Adds Kitti torchvision dataset to TL notebook (#512) * Adds Kitti torchvision dataset to TL notebook * Fixed citations formatting * update maskrcnn model (#515) * minor update. (#465) * Create unit-test github action workflow (#518) * Create unit-test github action workflow Tested here: https://github.com/sriester/frameworks.ai.models.intel-models/runs/6089350443?check_suite_focus=true Runs tox py.test on push. * Containerize job * Update unit-test.yml Changed docker credentials to imzbot * Update to Horovod commit 11c1389 to fix TF v2.9 + Horovod install failure (#519) Signed-off-by: Abolfazl Shahbazi <[email protected]> * update distilbert model to 4.18 transformers and enable int8 path (#521) * rnnt: use launcher to set output file path and name (#524) * Update BareMetalSetup.md (#526) Always use the latest torchvision * Reduce memory usage for dlrm acc test (#527) * updatedistilbert with text_classification (#529) * add patch for distilbert (#530) * Update the model-builder dockerfile to use ubuntu 20.04 (#532) * Add script for coco training dataset processing (#525) * and update tensorflow ssd-resnet34 training dataset instructions * update patch (#533) Co-authored-by: Wang, Chuanqi <[email protected]> * [RNN-T training] Enable FP32 gemm using oneDNN (#531) * Update the Readme guide for distilbert (#534) * Update the Readme guide for distilbert * Fix accuracy grep bug, and grep accuracy for distilbert Co-authored-by: Weizhuo Zhang <[email protected]> * Update end2end public dockerfile to look for IPEX in the conda directory (#535) * Notebook to script conversion example (#516) * Add notebook script conversion example * Fixed doc * Replaces custom preprocessor with built-in one * Changed tag to remove_for_custom_dataset … * Change --num-inter-threads to 1 for bert-large int8 (#1091) * modify ResNet50 training (#1092) * Cherry-pick commits for GPU Flex updates from the develop-gpu branch (#1090) * Corrected typos in README (#1074) * IPEX FLEX 555 docker validation (#1060) * update IPEX flex series for new driver version * add dummy,batchsize options * clarify readme details,change docker image name * update download links * add bs and num_iter env vars * ITEX FLEX 555 docker validation (#1059) * update flex workloads for the new driver version * add batch size as env * Modified Readme for baremetal for ITEX workloads --------- Co-authored-by: Mahathi Vatsal <[email protected]> * clean old devcatalog instructions (#1086) * change precision * remove old instructions --------- Co-authored-by: Srikanth Ramakrishna <[email protected]> Co-authored-by: Mahathi Vatsal <[email protected]> * fix bert large fp32 training for cpu not to use keras_policy datatype (#1093) * add try except to avoid mkdir fail the case (#1095) Co-authored-by: xiaoman-liu <[email protected]> * develop branch: fix return_dict config (#1104) * remove extra dockerfile (#1108) * fix data buffer for DLRM while epoch > 1 (#1109) Co-authored-by: chunyuan-w <[email protected]> * exit while epoch reach args.nepoches (#1110) Co-authored-by: chunyuan-w <[email protected]> * revert changes to fix perf drop (#1101) * fix buffer_num==0 issue (#1114) Co-authored-by: chunyuan-w <[email protected]> * Update readme for dgpu workloads and IPEX cpu version (#1116) * update main readme for dgpu workloads * upgrade ipex and torch versions for cpu * pick files for multi-card release ipex (#1119) * pick files for multi-card release ipex * pick files for release ITEX multi-card (#1111) * pick files for release * Update flex_multi_card_batch_inference.sh * Update flex_multi_card_online_inference.sh * remove edits from unrelated files,rename file * TensorFlow Linux CI Temporary Change of 3d_unet_mlperf Model (#1121) * Update requirements.txt * fixed license issue in dataset api (#1126) * rnnt: fix _joint_step_batch for stock pt path (#1123) * rnnt: fix stock pt path and refactor code (#1124) * rnnt: refactor ipex fp32 & bf16 * refactor with or without ipex path; convert embed dtype for stock pt path; * port changes from validation for more scenario support * simplify embed dtype conversion and jit * remove torch.compile for now since seems incorrect * fix graph_mode * remove redundant space * Update requirements.txt for rfcn model (#1128) * Decouple TF ResNet50v1.5 GPU/CPU model scripts (#1133) * decouple training scripts for cpu and gpu * decouple inference scripts * update unittests * fix pythonpath * fixed vulnerabilities for snyk scan (#1134) * fixed vulnerabilities for snyk scan * Enable Vision transformers inference on CPU (#1102) * enable hf vit model * enable vit inference * Update README.md * fix patch (#1137) * Update requirements.txt (#1135) * fix dlrm ddp training local variable Batch referenced before assignment" (#1139) * Fix some quickstart scripts for optional ARGS (#1138) * fix some quickstart scripts for image recognition * Updated with latest ITEX and TF version (#1131) * Updated readmes to refer to the latest ITEX instructions * Add support for MVTEC-AD dataset in dataset API (#1099) * add mvtec dataset download support * add preprocessing support * update for not to remove the raw data file after extraction * use wget to reduce download time * display the wget logs * update requirements.txt * add one more data file for dureader * update broken links in devcatalog (#1140) * update broken links and update filename * Updated with latest IPEX and torch (#1130) * Added IPEX latest documentation * Corrected batch size for bfloat16 (#1098) * Corrected batch size for bfloat16 * Ejan/test quickstart (#1145) * Update latency calculation in quick start script * Fix distilbert script * Add fix for 3dunet quickstart --------- Co-authored-by: shahbaazsyed <[email protected]> * Ejan/test mobilenet v1 (#1150) * Take out bfloat16 env settings * Expect correct vision images path (#1143) * Expect correct vision images path * remove wget * add wget to setup.sh * add sudo * Revert "add sudo" This reverts commit 77ef66abac66e203b423948f76322b153aa63743. * update preprocessing scripts for brca --------- Co-authored-by: WafaaT <[email protected]> * Enabling MobileNetv2 (#1088) * Model Enabling for MobileNetv2 * Updated unit tests * release clean up for PVC containers (#1146) * release clean up for PVC containers * revert license dates * Ejan/dien quickstart (#1153) * Fix dien quickstart * Data Connector integration to Model Zoo (#1136) * Add data connector Co-authored-by: Felipe Leza Alvarez <[email protected]> Co-authored-by: Leza Alvarez, Felipe <[email protected]> Co-authored-by: Miguel Pineda <[email protected]> Co-authored-by: Gerardo Dominguez <[email protected]> Co-authored-by: aagalleg <[email protected]> * Revert changes in ssd-mobilenet int8 cpu scripts (#1155) * revert changes in ssd-mobilenet int8 cpu scripts * update unittests * SPR Ubuntu READMEs modified (#1154) * correct names of devcatalog files (#1160) * reverting the changes for BERT fp16 inference with keras MP (#1141) * Resolve Snyk critical vulnerability (#1162) * Updated branch for SDLe scans (#1148) * Changed declaration of output * Update FLEX_DEVCATALOG.md (#1165) * fix for 'hashlib.md5' bandit scans (#1157) Signed-off-by: Abolfazl Shahbazi <[email protected]> * Enable GPT-J/bloom inference for fp32/bf16/bf32/int8/calibration (#1168) * Enable GPT-J/bloom inference for int8/fp32/bf16/bf32 * Refine bloom-176b inference (#1170) * Enable Stable Diffusion inference for fp32/bf16/fp16/int8/calibration (#1172) * add stable diffusion * modify scripts * modify inference_realtime.sh * enable int8 * modify scripts and README * add calibration script * Numpy parameter constraint (#1171) Co-authored-by: dhermosi <[email protected]> Co-authored-by: Wafaa Taie <[email protected]> * Fix in DIEN model,for changes in Tensorflow framework (#1125) * Fix for array_ops change * Remove some check for latest tf * Change seq test to list test * Enabling MMoE training with bfloat16 and fp16 precisions (#1149) * enabling MMoE training with bfloat16 and fp16 precisions * fixing coverage tests * changing arg model-dir to output-dir * modifying tf_mmoe_args.json * correcting --output-dir type on model_init * coding style fixes * adding coverage tests for bfloat16 and fp16 training * Changes to fix horovod issue (#1163) * add license for stable-diffusion scripts (#1173) * Dlrm v2 (#1151) * copy dlrm v2 from mlperf repo * remove gpu/distribute/mlperf-log/fused-optimizer related code * enable ipex.optimize with fp32/bf16/fp16, INT8 blocked by trace issue * add data_process folder * add log for performance and add script to enable torchrec dlrm inference/training fp32/fp16/bf16 * enable int8 * enable int8 * add intel license * add Max devcatalog and change flex link (#1179) * add Max devcatalog and change flex link * add precision list * Update CODEOWNERS * Update requirements.txt (#1178) Co-authored-by: justkw <[email protected]> * Add scripts to create model zoo bits for aikit (#1177) * add scripts to create model zoo bits for aikit * changes for code review * Liangan1/remove configure file (#1188) * Add README.md * Remove configure.json * modify maskrcnn training script (#1184) * Liangan1/update bloom model (#1193) * Add README.md * Remove configure.json * Update model to bloom-1b-4-zh * Revert "Remove configure.json" This reverts commit a17d0ae219acf2a0bc6d8e0ef73cbadec35a2522. * Update READEME * Update code owners list (#1186) * only remove torch hub when rn101 (#1200) * only remove torch hub when rn101 * rm -resnext_wsl_model_names due to duplication with hub_model_names * download weight only with pretrained==true --------- Co-authored-by: XiaobingZhang <[email protected]> * Maskrcnn solver steps (#1189) * split solver steps to avoid use () in command line * rm inf change due to no SOVLER.STEPS in inf * split solver steps to avoid use () in command line * rm inf change due to no SOVLER.STEPS in inf * Use gcc for horovod installation (#1202) * test workaround * try flag * try gcc * Enable weight sharing for INT8 and BF16 distilbert (#1097) * Enable weight sharing for INT8 distilbert * enable bf32 for resnet50 training (#1210) * remove video groups (#1197) Co-authored-by: Jitendra Patil <[email protected]> * source env vars and remove video grps (#1203) Co-authored-by: Jitendra Patil <[email protected]> * remove video grps pvc (#1213) Co-authored-by: Jitendra Patil <[email protected]> * add environment variables GLOBAL_BATCH_SIZE and LOCAL_BATCH_SIZE for resnet50 and maskrcnn distributed training (#1212) Co-authored-by: liangan1 <[email protected]> * Update run_multi_instance_throughput.sh (#1211) * add workflow to automate mz drop to aikit (#1217) * Gda/scans (#1194) * Updated scans * Test scans on github.head_ref branch Fixed merge conflicts Fixed unit test requirements * update model zoo drop workflow (#1218) * update to use runners and run in container * update mz drop workflow (#1220) * changes to clone oneapi tools repo (#1222) * fix create bits script (#1223) * Fixed version for snyk scan (#1191) * Fixed version for snyk scan * test mz drop worflow (#1224) * fix create bits script * update create mz aikit bits script (#1226) * comment out drop to artifactory code (#1227) * update pytorch gpu yolov4 scripts (#1225) * update pytorch yolov4 scripts * update readme --------- Co-authored-by: Srikanth Ramakrishna <[email protected]> * Gda/checkmarx (#1221) * Test checkmarx scan * Test PR tests * Test Snyk and Checkmarx scans * Test bandit scan * update driver links,paths and remove dataset coco.names (#1230) * update driver links in devcatalog * update code to include coco files mount --------- Co-authored-by: hanchao <[email protected]> * Create CI/CD pipeline orchestrator workflow (#1229) * Create CI/CD pipeline orchestrator workflow * Rename top layer workflow job names * Update preprocessing scripts for BRCA dataset (#1231) * update scripts for brca * Cicd orchestrator (#1236) * Create CI/CD pipeline orchestrator workflow * Rename top layer workflow job names * Add scheduled execution for CI/CD pipeline * Fix jobs uses paths * Cicd orchestrator (#1237) * Create CI/CD pipeline orchestrator workflow * Rename top layer workflow job names * Add scheduled execution for CI/CD pipeline * Fix jobs uses paths * Add starting job to gather all following jobs in a common root * Send fixed paths to remote * Cicd orchestrator (#1238) * Create CI/CD pipeline orchestrator workflow * Rename top layer workflow job names * Add scheduled execution for CI/CD pipeline * Fix jobs uses paths * Add starting job to gather all following jobs in a common root * Send fixed paths to remote * Change workflow file extension * modify stable_diffusion accuracy script (#1235) * Provide explicit inputs to workflow call (#1242) * enable int8-bf16 mixed datatype for stable diffusion (#1241) * use --memory-allocator jemalloc replace default_allocator (#1247) * Enable int8-fp32 for BertLarge (#1120) * Update script for the HF model (#1251) * Update script for the HF model * update model link * Fixing batch size param in transformer training (#1244) * Dataset_librarian code updates (#1187) --------- Signed-off-by: gera-aldama <[email protected]> Signed-off-by: Felipe Leza Alvarez <[email protected]> Co-authored-by: Miguel Pineda <[email protected]> Co-authored-by: Gerardo Dominguez <[email protected]> Co-authored-by: gera-aldama <[email protected]> Co-authored-by: Felipe Leza Alvarez <[email protected]> Co-authored-by: aagalleg <[email protected]> Co-authored-by: Leza Alvarez, Felipe <[email protected]> Co-authored-by: ma-pineda <[email protected]> * Dien fix for training failure (#1252) * Fixing SSD-RN34 Training accuracy/convergence with NHWC format (#1249) * Fixing SSD-RN34 accuracy with NHWC format * Fixing RFCN inference model script to use the right numactl parameters (#1253) * Fixing RFCN model script * Fixing unit test * Add inputs to manual CI/CD workflow execution (#1254) * Add inputs to manual CI/CD workflow execution * Add schedule execution commet * Change default value for is_lkg_drop flag to true * enable fp16 for resnet50 training (#1258) * Enable GNN models in Model Zoo based on PyG (#1106) * Enable graph classification of inference in model_zoo * Enable training * Add quickstart for inference and training * Rebase with PyG master --------- Co-authored-by: jiayisunx <[email protected]> * Disable failing scanners * Remove dependency on commented out scanners * Fix precision value for TF workload test execution * Enable MZ drops in CI/CD pipeline * Add default value for test step * Fixing conda recipe syntax error (#1261) * Feature/aikitpv 828/dataset librarian code refactor (#1273) * Fixing conda recipe syntax error * Updating conda recipe * Maskrcnn print (#1262) * add printing maskrcnn model info into debug level * comment print info * Weizhuoz/fix numactl emr (#1263) * Fix RN50 and RN101 THP, use launcher --throughput-mode * distilbert use all numa, and reduce latency steps to 250 * fix latency numactl for SNC=2 EMR * Adding llm models inference generation for llama/gptj/bloom and lora training for llama (#1259) * draft adding llama training and inference * llm model common enabling * Update README.md * model refine * Update finetune.py * Update prompter.py --------- Co-authored-by: liangan1 <[email protected]> Co-authored-by: leslie-fang-intel <[email protected]> * output fps info for DLRM-v2 (#1277) * Add temporary Docker Hub personal credentials * Move Docker Hub credentials to the image section * Feature/aikitpv 828/dataset librarian code refactor (#1280) * Fixing conda recipe syntax error * Updating conda recipe * Updating python requirements version * Updating python requirements version * Updatin python version from requirements * Updatin pypi package SHA --------- Co-authored-by: Wafaa Taie <[email protected]> * change torchccl_version (#1287) * add preprocess coco (#1299) * add yolov4 env changes (#1300) * Added split ratio as sub arg in brca preprocess (#1284) * Added split ratio as sub arg in brca preprocess * Modified dataset_api readme * modify rn50 training script (#1310) * modify maskrcnn training script (#1312) * Revert "buried Jenkinsfile test (#1309)" (#1317) This reverts commit 092ad07677de5c15e5dcdbbb5d46c79b84db186f. * Fix C++17 build issue of RNN-T training (#1314) * Fix inceptionv3 latency regression (#1307) * Fix frozen graph for flex series * enable accuracy test for dlrm-v2 (#1323) * Unify the patch for Transformers model and upgrade Transformers to 4.28.1 (#1302) * upgrade transofmers to 4.28.1 and unify all of the transformers models' patch * pr refine * refine patch * fix patch * fix int8 acc * Dlrmv2 (#1328) * fix auc compute for log * enable jit/prof * change inference batch size from 16 to 32K * enable bf32 (#1329) * update DLRM int8 config with correct calibration set (#1330) * Update mz-workload-tests.yml Updated internal container image to use for workload tests * Data Connector HF for public repo (#1331) --------- Signed-off-by: Felipe Leza Alvarez <[email protected]> Co-authored-by: Felipe Leza Alvarez <[email protected]> Co-authored-by: aagalleg <[email protected]> Co-authored-by: Gerardo Dominguez <[email protected]> Co-authored-by: Leza Alvarez, Felipe <[email protected]> Co-authored-by: Miguel Pineda <[email protected]> Co-authored-by: ma-pineda <[email protected]> Co-authored-by: gera-aldama <[email protected]> * tests for m1 and m3 (#1301) * add pytorch m1 and m3 in one container * streamline validation pre-process * add new scripts for yolov4 * combine fps for m3 * review devcatalogs Co-authored-by: Jitendra Patil <[email protected]> * Clean up files for release (#1341) * update devcatalog landing page * remove build and multicard devcatalog pyt * remove build and multi-card devcatalogs * restore deleted file * Checkmarx fix for not updated input names (#1343) Fixed wrong names in workflow call input names * Fixing Lint and Unit tests (#1344) Signed-off-by: Abolfazl Shahbazi <[email protected]> * PVC bert-large inference modification (#1332) * change paths and avoid re-download * fix for accuracy (#1345) Co-authored-by: mahathis <[email protected]> * Update Scanner_Snyk.yml with new name in one-ci-cd repo * Update Scanner_Snyk.yml with refs input instead of ref * remove deprecated API replace_lstm_with_ipex_lstm (#1347) * add Throughput keyword (#1295) * add Throughput keyword * use tqdm.format_dict to record thp info * disable pbar before print thp, to avoid final wrong output * add enter before print thp --------- Co-authored-by: jiayisunx <[email protected]> * Update Scans.yml with latest Snyk scanner golden workflow changes * Update mz-workload-tests.yml with git dependency install (#1351) * modify maskrcnn script (#1352) * Upgrade transformers version to fix CVEs (#1355) * Use DockerHub account on workfload tests Moved back to Python 3.8 public Docker image from Docker Hub * adding logs for indicating start and end of an iteration (#1357) * adding logs for indicating start and end of an iteration (#1356) * adding logs for indicating start and stop of an iteration (#1333) * Fixed version to avoid snyk vulnerability (#1363) * minor changes in the readme (#1365) * drop last and add readme (#1368) * Enable local batch size param for dlrm distribtued training (#1369) * Fixed Batch size param for transformer fp32 (#1367) * enable local batch size for distributed training (#1370) * Use local batch size for maskrcnn distributed training (#1371) * Enable local_batch_size for RNN-T distributed training (#1374) * update README for resnet50 and maskrcnn (#1373) * fix sq api (#1379) * add config argument (#1384) * Fix typo of DLRM script (#1375) Co-authored-by: jiayisunx <[email protected]> * Add AVX check logic to workload tests workflow * Disable schedule CI/CD execution * add command to install tfgnn from start.sh (#1382) * fix dlrm batch size issue (#1386) * Add more logs for vit training (#1354) * Remove saving the model * Add no. steps as configurable in quickstart script * Add start/stop logs * move throughput before evaluate * removed accuracy script in gha scripts for debug (#1388) * removed accuracy script for debug * removed accuracy script for debug * Transformers patch for keras nightly (#1387) * Minor changes to capture correct data * Added new patch * Better reporting * Add requirements file * Licence and samples (#1350) --------- Signed-off-by: Felipe Leza Alvarez <[email protected]> Co-authored-by: Leza Alvarez, Felipe <[email protected]> Co-authored-by: Gerardo Dominguez <[email protected]> Co-authored-by: Miguel Pineda <[email protected]> Co-authored-by: ma-pineda <[email protected]> Co-authored-by: aagalleg <[email protected]> Co-authored-by: gera-aldama <[email protected]> * set find_unused_parameters to true for DDP training (#1389) * Update README.md (#1390) * fix ssd rn34 ddp issue (#1391) * TF DistilBERT - Update model with new benchmark scripts (#1327) * Add separate benchmark script for Distilbert to use same input repeatedly. * Add start/stop logs for each iteration * Update unit tests for distilbert * Update distilbert script to select weight sharing option (#1392) * Sd benchmark use dummy dataset (#1398) * do not load data for benchmark test * change argument name * Ejan/3dunet accuracy fix (#1383) * Add SimpleITK version * Change simpleitk version * Change tables version * Clean-up models (#1404) * clean up models * fix unit tests * upgrade mlflows to fix CVEs (#1403) * fix not found links (#1405) * revert changes in CODEOWNERS file * remove torchrec_dlrm --------- Signed-off-by: Abolfazl Shahbazi <[email protected]> Signed-off-by: gera-aldama <[email protected]> Signed-off-by: Felipe Leza Alvarez <[email protected]> Signed-off-by: Felipe Leza Alvarez <[email protected]> Co-authored-by: Kanvi Khanna <[email protected]> Co-authored-by: akhilgoe <[email protected]> Co-authored-by: Syed Shahbaaz Ahmed <[email protected]> Co-authored-by: gera-aldama <[email protected]> Co-authored-by: Lu Teng <[email protected]> Co-authored-by: ellie-jan <[email protected]> Co-authored-by: ellie.jan <[email protected]> Co-authored-by: Shahbazi, Abolfazl <[email protected]> Co-authored-by: Jones, Dina S <[email protected]> Co-authored-by: Patil, Jitendra <[email protected]> Co-authored-by: Zhu, Wei2 <[email protected]> Co-authored-by: Sheng, Yang <[email protected]> Co-authored-by: Wang, Chuanqi <[email protected]> Co-authored-by: Liu, River <[email protected]> Co-authored-by: Robison, Clayne B <[email protected]> Co-authored-by: ltsai1 <[email protected]> Co-authored-by: Yimei Sun <[email protected]> Co-authored-by: Melanie H Buehler <[email protected]> Co-authored-by: Mahmoud Abuzaina <[email protected]> Co-authored-by: Rajendrakumar Chinnaiyan <[email protected]> Co-authored-by: Yerneni, Venkata P <[email protected]> Co-authored-by: Thakkar, Om <[email protected]> Co-authored-by: Ojha, Shweta <[email protected]> Co-authored-by: Cui, Xiaoming <[email protected]> Co-authored-by: Varghese, Jojimon <[email protected]> Co-authored-by: mdfaijul <[email protected]> Co-authored-by: Shiddibhavi, Sharada <[email protected]> Co-authored-by: Shah, Sharvil <[email protected]> Co-authored-by: Ketineni, Rama <[email protected]> Co-authored-by: nedsouza <[email protected]> Co-authored-by: Vincent Zhang <[email protected]> Co-authored-by: mahathis <[email protected]> Co-authored-by: sramakintel <[email protected]> Co-authored-by: Chao1Han <[email protected]> Co-authored-by: Tengfei, Han <[email protected]> Co-authored-by: Feng Yuan <[email protected]> Co-authored-by: Mahathi Vatsal <[email protected]> Co-authored-by: chaohan <[email protected]> Co-authored-by: jiayisunx <[email protected]> Co-authored-by: Vlad Silverman <[email protected]> Co-authored-by: YanbingJiang <[email protected]> Co-authored-by: leslie-fang-intel <[email protected]> Co-authored-by: WeizhuoZhang-intel <[email protected]> Co-authored-by: liangan1 <[email protected]> Co-authored-by: zhuhaozhe <[email protected]> Co-authored-by: Tyler Titsworth <[email protected]> Co-authored-by: Jing Xu <[email protected]> Co-authored-by: blzheng <[email protected]> Co-authored-by: jianan-gu <[email protected]> Co-authored-by: Neo Zhang Jianyu <[email protected]> Co-authored-by: XiaobingZhang <[email protected]> Co-authored-by: Srini511 <[email protected]> Co-authored-by: Sean-Michael Riesterer <[email protected]> Co-authored-by: Chunyuan WU <[email protected]> Co-authored-by: xiaofeij <[email protected]> Co-authored-by: Rahul Nair <[email protected]> Co-authored-by: Veena2207 <[email protected]> Co-authored-by: xiangdong <[email protected]> Co-authored-by: Huang, Zhiwei <[email protected]> Co-authored-by: Sharvil Shah <[email protected]> Co-authored-by: wyang2 <[email protected]> Co-authored-by: zofia <[email protected]> Co-authored-by: Cui, Yifeng <[email protected]> Co-authored-by: LuFengqing <[email protected]> Co-authored-by: Li, Guizi <[email protected]> Co-authored-by: Wang, Yanzhang <[email protected]> Co-authored-by: FengXiongIntel <[email protected]> Co-authored-by: xiaoman-liu <[email protected]> Co-authored-by: gaurides <[email protected]> Co-authored-by: ke1ding <[email protected]> Co-authored-by: Tyler Titsworth <[email protected]> Co-authored-by: ratnampa <[email protected]> Co-authored-by: Jesus Herrera Ledon <[email protected]> Co-authored-by: ke1ding <[email protected]> Co-authored-by: dhermosi <[email protected]> Co-authored-by: justkw <[email protected]> Co-authored-by: lerealno <[email protected]> Co-authored-by: hanchao <[email protected]> Co-authored-by: DiweiSun <[email protected]> Co-authored-by: Miguel Pineda <[email protected]> Co-authored-by: Gerardo Dominguez <[email protected]> Co-authored-by: Felipe Leza Alvarez <[email protected]> Co-authored-by: aagalleg <[email protected]> Co-authored-by: Leza Alvarez, Felipe <[email protected]> Co-authored-by: ma-pineda <[email protected]> Co-authored-by: Mustafa <[email protected]> Co-authored-by: Real Novo, Luis <[email protected]> Co-authored-by: sachinmuradi <[email protected]> Co-authored-by: Ashiq Imran <[email protected]> Co-authored-by: Cao E <[email protected]>
1 parent 31bf4d4 commit 20a2fbb

File tree

224 files changed

+13377
-2825
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

224 files changed

+13377
-2825
lines changed

.bandit.yml

Lines changed: 401 additions & 0 deletions
Large diffs are not rendered by default.

Makefile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ all: venv lint unit_test
3030
$(ACTIVATE):
3131
@echo "Updating virtualenv dependencies in: $(VIRTUALENV_DIR)..."
3232
@test -d $(VIRTUALENV_DIR) || $(VIRTUALENV_EXE) $(VIRTUALENV_DIR)
33-
@. $(ACTIVATE) && python -m pip install -r requirements-test.txt
33+
@. $(ACTIVATE) && python -m pip install -r requirements.txt
3434
@touch $(ACTIVATE)
3535

3636
venv: $(ACTIVATE)

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -45,6 +45,7 @@ For best performance on Intel® Data Center GPU Flex and Max Series, please chec
4545
| [Inception V4](https://arxiv.org/pdf/1602.07261.pdf) | TensorFlow | Inference | [Int8 FP32](/benchmarks/image_recognition/tensorflow/inceptionv4/inference/README.md) | [ImageNet 2012](https://github.com/IntelAI/models/tree/master/datasets/imagenet/README.md) |
4646
| [MobileNet V1*](https://arxiv.org/pdf/1704.04861.pdf) | TensorFlow | Inference | [Int8 FP32 BFloat16](/benchmarks/image_recognition/tensorflow/mobilenet_v1/inference/README.md) | [ImageNet 2012](https://github.com/IntelAI/models/tree/master/datasets/imagenet/README.md) |
4747
| [MobileNet V1*](https://arxiv.org/pdf/1704.04861.pdf) [Sapphire Rapids](https://www.intel.com/content/www/us/en/newsroom/opinion/updates-next-gen-data-center-platform-sapphire-rapids.html#gs.blowcx) | TensorFlow | Inference | [Int8 FP32 BFloat16 BFloat32](/quickstart/image_recognition/tensorflow/mobilenet_v1/inference/cpu/README_SPR_baremetal.md) | [ImageNet 2012](https://github.com/IntelAI/models/tree/master/datasets/imagenet/README.md) |
48+
| [MobileNet V2](https://arxiv.org/pdf/1801.04381.pdf) | Tensorflow | Inference | [FP32 BFloat16 Int8](/benchmarks/image_recognition/tensorflow/mobilenet_v2/inference/README.md) | [ImageNet 2012](https://github.com/IntelAI/models/tree/master/datasets/imagenet/README.md)
4849
| [ResNet 101](https://arxiv.org/pdf/1512.03385.pdf) | TensorFlow | Inference | [Int8 FP32](/benchmarks/image_recognition/tensorflow/resnet101/inference/README.md) | [ImageNet 2012](https://github.com/IntelAI/models/tree/master/datasets/imagenet/README.md) |
4950
| [ResNet 50](https://arxiv.org/pdf/1512.03385.pdf) | TensorFlow | Inference | [Int8 FP32](/benchmarks/image_recognition/tensorflow/resnet50/inference/README.md) | [ImageNet 2012](https://github.com/IntelAI/models/tree/master/datasets/imagenet/README.md) |
5051
| [ResNet 50v1.5](https://github.com/tensorflow/models/tree/v2.11.0/official/legacy/image_classification/resnet) | TensorFlow | Inference | [Int8 FP32 BFloat16 FP16](/benchmarks/image_recognition/tensorflow/resnet50v1_5/inference/README.md) | [ImageNet 2012](https://github.com/IntelAI/models/tree/master/datasets/imagenet/README.md) |

benchmarks/README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,7 @@ For information on running more advanced use cases using the workload containers
2323
| Image Recognition | [Inception V3](https://arxiv.org/pdf/1512.00567.pdf) | Inference | Model Containers: [Int8](https://software.intel.com/content/www/us/en/develop/articles/containers/inceptionv3-int8-inference-tensorflow-container.html) [FP32](https://software.intel.com/content/www/us/en/develop/articles/containers/inceptionv3-fp32-inference-tensorflow-container.html) <br> Model Packages: [Int8](https://software.intel.com/content/www/us/en/develop/articles/containers/inceptionv3-int8-inference-tensorflow-model.html) [FP32](https://software.intel.com/content/www/us/en/develop/articles/containers/inceptionv3-fp32-inference-tensorflow-model.html) | [Int8 FP32](image_recognition/tensorflow/inceptionv3/inference/README.md) | [ImageNet 2012](https://github.com/IntelAI/models/tree/master/datasets/imagenet/README.md) |
2424
| Image Recognition | [Inception V4](https://arxiv.org/pdf/1602.07261.pdf) | Inference | Model Containers: [Int8](https://software.intel.com/content/www/us/en/develop/articles/containers/inceptionv4-int8-inference-tensorflow-container.html) [FP32](https://software.intel.com/content/www/us/en/develop/articles/containers/inceptionv4-fp32-inference-tensorflow-container.html) <br> Model Packages: [Int8](https://software.intel.com/content/www/us/en/develop/articles/containers/inceptionv4-int8-inference-tensorflow-model.html) [FP32](https://software.intel.com/content/www/us/en/develop/articles/containers/inceptionv4-fp32-inference-tensorflow-model.html) | [Int8 FP32](image_recognition/tensorflow/inceptionv4/inference/README.md) | [ImageNet 2012](https://github.com/IntelAI/models/tree/master/datasets/imagenet/README.md) |
2525
| Image Recognition | [MobileNet V1*](https://arxiv.org/pdf/1704.04861.pdf) | Inference | Model Containers: [Int8](https://software.intel.com/content/www/us/en/develop/articles/containers/mobilenetv1-int8-inference-tensorflow-container.html) [FP32](https://software.intel.com/content/www/us/en/develop/articles/containers/mobilenetv1-fp32-inference-tensorflow-container.html) <br> Model Packages: [Int8](https://software.intel.com/content/www/us/en/develop/articles/containers/mobilenetv1-int8-inference-tensorflow-model.html) [FP32](https://software.intel.com/content/www/us/en/develop/articles/containers/mobilenetv1-fp32-inference-tensorflow-model.html) | [Int8 FP32 BFloat16](image_recognition/tensorflow/mobilenet_v1/inference/README.md) | [ImageNet 2012](https://github.com/IntelAI/models/tree/master/datasets/imagenet/README.md) |
26+
| Image Recognition | [MobileNet V2](https://arxiv.org/pdf/1801.04381.pdf) | Inference | | [Int8 FP32 BFloat16](image_recognition/tensorflow/mobilenet_v2/inference/README.md) | [ImageNet 2012](https://github.com/IntelAI/models/tree/master/datasets/imagenet/README.md) |
2627
| Image Recognition | [ResNet 101](https://arxiv.org/pdf/1512.03385.pdf) | Inference | Model Containers: [Int8](https://software.intel.com/content/www/us/en/develop/articles/containers/resnet101-int8-inference-tensorflow-container.html) [FP32](https://software.intel.com/content/www/us/en/develop/articles/containers/resnet101-fp32-inference-tensorflow-container.html) <br> Model Packages: [Int8](https://software.intel.com/content/www/us/en/develop/articles/containers/resnet101-int8-inference-tensorflow-model.html) [FP32](https://software.intel.com/content/www/us/en/develop/articles/containers/resnet101-fp32-inference-tensorflow-model.html) | [Int8 FP32](image_recognition/tensorflow/resnet101/inference/README.md) | [ImageNet 2012](https://github.com/IntelAI/models/tree/master/datasets/imagenet/README.md) |
2728
| Image Recognition | [ResNet 50](https://arxiv.org/pdf/1512.03385.pdf) | Inference | Model Containers: [Int8](https://software.intel.com/content/www/us/en/develop/articles/containers/resnet50-int8-inference-tensorflow-container.html) [FP32](https://software.intel.com/content/www/us/en/develop/articles/containers/resnet50-fp32-inference-tensorflow-container.html) <br> Model Packages: [Int8](https://software.intel.com/content/www/us/en/develop/articles/containers/resnet50-int8-inference-tensorflow-model.html) [FP32](https://software.intel.com/content/www/us/en/develop/articles/containers/resnet50-fp32-inference-tensorflow-model.html) | [Int8 FP32](image_recognition/tensorflow/resnet50/inference/README.md) | [ImageNet 2012](https://github.com/IntelAI/models/tree/master/datasets/imagenet/README.md) |
2829
| Image Recognition | [ResNet 50v1.5](https://github.com/tensorflow/models/tree/v2.11.0/official/legacy/image_classification/resnet) | Inference | Model Containers: [Int8](https://software.intel.com/content/www/us/en/develop/articles/containers/resnet50v1-5-int8-inference-tensorflow-container.html) [FP32](https://software.intel.com/content/www/us/en/develop/articles/containers/resnet50v1-5-fp32-inference-tensorflow-container.html) [BFloat16](https://software.intel.com/content/www/us/en/develop/articles/containers/resnet50v1-5-bfloat16-inference-tensorflow-container.html) <br> Model Packages: [Int8](https://software.intel.com/content/www/us/en/develop/articles/containers/resnet50v1-5-int8-inference-tensorflow-model.html) [FP32](https://software.intel.com/content/www/us/en/develop/articles/containers/resnet50v1-5-fp32-inference-tensorflow-model.html) [BFloat16](https://software.intel.com/content/www/us/en/develop/articles/containers/resnet50v1-5-bfloat16-inference-tensorflow-model.html) | [Int8 FP32 BFloat16](image_recognition/tensorflow/resnet50v1_5/inference/README.md) | [ImageNet 2012](https://github.com/IntelAI/models/tree/master/datasets/imagenet/README.md) |

benchmarks/common/platform_util.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@
3232
CPU_SOCKETS_STR_ = "Socket(s)"
3333
CORES_PER_SOCKET_STR_ = "Core(s) per socket"
3434
THREADS_PER_CORE_STR_ = "Thread(s) per core"
35-
LOGICAL_CPUS_STR_ = "CPU(s)"
35+
LOGICAL_CPUS_STR_ = "CPU(s):"
3636
NUMA_NODE_CPU_RANGE_STR_ = "NUMA node{} CPU(s):"
3737
ONLINE_CPUS_LIST = "On-line CPU(s) list:"
3838

benchmarks/common/tensorflow/start.sh

Lines changed: 49 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -114,6 +114,17 @@ if [[ ${NOINSTALL} != "True" ]]; then
114114
# set env var before installs so that user interaction is not required
115115
export DEBIAN_FRONTEND=noninteractive
116116
# install common dependencies
117+
118+
# Handle horovod uniformly for all OSs.
119+
# If a diffferent version need to be used for a specific OS
120+
# change that variable alone locally in the large if stmts (below)
121+
if [[ ${MPI_NUM_PROCESSES} != "None" && $MODE == "training" ]]; then
122+
export HOROVOD_WITHOUT_PYTORCH=1
123+
export HOROVOD_WITHOUT_MXNET=1
124+
export HOROVOD_WITH_TENSORFLOW=1
125+
export HOROVOD_VERSION=39c8f7c
126+
fi
127+
117128
if [[ ${OS_PLATFORM} == *"CentOS"* ]] || [[ ${OS_PLATFORM} == *"Red Hat"* ]]; then
118129
yum update -y
119130
yum install -y gcc gcc-c++ cmake python3-tkinter libXext libSM
@@ -150,18 +161,12 @@ if [[ ${NOINSTALL} != "True" ]]; then
150161
fi
151162
fi
152163

153-
if [[ ${MPI_NUM_PROCESSES} != "None" ]]; then
164+
if [[ ${MPI_NUM_PROCESSES} != "None" && $MODE == "training" ]]; then
154165
# Installing OpenMPI
155166
yum install -y openmpi openmpi-devel openssh openssh-server
156167
yum clean all
157168
export PATH="/usr/lib64/openmpi/bin:${PATH}"
158169

159-
# Install Horovod
160-
export HOROVOD_WITHOUT_PYTORCH=1
161-
export HOROVOD_WITHOUT_MXNET=1
162-
export HOROVOD_WITH_TENSORFLOW=1
163-
export HOROVOD_VERSION=b1d0ce8
164-
165170
# Install GCC 7 from devtoolset-7
166171
if [[ ${OS_VERSION} =~ "7".* ]]; then
167172
if [[ ${OS_PLATFORM} == *"CentOS"* ]]; then
@@ -179,7 +184,7 @@ if [[ ${NOINSTALL} != "True" ]]; then
179184
# a working commit replace next set of commands with something like:
180185
yum install -y git make
181186
yum clean all
182-
python3 -m pip install --no-cache-dir git+https://github.com/horovod/horovod.git@${HOROVOD_VERSION}
187+
CC=gcc CXX=g++ python3 -m pip install --no-cache-dir git+https://github.com/horovod/horovod.git@${HOROVOD_VERSION}
183188
horovodrun --check-build
184189
fi
185190
elif [[ ${OS_PLATFORM} == *"SLES"* ]] || [[ ${OS_PLATFORM} == *"SUSE"* ]]; then
@@ -192,23 +197,17 @@ if [[ ${NOINSTALL} != "True" ]]; then
192197
zypper clean all
193198
fi
194199

195-
if [[ ${MPI_NUM_PROCESSES} != "None" ]]; then
200+
if [[ ${MPI_NUM_PROCESSES} != "None" && $MODE == "training" ]]; then
196201
## Installing OpenMPI
197202
zypper install -y openmpi3 openmpi3-devel openssh openssh-server
198203
zypper clean all
199204
export PATH="/usr/lib64/mpi/gcc/openmpi3/bin:${PATH}"
200205

201-
## Install Horovod
202-
export HOROVOD_WITHOUT_PYTORCH=1
203-
export HOROVOD_WITHOUT_MXNET=1
204-
export HOROVOD_WITH_TENSORFLOW=1
205-
export HOROVOD_VERSION=35b27e9
206-
207206
# In case installing released versions of Horovod fail,and there is
208207
# a working commit replace next set of commands with something like:
209208
zypper install -y git make
210209
zypper clean all
211-
python3 -m pip install --no-cache-dir git+https://github.com/horovod/horovod.git@${HOROVOD_VERSION}
210+
CC=gcc CXX=g++ python3 -m pip install --no-cache-dir git+https://github.com/horovod/horovod.git@${HOROVOD_VERSION}
212211
horovodrun --check-build
213212
fi
214213
elif [[ ${OS_PLATFORM} == *"Ubuntu"* ]] || [[ ${OS_PLATFORM} == *"Debian"* ]]; then
@@ -224,30 +223,25 @@ if [[ ${NOINSTALL} != "True" ]]; then
224223
apt-get install google-perftools -y
225224
fi
226225

227-
if [[ ${MPI_NUM_PROCESSES} != "None" ]]; then
226+
if [[ ${MPI_NUM_PROCESSES} != "None" && $MODE == "training" ]]; then
228227
# Installing OpenMPI
229228
apt-get install openmpi-bin openmpi-common openssh-client openssh-server libopenmpi-dev -y
230229

231-
# Install Horovod
232-
export HOROVOD_WITHOUT_PYTORCH=1
233-
export HOROVOD_WITHOUT_MXNET=1
234-
export HOROVOD_WITH_TENSORFLOW=1
235-
export HOROVOD_WITH_MPI=1
236-
export HOROVOD_VERSION=35b27e9
237-
238230
apt-get update
239231
# In case installing released versions of Horovod fail,and there is
240232
# a working commit replace next set of commands with something like:
241233
apt-get install -y --no-install-recommends --fix-missing cmake git
242234
# TODO: Once this PR https://github.com/horovod/horovod/pull/3864 is merged, we can install horovod as before.
243-
# python3 -m pip install --no-cache-dir git+https://github.com/horovod/horovod.git@${HOROVOD_VERSION}
244-
git clone https://github.com/horovod/horovod.git
245-
cd horovod
246-
git reset --hard ${HOROVOD_VERSION}
247-
git submodule update --init --recursive
248-
git fetch origin pull/3864/head:ashahba/issue-3861-fix
249-
git checkout ashahba/issue-3861-fix
250-
python3 -m pip install --no-cache-dir -v -e .
235+
CC=gcc CXX=g++ python3 -m pip install --no-cache-dir git+https://github.com/horovod/horovod.git@${HOROVOD_VERSION}
236+
237+
# Will keep this as reference for any future usecase
238+
#git clone https://github.com/horovod/horovod.git
239+
#cd horovod
240+
#git reset --hard ${HOROVOD_VERSION}
241+
#git submodule update --init --recursive
242+
#git fetch origin pull/3864/head:ashahba/issue-3861-fix
243+
#git checkout ashahba/issue-3861-fix
244+
#python3 -m pip install --no-cache-dir -v -e .
251245

252246
horovodrun --check-build
253247
fi
@@ -921,6 +915,27 @@ function mobilenet_v1() {
921915
fi
922916
}
923917

918+
# mobilenet_v2 model
919+
function mobilenet_v2() {
920+
if [ ${PRECISION} == "fp32" ] || [ ${PRECISION} == "bfloat16" ]; then
921+
CMD="${CMD} $(add_arg "--input_height" ${INPUT_HEIGHT}) $(add_arg "--input_width" ${INPUT_WIDTH}) \
922+
$(add_arg "--warmup_steps" ${WARMUP_STEPS}) $(add_arg "--steps" ${STEPS}) \
923+
$(add_arg "--input_layer" ${INPUT_LAYER}) $(add_arg "--output_layer" ${OUTPUT_LAYER})"
924+
925+
PYTHONPATH=${PYTHONPATH} CMD=${CMD} run_model
926+
elif [ ${PRECISION} == "int8" ]; then
927+
CMD="${CMD} $(add_arg "--input_height" ${INPUT_HEIGHT}) $(add_arg "--input_width" ${INPUT_WIDTH}) \
928+
$(add_arg "--warmup_steps" ${WARMUP_STEPS}) $(add_arg "--steps" ${STEPS}) \
929+
$(add_arg "--input_layer" ${INPUT_LAYER}) $(add_arg "--output_layer" ${OUTPUT_LAYER}) \
930+
$(add_calibration_arg)"
931+
932+
PYTHONPATH=${PYTHONPATH} CMD=${CMD} run_model
933+
else
934+
echo "PRECISION=${PRECISION} is not supported for ${MODEL_NAME}"
935+
exit 1
936+
fi
937+
}
938+
924939
# MTCC model
925940
function mtcc() {
926941
if [ ${PRECISION} == "fp32" ]; then
@@ -1659,6 +1674,8 @@ elif [ ${MODEL_NAME} == "maskrcnn" ]; then
16591674
maskrcnn
16601675
elif [ ${MODEL_NAME} == "mobilenet_v1" ]; then
16611676
mobilenet_v1
1677+
elif [ ${MODEL_NAME} == "mobilenet_v2" ]; then
1678+
mobilenet_v2
16621679
elif [ ${MODEL_NAME} == "resnet101" ]; then
16631680
resnet101_inceptionv3
16641681
elif [ ${MODEL_NAME} == "resnet50" ]; then
Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
#
2+
# -*- coding: utf-8 -*-
3+
#
4+
# Copyright (c) 2023 Intel Corporation
5+
#
6+
# Licensed under the Apache License, Version 2.0 (the "License");
7+
# you may not use this file except in compliance with the License.
8+
# You may obtain a copy of the License at
9+
#
10+
# http://www.apache.org/licenses/LICENSE-2.0
11+
#
12+
# Unless required by applicable law or agreed to in writing, software
13+
# distributed under the License is distributed on an "AS IS" BASIS,
14+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
15+
# See the License for the specific language governing permissions and
16+
# limitations under the License.
17+
#
18+
19+
#

0 commit comments

Comments
 (0)