diff --git a/README.md b/README.md
index 34b658758..43da6db81 100644
--- a/README.md
+++ b/README.md
@@ -1,111 +1,119 @@
-# Detectron
+# Detectron Transfer Learning with PASCAL VOC 2007 dataset
-Detectron is Facebook AI Research's software system that implements state-of-the-art object detection algorithms, including [Mask R-CNN](https://arxiv.org/abs/1703.06870). It is written in Python and powered by the [Caffe2](https://github.com/caffe2/caffe2) deep learning framework.
+** Detectron implemented several object detecton algorithms. All the algorithms are trained on coco 2014 data set which has 80 categories. I want to fine tune the faster-rcnn with FPN on pascal voc 2007 dataset which has only 20 categories. The same way can be used to fine tune your own model on a new dataset **
-At FAIR, Detectron has enabled numerous research projects, including: [Feature Pyramid Networks for Object Detection](https://arxiv.org/abs/1612.03144), [Mask R-CNN](https://arxiv.org/abs/1703.06870), [Detecting and Recognizing Human-Object Interactions](https://arxiv.org/abs/1704.07333), [Focal Loss for Dense Object Detection](https://arxiv.org/abs/1708.02002), [Non-local Neural Networks](https://arxiv.org/abs/1711.07971), [Learning to Segment Every Thing](https://arxiv.org/abs/1711.10370), and [Data Distillation: Towards Omni-Supervised Learning](https://arxiv.org/abs/1712.04440).
-
-
-

-
Example Mask R-CNN output.
-
-
-## Introduction
-
-The goal of Detectron is to provide a high-quality, high-performance
-codebase for object detection *research*. It is designed to be flexible in order
-to support rapid implementation and evaluation of novel research. Detectron
-includes implementations of the following object detection algorithms:
-
-- [Mask R-CNN](https://arxiv.org/abs/1703.06870) -- *Marr Prize at ICCV 2017*
-- [RetinaNet](https://arxiv.org/abs/1708.02002) -- *Best Student Paper Award at ICCV 2017*
-- [Faster R-CNN](https://arxiv.org/abs/1506.01497)
-- [RPN](https://arxiv.org/abs/1506.01497)
-- [Fast R-CNN](https://arxiv.org/abs/1504.08083)
-- [R-FCN](https://arxiv.org/abs/1605.06409)
-
-using the following backbone network architectures:
-
-- [ResNeXt{50,101,152}](https://arxiv.org/abs/1611.05431)
-- [ResNet{50,101,152}](https://arxiv.org/abs/1512.03385)
-- [Feature Pyramid Networks](https://arxiv.org/abs/1612.03144) (with ResNet/ResNeXt)
-- [VGG16](https://arxiv.org/abs/1409.1556)
-
-Additional backbone architectures may be easily implemented. For more details about these models, please see [References](#references) below.
-
-## License
-
-Detectron is released under the [Apache 2.0 license](https://github.com/facebookresearch/detectron/blob/master/LICENSE). See the [NOTICE](https://github.com/facebookresearch/detectron/blob/master/NOTICE) file for additional details.
-
-## Citing Detectron
-
-If you use Detectron in your research or wish to refer to the baseline results published in the [Model Zoo](MODEL_ZOO.md), please use the following BibTeX entry.
+## 1. Setup caffe2 and Detectron and run the Detectron demo successfully.
+I will refer the Detectron directory as $DETECTRON
+## 2. Download the pre-trained model
+The code will download the models automatically. But my internet is slow and I'd like to download them before I run the code.
+Becase I'm gonna using the ResNet-50 as the backbone, so I need to download the ResNet and faster_rcnn_R-50-FPN model.
```
-@misc{Detectron2018,
- author = {Ross Girshick and Ilija Radosavovic and Georgia Gkioxari and
- Piotr Doll\'{a}r and Kaiming He},
- title = {Detectron},
- howpublished = {\url{https://github.com/facebookresearch/detectron}},
- year = {2018}
-}
+wget https://s3-us-west-2.amazonaws.com/detectron/ImageNetPretrained/MSRA/R-50.pkl /tmp/detectron/detectron-download-cache/ImageNetPretrained/MSRA/R-50.pkl
+wget https://s3-us-west-2.amazonaws.com/detectron/36225732/12_2017_baselines/mask_rcnn_R-50-FPN_2x.yaml.08_43_08.gDqBz9zS/output/train/coco_2014_train%3Acoco_2014_valminusminival/generalized_rcnn/model_final.pkl
```
-## Model Zoo and Baselines
+## 3. Prepare configuration file.
+### a. Copy the sample configure file from $DETECTRON/configs/getting_started
+```
+cd $DETECTORN
+mkdir experiments && cd experiments
+cp ../configs/getting_started/tutorial_1gpu_e2e_faster_rcnn_R-50-FPN.yaml e2e_faster_rcnn_resnet-50-FPN_pascal2007.yaml
+```
+### b. Change the configuration file
+```
+MODEL:
+ TYPE: generalized_rcnn
+ CONV_BODY: FPN.add_fpn_ResNet50_conv5_body
+ NUM_CLASSES: 21
+ FASTER_RCNN: True
+```
+The pascal voc 2007 has only 20 classes plus one background class. So the NUM_CLASSES is set to 21.
-We provide a large set of baseline results and trained models available for download in the [Detectron Model Zoo](MODEL_ZOO.md).
+```
+TRAIN:
+ SNAPSHOT_ITERS: 5000
+ WEIGHTS: /tmp/detectron/detectron-download-cache/35857389/12_2017_baselines/e2e_faster_rcnn_R-50-FPN_2x.yaml.01_37_22.KSeq0b5q/output/train/coco_2014_train%3Acoco_2014_valminusminival/generalized_rcnn/model_final.pkl
+ DATASETS: ('voc_2007_train',)
+```
+Change the WEIGHTS value to where you just place in step 2.
-## Installation
+## 4. Download pascal voc2007 and coco format annotations.
+Refer [data readme file](https://github.com/facebookresearch/Detectron/blob/master/lib/datasets/data/README.md) to prepare the pascal data set.
-Please find installation instructions for Caffe2 and Detectron in [`INSTALL.md`](INSTALL.md).
+The code support pascal data set has bug, the code in $DETECTRON/lib/datasets/dataset_catalog.py should be changed as following:
+```
+ 'voc_2007_train': {
+ IM_DIR:
+ _DATA_DIR + '/VOC2007/JPEGImages',
+ ANN_FN:
+ _DATA_DIR + '/VOC2007/annotations/pascal_train2007.json',
+ DEVKIT_DIR:
+ _DATA_DIR + '/VOC2007/VOCdevkit2007'
+ },
+ 'voc_2007_test': {
+ IM_DIR:
+ _DATA_DIR + '/VOC2007/JPEGImages',
+ ANN_FN:
+ _DATA_DIR + '/VOC2007/annotations/pascal_test2007.json',
+ DEVKIT_DIR:
+ _DATA_DIR + '/VOC2007/VOCdevkit2007'
+ },
-## Quick Start: Using Detectron
+```
-After installation, please see [`GETTING_STARTED.md`](GETTING_STARTED.md) for brief tutorials covering inference and training with Detectron.
+## 5. Change the cls_score and bbox_pred name to prevent error when train_net.py load weights
+In lib/modeling/fast_rcnn_heads.py, change all cls_score to cls_score_voc, bbox_pred to bbox_pred_voc.
-## Getting Help
+## 6. Run command to begin training.
+```
+python2 tools/train_net.py --cfg experiments/e2e_faster_rcnn_resnet-50-FPN_pascal2007.yaml OUTPUT_DIR experiments/output
+```
-To start, please check the [troubleshooting](INSTALL.md#troubleshooting) section of our installation instructions as well as our [FAQ](FAQ.md). If you couldn't find help there, try searching our GitHub issues. We intend the issues page to be a forum in which the community collectively troubleshoots problems.
+## 7. Copy the final model just trained.
+```
+mkdir -p /tmp/detectron-download-cache/voc2007/
+cp experiments/output/train/voc_2007_train/generalized_rcnn/model_iter49999.pkl /tmp/detectron-download-cache/voc2007/model_final.pkl
-If bugs are found, **we appreciate pull requests** (including adding Q&A's to `FAQ.md` and improving our installation instructions and troubleshooting documents). Please see [CONTRIBUTING.md](CONTRIBUTING.md) for more information about contributing to Detectron.
+```
+## 8. Infer some images psacal 2007 from test dataset
+```
+python2 tools/infer_simple.py --cfg experiments/e2e_faster_rcnn_resnet-50-FPN_pascal2007.yaml \
+ --output-dir /tmp/detectron-visualizations --wts /tmp/detectron-download-cache/voc2007/model_final.pkl \
+ demo2
+```
+Unfortunately, I found all persons are labeld as bird. This may be caused by the json datasets which are not corrrectly converted.
-## References
+## 9. Run test_net.py on pascal 2007 test dataset.
+```
+python2 tools/test_net.py \
+ --cfg experiments/e2e_faster_rcnn_resnet-50-FPN_pascal2007.yaml \
+ TEST.WEIGHTS /tmp/detectron-download-cache/voc2007/model_final.pkl \
+ NUM_GPUS 1
+```
+The test report the AP and mAP:
+```
+INFO voc_dataset_evaluator.py: 127: AP for aeroplane = 0.8095
+INFO voc_dataset_evaluator.py: 127: AP for bicycle = 0.8042
+INFO voc_dataset_evaluator.py: 127: AP for bird = 0.7086
+INFO voc_dataset_evaluator.py: 127: AP for boat = 0.6418
+INFO voc_dataset_evaluator.py: 127: AP for bottle = 0.6861
+INFO voc_dataset_evaluator.py: 127: AP for bus = 0.8822
+INFO voc_dataset_evaluator.py: 127: AP for car = 0.8794
+INFO voc_dataset_evaluator.py: 127: AP for cat = 0.8621
+INFO voc_dataset_evaluator.py: 127: AP for chair = 0.5876
+INFO voc_dataset_evaluator.py: 127: AP for cow = 0.7799
+INFO voc_dataset_evaluator.py: 127: AP for diningtable = 0.7404
+INFO voc_dataset_evaluator.py: 127: AP for dog = 0.8497
+INFO voc_dataset_evaluator.py: 127: AP for horse = 0.8855
+INFO voc_dataset_evaluator.py: 127: AP for motorbike = 0.7912
+INFO voc_dataset_evaluator.py: 127: AP for person = 0.7931
+INFO voc_dataset_evaluator.py: 127: AP for pottedplant = 0.5142
+INFO voc_dataset_evaluator.py: 127: AP for sheep = 0.7950
+INFO voc_dataset_evaluator.py: 127: AP for sofa = 0.7457
+INFO voc_dataset_evaluator.py: 127: AP for train = 0.7956
+INFO voc_dataset_evaluator.py: 127: AP for tvmonitor = 0.6960
+INFO voc_dataset_evaluator.py: 130: Mean AP = 0.7624
-- [Data Distillation: Towards Omni-Supervised Learning](https://arxiv.org/abs/1712.04440).
- Ilija Radosavovic, Piotr Dollár, Ross Girshick, Georgia Gkioxari, and Kaiming He.
- Tech report, arXiv, Dec. 2017.
-- [Learning to Segment Every Thing](https://arxiv.org/abs/1711.10370).
- Ronghang Hu, Piotr Dollár, Kaiming He, Trevor Darrell, and Ross Girshick.
- Tech report, arXiv, Nov. 2017.
-- [Non-Local Neural Networks](https://arxiv.org/abs/1711.07971).
- Xiaolong Wang, Ross Girshick, Abhinav Gupta, and Kaiming He.
- Tech report, arXiv, Nov. 2017.
-- [Mask R-CNN](https://arxiv.org/abs/1703.06870).
- Kaiming He, Georgia Gkioxari, Piotr Dollár, and Ross Girshick.
- IEEE International Conference on Computer Vision (ICCV), 2017.
-- [Focal Loss for Dense Object Detection](https://arxiv.org/abs/1708.02002).
- Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, and Piotr Dollár.
- IEEE International Conference on Computer Vision (ICCV), 2017.
-- [Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour](https://arxiv.org/abs/1706.02677).
- Priya Goyal, Piotr Dollár, Ross Girshick, Pieter Noordhuis, Lukasz Wesolowski, Aapo Kyrola, Andrew Tulloch, Yangqing Jia, and Kaiming He.
- Tech report, arXiv, June 2017.
-- [Detecting and Recognizing Human-Object Interactions](https://arxiv.org/abs/1704.07333).
- Georgia Gkioxari, Ross Girshick, Piotr Dollár, and Kaiming He.
- Tech report, arXiv, Apr. 2017.
-- [Feature Pyramid Networks for Object Detection](https://arxiv.org/abs/1612.03144).
- Tsung-Yi Lin, Piotr Dollár, Ross Girshick, Kaiming He, Bharath Hariharan, and Serge Belongie.
- IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
-- [Aggregated Residual Transformations for Deep Neural Networks](https://arxiv.org/abs/1611.05431).
- Saining Xie, Ross Girshick, Piotr Dollár, Zhuowen Tu, and Kaiming He.
- IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
-- [R-FCN: Object Detection via Region-based Fully Convolutional Networks](http://arxiv.org/abs/1605.06409).
- Jifeng Dai, Yi Li, Kaiming He, and Jian Sun.
- Conference on Neural Information Processing Systems (NIPS), 2016.
-- [Deep Residual Learning for Image Recognition](http://arxiv.org/abs/1512.03385).
- Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun.
- IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
-- [Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks](http://arxiv.org/abs/1506.01497)
- Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun.
- Conference on Neural Information Processing Systems (NIPS), 2015.
-- [Fast R-CNN](http://arxiv.org/abs/1504.08083).
- Ross Girshick.
- IEEE International Conference on Computer Vision (ICCV), 2015.
+```
+
diff --git a/demo2/000012.jpg b/demo2/000012.jpg
new file mode 100644
index 000000000..b829107b8
Binary files /dev/null and b/demo2/000012.jpg differ
diff --git a/demo2/000014.jpg b/demo2/000014.jpg
new file mode 100644
index 000000000..5d8583e6a
Binary files /dev/null and b/demo2/000014.jpg differ
diff --git a/demo2/000017.jpg b/demo2/000017.jpg
new file mode 100644
index 000000000..19dd833f8
Binary files /dev/null and b/demo2/000017.jpg differ
diff --git a/demo2/000018.jpg b/demo2/000018.jpg
new file mode 100644
index 000000000..e76c295f2
Binary files /dev/null and b/demo2/000018.jpg differ
diff --git a/demo2/000019.jpg b/demo2/000019.jpg
new file mode 100644
index 000000000..f4f2e76ab
Binary files /dev/null and b/demo2/000019.jpg differ
diff --git a/demo2/000021.jpg b/demo2/000021.jpg
new file mode 100644
index 000000000..bf32a0002
Binary files /dev/null and b/demo2/000021.jpg differ
diff --git a/demo2/000022.jpg b/demo2/000022.jpg
new file mode 100644
index 000000000..c83174d74
Binary files /dev/null and b/demo2/000022.jpg differ
diff --git a/demo2/000023.jpg b/demo2/000023.jpg
new file mode 100644
index 000000000..ff2886993
Binary files /dev/null and b/demo2/000023.jpg differ
diff --git a/demo2/000025.jpg b/demo2/000025.jpg
new file mode 100644
index 000000000..b1b0359c9
Binary files /dev/null and b/demo2/000025.jpg differ
diff --git a/demo2/000028.jpg b/demo2/000028.jpg
new file mode 100644
index 000000000..df4a907e5
Binary files /dev/null and b/demo2/000028.jpg differ
diff --git a/demo2/000030.jpg b/demo2/000030.jpg
new file mode 100644
index 000000000..ea141fc52
Binary files /dev/null and b/demo2/000030.jpg differ
diff --git a/demo2/000031.jpg b/demo2/000031.jpg
new file mode 100644
index 000000000..74ea3a85c
Binary files /dev/null and b/demo2/000031.jpg differ
diff --git a/demo2/000032.jpg b/demo2/000032.jpg
new file mode 100644
index 000000000..b111b5a0b
Binary files /dev/null and b/demo2/000032.jpg differ
diff --git a/demo2/000034.jpg b/demo2/000034.jpg
new file mode 100644
index 000000000..aacf7d075
Binary files /dev/null and b/demo2/000034.jpg differ
diff --git a/demo2/000035.jpg b/demo2/000035.jpg
new file mode 100644
index 000000000..fa23a0b27
Binary files /dev/null and b/demo2/000035.jpg differ
diff --git a/demo2/000036.jpg b/demo2/000036.jpg
new file mode 100644
index 000000000..911d58131
Binary files /dev/null and b/demo2/000036.jpg differ
diff --git a/experiments/_init_paths.py b/experiments/_init_paths.py
new file mode 100644
index 000000000..11f3d85a2
--- /dev/null
+++ b/experiments/_init_paths.py
@@ -0,0 +1,20 @@
+#!/usr/bin/env python2
+# -*- coding: utf-8 -*-
+"""
+Created on Tue Feb 6 08:39:36 2018
+
+@author: roy
+"""
+
+"""Insert /home/roy/projects/caffe2/build to PYTHONPATH"""
+
+import sys
+
+pt = '/home/roy/projects/caffe2/build'
+
+
+def add_path(path):
+ if path not in sys.path:
+ sys.path.insert(0, path)
+
+add_path(pt)
\ No newline at end of file
diff --git a/experiments/demo.txt b/experiments/demo.txt
new file mode 100644
index 000000000..5c8c91e13
--- /dev/null
+++ b/experiments/demo.txt
@@ -0,0 +1,9 @@
+python2 tools/infer_simple.py --cfg configs/12_2017_baselines/e2e_mask_rcnn_R-101-FPN_2x.yaml --output-dir /tmp/detectron-visualizations --image-ext jpg --wts https://s3-us-west-2.amazonaws.com/detectron/35861858/12_2017_baselines/e2e_mask_rcnn_R-101-FPN_2x.yaml.02_32_51.SgT4y1cO/output/train/coco_2014_train:coco_2014_valminusminival/generalized_rcnn/model_final.pkl demo
+
+python2 tools/infer_simple.py \
+ --cfg configs/12_2017_baselines/e2e_faster_rcnn_R-50-FPN_2x.yaml \
+ --output-dir /tmp/detectron-visualizations \
+ --image-ext jpg \
+ --wts https://s3-us-west-2.amazonaws.com/detectron/35857389/12_2017_baselines/e2e_faster_rcnn_R-50-FPN_2x.yaml.01_37_22.KSeq0b5q/output/train/coco_2014_train%3Acoco_2014_valminusminival/generalized_rcnn/model_final.pkl demo
+
+python2 tools/train_net.py --cfg experiments/e2e_faster_rcnn_resnet-50-FPN_pascal2007.yaml OUTPUT_DIR experiments/output
diff --git a/experiments/e2e_faster_rcnn_resnet-50-FPN_pascal2007.yaml b/experiments/e2e_faster_rcnn_resnet-50-FPN_pascal2007.yaml
new file mode 100644
index 000000000..994c86a75
--- /dev/null
+++ b/experiments/e2e_faster_rcnn_resnet-50-FPN_pascal2007.yaml
@@ -0,0 +1,55 @@
+MODEL:
+ TYPE: generalized_rcnn
+ CONV_BODY: FPN.add_fpn_ResNet50_conv5_body
+ NUM_CLASSES: 21
+ FASTER_RCNN: True
+NUM_GPUS: 1
+SOLVER:
+ WEIGHT_DECAY: 0.0001
+ LR_POLICY: steps_with_decay
+ BASE_LR: 0.0025
+ GAMMA: 0.1
+ MAX_ITER: 50000
+ STEPS: [0, 30000, 40000]
+ # Equivalent schedules with...
+ # 1 GPU:
+ # BASE_LR: 0.0025
+ # MAX_ITER: 60000
+ # STEPS: [0, 30000, 40000]
+ # 2 GPUs:
+ # BASE_LR: 0.005
+ # MAX_ITER: 30000
+ # STEPS: [0, 15000, 20000]
+ # 4 GPUs:
+ # BASE_LR: 0.01
+ # MAX_ITER: 15000
+ # STEPS: [0, 7500, 10000]
+ # 8 GPUs:
+ # BASE_LR: 0.02
+ # MAX_ITER: 7500
+ # STEPS: [0, 3750, 5000]
+FPN:
+ FPN_ON: True
+ MULTILEVEL_ROIS: True
+ MULTILEVEL_RPN: True
+FAST_RCNN:
+ ROI_BOX_HEAD: fast_rcnn_heads.add_roi_2mlp_head
+ ROI_XFORM_METHOD: RoIAlign
+ ROI_XFORM_RESOLUTION: 7
+ ROI_XFORM_SAMPLING_RATIO: 2
+TRAIN:
+ SNAPSHOT_ITERS: 5000
+ WEIGHTS: /tmp/detectron/detectron-download-cache/35857389/12_2017_baselines/e2e_faster_rcnn_R-50-FPN_2x.yaml.01_37_22.KSeq0b5q/output/train/coco_2014_train%3Acoco_2014_valminusminival/generalized_rcnn/model_final.pkl
+ DATASETS: ('voc_2007_train',)
+ SCALES: (500,)
+ MAX_SIZE: 833
+ BATCH_SIZE_PER_IM: 256
+ RPN_PRE_NMS_TOP_N: 2000 # Per FPN level
+TEST:
+ DATASETS: ('voc_2007_test',)
+ SCALES: (500,)
+ MAX_SIZE: 833
+ NMS: 0.5
+ RPN_PRE_NMS_TOP_N: 1000 # Per FPN level
+ RPN_POST_NMS_TOP_N: 1000
+OUTPUT_DIR: .
diff --git a/experiments/net2json.ipynb b/experiments/net2json.ipynb
new file mode 100644
index 000000000..69da3d4aa
--- /dev/null
+++ b/experiments/net2json.ipynb
@@ -0,0 +1,156 @@
+{
+ "cells": [
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "from __future__ import division\n",
+ "from __future__ import print_function\n",
+ "import _init_paths\n",
+ "import json\n",
+ "\n",
+ "from caffe2.python.core import Net\n",
+ "from caffe2.python import workspace\n",
+ "from caffe2.proto.caffe2_pb2 import NetDef\n",
+ "import caffe2.proto.caffe2_pb2 as pb2\n",
+ "\n",
+ "from google.protobuf import json_format\n",
+ "import pprint"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "\n",
+ "net = Net('fist_net')\n",
+ "X = net.GaussianFill([], ['X'], mean=0.0, std=1.0, shape=[2,3], run_once=0)\n",
+ "\n",
+ "net_json = json_format.MessageToJson(net.Proto())\n",
+ "\n",
+ "#json.loads convert a json instance to Python object\n",
+ "#json.dumps convert a Python object to JSON formatted str\n",
+ "with open('testnet.txt', 'w') as fid:\n",
+ " fid.write(json.dumps(json.loads(net_json), indent=4))\n",
+ " \n",
+ "jstr = None \n",
+ "with open('testnet.txt', 'r') as fid:\n",
+ " jstr = json.dumps(json.load(fid))\n",
+ " \n",
+ "net2 = NetDef()\n",
+ "json_format.Parse(jstr, net2)\n",
+ "\n",
+ "print(str(net2))\n",
+ " \n",
+ "'''\n",
+ "init_net = NetDef()\n",
+ "with open('param_init_net.pb', 'r') as fid:\n",
+ " init_net.ParseFromString(fid.read())\n",
+ " \n",
+ "\n",
+ "train_net = NetDef()\n",
+ "with open('net.pb', 'r') as fid:\n",
+ " train_net.ParseFromString(fid.read())\n",
+ " \n",
+ "print(init_net.name)\n",
+ "print(train_net.name)\n",
+ "'''"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "jstr = None \n",
+ "with open('net.json', 'r') as fid:\n",
+ " jstr = json.dumps(json.load(fid))\n",
+ "\n",
+ "net = NetDef()\n",
+ "json_format.Parse(jstr, net)\n",
+ "\n",
+ "with open('param_init_net.json', 'r') as fid:\n",
+ " jstr = json.dumps(json.load(fid))\n",
+ " \n",
+ "init_net = NetDef()\n",
+ "json_format.Parse(jstr, init_net)\n",
+ "\n",
+ "print(str(net))\n",
+ "print('=======================')\n",
+ "print(str(init_net))\n",
+ " "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "jstr = None \n",
+ "with open('/home/roy/working/Detectron/experiments/output/train/voc_2007_train/generalized_rcnn/net.json', 'r') as fid:\n",
+ " jstr = json.load(fid)\n",
+ "\n",
+ "\n",
+ "ext_list = jstr['externalInput']\n",
+ "for it in ext_list:\n",
+ " #print(it)\n",
+ " if it.startswith('gpu_0/roi_blobs_queue'):\n",
+ " #print(it)\n",
+ " ext_list.remove(it)\n",
+ " \n",
+ "#print(str(ext_list))\n",
+ "\n",
+ "op_list = jstr['op']\n",
+ "for it_op in op_list:\n",
+ " if it_op['input'][0].startswith('gpu_0/roi_blobs_queue'):\n",
+ " op_list.remove(it_op)\n",
+ "\n",
+ "print(json.dumps(jstr, indent=4))\n",
+ "\n",
+ "\n",
+ " \n",
+ "#net = NetDef()\n",
+ "#json_format.Parse(jstr, net)\n",
+ "\n",
+ "\n",
+ "#print(str(net))\n",
+ "#print('=======================')\n",
+ " "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "Python 2",
+ "language": "python",
+ "name": "python2"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 2
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython2",
+ "version": "2.7.12"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
diff --git a/experiments/test.yaml b/experiments/test.yaml
new file mode 100644
index 000000000..55aff2295
--- /dev/null
+++ b/experiments/test.yaml
@@ -0,0 +1,57 @@
+MODEL:
+ TYPE: generalized_rcnn
+ CONV_BODY: FPN.add_fpn_ResNet50_conv5_body
+ NUM_CLASSES: 81
+ FASTER_RCNN: True
+NUM_GPUS: 1
+SOLVER:
+ WEIGHT_DECAY: 0.0001
+ LR_POLICY: steps_with_decay
+ BASE_LR: 0.0025
+ GAMMA: 0.1
+ MAX_ITER: 5000
+ STEPS: [0, 1000, 3000]
+ # Equivalent schedules with...
+ # 1 GPU:
+ # BASE_LR: 0.0025
+ # MAX_ITER: 60000
+ # STEPS: [0, 30000, 40000]
+ # 2 GPUs:
+ # BASE_LR: 0.005
+ # MAX_ITER: 30000
+ # STEPS: [0, 15000, 20000]
+ # 4 GPUs:
+ # BASE_LR: 0.01
+ # MAX_ITER: 15000
+ # STEPS: [0, 7500, 10000]
+ # 8 GPUs:
+ # BASE_LR: 0.02
+ # MAX_ITER: 7500
+ # STEPS: [0, 3750, 5000]
+FPN:
+ FPN_ON: True
+ MULTILEVEL_ROIS: True
+ MULTILEVEL_RPN: True
+FAST_RCNN:
+ ROI_BOX_HEAD: fast_rcnn_heads.add_roi_2mlp_head
+ ROI_XFORM_METHOD: RoIAlign
+ ROI_XFORM_RESOLUTION: 7
+ ROI_XFORM_SAMPLING_RATIO: 2
+TRAIN:
+ SNAPSHOT_ITERS: 1000
+ #WEIGHTS: https://s3-us-west-2.amazonaws.com/detectron/ImageNetPretrained/MSRA/R-50.pkl
+ WEIGHTS: https://s3-us-west-2.amazonaws.com/detectron/35857389/12_2017_baselines/e2e_faster_rcnn_R-50-FPN_2x.yaml.01_37_22.KSeq0b5q/output/train/coco_2014_train%3Acoco_2014_valminusminival/generalized_rcnn/model_final.pkl
+ DATASETS: ('voc_2007_train',)
+ SCALES: (500,)
+ MAX_SIZE: 833
+ BATCH_SIZE_PER_IM: 256
+ RPN_PRE_NMS_TOP_N: 2000 # Per FPN level
+TEST:
+ WEIGHTS: /home/roy/working/Detectron/experiments/output/train/voc_2007_train/generalized_rcnn/model_iter999.pkl
+ DATASETS: ('voc_2007_test',)
+ SCALES: (500,)
+ MAX_SIZE: 833
+ NMS: 0.5
+ RPN_PRE_NMS_TOP_N: 1000 # Per FPN level
+ RPN_POST_NMS_TOP_N: 1000
+OUTPUT_DIR: .
diff --git a/lib/core/config.py b/lib/core/config.py
index 08c7068f8..1e5ebe28e 100644
--- a/lib/core/config.py
+++ b/lib/core/config.py
@@ -483,7 +483,8 @@
# Use 'prof_dag' to get profiling statistics
__C.MODEL.EXECUTION_TYPE = b'dag'
-
+__C.MODEL.PROTOTXT = b''
+__C.MODEL.INIT_PROTOTXT = b''
# ---------------------------------------------------------------------------- #
# RetinaNet options
# ---------------------------------------------------------------------------- #
diff --git a/lib/core/test.py b/lib/core/test.py
index bf788afad..dc8c5e050 100644
--- a/lib/core/test.py
+++ b/lib/core/test.py
@@ -168,7 +168,7 @@ def im_detect_bbox(model, im, boxes=None):
if cfg.TEST.BBOX_REG:
# Apply bounding-box regression deltas
- box_deltas = workspace.FetchBlob(core.ScopedName('bbox_pred')).squeeze()
+ box_deltas = workspace.FetchBlob(core.ScopedName('bbox_pred_voc')).squeeze()
# In case there is 1 proposal
box_deltas = box_deltas.reshape([-1, box_deltas.shape[-1]])
if cfg.MODEL.CLS_AGNOSTIC_BBOX_REG:
diff --git a/lib/datasets/dataset_catalog.py b/lib/datasets/dataset_catalog.py
index 9a0f35dbb..66e0d997b 100644
--- a/lib/datasets/dataset_catalog.py
+++ b/lib/datasets/dataset_catalog.py
@@ -162,11 +162,11 @@
ANN_FN:
_DATA_DIR + '/coco/annotations/image_info_test-dev2015.json'
},
- 'voc_2007_trainval': {
+ 'voc_2007_train': {
IM_DIR:
_DATA_DIR + '/VOC2007/JPEGImages',
ANN_FN:
- _DATA_DIR + '/VOC2007/annotations/voc_2007_trainval.json',
+ _DATA_DIR + '/VOC2007/annotations/pascal_train2007.json',
DEVKIT_DIR:
_DATA_DIR + '/VOC2007/VOCdevkit2007'
},
@@ -174,7 +174,7 @@
IM_DIR:
_DATA_DIR + '/VOC2007/JPEGImages',
ANN_FN:
- _DATA_DIR + '/VOC2007/annotations/voc_2007_test.json',
+ _DATA_DIR + '/VOC2007/annotations/pascal_test2007.json',
DEVKIT_DIR:
_DATA_DIR + '/VOC2007/VOCdevkit2007'
},
diff --git a/lib/modeling/fast_rcnn_heads.py b/lib/modeling/fast_rcnn_heads.py
index e2cadc0c5..a338ef96c 100644
--- a/lib/modeling/fast_rcnn_heads.py
+++ b/lib/modeling/fast_rcnn_heads.py
@@ -46,7 +46,7 @@ def add_fast_rcnn_outputs(model, blob_in, dim):
"""Add RoI classification and bounding box regression output ops."""
model.FC(
blob_in,
- 'cls_score',
+ 'cls_score_voc',
dim,
model.num_classes,
weight_init=gauss_fill(0.01),
@@ -55,10 +55,10 @@ def add_fast_rcnn_outputs(model, blob_in, dim):
if not model.train: # == if test
# Only add softmax when testing; during training the softmax is combined
# with the label cross entropy loss for numerical stability
- model.Softmax('cls_score', 'cls_prob', engine='CUDNN')
+ model.Softmax('cls_score_voc', 'cls_prob', engine='CUDNN')
model.FC(
blob_in,
- 'bbox_pred',
+ 'bbox_pred_voc',
dim,
model.num_classes * 4,
weight_init=gauss_fill(0.001),
@@ -69,12 +69,12 @@ def add_fast_rcnn_outputs(model, blob_in, dim):
def add_fast_rcnn_losses(model):
"""Add losses for RoI classification and bounding box regression."""
cls_prob, loss_cls = model.net.SoftmaxWithLoss(
- ['cls_score', 'labels_int32'], ['cls_prob', 'loss_cls'],
+ ['cls_score_voc', 'labels_int32'], ['cls_prob', 'loss_cls'],
scale=model.GetLossScale()
)
loss_bbox = model.net.SmoothL1Loss(
[
- 'bbox_pred', 'bbox_targets', 'bbox_inside_weights',
+ 'bbox_pred_voc', 'bbox_targets', 'bbox_inside_weights',
'bbox_outside_weights'
],
'loss_bbox',
diff --git a/old-README.md b/old-README.md
new file mode 100644
index 000000000..34b658758
--- /dev/null
+++ b/old-README.md
@@ -0,0 +1,111 @@
+# Detectron
+
+Detectron is Facebook AI Research's software system that implements state-of-the-art object detection algorithms, including [Mask R-CNN](https://arxiv.org/abs/1703.06870). It is written in Python and powered by the [Caffe2](https://github.com/caffe2/caffe2) deep learning framework.
+
+At FAIR, Detectron has enabled numerous research projects, including: [Feature Pyramid Networks for Object Detection](https://arxiv.org/abs/1612.03144), [Mask R-CNN](https://arxiv.org/abs/1703.06870), [Detecting and Recognizing Human-Object Interactions](https://arxiv.org/abs/1704.07333), [Focal Loss for Dense Object Detection](https://arxiv.org/abs/1708.02002), [Non-local Neural Networks](https://arxiv.org/abs/1711.07971), [Learning to Segment Every Thing](https://arxiv.org/abs/1711.10370), and [Data Distillation: Towards Omni-Supervised Learning](https://arxiv.org/abs/1712.04440).
+
+
+

+
Example Mask R-CNN output.
+
+
+## Introduction
+
+The goal of Detectron is to provide a high-quality, high-performance
+codebase for object detection *research*. It is designed to be flexible in order
+to support rapid implementation and evaluation of novel research. Detectron
+includes implementations of the following object detection algorithms:
+
+- [Mask R-CNN](https://arxiv.org/abs/1703.06870) -- *Marr Prize at ICCV 2017*
+- [RetinaNet](https://arxiv.org/abs/1708.02002) -- *Best Student Paper Award at ICCV 2017*
+- [Faster R-CNN](https://arxiv.org/abs/1506.01497)
+- [RPN](https://arxiv.org/abs/1506.01497)
+- [Fast R-CNN](https://arxiv.org/abs/1504.08083)
+- [R-FCN](https://arxiv.org/abs/1605.06409)
+
+using the following backbone network architectures:
+
+- [ResNeXt{50,101,152}](https://arxiv.org/abs/1611.05431)
+- [ResNet{50,101,152}](https://arxiv.org/abs/1512.03385)
+- [Feature Pyramid Networks](https://arxiv.org/abs/1612.03144) (with ResNet/ResNeXt)
+- [VGG16](https://arxiv.org/abs/1409.1556)
+
+Additional backbone architectures may be easily implemented. For more details about these models, please see [References](#references) below.
+
+## License
+
+Detectron is released under the [Apache 2.0 license](https://github.com/facebookresearch/detectron/blob/master/LICENSE). See the [NOTICE](https://github.com/facebookresearch/detectron/blob/master/NOTICE) file for additional details.
+
+## Citing Detectron
+
+If you use Detectron in your research or wish to refer to the baseline results published in the [Model Zoo](MODEL_ZOO.md), please use the following BibTeX entry.
+
+```
+@misc{Detectron2018,
+ author = {Ross Girshick and Ilija Radosavovic and Georgia Gkioxari and
+ Piotr Doll\'{a}r and Kaiming He},
+ title = {Detectron},
+ howpublished = {\url{https://github.com/facebookresearch/detectron}},
+ year = {2018}
+}
+```
+
+## Model Zoo and Baselines
+
+We provide a large set of baseline results and trained models available for download in the [Detectron Model Zoo](MODEL_ZOO.md).
+
+## Installation
+
+Please find installation instructions for Caffe2 and Detectron in [`INSTALL.md`](INSTALL.md).
+
+## Quick Start: Using Detectron
+
+After installation, please see [`GETTING_STARTED.md`](GETTING_STARTED.md) for brief tutorials covering inference and training with Detectron.
+
+## Getting Help
+
+To start, please check the [troubleshooting](INSTALL.md#troubleshooting) section of our installation instructions as well as our [FAQ](FAQ.md). If you couldn't find help there, try searching our GitHub issues. We intend the issues page to be a forum in which the community collectively troubleshoots problems.
+
+If bugs are found, **we appreciate pull requests** (including adding Q&A's to `FAQ.md` and improving our installation instructions and troubleshooting documents). Please see [CONTRIBUTING.md](CONTRIBUTING.md) for more information about contributing to Detectron.
+
+## References
+
+- [Data Distillation: Towards Omni-Supervised Learning](https://arxiv.org/abs/1712.04440).
+ Ilija Radosavovic, Piotr Dollár, Ross Girshick, Georgia Gkioxari, and Kaiming He.
+ Tech report, arXiv, Dec. 2017.
+- [Learning to Segment Every Thing](https://arxiv.org/abs/1711.10370).
+ Ronghang Hu, Piotr Dollár, Kaiming He, Trevor Darrell, and Ross Girshick.
+ Tech report, arXiv, Nov. 2017.
+- [Non-Local Neural Networks](https://arxiv.org/abs/1711.07971).
+ Xiaolong Wang, Ross Girshick, Abhinav Gupta, and Kaiming He.
+ Tech report, arXiv, Nov. 2017.
+- [Mask R-CNN](https://arxiv.org/abs/1703.06870).
+ Kaiming He, Georgia Gkioxari, Piotr Dollár, and Ross Girshick.
+ IEEE International Conference on Computer Vision (ICCV), 2017.
+- [Focal Loss for Dense Object Detection](https://arxiv.org/abs/1708.02002).
+ Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, and Piotr Dollár.
+ IEEE International Conference on Computer Vision (ICCV), 2017.
+- [Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour](https://arxiv.org/abs/1706.02677).
+ Priya Goyal, Piotr Dollár, Ross Girshick, Pieter Noordhuis, Lukasz Wesolowski, Aapo Kyrola, Andrew Tulloch, Yangqing Jia, and Kaiming He.
+ Tech report, arXiv, June 2017.
+- [Detecting and Recognizing Human-Object Interactions](https://arxiv.org/abs/1704.07333).
+ Georgia Gkioxari, Ross Girshick, Piotr Dollár, and Kaiming He.
+ Tech report, arXiv, Apr. 2017.
+- [Feature Pyramid Networks for Object Detection](https://arxiv.org/abs/1612.03144).
+ Tsung-Yi Lin, Piotr Dollár, Ross Girshick, Kaiming He, Bharath Hariharan, and Serge Belongie.
+ IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
+- [Aggregated Residual Transformations for Deep Neural Networks](https://arxiv.org/abs/1611.05431).
+ Saining Xie, Ross Girshick, Piotr Dollár, Zhuowen Tu, and Kaiming He.
+ IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
+- [R-FCN: Object Detection via Region-based Fully Convolutional Networks](http://arxiv.org/abs/1605.06409).
+ Jifeng Dai, Yi Li, Kaiming He, and Jian Sun.
+ Conference on Neural Information Processing Systems (NIPS), 2016.
+- [Deep Residual Learning for Image Recognition](http://arxiv.org/abs/1512.03385).
+ Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun.
+ IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
+- [Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks](http://arxiv.org/abs/1506.01497)
+ Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun.
+ Conference on Neural Information Processing Systems (NIPS), 2015.
+- [Fast R-CNN](http://arxiv.org/abs/1504.08083).
+ Ross Girshick.
+ IEEE International Conference on Computer Vision (ICCV), 2015.