Skip to content

Commit

Permalink
Issue/keras label whitelist (#95)
Browse files Browse the repository at this point in the history
* keras-pretrained analyser only provides results for labels provided in whitelist

* rename examples

* allow multiple mtriage processes to run at once

* correct examples

* update keras_pretrained with thresh

* fmt

* correct labels, name mtriage containers correct

* update readme

* update

* 💄
  • Loading branch information
breezykermo authored Jul 22, 2019
1 parent b99b697 commit fdd51bf
Show file tree
Hide file tree
Showing 53 changed files with 307 additions and 226 deletions.
87 changes: 31 additions & 56 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,79 +2,54 @@

[![Build Status](https://travis-ci.com/forensic-architecture/mtriage.svg?branch=master)](https://travis-ci.com/forensic-architecture/mtriage)

### NB: currently unstable, in active development, and should not be used in production
##### select, download, and analyse media

##### scrape and analyse media on the web
mtriage is a command-line application that can be used to scrape and analyse
media. mtriage is developed by [Forensic Architecture](https://forensic-architecture.org), and is intended for use
by open source research agencies, journalists, and activists.

mtriage is a command-line application that scrapes and analyses public domain media. mtriage is developed by [Forensic Architecture](https://forensic-architecture.org), and is intended for use by open source research agencies, journalists, and activists.

mtriage is a framework that orchestrates two different kinds of components:
mtriage consists of two types of components:

* **selectors**: to search for and download media from various platforms.
* **analysers**: to derive data from media that has been retrieved by
a selector.
* **analysers**: to derive data from media that has been retrieved by a selector.

Below are the following components that are supported. If you are interested in
helping us to develop additional selectors and analysers, please consider
joining [the conversaton on Discord](https://discord.gg/FJ4XsCg).

### selectors
* youtube - search and download via the [v3 API](https://developers.google.com/youtube/v3/).
* local - use media that already exists on your filesystem.

### analysers
* frames - extract one frame for each second from a video.
* ocr - analyse an image using [Google Cloud Platform](https://cloud.google.com/vision/docs/ocr).
<!-- * pytorch - run inference with a [PyTorch](https://pytorch.org/) model on an image. -->
* frames - extract frames from videos as images.
* yolov3 - detect and classify objects in images using [YoloV3](https://pjreddie.com/darknet/yolo/) trained on [ImageNet](http://www.image-net.org/) classes.
* keras_pretrained - classify objects in images using [Resnet50 trained on
ImageNet](https://resources.wolframcloud.com/NeuralNetRepository/resources/ResNet-50-Trained-on-ImageNet-Competition-Data).
* ocr - analyse an image using optical character recognition from [Google Cloud Platform](https://cloud.google.com/vision/docs/ocr).


## development
## setup
mtriage is currently in active development, and is not yet packaged in any way.
It uses [Docker](https://www.docker.com/products/docker-desktop) to manage
dependencies, and is written in Python.
It uses [Docker](https://www.docker.com/products/docker-desktop) to manage dependencies, which you will need to download to ensure mtriage works as expected.

### dependencies
- Docker Desktop (Mac installation [here](https://docs.docker.com/v17.12/docker-for-mac/install/), Ubuntu installation [here](https://docs.docker.com/v17.12/install/linux/docker-ce/ubuntu/)).
- [docker](https://docs.docker.com/install/) (python library, v3.5.0)

Follow the instructions relevant to your operating system to install Docker CE,
and then install the python dependency with:

```bash
python -m pip install -r requirements.txt
```

(Note that mtriage was developed using Python 3, but you should be able to run it with 2.x as well.)

### configuration setup
Selectors and analysers often rely on private credentials such as API keys. mtriage deals with these in two ways:
Follow the instructions relevant to your operating system to install Docker CE.

* **`.env` file at the top level**: contains API keys and other environment variables, which are made available when
mtriage is running.
* **`credentials` folder**: in some cases, components require JSON configs, such as for GCP service accounts. mtriage
currently deals with this by adding a path to the credentials file in `.env`, and adding the credential file itself
in the `credentials` folder.
You also need to ensure that a version of [Python](https://www.python.org/downloads/) is installed on your computer.
Most modern operating systems have a version installed by default.

The specific configuration steps depend on which components you intend to use. For every component you wish to use, run
through its setup:
### additional setup
Depending on what components you intend to use, there may be additional setup
required. Ensure to read the documentation for each component you wish to use.

##### selectors
* [youtube](docs/config/youtube.md)
##### analysers
* frames
* ocr

### running
You can run mtriage in a Docker container with:
## Run
Once you have Docker and Python installed, you can run mtriage using one of the
examples provided. From this folder:
```bash
python run.py develop
./mtriage run examples/_demo/youtube.yaml
```

Selectors and analysers are currently specified as runtime arguments to the
entrypoint script, "src/run.py". In "scripts" you can find a series of example
bash scripts that construct appropriate arguments and execute them.

A more robust interface for passing options is a work in progress.


### building locally
You can build the mtriage image locally via run.py as well:
```bash
python run.py build


When you first run mtriage, it will download the necessary Docker images to
your system. The first time you run it, it may take several minutes to get up
and running. Subsequent uses will be much faster.
File renamed without changes.
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
folder: "temp/tear_gas_turkey"
folder: "temp/demo"
phase: "analyse"
module: "frames"
config:
Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,11 @@
folder: "temp/military_vehicles_ambaz"
folder: "temp/demo"
phase: "analyse"
module: "keras_pretrained"
config:
elements_in:
- "youtube/frames"
model: "ResNet50"
labels:
- "tank"
- "rifle"
- "military uniform"
File renamed without changes.
2 changes: 1 addition & 1 deletion examples/an-meta.yaml → examples/_demo/meta.yaml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
folder: "temp/demo_output"
folder: "temp/demo"
phase: "analyse"
module: "meta"
config:
Expand Down
2 changes: 1 addition & 1 deletion examples/an-ocr.yaml → examples/_demo/ocr.yaml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
folder: "temp/demo_output"
folder: "temp/demo"
phase: "analyse"
module: "ocr"
config:
Expand Down
2 changes: 1 addition & 1 deletion examples/resnet_demo.yaml → examples/_demo/resnet.yaml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
folder: "temp/demo_output"
folder: "temp/demo"
phase: "analyse"
module: "keras_pretrained"
config:
Expand Down
File renamed without changes.
File renamed without changes.
8 changes: 8 additions & 0 deletions examples/ambaz/1.select.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
folder: temp/ambaz
phase: select
module: youtube
config:
search_term: ambazonia
uploaded_before: 2019-06-01T00:00:00Z
uploaded_after: 2018-06-01T00:00:00Z
daily: false
10 changes: 10 additions & 0 deletions examples/ambaz/2.frames.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
folder: temp/ambaz
phase: analyse
module: frames
config:
elements_in:
- youtube
method: opencv
change_threshold: 1e-5
sequential: false
fps: 1
28 changes: 28 additions & 0 deletions examples/ambaz/3.resnet.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
folder: temp/ambaz
phase: analyse
module: keras_pretrained
config:
elements_in:
- youtube/frames
model: ResNet50
labels:
- tank
- rifle
- military_uniform
- assault_rifle
- assault rifle
- assault_gun
- revolver
- bulletproof_vest
- bomb
- bomblet
- chemical_bomb
- pipe_bomb
- grenade
- munition
- gunnery
- case_shot
- canister
- tank_shell
- artillery_shell
- shrapnel
6 changes: 6 additions & 0 deletions examples/ambaz/4.rank.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
folder: temp/ambaz
phase: analyse
module: ranking
config:
elements_in:
- youtube/keras_pretrained
10 changes: 0 additions & 10 deletions examples/an-frames.yaml

This file was deleted.

7 changes: 0 additions & 7 deletions examples/an-keras-pretrained.yaml

This file was deleted.

8 changes: 0 additions & 8 deletions examples/chase_10e_mexico.yaml

This file was deleted.

8 changes: 0 additions & 8 deletions examples/chase_10e_turkey.yaml

This file was deleted.

8 changes: 0 additions & 8 deletions examples/chase_10e_venezuela.yaml

This file was deleted.

9 changes: 0 additions & 9 deletions examples/chase_generic.yaml

This file was deleted.

10 changes: 0 additions & 10 deletions examples/frames_mv_ambaz.yaml

This file was deleted.

10 changes: 0 additions & 10 deletions examples/frames_mv_ukraine.yaml

This file was deleted.

10 changes: 0 additions & 10 deletions examples/frames_tg_mexico.yaml

This file was deleted.

10 changes: 0 additions & 10 deletions examples/frames_tg_venezuela.yaml

This file was deleted.

7 changes: 0 additions & 7 deletions examples/imgnet-ukraine.yaml

This file was deleted.

8 changes: 8 additions & 0 deletions examples/mexico/1.select.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
folder: temp/mexico
phase: select
module: youtube
config:
search_term: caravanamigrante
uploaded_before: 2019-12-05T00:00:00Z
uploaded_after: 2018-11-20T00:00:00Z
daily: false
10 changes: 10 additions & 0 deletions examples/mexico/2.frames.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
folder: temp/mexico
phase: analyse
module: frames
config:
elements_in:
- youtube
method: opencv
change_threshold: 1e-5
sequential: false
fps: 1
11 changes: 11 additions & 0 deletions examples/mexico/3.fastai1.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
folder: temp/mexico
phase: analyse
module: fastai_model
config:
elements_in:
- youtube/frames
path: data/generic-chaser.pkl
labels:
- other
- teargas
- chaser_validation_set
11 changes: 11 additions & 0 deletions examples/mexico/3.fastai2.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
folder: temp/mexico
phase: analyse
module: fastai_model
config:
elements_in:
- youtube/frames
path: data/canister_10e.pkl
labels:
- other
- teargas
- triplechaser_synth
6 changes: 6 additions & 0 deletions examples/mexico/4.rank.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
folder: temp/tear_gas_mexico
phase: analyse
module: ranking
config:
elements_in:
- youtube/fastai_model
8 changes: 0 additions & 8 deletions examples/military_vehicles_ambaz.yaml

This file was deleted.

8 changes: 0 additions & 8 deletions examples/military_vehicles_ukraine.yaml

This file was deleted.

6 changes: 0 additions & 6 deletions examples/ranking.yaml

This file was deleted.

8 changes: 0 additions & 8 deletions examples/tear_gas_mexico.yaml

This file was deleted.

Loading

0 comments on commit fdd51bf

Please sign in to comment.