Skip to content

Commit abb1554

Browse files
2 parents f3d83f5 + 619b03e commit abb1554

File tree

1 file changed

+106
-0
lines changed

1 file changed

+106
-0
lines changed

README.md

Lines changed: 106 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,106 @@
1+
# Sign Language-to-Speech with DeepStack's Custom API
2+
3+
This project is an end-to-end working prototype that uses Artificial intelligence to detect sign language meanings
4+
in images/videos and generate equivalent, realistic voice of words communicated by the sign language.
5+
6+
## Steps to run the project
7+
### 1. Install DeepStack using Docker. (Skip this if you already have DeepStack installed)
8+
- Docker needs to be installed first. For Mac OS and Windows users can install Docker from
9+
[Docker's website](https://www.docker.com/products/docker-desktop).
10+
- To install on a Linux OS,run the code below;
11+
12+
```
13+
sudo apt-get update && sudo apt-get install docker.io
14+
```
15+
- Install DeepStack. *You might want to grab a coffee while waiting for this to finish its execution :smirk:*
16+
```
17+
docker pull deepquestai/deepstack
18+
```
19+
- Test DeepQuest.
20+
```
21+
docker run -e VISION-SCENE=True -v localstorage:/datastore -p 80:5000 deepquestai/deepstack
22+
```
23+
**NOTE:** This works for the CPU variant only. To explore the other ways to install, check the
24+
[official tutorial](https://docs.deepstack.cc/#installation-guide-for-cpu-version).
25+
26+
> **Error starting userland proxy: listen tcp4 0.0.0.0:80: bind: address already in use.**
27+
28+
If you come across this error, change `-p 80:5000` to another port, e.g., `-p 88:5000`.
29+
(I'm using **88**, too :blush:).
30+
31+
32+
### 2. Clone the Project Repository and Install Dependencies
33+
- To clone this repo, copy and run the command below in your bash and change into the new
34+
directory with the next line of code.
35+
```
36+
git clone https://github.com/SteveKola/Sign-Language-to-Speech-with-DeepStack-Custom-API.git
37+
cd Sign-Language-to-Speech-with-DeepStack-Custom-API
38+
```
39+
- To avoid potential *dependency hell*, create a virtual enviroment and
40+
activate the created virtual environment afterwards.
41+
```
42+
python3 -m venv env
43+
source env/bin/activate
44+
```
45+
- Install the dependencies using `pip install -r requirements`.
46+
- If you are on a Linux OS, TTS engines might not be pre-installed on your platform. Use the code below to install them.
47+
```
48+
sudo apt-get update && sudo apt-get install espeak ffmpeg libespeak1
49+
```
50+
51+
52+
### 3. Spin up the DeepStack custom model's Server.
53+
- While still in the project directory's root, spin up the deepstack custom model's server by running the command below;
54+
```
55+
sudo docker run -v ~/Downloads/ObjectDetection/models:/modelstore/detection -p 88:5000 deepquestai/deepstack
56+
```
57+
58+
### 4. Detect sign language meanings in image files and generate realistic voice of words.
59+
- run the image_detection script on the image;
60+
```
61+
python image_detection.py image_filename.file_extension
62+
```
63+
My default port number is 88. To specify the port on which DeepStack server is running, run this instead;
64+
```
65+
python image_detection.py image_filename.file_extension --deepstack-port port_number
66+
```
67+
Running the above command would return two new files in your project root directory -
68+
69+
1. a copy of the image with bbox around the detected sign with the meaning on the top of the box,
70+
2. an audiofile of the detected sign language.
71+
72+
73+
### 5. Detect sign language meanings on a live video (via webcam).
74+
- run the livefeed detection script;
75+
```
76+
python livefeed_detection.py
77+
```
78+
My default port number is 88. To specify the port on which DeepStack server is running, run this instead;
79+
```
80+
python livefeed_detection.py --deepstack-port port_number
81+
```
82+
This will spin up the webcam and would automatically detect any sign language words in view of the camera,
83+
while also returning the sign meaning and its speech equivalent immediately.
84+
85+
86+
## Additional Notes
87+
- **This project has built and tested successfully on a Linux machine. Other errors might arise on other Operating Systems,
88+
which might not have been accounted for in this documentation)**.
89+
- The dataset used in training the model was created via my webcam using an automation scipt.
90+
[scripts/creating_data.py](https://github.com/SteveKola/Sign-Language-to-Speech-with-DeepStack-Custom-API/blob/main/scripts/creating_data.py)
91+
is the script used.
92+
- My dataset could be found in [this repository](https://github.com/SteveKola/Sign-Language-to-Speech-with-DeepStack-Custom-API/tree/main/scripts).
93+
The repo contains both the DeepStack model's data and the TensorFlow Object Detection API's data (I did that about a month before this).
94+
- Dataset was annotated in YOLOv format using [LabelImg](https://github.com/tzutalin/labelImg).
95+
- Model was trained using Colab GPU.
96+
[scripts/model_training_deepstack.ipynb](https://github.com/SteveKola/Sign-Language-to-Speech-with-DeepStack-Custom-API/blob/main/scripts/model_training_deepstack.ipynb)
97+
is the notebook used for that purpose.
98+
99+
## Attributions
100+
- The [DeepStack custom models' official docs](https://docs.deepstack.cc/custom-models/) contains everything that'd be
101+
needed to replicate the whole building process. It is lean and concise.
102+
- A big **thank you** to [Patrick Ryan](https://github.com/youngsoul) for making it seem like
103+
the project is not too herculean in his [article](https://docs.deepstack.cc/custom-models/).
104+
- I got my first introduction to DeepStack's custom models with this
105+
[article](https://medium.com/deepquestai/detect-any-custom-object-with-deepstack-dd0a824a761e).
106+
Having built few with TensorFlow, I can't appreciate this enough.

0 commit comments

Comments
 (0)