Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Dockerfile and README.md to help development #35

Open
wants to merge 6 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 26 additions & 0 deletions docker/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
FROM ubuntu

RUN apt-get update \
&& apt-get install -y \
apt-utils \
curl \
wget \
nano \
libsm6 \
libxrender1 \
libxext6 \
ghostscript \
python3-minimal \
python3-setuptools \
python3-pip \
&& ln -s /usr/bin/python3 /usr/bin/python \
&& ln -s /usr/bin/pip3 /usr/bin/pip

RUN pip install excalibur-py[dev]

EXPOSE 5000

ENV LC_ALL=C.UTF-8
ENV LANG=C.UTF-8

WORKDIR /excalibur/
54 changes: 54 additions & 0 deletions docker/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
<p align="center">
<img src="https://raw.githubusercontent.com/camelot-dev/excalibur/master/docs/_static/excalibur-logo.png" width="200">
</p>

# Excalibur: Docker
This is the Docker configuration which allows you to run Excalibur without installing any dependencies on your machine!<br/>
OK, any except `docker`.

## Prerequisites

As stated, the thing you need is `docker`.

Follow the instructions on [Install Docker](https://docs.docker.com/engine/installation/) for your environment if you haven't got `docker` already.

## Usage

### Prepare the image

Switch to `docker` directory here and run `docker build -t excalibur .` (don't forget the final `.`) to build your docker image. That may take some time but is only required once. Or perhaps a few times after you tweak something in a `Dockerfile`.

After the process is finished you have a `excalibur` image, that will be the base for your experiments. You can confirm that looking on results of `docker images` command.

### Run the container

From your project folder, run `docker run -it -p 5000:5000 -v $(pwd):/excalibur/ excalibur /bin/bash`
This will start the container and open up a bash console inside it.

At this point you need to initialize the metadata database using:

<pre>
$ excalibur initdb
</pre>

Once initialized, you need to enable connectivity from outside the container:

Use nano to open the config file ...

<pre>
$ nano /root/excalibur/excalibur.cfg
</pre>

... and modify the [webserver] section as:

<pre>
web_server_host = 0.0.0.0
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For production use should probably include instructions on using a WSGI server or similar

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point... the initial idea behind having a the docker container is for development purpose, but I agree if we want to use the a container for deployment this may need a thing to address

Copy link

@zainozzaini zainozzaini Feb 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any instruction for WSGI setup ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

</pre>

And then start the webserver using:

<pre>
$ excalibur webserver
</pre>

That's it! Now you can go to http://localhost:5000 and start extracting tabular data from your PDFs.