Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Dockerfile to simplify installation #93

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 13 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
FROM python:3.4-alpine
WORKDIR /app
COPY ./images /app/images
COPY ./libgrabsite /app/libgrabsite
COPY ./grab-site ./gs-dump-urls ./gs-server ./setup.py /app/
RUN apk add --update build-base libffi-dev && \
pip3 install ./ && \
apk del --purge build-base libffi-dev && \
rm -R /root/.cache
VOLUME ["/data"]
WORKDIR /data
EXPOSE 29000
CMD ["python", "/app/gs-server"]
22 changes: 22 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ Note: grab-site currently **does not work with Python 3.5**; please use Python 3
<!-- DON'T EDIT THIS SECTION, INSTEAD RE-RUN doctoc TO UPDATE -->
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is a lie, sorry. I've been updating this TOC manually and probably don't want the Tips for specific websites expanded.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, removed

**Contents**

- [Install with Docker](#install-with-docker)
- [Install on Ubuntu 14.04 - 15.10](#install-on-ubuntu-1404---1510)
- [Install on Ubuntu 16.04](#install-on-ubuntu-1604)
- [Install on a non-Ubuntu distribution lacking Python 3.4.x](#install-on-a-non-ubuntu-distribution-lacking-python-34x)
Expand All @@ -55,6 +56,27 @@ Note: grab-site currently **does not work with Python 3.5**; please use Python 3

<!-- END doctoc generated TOC please keep comment here to allow auto update -->

Install with Docker
---
After [installing Docker](https://docs.docker.com/engine/installation/), get the pre-built container:

```bash
docker pull slang800/grab-site
```

Start the grab-site server. You can set the port, volume, and name to whatever you want:

```bash
docker run --detach -p 29000:29000 -v ~/grabs:/data --name warcfactory slang800/grab-site
```

Run a new crawl:

```bash
docker exec warcfactory grab-site --no-offsite-links --1 http://xkcd.com/
```

The downloaded data, temp files, ignores list, and other configuration will be in a sub-directory of the mounted volume. In this case, `~/grabs/xkcd.com-2016-09-05-caf0a39c`.

Install on Ubuntu 14.04 - 15.10
---
Expand Down