Skip to content

Commit

Permalink
Merge pull request #3 from bgadrian/tweaks
Browse files Browse the repository at this point in the history
renamed project (hotcache was taken); made it compatible with go get/…
  • Loading branch information
bgadrian authored Oct 8, 2018
2 parents 842a0ba + 5f95c5b commit dfd988d
Show file tree
Hide file tree
Showing 7 changed files with 699 additions and 19 deletions.
674 changes: 674 additions & 0 deletions LICENSE

Large diffs are not rendered by default.

14 changes: 7 additions & 7 deletions Makefile
Original file line number Diff line number Diff line change
@@ -1,18 +1,18 @@
# Makefile
source := ./src/main.go
source := main.go

pre:
mkdir -p ./build/
env GO111MODULE=on go get -d ./src/
env GO111MODULE=on go get -d ./

run: pre
go run $(source) --seed $(URL) --debug

build: pre
go build -o ./build/hotcache $(source)
@echo "See ./build/hotcache --help"
go build -o ./build/warmcache $(source)
@echo "See ./build/warmcache --help"

buildall: pre
GOOS=darwin GOARCH=amd64 go build -o ./build/hotcache-mac $(source)
GOOS=linux GOARCH=amd64 go build -o ./build/hotcache $(source)
GOOS=windows GOARCH=amd64 go build -o ./build/hotcache.exe $(source)
GOOS=darwin GOARCH=amd64 go build -o ./build/warmcache-mac $(source)
GOOS=linux GOARCH=amd64 go build -o ./build/warmcache $(source)
GOOS=windows GOARCH=amd64 go build -o ./build/warmcache.exe $(source)
26 changes: 16 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Hot Cache Crawler
# WarmCache [Crawler]

Problem: **you have a lazy-init web cache system** (like a CDN or Prerender, or an internal redis/memcache), in result the first visit for each page will have a big latency.

Expand Down Expand Up @@ -43,20 +43,26 @@ The crawler is caching-agnostic, because it has a simple logic and just visits t
## Install & build

##### A. Easy way:
Download the prebuilt binaries and use them.
Download the [prebuilt binaries](https://github.com/bgadrian/warmcache/releases) and use them.

##### B. Go way
```bash
$ go install github.com/bgadrian/warmcache
warmcache --help
```

##### B. Hard way:
Clone the repo and build it yourself. Requires: go 1.11+, makefile, bash.
> It uses Go modules so you can clone the project in any folder you want, it does not required to be in GOPATH/src
```bash
$ git clone [email protected]:bgadrian/hot-cache-crawler.git
$ cd hot-cache-crawler
$ git clone [email protected]:bgadrian/warmcache.git
$ cd warmcache
$ make build
```

## Usage
```bash
$ ./build/hotcache -help
$ ./build/warmcache -help
#output:
-h, --help display help information
--seed *The start page (seed) of the crawl, example: https://google.com
Expand All @@ -66,21 +72,21 @@ $ ./build/hotcache -help
--agent[=Mozilla/5.0 ...] User-agent for all requests
--debug Print all pages that are found
--query Add custom query params to all requests
--header[=X-hotcache:crawler] Add one or more HTTP request headers to all requests
--header[=X-warmcache:crawler] Add one or more HTTP request headers to all requests

```
Crawl trigger for Prerender:
```bash
$ ./build/hotcache --seed http://localhost/ --debug --query "_escaped_fragment_="```
$ ./build/warmcache --seed http://localhost/ --debug --query "_escaped_fragment_="```
````
Simple crawl of 2 domains, with a maximum of 400 visited pages:
```bash
$ ./build/hotcache --seed http://domain1 --seed https://domain2 --max 400
$ ./build/warmcache --seed http://domain1 --seed https://domain2 --max 400
```
Custom delay time and user-agent:
```bash
$ ./build/hotcache --seed https://domain1 --delay 250 --robot "mybot" --agent "Mozilla/5.0 (compatible; MyBot/1.0)"
$ ./build/warmcache --seed https://domain1 --delay 250 --robot "mybot" --agent "Mozilla/5.0 (compatible; MyBot/1.0)"
```
## Test

Expand All @@ -90,7 +96,7 @@ $ docker run -p 80:80 kennethreitz/httpbin

#in other terminal:
$ make build
./build/hotcache --seed http://localhost/anything --debug --query "test=1" --query "_escaped_fragment_=1" --header "Accept: application/json"
$ ./build/warmcache --seed http://localhost/anything --debug --query "test=1" --query "_escaped_fragment_=1" --header "Accept: application/json"
```
You should see in the output the httpbin echo with all the parameters and custom headers.

Expand Down
2 changes: 1 addition & 1 deletion go.mod
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
module github.com/bgadrian/hot-cache-crawler
module github.com/bgadrian/warmcache

require (
github.com/Bowery/prompt v0.0.0-20180817134258-8a1d5376df1c // indirect
Expand Down
2 changes: 1 addition & 1 deletion src/main.go → main.go
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
package main

import (
"github.com/bgadrian/hot-cache-crawler/src/scanner"
"github.com/bgadrian/warmcache/scanner"
"github.com/mkideal/cli"
)

Expand Down
File renamed without changes.
File renamed without changes.

0 comments on commit dfd988d

Please sign in to comment.