Skip to content
Sandro edited this page Jul 26, 2013 · 2 revisions

This page describes how to run DBpedia Spotlight in your own server by using a pre-packaged JAR. We assume that you are running these commands on a bash command line (Linux) and have wget, curl and java installed.

Requirements

  • Java 1.6+
  • RAM of appropriate size for the spotter lexicon you need

Quickstart

The commands below will help you to obtain a pre-packaged lightweight deployment to get you started.

Lucene:

wget http://spotlight.dbpedia.org/download/release-0.6/dbpedia-spotlight-quickstart-0.6.5.zip
unzip dbpedia-spotlight-quickstart-0.6.5.zip
cd dbpedia-spotlight-quickstart-0.6.5/
./run.sh

Older jars are downloadable from: https://github.com/dbpedia-spotlight/dbpedia-spotlight/downloads

Statistical:

 wget http://spotlight.sztaki.hu/downloads/en.tar.gz
 wget http://spotlight.sztaki.hu/downloads/dbpedia-spotlight.jar
 tar xvf en.tar.gz 
 java -jar dbpedia-spotlight.jar /data/spotlight/en/model_en http://localhost:2222/rest

Test your installation

In order to test your new installation, run:

curl http://localhost:2222/rest/annotate \
  -H "Accept: text/xml" \
  --data-urlencode "text=Brazilian state-run giant oil company Petrobras signed a three-year technology and research cooperation agreement with oil service provider Halliburton." \
  --data "confidence=0" \
  --data "support=0"

Now you can study more about how to call your newly installed Web Service, which parameters are accepted, etc. here.

Upgrade your models

Lucene:

The files you've downloaded above contain only a very small subset of the DBpedia resources. They are used to demonstrate DBpedia Spotlight in a lightweight environment. Please see our Downloads for more information on other alternatives that are more useful in real world scenarios. See below one example.

First rename your small model files:

mv data/index data/index-small
mv data/spotter.dict data/spotter-small.dict

Now obtain new copies with larger models:

cd data
wget http://spotlight.dbpedia.org/download/release-0.5/context-index-compact.tgz
tar zxvf context-index-compact.tgz
mv index-withSF-withTypes-compressed index
wget http://spotlight.dbpedia.org/download/release-0.4/surface_forms-Wikipedia-TitRedDis.uriThresh75.tsv.spotterDictionary.gz
gunzip surface_forms-Wikipedia-TitRedDis.uriThresh75.tsv.spotterDictionary.gz
mv surface_forms-Wikipedia-TitRedDis.uriThresh75.tsv.spotterDictionary spotter.dict

If you are using the largest spotter dict, you may need to increase the java heap space -- e.g. -Xmx10G in your command line.

Statistical:

We offer only the complete model with this option. You can download the newest models from http://spotlight.sztaki.hu/downloads/.

Clone this wiki locally