-
Notifications
You must be signed in to change notification settings - Fork 0
Web service
This page gives an introduction on how to use the DBpedia Spotlight Web Service. The available service endpoints are listed below and described in more details in the User's Manual.
Spotting : takes text as input and recognizes entities/concepts to annotate. Several spotting techniques are available, such as dictionary lookup and Named Entity Recognition (NER).
Supported types (POST/GET): XML, JSON, NIF
Disambiguation: takes spotted text input, where entities/concepts have already been recognized and marked as wiki markup or xml. Chooses an identifier for each recognized entity/concept given the context.
Supported types (POST/GET):XML, JSON, HTML, RDFa, NIF
Annotation: runs spotting and disambiguation. Takes text as input, recognizes entities/concepts to annotate and chooses an identifier for each recognized entity/concept given the context.
Supported types (POST/GET):XML, JSON, HTML, RDFa, NIF
Similar to annotate, but returns a ranked list of candidates instead of deciding on one. These list contains some properties as described below:
- support: how prominent is this entity, i.e. number of inlinks in Wikipedia;
- priorScore: normalized support;
- contextualScore: score from comparing the context representation of an entity with the text (e.g. cosine similartity with if-icf weights);
- percentageOfSecondRank: measure by how much the winning entity has won by takingcontextualScore_2ndRank / contextualScore_1stRank, which means the lower this score, the further the first ranked entity was "in the lead";
- finalScore: combination of all of them;
Supported types (POST/GET):XML
Example 1: Simple request
- text= "President Obama called Wednesday on Congress to extend a tax break for students included in last year's economic stimulus package, arguing that the policy provides more generous assistance."
- confidence = 0.2; support=20
- whitelist all types.
curl http://spotlight.dbpedia.org/rest/annotate \ --data-urlencode "text=President Obama called Wednesday on Congress to extend a tax break for students included in last year's economic stimulus package, arguing that the policy provides more generous assistance." \ --data "confidence=0.2" \ --data "support=20"
Example 2: Using SPARQL for filtering
This example demonstrates how to keep the annotations constrained to only politicians related to Chicago.
- text= "President Obama called Wednesday on Congress to extend a tax break for students included in last year's economic stimulus package, arguing that the policy provides more generous assistance."
- confidence = 0.2; support=20
- whitelist sparql = SELECT DISTINCT ?politician WHERE { ?politician a <http://dbpedia.org/ontology/officeholder></http://dbpedia.org/ontology/officeholder> . ?politician ?related <http://dbpedia.org/resource/chicago></http://dbpedia.org/resource/chicago> }
curl http&#58;//spotlight.dbpedia.org/rest/annotate \ &#45;&#45;data&#45;urlencode &quot;text&#61;President Obama called Wednesday on Congress to extend a tax break for students included in last year&#39;s economic stimulus package, arguing that the policy provides more generous assistance.&quot; \ &#45;&#45;data &quot;confidence&#61;0.2&quot; \ &#45;&#45;data &quot;support&#61;20&quot; \ &#45;&#45;data&#45;urlencode &quot;sparql&#61;SELECT DISTINCT ?x WHERE &#123; ?x a &lt;http&#58;//dbpedia.org/ontology/OfficeHolder&gt; . ?x ?related &lt;http&#58;//dbpedia.org/resource/Chicago&gt; . &#125;&quot;
Notice: Due to system resources restrictions, for this demo we only use the first 2000 results returned for each query (default for the public DBpedia SPARQL endpoint). However you are welcome to download the software+data and install in your server for real world use cases.
Attention: Make sure to encode your SPARQL query before adding it as the value of the //&sparql// parameter - see java.net.URLEncoder.encode().
You can request different types of output by setting the Accept
request header.
For example, in order to request JSON output, you can add Accept:application/json to the request headers.
One example using cURL:
curl "http://spotlight.dbpedia.org/rest/annotate?text=President%20Michelle%20Obama%20called%20Thursday%20on%20Congress%20to%20extend%20a%20tax%20break%20for%20students%20included%20in%20last%20year%27s%20economic%20stimulus%20package,%20arguing%20that%20the%20policy%20provides%20more%20generous%20assistance.&confidence=0.2&support=20" -H "Accept:application/json"
The content types we currently support are:
- text/html
- application/xhtml+xml
- text/xml
- application/json
If your input text is long, you may prefer using POST instead of GET.
curl -i -X POST \ -H "Accept:application/json" \ -H "content-type:application/x-www-form-urlencoded" \ -d "disambiguator=Document&confidence=-1&support=-1&text=President%20Obama%20called%20Wednesday%20on%20Congress%20to%20extend%20a%20tax%20break%20for%20students%20included%20in%20last%20year%27s%20economic%20stimulus%20package" \ http://spotlight.dbpedia.org/dev/rest/annotate/
Please not that you must use content-type application/x-www-form-urlencoded for POST requests.
Project
- Introduction
- Glossary
- User's manual
- Web application
- Installation
- Internationalization
- Licenses
- Researcher
- How to cite
- Support and Feedback
- Troubleshooting
- Team
- Acknowledgements
Statistical backend
Lucene backend
- Introduction
- Downloads
- Architecture
- Internationalization
- Web service parameters / API
- Splitting occurrences into topics
Developers