Skip to content

Commit 92a625c

Browse files
authored
Merge pull request #9 from seralf/master
added pre-built docker image for [DAF-126] POC
2 parents 474ed6c + 0efaa22 commit 92a625c

30 files changed

+1848
-84
lines changed

.gitignore

+7
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,9 @@
11
*.class
22
*.log
3+
target/
4+
bin/
5+
.project
6+
.classpath
7+
.settings/
8+
.cache-main
9+
.cache-tests

README.md

+28-2
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,28 @@
1-
# daf-semantics
2-
Daf Semantics repository
1+
2+
daf-semantics
3+
====================
4+
5+
The Daf Semantics repository collects some different components, designed for integrating ontologies, RDF data and to provide some different "semantic" functionalities to the [DAF](https://github.com/italia/daf) platform.
6+
7+
The [semantic_manager]() component exposes the central access point for a subset of the microservices' functionalities:
8+
<img src="./docs/semantic_manager-v4.png" alt="semantic_manager" width="60%" height="auto">
9+
10+
The planned components are:
11+
12+
+ [***semantic_frontend***](https://github.com/seralf/daf-semantics/tree/master/semantic_frontend):
13+
the front end for the OntoPA catalog [TODO]
14+
+ [***semantic_manager***](https://github.com/seralf/daf-semantics/tree/master/semantic_manager):
15+
the main interface between DAF and the daf-semantics microservices [WIP]
16+
+ [***ontonethub***](https://github.com/seralf/teamdigitale/ontonethub):
17+
a component providing indexing/search capabilities for the catalog [WIP]
18+
+ [***semantic_repository***](https://github.com/seralf/daf-semantics/tree/master/semantic_repository):
19+
an abstraction over different triplestores [WIP]
20+
+ [***semantic_validator***](https://github.com/seralf/daf-semantics/tree/master/semantic_validator):
21+
a component for validating an ontology over DCAT-AP_IT standard [WIP]
22+
+ [***semantic_standardization***](https://github.com/seralf/daf-semantics/tree/master/semantic_standardization):
23+
a component exposing vocabulary data and hierarchies, useful for simple standardization [POC]
24+
+ [***semantic_spreadsheet***](https://github.com/seralf/daf-semantics/tree/master/semantic_spreadsheet):
25+
a repository collecting recipes for creating RDF data from spreadsheets, using google refine [WIP]
26+
+ [***semantic_mapping***](#):
27+
a component for mapping of incoming data (typically in CSV) to RDF, using W3C standards [TODO]
28+

docs/semantic_manager-v4.png

43.4 KB
Loading

docs/semantic_manager-v4.xml

+1
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
<mxfile userAgent="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36" version="7.5.0" editor="www.draw.io" type="github"><diagram id="277c755e-be6f-885f-7c0a-642f4f8962d8" name="Page-1">7Vtbd6I6FP41PtolRNA+9ubMOauzZk4765zpY4SImQJxhVjt/PqzgYRbIrUWtTf7UNiEEPa3v30DeugiWn/heDH/xnwS9uyBv+6hy55tW7Y1hH+p5DGXOKdWLgg49eWgUnBL/xApHEjpkvokqQ0UjIWCLupCj8Ux8URNhjlnq/qwGQvrV13ggGiCWw+HuvQ/6ot5Lh3bo1L+ldBgrq5suaf5kSn27gPOlrG8Xs9Gs+yXH46wmkveaDLHPltVROiqhy44YyLfitYXJEx1q9SWnzfZcLRYNyex2OYEWy5DPKpbJz5oQu4yLuYsYDEOr0rpeXZ7JJ1gAHtzEYWwacEmXJM//pLybOcu3TlxYDcRmIuzFBoQxSwmSjahYVic4jdGgKRy/DcR4lEaC14KBqJyhdeMLeQ6EsHZPblgIePZPaFh9iuOKEjTsbrCpA4TtuSe1Im0ZlhuQNQoCWCqrsp5UstfCIsI3D8M4CTEgj7UDQtL+wyKcSVGsCFhMkOGPiHbDbLx0RBzOoVsTcWvAqR0764JpqWD6eNkns1mvQSV8+zPiMqMxcI0cle0kH0suORiHnC4lJPekgjHgnog/YZjcP5cA3Q1p4LcLnB2RyuIinXQcLLI49SMrlMYGvAWYWBg0rqb/TSt2+m8IQ1i2PZAu7AqgAG4Vzl15qR/bTA8EC7IulXD8qiNpM+TQd1y5f6qEiKlaF6Jjkr2DEzY9HeqLnsQ4ikJVSjNJsH5HgXtrWkcgAgMD244yOcrUHQq9yIBfnUucxoy7/7nnMY1v/kihk4mCH5GW6m5AOVF1HZ+DyPHYEJjj3jetkweG0Kl1TWT5ak/GI1FaZ+OCo3KPlHD8PJ1yrNK2wM08GNl2CIdkGy+josa1xk3cqwnxg+HTsP08xWURCh0YuIG7Ep6mN2Xq7mvjVypEAJ8gKjbOScJ/YOn2YDUbqVaYLRz3nMuTc4n9SUUsugzeSCivp+RKaPxeZEbV60rz45N1uVscFBF2i8XV0udTY6rD4ZtIWmZLzSyUQ3K/mn9fDabJUT0mo6tFU8NwNE+soU6z22nxZHZmier+qmqO7P27s5emgaa3BE6VmIx1pj5HegYE/F1OT1oQtGiVY3THSQOyG04THtviYOm89NWMskQ/Uz2tJFHy7XfdBWlqtwaf45WR6nVmDPzG7JgCRUsndMN02A2hTTdDURxsx+DXYWaDsAuldpVEWGhXh2Bwhbppo8FTgCi1MSfgOAVa3zUSDSHBo2PDBofdaHx9o7dtg6tpXlQ8W93hcLfsg9DBh/WeUmyNX5IZ4zA8ZSF79lHjVTQOEIGoJ5FfITq/8VccXSuDI9GFVUAllS5ZlBd6uH9e17VUpLokUdTRAVMFZaidZA+xzqZhWzlzQGwk2gZCtr3mbeMMn0Zo1UL4TQSbYaqA3qNm40PR2fXWGeX2wW73CPEo/edYNudF6gb+mVO3WxQ0WOVU+SL0vplO7Q0rG57Gm+i9mrpxG5nGgfpXezWAR01Sg7niQ5oY7x6crBjB1Q3L72zMiM+4aACFh+5GmwxAy1I1Iy3g7gwbOJ0yNLwVE90f5zd/HOtaX+n1jMwMclJWc1ZQzJLp2r2oUXK0vMEAKZx8DOjbL+j0Os0VYx0FbsGFdsdqLh4iaQTr/r2s9fiVaHXkL3aemvkhkRMkM/09Yn01R4cLn1VKU8FpQfKxZIlTMfi9TSxug8U9focGerzfXW0bL0jcnM5Gf6t84TG/QgoxPWnl68Img7QQO4R0WjvlnzWc89P2pWP2Xs9N2jUc4Pt3n/QJhqOG/bX9LTdFYao2+acsr6K7TUe1725V+NsQ0vueO/G2XpPrvII7l/whuCCGf/YT+BMz4P2VWYZXir4jJ41NJwDPp1D7RXZZ/R8vq/r/rXtDUHPakRPa7ug9+REGs03hOFdome33wI8+2WX7E2xQ0bM7eqeqkkhQ0/geB8C6D2BCeORKVrmEp8+KNFfcUCSaj8TLlY5/J7jqyp/irz2YNFVJVrmdGfCwUzBdX4g3e8xtYHd8ru33BmWHxeiq/8B</diagram></mxfile>

semantic_manager/app/generated_controllers/semantic_manager.yaml.scala

+13-1
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,15 @@ import scala.util._
1717

1818
import javax.inject._
1919

20+
import play.api.mvc.{Action,Controller}
21+
import play.api.data.validation.Constraint
22+
import play.api.i18n.MessagesApi
23+
import play.api.inject.{ApplicationLifecycle,ConfigurationProvider}
24+
import de.zalando.play.controllers._
25+
import PlayBodyParsing._
26+
import PlayValidations._
27+
import scala.util._
28+
import javax.inject._
2029
import play.api.libs.ws.WSClient
2130
import utilities.JSONHelper
2231
import scala.concurrent.ExecutionContext.Implicits._
@@ -40,6 +49,9 @@ import OntonetHubClient.models._
4049
import OntonetHubClient.models._
4150
import OntonetHubClient.models._
4251
import OntonetHubClient.models._
52+
import OntonetHubClient.models._
53+
import OntonetHubClient.models._
54+
import OntonetHubClient.models._
4355

4456
/**
4557
* This controller is re-generated after each change in the specification.
@@ -48,7 +60,7 @@ import OntonetHubClient.models._
4860

4961
package semantic_manager.yaml {
5062
// ----- Start of unmanaged code area for package Semantic_managerYaml
51-
63+
5264
// ----- End of unmanaged code area for package Semantic_managerYaml
5365
class Semantic_managerYaml @Inject() (
5466
// ----- Start of unmanaged code area for injections Semantic_managerYaml

semantic_repository/app/generated_controllers/semantic_repository.yaml.scala

+1-1
Original file line numberDiff line numberDiff line change
@@ -317,7 +317,7 @@ import java.net.URI
317317

318318
package semantic_repository.yaml {
319319
// ----- Start of unmanaged code area for package Semantic_repositoryYaml
320-
320+
321321
// ----- End of unmanaged code area for package Semantic_repositoryYaml
322322
class Semantic_repositoryYaml @Inject() (
323323
// ----- Start of unmanaged code area for injections Semantic_repositoryYaml

semantic_standardization/README.md

+152-6
Original file line numberDiff line numberDiff line change
@@ -12,10 +12,111 @@ Two endpoints are provided:
1212

1313
The idea is that each endpoint (and its configured queries) acts for a very specific domain, so the next versions could introduce new vocabularies and ontologies, but needs to create ad-hoc SPARQL queries for retrieving the informations needed.
1414

15+
## semantic annotation in DAF ingestion
1516

16-
## example: retrieving a vocabulary dataset
17+
The [DAF](https://github.com/italia/daf) `semantic_annotation` has currently the following structure: `{ontology}.{concept}.{property}`.
18+
During the ingestion phase of datasets in DAF platform a `semantic_annotation` is used, in order to relate some column of a dataset to the most appropriate property of a given existing concept, from the controlled vocabularies.
19+
20+
**Note** that while the annotation is used to relate cells with vocabularies, it does not save explicitly a reference to the vocabularies used. A reference to concept from an ontology is used instead.
21+
22+
23+
## examples
24+
25+
26+
### example: sequence of calls
27+
28+
1. retrieves (vocabulary,ontology) reference from semantic_annotation tag
29+
```
30+
curl -X GET http://localhost:9000/kb/v1/daf/annotation/lookup?semantic_annotation=POI-AP_IT.PointOfInterestCategory.POIcategoryIdentifier -H "accept: application/json" -H "content-type: application/json"
31+
```
32+
33+
2. retrieves the hierarchies for a given property
34+
```
35+
curl -X GET http://localhost:9000/kb/v1/hierarchies/properties?vocabulary_name=POICategoryClassification&ontology_name=poiapit&lang=it -H "accept: application/json" -H "content-type: application/json"
36+
```
37+
38+
3. retrieves the dataset values for a certain vocaulary
39+
```
40+
curl -X GET http://localhost:9000/kb/v1/vocabularies/POICategoryClassification?lang=it -H "accept: application/json" -H "content-type: application/json"
41+
```
42+
43+
----
44+
45+
### example: retrieves informations from the semantic_annotation tag
46+
With this endpoint we can retrieve informations about the vocabulary/ontology pair related to a given `semantic_annotation` tag:
47+
48+
```
49+
curl -X GET http://localhost:9000/kb/v1/daf/annotation/lookup?semantic_annotation={semantic_annotation} \
50+
-H "accept: application/json" -H "content-type: application/json"
51+
```
52+
53+
for example, for the Point Of Interest vocabulary:
54+
55+
```
56+
curl -X GET 'http://localhost:9000/kb/v1/daf/annotation/lookup?semantic_annotation=POI-AP_IT.PointOfInterestCategory.POIcategoryIdentifier' \
57+
-H "accept: application/json" -H "content-type: application/json"
58+
```
59+
60+
This will return a datastructure similar to the following one for each tag:
61+
62+
```
63+
[
64+
{
65+
"vocabulary_id": "POICategoryClassification",
66+
"vocabulary": "http://dati.gov.it/onto/controlledvocabulary/POICategoryClassification",
67+
"ontology": "http://dati.gov.it/onto/poiapit",
68+
"semantic_annotation": "POI-AP_IT.PointOfInterestCategory.POIcategoryIdentifier",
69+
"property_id": "POIcategoryIdentifier",
70+
"concept_id": "PointOfInterestCategory",
71+
"ontology_prefix": "poiapit",
72+
"ontology_id": "POI-AP_IT",
73+
"concept": "http://dati.gov.it/onto/poiapit#PointOfInterestCategory",
74+
"property": "http://dati.gov.it/onto/poiapit#POIcategoryIdentifier"
75+
}
76+
]
77+
```
78+
79+
the idea is to be able to have as much informations as possible to eventually relate the annotation to ontologies and vocabularies.
80+
81+
82+
### example: retrieving a vocabulary dataset
1783

1884
We can obtain a de-normalized, tabular version of the vocabulary `Istat-Classificazione-08-Territorio` using the curl call:
85+
86+
```
87+
curl -X GET http://localhost:9000/kb/v1/hierarchies/properties?vocabulary_name={vocabulary_name}&ontology_name={ontology_prefix}&lang={lang} \
88+
-H "accept: application/json" -H "content-type: application/json"
89+
```
90+
91+
A `SPARQL` query is used to create a proper tabular representation of the data.
92+
93+
#### example: PontOfInterest / POI_AP-IT
94+
95+
```
96+
curl -X GET http://localhost:9000/kb/v1/hierarchies/properties?vocabulary_name=POICategoryClassification&ontology_name=poiapit&lang=it -H "accept: application/json" -H "content-type: application/json"
97+
```
98+
99+
this will return a data structure:
100+
101+
```
102+
[
103+
{
104+
"vocabulary": "POI-AP_IT",
105+
"path": "POI-AP_IT.PointOfInterestCategory.definition",
106+
"hierarchy_flat": "PointOfInterestCategory",
107+
"hierarchy": [
108+
{
109+
"class": "PointOfInterestCategory",
110+
"level": 0
111+
}
112+
]
113+
},
114+
...
115+
]
116+
```
117+
118+
119+
#### example: Luoghi Istat / CLV_AP-IT
19120
```
20121
$ curl -X GET "http://localhost:9000/kb/v1/vocabularies/Istat-Classificazione-08-Territorio?lang=it" -H "accept: application/json" -H "content-type: application/json"
21122
```
@@ -39,12 +140,52 @@ this will return a result structure similar to the following one:
39140
]
40141
```
41142

42-
## example: retrieve th hierarchies for the properties used
143+
For technical reason, currently a value of `CLV-AP_IT_Region_name` is used in place of `CLV-AP_IT.Region.name`.
144+
145+
### example: retrieve the hierarchies for the properties used
43146

44147
If we have the example vocabulary `Istat-Classificazione-08-Territorio`, which uses terms from the ontology `clvapit`, we can retrieve the local hierarchy associated to each property with the curl command:
45148

46149
```
47-
$ curl -X GET http://localhost:9000/kb/v1/hierarchies/properties?vocabulary_name=Istat-Classificazione-08-Territorio&ontology_name=clvapit&lang=it -H "accept: application/json" -H "content-type: application/json"
150+
$ curl -X GET http://localhost:9000/kb/v1/hierarchies/properties?vocabulary_name={vocabulary_name}&ontology_name={ontology_prefix}&lang={lang} \
151+
-H "accept: application/json" -H "content-type: application/json"
152+
```
153+
154+
#### example: POI / POI_AP-IT
155+
156+
```
157+
curl -X GET http://localhost:9000/kb/v1/vocabularies/POICategoryClassification?lang=it \
158+
-H "accept: application/json" -H "content-type: application/json"
159+
```
160+
161+
which will return results:
162+
163+
```
164+
[
165+
[
166+
{
167+
"key": "POI-AP_IT_PointOfInterestCategory_definition",
168+
"value": "Rientrano in questa categoria tutti i punti di interesse connessi all'intrattenimento come zoo, discoteche, pub, teatri, acquari, stadi, casino, parchi divertimenti, ecc."
169+
},
170+
{
171+
"key": "POI-AP_IT_PointOfInterestCategory_POICategoryName",
172+
"value": "Settore intrattenimento"
173+
},
174+
{
175+
"key": "POI-AP_IT_PointOfInterestCategory_POICategoryIdentifier",
176+
"value": "cat_1"
177+
}
178+
],
179+
...
180+
]
181+
```
182+
183+
184+
#### example: Luoghi Istat / CLV_AP-IT
185+
186+
```
187+
$ curl -X GET http://localhost:9000/kb/v1/hierarchies/properties?vocabulary_name=Istat-Classificazione-08-Territorio&ontology_name=clvapit&lang=it \
188+
-H "accept: application/json" -H "content-type: application/json"
48189
```
49190

50191
which will return the results:
@@ -68,7 +209,7 @@ which will return the results:
68209
```
69210

70211

71-
## example configurations
212+
### example configurations
72213

73214
An example configuration for working with a vocabulary (VocabularyAPI):
74215

@@ -122,8 +263,13 @@ Eventually the idea of pre-loading ontologies and vocabularies from disk can be
122263

123264
----
124265

125-
TODO:
266+
## TODO
126267

127268
+ more documentation / comments
128269
+ more proper tests
129-
+ remove redundant classes for RDFRepository, importing external kb-core dependency, instead
270+
+ remove redundant classes for RDFRepository, importing external kb-core dependency, instead
271+
272+
273+
## known ISSUES
274+
275+
...

0 commit comments

Comments
 (0)