-
Notifications
You must be signed in to change notification settings - Fork 25
Install&Run
The SDM-RDFizer can run by building a docker container or by installing the RDFfizer locally.
Building docker container.
Note: All documents in the same folder of the Dockerfile will be copied to the container.
docker build -t rdfizer .
To run the application, you need to map your data volume to /data folder of the container. The folder /data will contain the data, mappings, and config files that will be used.
docker run -d -p 4000:4000 -v /path/to/yourdata:/data rdfizer
Send a GET request with the configuration file to SDM-RDFizer container.
Note: The entire localhost:4000/graph_creation/data/ command is necessary for the proper function of SDM-RDFizer container.
curl localhost:4000/graph_creation/data/your-config-file.ini
Get the results from the container (if output folder is inside data folder, results are already in your host)
docker cp CONTAINER_ID:/app/path/to/output .
Note: All documents in the same folder of the Dockerfile will be copied to the container.
docker build -t rdfizer .
docker run -d -p 4000:4000 -v /path/../SDM-RDFizer/example:/data rdfizer
curl http://localhost:4000/graph_creation/data/config.ini
ls /path/../SDM-RDFizer/example/output
pip install -r requeriments.txt
python3 rdfizer/run_rdfizer.py /path/to/config/FILE
pip install rdfizer
python3 -m rdfizer -c /path/to/config/FILE
The SDM-RDFizer receives as input a configuration file that indicates the location of the RML triple maps and the output RDF knowledge graph. This file indicates the values of additional variables required during the process of RDF knowledge graph creation.
The description of each parameters of the configuration file can be found here: https://github.com/SDM-TIB/SDM-RDFizer/wiki/The-Parameters-of-the-Configuration-file
Example of a config file for accessing two CSV datasets integrating them in an unique RDF knowledge graph
This configuration file indicates that one RDF knowledge graphs will be created from the execution of the RDF triple maps ${default:main_directory}/mappingDataset1.ttl and ${default:main_directory}/mappingDataset2.ttl. Duplicates are eliminated in the RDF knowledge graph.
[default]
main_directory: /path/to/datasets
[datasets]
number_of_datasets: 2
output_folder: ${default:main_directory}/graph
all_in_one_file: yes
remove_duplicate: yes
enrichment: yes
name: OutputRDFkg1
new_formulation: no
[dataset1]
name: OutputRDFkg-D1
mapping: ${default:main_directory}/mappingDataset1.ttl
[dataset2]
name: OutputRDFkg-D2
mapping: ${default:main_directory}/mappingDataset2.ttl
This configuration file indicates that two RDF knowledge graphs will be created from the execution of the RDF triple maps ${default:main_directory}/mappingDataset1.ttl and ${default:main_directory}/mappingDataset2.ttl. Duplicates are eliminated in both RDF knowledge graphs.
[default]
main_directory: main_directory: /path/to/datasets
[datasets]
number_of_datasets: 2
output_folder: ${default:main_directory}/graph
all_in_one_file: no
remove_duplicate: yes
enrichment: yes
name: output
new_formulation: no
[dataset1]
name: OutputRDFkg1
mapping: ${default:main_directory}/mappingDataset1.ttl
[dataset2]
name: OutputRDFkg2
mapping: ${default:main_directory}/mappingDataset2.ttl
This configuration file indicates that two RDF knowledge graphs will be created from the execution of the RDF triple maps ${default:main_directory}/mappingDataset1.ttl and ${default:main_directory}/mappingDataset2.ttl. Duplicates are eliminated in both RDF knowledge graphs. The following variables need to be defined for accessing each dataset from the database management system.
[default]
main_directory: /path/to/datasets
[datasets]
number_of_datasets: 2
output_folder: ${default:main_directory}/graph
all_in_one_file: no
remove_duplicate: yes
enrichment: yes
dbType: mysql
name: output
new_formulation: no
[dataset1]
user: root
password: 06012009mj
host: localhost
port: 3306
name: OutputRDFkg1
mapping: ${default:main_directory}/mappingDataset1.ttl
[dataset2]
user: root
password: 06012009mj
host: localhost
port: 3306
db: databaseInMySQL
name: OutputRDFkg2
mapping: ${default:main_directory}/mappingDataset2.ttl
This configuration file indicates that one RDF knowledge graph will be created from the execution of the RDF triple map
${default:main_directory}/mappingDataset1.ttl. Duplicates are eliminated from the RDF knowledge graph. The variable db indicates the database in Postgres that will be accessed.
[default]
main_directory: /path/to/datasets
[datasets]
number_of_datasets: 1
output_folder: ${default:main_directory}/graph
all_in_one_file: no
remove_duplicate: yes
enrichment: yes
dbType: postgres
name: output
new_formulation: no
[dataset1]
user: postgres
password: postgres
host: localhost
db: databaseInProgess
name: OutputRDFkg
mapping: ${default:main_directory}/mappingDataset1.ttl