Skip to content

Install&Run

eiglesias34 edited this page Nov 5, 2025 · 12 revisions

Installing and Running the SDM-RDFizer

The SDM-RDFizer can run by building a docker container or by installing the RDFfizer locally.

Accessing the SDM-RDFizer via a docker

Building docker container.

Note: All documents in the same folder of the Dockerfile will be copied to the container.

docker build -t rdfizer .

To run the application, you need to map your data volume to /data folder of the container. The folder /data will contain the data, mappings, and config files that will be used.

docker run -d -p 4000:4000 -v /path/to/yourdata:/data rdfizer

Send a GET request with the configuration file to SDM-RDFizer container.

Note: The entire localhost:4000/graph_creation/data/ command is necessary for the proper function of SDM-RDFizer container.

curl localhost:4000/graph_creation/data/your-config-file.ini

Get the results from the container (if output folder is inside data folder, results are already in your host)

docker cp CONTAINER_ID:/app/path/to/output .

Example of executing SDM-RDFizer in docker

Note: All documents in the same folder of the Dockerfile will be copied to the container.

docker build -t rdfizer .
docker run -d -p 4000:4000 -v /path/../SDM-RDFizer/example:/data rdfizer
curl http://localhost:4000/graph_creation/data/config.ini
ls /path/../SDM-RDFizer/example/output

Running the SDM-RDFizer locally

pip install -r requeriments.txt
python3 rdfizer/run_rdfizer.py /path/to/config/FILE

Running the SDM-RDFizer locally with the Library

pip install rdfizer
python3 -m rdfizer -c /path/to/config/FILE

Parameters to Run the SDM-RDFizer

The SDM-RDFizer receives as input a configuration file that indicates the location of the RML triple maps and the output RDF knowledge graph. This file indicates the values of additional variables required during the process of RDF knowledge graph creation.

The description of each parameters of the configuration file can be found here: https://github.com/SDM-TIB/SDM-RDFizer/wiki/The-Parameters-of-the-Configuration-file

Examples of configurations

Example of a config file for accessing two CSV datasets integrating them in an unique RDF knowledge graph

This configuration file indicates that one RDF knowledge graphs will be created from the execution of the RDF triple maps ${default:main_directory}/mappingDataset1.ttl and ${default:main_directory}/mappingDataset2.ttl. Duplicates are eliminated in the RDF knowledge graph.

[default]
main_directory: /path/to/datasets

[datasets]
number_of_datasets: 2
output_folder: ${default:main_directory}/graph
all_in_one_file: yes
remove_duplicate: yes
enrichment: yes
name: OutputRDFkg1
new_formulation: no

[dataset1]
name: OutputRDFkg-D1
mapping: ${default:main_directory}/mappingDataset1.ttl 

[dataset2]
name: OutputRDFkg-D2
mapping: ${default:main_directory}/mappingDataset2.ttl 

Example of a config file for accessing two CSV datasets with separated RDF knowledge graph creation

This configuration file indicates that two RDF knowledge graphs will be created from the execution of the RDF triple maps ${default:main_directory}/mappingDataset1.ttl and ${default:main_directory}/mappingDataset2.ttl. Duplicates are eliminated in both RDF knowledge graphs.

[default]
main_directory: main_directory: /path/to/datasets

[datasets]
number_of_datasets: 2
output_folder: ${default:main_directory}/graph
all_in_one_file: no
remove_duplicate: yes
enrichment: yes
name: output
new_formulation: no

[dataset1]
name: OutputRDFkg1
mapping: ${default:main_directory}/mappingDataset1.ttl 

[dataset2]
name: OutputRDFkg2
mapping: ${default:main_directory}/mappingDataset2.ttl 

Example of a config file for accessing two datasets in MySQL

This configuration file indicates that two RDF knowledge graphs will be created from the execution of the RDF triple maps ${default:main_directory}/mappingDataset1.ttl and ${default:main_directory}/mappingDataset2.ttl. Duplicates are eliminated in both RDF knowledge graphs. The following variables need to be defined for accessing each dataset from the database management system.

[default]
main_directory: /path/to/datasets

[datasets]
number_of_datasets: 2
output_folder: ${default:main_directory}/graph
all_in_one_file: no
remove_duplicate: yes
enrichment: yes
dbType: mysql
name: output
new_formulation: no


[dataset1]
user: root
password: 06012009mj
host: localhost
port: 3306
name: OutputRDFkg1
mapping: ${default:main_directory}/mappingDataset1.ttl

[dataset2]
user: root
password: 06012009mj
host: localhost
port: 3306
db: databaseInMySQL
name: OutputRDFkg2
mapping: ${default:main_directory}/mappingDataset2.ttl

Example of a config file for accessing data in Postgres

This configuration file indicates that one RDF knowledge graph will be created from the execution of the RDF triple map ${default:main_directory}/mappingDataset1.ttl. Duplicates are eliminated from the RDF knowledge graph. The variable db indicates the database in Postgres that will be accessed.

[default]
main_directory: /path/to/datasets

[datasets]
number_of_datasets: 1
output_folder: ${default:main_directory}/graph
all_in_one_file: no
remove_duplicate: yes
enrichment: yes
dbType: postgres
name: output
new_formulation: no

[dataset1]
user: postgres
password: postgres
host: localhost
db: databaseInProgess 
name: OutputRDFkg
mapping: ${default:main_directory}/mappingDataset1.ttl