Join Ordering of SPARQL Property Path Queries

This repository contains the source code, the configuration files, and the queries used in the experimental study presented in our paper Join Ordering of SPARQL Property Path Queries.

Setup

To quickly get started, run the following commands on one machine, it will install everything you need to reproduce our experimental results.

Clone and install the project

Details

We use conda to manage the project dependencies. If conda is not installed on your system, you can download it from their website.

git clone https://github.com/JulienDavat/Join-Ordering-of-SPARQL-Property-Path-Queries.git xp-eswc2023
cd xp-eswc2023

conda env create -f environment.yml
conda activate xp

Install HDT

Details

In this project we use a custom version of HDT that need to be installed on your system.

git clone https://github.com/JulienDavat/hdt-bindings.git hdt
cd hdt

git clone [email protected]:rdfhdt/hdt-cpp.git
cd hdt-cpp
git checkout tags/v1.3.3 -b master 
cd ..

python -m pip install .

Download HDT files

Details

Random Walks are performed over HDT. Please download HDT files from this link into the data directory. If the data directory does not exist, please create it.
Install Virtuoso v7.2.7
Details
```
wget https://github.com/openlink/virtuoso-opensource/releases/download/v7.2.7/virtuoso-opensource-7.2.7.tar.gz
tar -zxvf virtuoso-opensource-7.2.7.tar.gz

cd virtuoso-opensource-7.2.7
./configure
make
make install
```
The configuration file used in our experiments is available in the config directory. You just have to indicate the location of Virtuoso on your system. The location of Virtuoso must also be reported in the server.sh script. Finally, you need to add the bin directory of Virtuoso in your PATH variable.

If everything went well, you should be able to start Virtuoso with the following command:
```
bash server.sh start virtuoso
```
Virtuoso can be stopped using the same command:
```
bash server.sh stop virtuoso
```
Install BlazeGraph v2.1.6
Details
```
wget https://github.com/blazegraph/database/releases/download/BLAZEGRAPH_2_1_6_RC/bigdata.jar
```
The configuration file used in our experiments is available in the config directory. You just have to copy it in the same directory as the .jar file. The location of BlazeGraph must be reported in the server.sh script.

If everything went well, you should be able to start BlazeGraph with the following command:
```
bash server.sh start blazegraph
```
BlazeGraph can be stopped using the same command:
```
bash server.sh stop blazegraph
```
Download the WDBench dataset.

Details

The dataset can be downloaded from Figshare. If there is any problem, please refer to their official github repository.
Load data into Virtuoso
Details

The WDBench dataset can be loaded into Virtuoso using the following commands. You just have to indicate the location of the .nt file.
```
isql "EXEC=ld_dir('<your file here>', '*.nt', 'http://example.com/wdbench');"
isql "EXEC=rdf_loader_run();"
isql "EXEC=checkpoint;"
```
Load data into BlazeGraph
Details

The WDBench dataset can be loaded into BlazeGraph using the following command. You just have to indicate the location of the .nt file.
```
java -cp blazegraph.jar com.bigdata.rdf.store.DataLoader -defaultGraph http://example.com/wdbench blazegraph.properties <your file here>
```

Quickstart

Experiments are powered by snakemake, a scientific workflow management system in Python. To re-run our experiments just run the following commands:

# For Virtuoso
snakemake --configfile virtuoso.yaml -C runs=[1,2,3,4] timeout=900000 -c1

# For BlazeGraph
snakemake --configfile blazegraph.yaml -C runs=[1,2,3,4] timeout=900 -c1

Visualization

The data generated by the two snakemake commands are available in the output directory. To visualize the data, you can use the provided jupyter notebook. You just have to run the following command:

jupyter notebook eswc2023.ipynb

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
config		config
output		output
queries		queries
rules		rules
scripts		scripts
.gitignore		.gitignore
README.md		README.md
Snakefile		Snakefile
blazegraph.yaml		blazegraph.yaml
environment.yaml		environment.yaml
eswc2023.ipynb		eswc2023.ipynb
server.sh		server.sh
virtuoso.yaml		virtuoso.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Join Ordering of SPARQL Property Path Queries

Setup

Quickstart

Visualization

About

Releases

Packages

Languages

JulienDavat/Join-Ordering-of-SPARQL-Property-Path-Queries

Folders and files

Latest commit

History

Repository files navigation

Join Ordering of SPARQL Property Path Queries

Setup

Quickstart

Visualization

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages