RubyPiche is a Ruby parser and other tools for Piche, a language like Turtle. The name Piche (Zaedyus pichiy) is for an animal, also knowed as dwarf armadillo, that looks like a turtle, but is not the same. Piches are from the south of Argentina and Chile. Piche language is a compact way to code triples. The next are examples of encoding the same triples with Piche, Turtle and N Triples:
Piche: a | b c & d f , g ; h i . Turtle: a c f , g ; h i . a d f , g . b c f , g ; h i . b d f , g . N Triples: a c f . a c g . a d f . a d g . a h i . b c f . b c g . b d f . b d g . b h i .
The below diagrams for the examples above shows that N Triples, Turtle without blank nodes and Piche are structured as digraphs of length 3, but N Triples instances are lists, Turtle instances are trees and Piche are acyclic digraphs.
Piche also incorporates modules, an idea proposed by Javier D. Fernández and Claudio Gutiérrez in Compact and Modular Representation of Large RDF Data Sets. Modules include the keywords @subj, @pred and @obj to mean that the following triples begins by subject, object and predicate respectively. For example:
-
@subj a b c ; d e .
is equivalent toa b c . a d e .
-
@pred a b c ; d e .
is equivalent tob a c . d a e .
-
@obj a b c ; d e .
is equivalent tob c a . a d e .
The above figure shows a comparation of Piche with the other languages Turtle and N Triples:
The Piche syntax is:
statement ::= triples | ws* | module module ::= '@subj' | '@pred' | '@obj' triples ::= head ws+ tail '.' head ::= term | head '|' term tail ::= pairs | pairs ';' tail pairs ::= mterms ws+ lterms mterms ::= term | term '&' mterms lterms ::= term | term ',' lterms
Where ws are white spaces and terms are identifiers or literals. In Piche we have not blank nodes, because piche is designed to work with a dictionary that associate the simple local Piche identifiers with any other indetifier system. Thus, blank nodes are nodes without an external identifier in the dictionary.
The firt Piche notation was designed to allow human readability. Other notations are designed for other goals.
- Array Notation
-
Is a notation that models piche instances like arrays. For example the notation for
a b c , d .
is[[:a] [[:b] [:c, :d]]]
in Ruby language. - Notation 32 and Notation 64
-
Are notations to enconde triples in binary sequences where each identifier is an integer of 32 or 64 bits respectivaly.
- Compresed Notation
-
Are a notation where identifiers an operators are coded with techniques of data compression for integers, like Elias coding.
Tools in RubyPiche include (or are going to be included):
-
A lexical parser for Piche files.
-
A RAM graph structure that solve some basic queries.
-
A parser that read a Piche file and generate a RAM graph for it.
-
A constructor for Piche files from Turtle or N Triple files.
- Author
-
Daniel Hernández, [email protected]
- Version
-
0.0
- License
-
GPL V3
This software is provided “as is” and without any express or implied warranties, including, without limitation, the implied warranties of merchantibility and fitness for a particular purpose.