Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 13 additions & 13 deletions test/DefaultLabelTypes_3.xml
Original file line number Diff line number Diff line change
Expand Up @@ -193,7 +193,7 @@ Free / paid for</Description>
<Description>Text represents natural language.

Examples:
A news artcile
A news article

Related:
</Description>
Expand Down Expand Up @@ -226,7 +226,7 @@ Online handwriting</Description>
</LabelType>
</LabelType>
<LabelType caption="Activity Domain" name="activityDomain">
<Description>General domain, research field or specific processing strategy of a workflow activty.
<Description>General domain, research field or specific processing strategy of a workflow activity.

Examples:
An activity for automated number plate recognition could be labelled with "OCR" domain.
Expand Down Expand Up @@ -333,7 +333,7 @@ OCR</Description>

Examples:
Stock exchange data in a newspaper,
Filled in questionaire form
Filled in questionnaire form

Related:
OCR
Expand Down Expand Up @@ -419,7 +419,7 @@ Text recognition (Visual Computing)</Description>
<Description>Natural language processing (NLP) is a field of computer science, artificial intelligence, and computational linguistics concerned with the interactions between computers and human (natural) languages. As such, NLP is related to the area of human–computer interaction. Many challenges in NLP involve: natural language understanding, enabling computers to derive meaning from human or natural language input; and others involve natural language generation.

Examples:
Digitial assistents (e.g. in smartphones)
Digital assistants (e.g. in smartphones)

Related:
OCR</Description>
Expand Down Expand Up @@ -457,7 +457,7 @@ Examples:
A simplified form of this is commonly taught to school-age children, in the identification of words as nouns, verbs, adjectives, adverbs, etc.

Related:
Named entitiy recognition,
Named entity recognition,
Tokenisation (as part of Data creation / transformation)</Description>
</LabelType>
<LabelType caption="Named entity recognition" name="namedEntities">
Expand Down Expand Up @@ -682,7 +682,7 @@ Part-of-speech tagging</Description>

Examples:
Part-of-speech tagging,
Named entitiy tagging,
Named entity tagging,
Page layout annotation (regions etc.)

Related:
Expand Down Expand Up @@ -795,7 +795,7 @@ Licence</Description>
<Description>Experimental, in development, prototype</Description>
</LabelType>
<LabelType caption="Industrial" name="industrial">
<Description>Production-strengh method / system that is reliable, tested, and robust</Description>
<Description>Production-strength method / system that is reliable, tested, and robust</Description>
</LabelType>
</LabelType>
<LabelType caption="Original Source" name="originalSource">
Expand All @@ -819,7 +819,7 @@ Whiteboard writing
Related:
Physical production method</Description>
<LabelType caption="Paper document" name="paper">
<Description>The data was orignially produced on paper
<Description>The data was originally produced on paper

Example:
Printed magazine
Expand Down Expand Up @@ -993,7 +993,7 @@ Source medium</Description>
</LabelType>
</LabelType>
<LabelType caption="Content of Interest" name="contentOfInterest">
<Description>Source / target content. What is the intersting bit in the data at hand.</Description>
<Description>Source / target content. What is the interesting bit in the data at hand.</Description>
<LabelType caption="Visual content" name="visual">
<LabelType caption="Text" name="text"/>
<LabelType caption="Graphical" name="graphical">
Expand Down Expand Up @@ -1096,7 +1096,7 @@ Book</Description>
<LabelType caption="Production-related" name="production-related">
<Description>Conditions introduced during the production of the medium / object</Description>
<LabelType caption="Document characteristics" name="document-characteristics">
<Description>Document-related charactersitics</Description>
<Description>Document-related characteristics</Description>
<LabelType caption="Pasted clippings" name="pasted-clippings">
<Description>Paper clippings pasted onto a page</Description>
</LabelType>
Expand All @@ -1110,7 +1110,7 @@ Book</Description>
<Description>The content of a page reaches very close to the page border or even touches it</Description>
</LabelType>
<LabelType caption="Low paper-to-content contrast" name="low-contrast">
<Description>The contrast bwtween the paper and the page content is very low</Description>
<Description>The contrast between the paper and the page content is very low</Description>
</LabelType>
<LabelType caption="Halftoning" name="halftoning">
<Description>Dot-based halftoning printing technique was used (to emulate more colours / grey tones)</Description>
Expand Down Expand Up @@ -1283,8 +1283,8 @@ Book</Description>
</LabelType>
<LabelType caption="Included other objects" name="included-objects">
<Description>Foreign objects visible</Description>
<LabelType caption="Part of preceeding or succeeding object" name="preceeding-or-proceeding">
<Description>Part of preceeding or succeeding object included (e.g. other page)</Description>
<LabelType caption="Part of preceding or succeeding object" name="preceding-or-proceeding">
<Description>Part of preceding or succeeding object included (e.g. other page)</Description>
</LabelType>
<LabelType caption="Medium structure" name="medium-structure">
<Description>Medium structure visible (e.g. book cover)</Description>
Expand Down
24 changes: 12 additions & 12 deletions xsd_schema/OCR-D_GT_schema.xsd
Original file line number Diff line number Diff line change
Expand Up @@ -264,7 +264,7 @@
<xsd:documentation xml:lang="en">Text represents natural language.

Examples:
A news artcile
A news article

Related:
</xsd:documentation>
Expand Down Expand Up @@ -303,7 +303,7 @@
<xsd:enumeration value="content-encoding/mathematical/polygonal"/>
<xsd:enumeration value="activityDomain">
<xsd:annotation>
<xsd:documentation xml:lang="en">General domain, research field or specific processing strategy of a workflow activty.
<xsd:documentation xml:lang="en">General domain, research field or specific processing strategy of a workflow activity.

Examples:
An activity for automated number plate recognition could be labelled with "OCR" domain.
Expand Down Expand Up @@ -436,7 +436,7 @@

Examples:
Stock exchange data in a newspaper,
Filled in questionaire form
Filled in questionnaire form

Related:
OCR
Expand Down Expand Up @@ -537,7 +537,7 @@
<xsd:documentation xml:lang="en">Natural language processing (NLP) is a field of computer science, artificial intelligence, and computational linguistics concerned with the interactions between computers and human (natural) languages. As such, NLP is related to the area of human–computer interaction. Many challenges in NLP involve: natural language understanding, enabling computers to derive meaning from human or natural language input; and others involve natural language generation.

Examples:
Digitial assistents (e.g. in smartphones)
Digital assistants (e.g. in smartphones)

Related:
OCR</xsd:documentation>
Expand Down Expand Up @@ -584,7 +584,7 @@
A simplified form of this is commonly taught to school-age children, in the identification of words as nouns, verbs, adjectives, adverbs, etc.

Related:
Named entitiy recognition,
Named entity recognition,
Tokenisation (as part of Data creation / transformation)</xsd:documentation>
</xsd:annotation>
</xsd:enumeration>
Expand Down Expand Up @@ -851,7 +851,7 @@

Examples:
Part-of-speech tagging,
Named entitiy tagging,
Named entity tagging,
Page layout annotation (regions etc.)

Related:
Expand Down Expand Up @@ -989,7 +989,7 @@
</xsd:enumeration>
<xsd:enumeration value="maturity/industrial">
<xsd:annotation>
<xsd:documentation xml:lang="en">Production-strengh method / system that is reliable, tested, and robust</xsd:documentation>
<xsd:documentation xml:lang="en">Production-strength method / system that is reliable, tested, and robust</xsd:documentation>
</xsd:annotation>
</xsd:enumeration>
<xsd:enumeration value="originalSource">
Expand Down Expand Up @@ -1023,7 +1023,7 @@
</xsd:enumeration>
<xsd:enumeration value="originalSource/produced/physical/paper">
<xsd:annotation>
<xsd:documentation xml:lang="en">The data was orignially produced on paper
<xsd:documentation xml:lang="en">The data was originally produced on paper

Example:
Printed magazine
Expand Down Expand Up @@ -1238,7 +1238,7 @@
</xsd:enumeration>
<xsd:enumeration value="contentOfInterest">
<xsd:annotation>
<xsd:documentation xml:lang="en">Source / target content. What is the intersting bit in the data at hand.</xsd:documentation>
<xsd:documentation xml:lang="en">Source / target content. What is the interesting bit in the data at hand.</xsd:documentation>
</xsd:annotation>
</xsd:enumeration>
<xsd:enumeration value="contentOfInterest/visual"/>
Expand Down Expand Up @@ -1367,7 +1367,7 @@
</xsd:enumeration>
<xsd:enumeration value="condition/production-related/document-characteristics">
<xsd:annotation>
<xsd:documentation xml:lang="en">Document-related charactersitics</xsd:documentation>
<xsd:documentation xml:lang="en">Document-related characteristics</xsd:documentation>
</xsd:annotation>
</xsd:enumeration>
<xsd:enumeration value="condition/production-related/document-characteristics/pasted-clippings">
Expand Down Expand Up @@ -1677,9 +1677,9 @@
<xsd:documentation xml:lang="en">Foreign objects visible</xsd:documentation>
</xsd:annotation>
</xsd:enumeration>
<xsd:enumeration value="condition/acquisition/content-or-background/included-objects/preceeding-or-proceeding">
<xsd:enumeration value="condition/acquisition/content-or-background/included-objects/preceding-or-proceeding">
<xsd:annotation>
<xsd:documentation xml:lang="en">Part of preceeding or succeeding object included (e.g. other page)</xsd:documentation>
<xsd:documentation xml:lang="en">Part of preceding or succeeding object included (e.g. other page)</xsd:documentation>
</xsd:annotation>
</xsd:enumeration>
<xsd:enumeration value="condition/acquisition/content-or-background/included-objects/medium-structure">
Expand Down