Skip to content

Rewrite SALAMI parser to use raw data #9

@bmcfee

Description

@bmcfee

Forking from mir-evaluation/mir_eval#162 ; parsing the parsed salami annotations could lead to errors. We should instead work on the raw version of the annotations.

I at one point had done this, but for the life of me can't find my implementation. As I recall, it was pretty nasty and should be rewritten anyway.

Basically, what one has to do is the following:

  1. Separate instrument labels (which have parentheses) from segment labels
  2. Induce segment intervals from the event boundary markers
  3. Partition segments by vocabulary for conversion.
  4. If we're daring, also transfer the instrument annotations by matching parentheses.

1 and 2 should be easy. 3 I think can be easily achieved by a clever use of the JAMS namespace structure for each annotation, and a cunning use of pandas.

4 is tricky since you sometimes see open- and close-parens on the same event, and we'll need a namespace for the instruments.

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions