Skip to content

WIP: RWC POP melody parser#12

Open
justinsalamon wants to merge 6 commits into
masterfrom
rwcpop2melody
Open

WIP: RWC POP melody parser#12
justinsalamon wants to merge 6 commits into
masterfrom
rwcpop2melody

Conversation

@justinsalamon
Copy link
Copy Markdown
Contributor

@justinsalamon
Copy link
Copy Markdown
Contributor Author

Hey everyone, do you think we could have a folder inside parsers called "data", which contains a folder per parser (when needed) with any data files the parser requires? In my case the RWC Pop melody metadata lives online in a table, which I've converted to a csv file, so I'd like to keep this csv file in the repo.

@justinsalamon
Copy link
Copy Markdown
Contributor Author

On a related note, this table also includes the track duration (resolution in seconds only though). @bmcfee do you think it'd make sense to populate the duration field of the JAMS file based on this table?

I could do it from the audio, but RWC is distributed via CD's (at least it was when I got a copy), which means the exact duration of the audio I have might depend on whatever software I used to rip the CD's in the first place, and I have no way of recovering that information. I also don't have the original CD's at hand. So I was thinking the durations reported online might be my best in this case?

@bmcfee
Copy link
Copy Markdown
Member

bmcfee commented Feb 22, 2016

So I was thinking the durations reported online might be my best in this case?

Yeah, in this case, I think that's the best you can hope for.

RWC is distributed on cds, but I thought they were data (not audio) cds? I may be mistaken though.

@justinsalamon
Copy link
Copy Markdown
Contributor Author

RWC is distributed on cds, but I thought they were data (not audio) cds? I may be mistaken though.

I don't remember TBH, you may very well be right. Still, I don't remember by which process I obtained the WAV files that I have... so I'll stick to the times reported online for now.

Data copied directly from the table that can be found online at:
https://staff.aist.go.jp/m.goto/RWC-MDB/rwc-mdb-p.html
Implemented process_folder, create_jams, fill_annotation_metadata and
fill_file_metadata
@justinsalamon
Copy link
Copy Markdown
Contributor Author

I think this is good to go, anyone care to CR it before I merge the PR?

@justinsalamon
Copy link
Copy Markdown
Contributor Author

p.s. - I still need to get permission to include the converted jams files in the repo, working on that.

jam.file_metadata.title = metadata['Title'][n_track]

d_str = metadata['Length'][n_track]
jam.file_metadata.duration = (float(d_str.split(":")[0]) * 60 +
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is pretty dirty. Since we're already using pandas here, we can do something like this instead:

In [4]: pd.Timedelta('0:2:34').total_seconds()
Out[4]: 154.0

(only problem is that you'll have to pad in hours, but i generally prefer that to reinventing second-parsing.)

@bmcfee
Copy link
Copy Markdown
Member

bmcfee commented Feb 25, 2016

@justinsalamon C has been R'ed. Mostly minor style issues, though the tempfile cleanup is a real problem. Otherwise LGTM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants