Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Standardize vocabulary #7

Open
JG-QuarkNet opened this issue Apr 1, 2021 · 0 comments
Open

Standardize vocabulary #7

JG-QuarkNet opened this issue Apr 1, 2021 · 0 comments
Labels

Comments

@JG-QuarkNet
Copy link
Member

JG-QuarkNet commented Apr 1, 2021

In 2019, the CMS data used by iSpy was reorganized from a "flat" 10,000 events into a set of "tranches" of data tailored to student groups of different sizes.

The alterations Joel made to CIMA to accommodate this did not keep a consistent vocabulary of labels and variable names in reference to this new data organization scheme. In addition, the original program contained several unclear or misleading variable names.

Labels and variable names should be standardized. This was Joel's working template for syntax at the time he made the changes:

datablock: 5,10,25,50,100
dataset: 5.1, 10.6, etc.
dataset id: dataset -> [1,190]
dataset number: 1, ..., 100
dataset index: 5.1-4, 10.6-55, etc.
unique id: (int)[(string)(dataset id) + (string,3)(dataset number)]
(replaces "flat" event_id: [1,10000])

He did not consistently apply these, though, and we've fallen into different usages since this change. Tom suggested "group" instead of "datablock", but "group" was already used by CIMA's previous data system and Joel wanted to avoid confusion. Ken has also started using "data file" for what's labeled here as "dataset," which seems fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant