Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data format #2

Open
nealf opened this issue Jan 30, 2015 · 0 comments
Open

Data format #2

nealf opened this issue Jan 30, 2015 · 0 comments
Assignees

Comments

@nealf
Copy link
Member

nealf commented Jan 30, 2015

Here's what I'm proposing for our crime data format:
{
“_id”: String (autogenerated),
“AgencyID”: String,
“AgencyName”: String,
“CaseNumber”: String,
“CriminalOffense”: String,
“DateReported”: DateTime (YYYY-MM-DDTHH:MM:SS),
“Description”: String,
“Location”: String,
“OccurenceDate”: DateTime (YYYY-MM-DDTHH:MM:SS),
“Disposition”: String,
“Lat”: float,
“Lng”: float
}

  • _id is autogenerated by CouchDB if you need, or could use a unique id like CrimeCodeID
  • Coordinates will need to be converted or geocoded to lat/lng
  • Location should have the city/state added to the street address
  • CriminalOffense (CrimeCode) should probably be standardized to some extent

Does anybody have any thoughts on whether we should essentially keep all of the original data we scrape and then add the standardized fields separately?

@nealf nealf self-assigned this Mar 20, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant