Skip to content

de-duplication#5

Open
DustinKLo wants to merge 5 commits intomasterfrom
handle_duplicates
Open

de-duplication#5
DustinKLo wants to merge 5 commits intomasterfrom
handle_duplicates

Conversation

@DustinKLo
Copy link
Contributor

logic that compares scihub acq with existing acq by missiondatatakeid
added function to convert timestamp string to datetime obj to compare scihub acq to existing acq
copied deprecate_document function to that we can deprecate the older acq
fixed some python formatting
added .gitignore file
added some logging to my de-duplication logic

…ssiondatatakeid

added function to convert timestamp string to datetime obj to compare scihub acq to existing acq
copied deprecate_document function to that we can deprecate the older acq
fixed some python formatting
added .gitignore file
added some logging to my de-duplication logic
@DustinKLo DustinKLo requested a review from NamrataM June 4, 2019 23:09
acq_id = item['_source']['metadata'].get('id')
ingestion_date = item['_source']['metadata'].get('ingestiondate')
mission_data_id_store[missiondatatakeid] = {
'id': acq_id,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking through the usage of the mission_data_id_store, I don't see the "id" field being utilized anywhere. @DustinKLo Could you verify that? If it's not being used then please delete it from the dict.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Screen Shot 2019-06-05 at 9 39 19 AM

@NamrataM yes, we save it in the metadata field, i use it so i can remove id's from prods_missing if scihub's acquisition is older than ours

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DustinKLo Okay great, thanks for that. I need to revisit the logic of duplicate detection then. May need to add a few more constraints.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants