Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data update #28

Merged
merged 3 commits into from
Aug 27, 2018
Merged

Data update #28

merged 3 commits into from
Aug 27, 2018

Conversation

dmil
Copy link
Contributor

@dmil dmil commented Aug 27, 2018

Data update from researchers at Clemson University
(including @patrick-lee-warren)

Major Changes:

closes #26

dmil added 3 commits August 27, 2018 18:00
- removes accounts that were accidentally included #16
- adds alt_external_id, tweet_id, and article_url
- adds fields that follow http(s)://t.co/ links to their first redirect if they exist in a tweet
- fixes some issues about how ids are displayed
- fixes double encoding issue
- drops some  duplicate observations
@EvanCarroll
Copy link

You say major changes, but most of those changes are already commited. We can't see the data-update either with the browser. Could you tell us what you did? Have you seen my PR here #29

What I did is pull down the data files from @patrick-lee-warren and self-host them within PostgreSQL. I also created a script to dump them (so we don't keep committing the whole repo). I also pruned the duplicates.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants