Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compile method returning 403 and KeyError: 'tweet_id' #27

Open
camila-cg opened this issue Jun 29, 2023 · 11 comments
Open

compile method returning 403 and KeyError: 'tweet_id' #27

camila-cg opened this issue Jun 29, 2023 · 11 comments

Comments

@camila-cg
Copy link

Hi everyone!

I can't compile mumin dataset. I had installed the lib, put my twitter bearer token to create the dataset object but when I try to compile I receive two error messages. The first one says something related to the twitter token, but I have already tested the same token in other situations and it works. The second message says that couldn't find tweet_id. Can you help me?

FIRST ERROR MESSAGE (I hide my client_id using 'xxxxx'):

/2023-06-29 20:36:38,903 [INFO] Loading dataset 2023-06-29 20:36:45,029 [INFO] Shrinking dataset 2023-06-29 20:36:46,178 [INFO] Rehydrating tweet nodes Rehydrating: 0%| | 0/5261 [00:00<?, ?it/s]2023-06-29 20:36:46,475 [ERROR] [403] {"client_id":"xxxxx","detail":"When authenticating requests to the Twitter API v2 endpoints, you must use keys and tokens from a Twitter developer App that is attached to a Project. You can create a project via the developer portal.","registration_url":"https://developer.twitter.com/en/docs/projects/overview","title":"Client Forbidden","required_enrollment":"Appropriate Level of API Access","reason":"client-not-enrolled","type":"https://api.twitter.com/2/problems/client-forbidden"}

SECOND ERROR MESSAGE

`
KeyError Traceback (most recent call last)
Cell In[25], line 1
----> 1 mumin_small.compile()

File ~\AppData\Roaming\Python\Python311\site-packages\mumin\dataset.py:251, in MuminDataset.compile(self, overwrite)
248 self._shrink_dataset()
250 # Rehydrate the tweets
--> 251 self._rehydrate(node_type='tweet')
252 self._rehydrate(node_type='reply')
254 # Update the IDs of the data that was there pre-hydration

File ~\AppData\Roaming\Python\Python311\site-packages\mumin\dataset.py:553, in MuminDataset._rehydrate(self, node_type)
549 self.nodes['user'] = user_df
551 # Add prehydration tweet features back to the tweets
552 self.nodes[node_type] = (self.nodes[node_type]
--> 553 .merge(prehydration_df,
554 on='tweet_id',
555 how='outer')
556 .reset_index(drop=True))
558 # Extract and store images
559 # Note: This will store self.nodes['image'], but this is only
560 # to enable extraction of URLs later on. The
561 # self.nodes['image'] will be overwritten later on.
562 if (node_type == 'tweet' and self.include_tweet_images and
563 len(source_tweet_dfs['media'])):

File ~\AppData\Roaming\Python\Python311\site-packages\pandas\core\frame.py:10090, in DataFrame.merge(self, right, how, on, left_on, right_on, left_index, right_index, sort, suffixes, copy, indicator, validate)
10071 @substitution("")
10072 @appender(_merge_doc, indents=2)
10073 def merge(
(...)
10086 validate: str | None = None,
10087 ) -> DataFrame:
10088 from pandas.core.reshape.merge import merge

10090 return merge(
10091 self,
10092 right,
10093 how=how,
10094 on=on,
10095 left_on=left_on,
10096 right_on=right_on,
10097 left_index=left_index,
10098 right_index=right_index,
10099 sort=sort,
10100 suffixes=suffixes,
10101 copy=copy,
10102 indicator=indicator,
10103 validate=validate,
10104 )

File ~\AppData\Roaming\Python\Python311\site-packages\pandas\core\reshape\merge.py:110, in merge(left, right, how, on, left_on, right_on, left_index, right_index, sort, suffixes, copy, indicator, validate)
93 @substitution("\nleft : DataFrame or named Series")
94 @appender(_merge_doc, indents=0)
95 def merge(
(...)
108 validate: str | None = None,
109 ) -> DataFrame:
--> 110 op = _MergeOperation(
111 left,
112 right,
113 how=how,
114 on=on,
115 left_on=left_on,
116 right_on=right_on,
117 left_index=left_index,
118 right_index=right_index,
119 sort=sort,
120 suffixes=suffixes,
121 indicator=indicator,
122 validate=validate,
123 )
124 return op.get_result(copy=copy)

File ~\AppData\Roaming\Python\Python311\site-packages\pandas\core\reshape\merge.py:703, in _MergeOperation.init(self, left, right, how, on, left_on, right_on, axis, left_index, right_index, sort, suffixes, indicator, validate)
696 self._cross = cross_col
698 # note this function has side effects
699 (
700 self.left_join_keys,
701 self.right_join_keys,
702 self.join_names,
--> 703 ) = self._get_merge_keys()
705 # validate the merge keys dtypes. We may need to coerce
706 # to avoid incompatible dtypes
707 self._maybe_coerce_merge_keys()

File ~\AppData\Roaming\Python\Python311\site-packages\pandas\core\reshape\merge.py:1179, in _MergeOperation._get_merge_keys(self)
1175 if lk is not None:
1176 # Then we're either Hashable or a wrong-length arraylike,
1177 # the latter of which will raise
1178 lk = cast(Hashable, lk)
-> 1179 left_keys.append(left._get_label_or_level_values(lk))
1180 join_names.append(lk)
1181 else:
1182 # work-around for merge_asof(left_index=True)

File ~\AppData\Roaming\Python\Python311\site-packages\pandas\core\generic.py:1850, in NDFrame._get_label_or_level_values(self, key, axis)
1844 values = (
1845 self.axes[axis]
1846 .get_level_values(key) # type: ignore[assignment]
1847 ._values
1848 )
1849 else:
-> 1850 raise KeyError(key)
1852 # Check for duplicates
1853 if values.ndim > 1:

KeyError: 'tweet_id'
`

@saattrupdan
Copy link
Member

saattrupdan commented Jul 1, 2023

Hey! It looks like there's something wrong with your Twitter API token. Can you confirm that you are entering the Bearer Token, and that it works?

@camila-cg
Copy link
Author

Yes, I'm using the Twitter bearer token, it should work :/

@zzoliman
Copy link

zzoliman commented Oct 5, 2023

I have the exact same errors :(

@GiovanniPioDelvecchio
Copy link

GiovanniPioDelvecchio commented Oct 6, 2023

Me too, same error, working bearer token but it is the default free one, do we need a premium or an enterprise one?

@saattrupdan
Copy link
Member

Hey all. It is probably because of Twitter/X's new policy that there's a very limited free access to their API 🙁

@lindeberg25
Copy link

Same error. @saattrupdan any plan to make a mumin dataset available to download ?

@saattrupdan
Copy link
Member

Same error. @saattrupdan any plan to make a mumin dataset available to download ?

Sorry to hear that, I'm guessing it's because of Twitter's new API limitations. And I'd love to share the full dataset, but I'm afraid the terms and conditions of the API prevents me from doing so. Hopefully the academic API will go back to how it was in the future.

@DanniXu98
Copy link

I met the same error. Has this problem solved yet?

@aliqb
Copy link

aliqb commented Jul 10, 2024

Isn't there any solution yet?

@PolinaSoloveychik
Copy link

hello, the same problem for me...Does anyone know if the "basic" twitter api plan (200$ per month) will resolve this issue?
What could be the solution for the academic research?

@xinyiwu98
Copy link

hello, the same problem for me...Does anyone know if the "basic" twitter api plan (200$ per month) will resolve this issue? What could be the solution for the academic research?

I tried with the "basic" twitter api plan, which was not enough. The most expensive plan is for sure needed :'(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants