Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSON_to_CSV_Converter #21

Open
bngksgl opened this issue Feb 27, 2016 · 5 comments
Open

JSON_to_CSV_Converter #21

bngksgl opened this issue Feb 27, 2016 · 5 comments

Comments

@bngksgl
Copy link

bngksgl commented Feb 27, 2016

Hi I am trying to convert json code into CSV and I am using the code that you have provided, however I am running into some errors. When I write the $ python json_to_csv_converter.py yelp_academic_dataset.json in the commandline i am getting the following error:

Traceback (most recent call last):
File "json_to_csv_converter.py", line 122, in column_names=get_superset_of_column_names_from_file<json_file>
File "json_to_csv_converter.py", line 25, in get_superset_of_column_names_from_file
for line in fin:
File "C:\Users\Bengi\Appdata\Local\Programs\Python\Python35-32\lib\encodings\cp1252.py" line 23, in decode
return codecs.charmap_decode(input, self_errors,decoding_table)[0]
Unicode Decode Error: 'charmap' codec cant decode byte 0X9d in position 1102: character maps to

Can you help me please?

@russ-white
Copy link

@bngksgl, or anyone else, if you're seeing json DecodeErrors, it may be that the input file is not a valid json file. The Yelp dataset is delivered as a compressed archive, .tar and is, 'double-zipped'. Before running the converter script you should see 5 separate json files (business,json, review.json etc...). In Windows, after unzipping yelp_dataset_challenge_academic_dataset.tar, I had to add the .tar extension again to the ~2GB output file, and unzip that to get the individual files... Then, converting the individual json files worked fine. I ran into a similar error, not realizing the file hadn't been completely unpacked.

capture

@RashmiGautam
Copy link

I faced the same issue while downloading the dataset on MAC. It is a double zipped file.
Thank you @russ-white for the resolution!!

@ydeng11
Copy link

ydeng11 commented Oct 22, 2016

Thank you @russ-white for the explanation!

@RohitJain13
Copy link

Yes, thanks from me as well

@Batuu13
Copy link

Batuu13 commented Dec 8, 2016

You saved a lot of time man! @russ-white

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants