Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Write instructions on how to download datasets and how to install Jupyter #7

Open
rodolfo-viana opened this issue Oct 6, 2018 · 6 comments
Assignees
Labels
enhancement New feature or request

Comments

@rodolfo-viana
Copy link

What is the problem?

As this repo is designed to assist either collaborators with experience in Python or newcomers, it is advised to have a version of Serenata's datasets which do not need Docker.

How can this be addressed?

Create a script to download, clean and translate datasets without Docker. Perhaps we could use Serenata's old version of doing it -- the version in which Docker was not required.

Who could help with this issue?

I can develop this, bit I believe it is already done if we get that old version script.

Labels

Enhancement.

@rodolfo-viana rodolfo-viana added the enhancement New feature or request label Oct 6, 2018
@rodolfo-viana rodolfo-viana self-assigned this Oct 6, 2018
@cuducos
Copy link
Collaborator

cuducos commented Oct 6, 2018

I think this would be quite simple. Usually beginner struggle to get Jupyter (we could recommend miniconda, or anaconda to make this easier). The part about the datasets is straightforward though:

  1. We need a requirements.txt with serenata-toolbox>=15.1.0
  2. Then $ pip install -r requirements.txt
  3. And finally $ serenata-toolbox will download the files to the data/ directory

@rodolfo-viana
Copy link
Author

You're right, @cuducos. The part about datasets is straightforward. I will just write some .md file do explain for newcomers how it works.

About anaconda or miniconda, it is easier, for sure, but it comes with a lot of useless libraries. And there is some issues when using conda and trying to upgrade Jupyter and other libraries with pip, for instance. (For my own experience I found it troublesome.) For educational purposes, I believe the best thing is to write instructions on how to install Python, Jupyter and libraries instead of taking the shortcut of Anaconda.

I can write these instructions. :)

@rodolfo-viana rodolfo-viana changed the title Download datasets without Docker Write instructions on how to download datasets and how to install Jupyter Oct 6, 2018
@rodolfo-viana
Copy link
Author

I wrote this: #8
But it is not complete, I guess. I mean, I go through the process from the start, but do not go further. That's because I believe we should have in mind we will probably get a lot of newcomers from different backgrounds.
So this one is the first one, to get things running. Then I will write instructions on how to use git and share with us. And then, how to download fresh datasets.
Is it ok for you guys? Suggestions are welcome. :)

@jtemporal
Copy link
Collaborator

We could use this setup script as basis for doing the downloads.

@rodolfo-viana
Copy link
Author

I guess we could specially for the latest files in SdA server. It would help a lot btw.
But to extract up-to-date files from Câmara and other servers I believe I must rewrite some lines of the original code. A couple of days ago I tried some of serenata-toolbox scripts and found some issues dealing with it on Windows. So I think I will add some try/except lines to the code to get it running smoothly on Windows and upload the new version here.
What do you guys think?

@cuducos
Copy link
Collaborator

cuducos commented Oct 24, 2018

I think the way to handle it is to report this errors as issues at serenata-toolbox repo and then we work on than over there ; )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants