Skip to content

DavisPL-Teaching/119-hw1

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Homework 1: Data Processing and Performance

Due date: October 25, 2024 (11:59pm)

This homework has three parts.

  • The first part (released Monday, Oct 14) reviews some of the basics of data processing in the context of Pandas (input, cleaning, validation, and manipulation of DataFrames).

  • The second part (released Wednesday, Oct 16) is about performance measuring & comparison, and will explore several design choices which can affect performance.

  • Finally, the third part (released Friday, Oct 18) is a shorter part about programming in the shell, and its relevance to data processing.

Getting Started

Clone this repository to your own machine (or open up a Codespace), then open up and complete part1.py. Parts 2 and 3 will be made available throughout the week.

If you get stuck, please ask a question on Piazza!

Submitting Your Code

We will use Gradescope to submit the code for this assignment. Instructions will be posted on Piazza. You can submit code on Gradescope in one of two ways; via a private GitHub repository which contains your work; or, via uploading a .zip file.

Grading Notes

In order to receive credit for your work, please follow the following guidelines.

  • Make sure that python3 part1.py, python3 part2.py, and python3 part3.py run successfully with no errors, and the same for pytest part1.py, pytest part2.py, and pytest part3.py. We cannot give credit to code that doesn't run!

  • Make sure that your output/part1-answers.txt, output/part2-answers.txt, and output/part3-answers.txt files are generated and up-to-date. These files should contain the answers to the questions in the assignment. Additionally, make sure your code is generating all plots in the output/ folder. Include all of these in your final code that you commit/upload.

  • If you are using GitHub to submit, make sure that you git commit and git push your latest code to your personal repository.

  • Don't rename any functions or methods or change the function signatures unless asked to do so.

  • As discussed in the syllabus, a small number of points on each homework (at most 10% of the grade) are reserved for style points. This also includes whether your free response answers are present and thoughtful. Here are some things to consider: are your variable names chosen appropriately? Have you added comments with # or docstrings with """ where appropriate? Have you removed any obsolete, unused code blocks, functions, or variables?

Credits

Many thanks to Hassnain (the TA) and the data science course at LUMS (CS 334 taught by Dr. Mobin Javed) for the data and some of the exercises that were used in Part 1 of this homework assignment.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages