Skip to content

Latest commit

 

History

History
9 lines (9 loc) · 820 Bytes

File metadata and controls

9 lines (9 loc) · 820 Bytes
  • This repo contains Linux/Unix command line code that I used to perform data analysis, cleaning, mining, and visualization on the real New York City August 2019 taxi dataset.
  • cmds.log contains the main commands performed for data cleaning and mining
  • plotcmnds.log contains the commands performed on Gnuplot
  • a3.txt contains the top 10 pickup locations and the top 10 pickup and dropoff pair locations that yielded the highest average "total amount" in August 2019
  • a3t3.svg contains the correlation chart between average tip amounts and passenger counts a3t4.svg contains the correlation chart between trip distances (miles) and average total earnings amounts ($)
  • awk.scr, awk.task2.scr, awk.task3.scr are the awk script files I wrote that are used for finding average total earnings and tip amounts