Data Preprocess

overview

  • This repo is created for sharing the resources and posting the notifications from our professor.
  • This repo will be updated weekly according to our process.

Week 1

task

  • perform cleansing on multiple CSV files you have found online and augment them into one spreadsheet.
    • When you concatenate the instances from different csv files, you need to unify all possible features/attributes the way you described.
  • perform scatter plots on across multiple attributes for visualization

resource

  • This is the link of the COVID-19 data.
  • This data has been added to this project too.
  • You may find some ideas from Read CSV in Python and Data frame.
  • I tried to write a demo for reading data, and it seems to be successful.

note

  • The next meeting will be Monday at 12 noon at H133.
  • Have a great weekend :)