Please paste the URL from Github to https://nbviewer.jupyter.org/ if failed to load.
Just finished my 100 days challenge on machine learning basics and feel like I need to review and actually get hands on cleaning the data. This challenge will (hopefully) also contains some real world data scraping and cleaning.
打醬油剛(蹣跚)完成第二屆機器學習百日馬拉松,想藉IT邦幫忙30日鐵人賽重點複習資料清理的部分,並爬取有興趣的數據來實際操作。
Day01 Jupyter Notebook. Jupyter Notebook基本安裝與操作
Day02 What is EDA (Exploratory Data Analysis)? 淺談何謂探索式資料分析
Day03 Pandas DataFrame, Label Encoding, and One Hot Encoding 基本資料類型、標籤編碼與獨熱編碼
Day04 Outlier and some Numpy 離群值與Numpy操作
Day05 Pandas skills: read in files. Pandas操作:讀取不同格式
Day06 Pandas skills: Data Wrangling. Pandas操作:資料角力
Day07 Pandas skills: Pandas cheat sheet. Pandas操作:快查表中文化
Day08 Basic Data Visualizations with Pandas 1/2. Pandas視覺化資料基礎 1/2
Day09 Basic Data Visualizations with Pandas 2/2. Pandas視覺化資料基礎 2/2
Day10 Data Visualization Tools: Matplotlib 視覺化資料工具:Matplotlib
Day11 Data Visualization Tools: Plotly 視覺化資料工具:Plotly
Day12 Data Visualization Tools: Seaborn. 視覺化資料工具:Seaborn
Day13 Converting Continuous Variables into Discrete Values 連續型變數離散化
Day14 Feature Engineering, Kurtosis and Skewness 淺談特徵工程、峰度與偏度
Day15 Numerical Data 1/2 replace N/A or outlier 數值型特徵 1/2 填補N/A與離群值
Day16 Numerical Data 2/2 reduce skewness 數值型特徵 2/2 去除偏態
Day17 Categorical Data 1/2 mean encoding 類別型特徵 1/2 均值編碼
Day18 Categorical Data 2/2 counting and feature hashing 類別型特徵 2/2 計數編碼與特徵雜湊
Day19 Time Series Feature 時間型特徵
Day20 Airbnb in Berlin 1/5 booking rate 柏林Airbnb 1/5 訂房率
Day21 Airbnb in Berlin 2/5 listings overview 柏林Airbnb 2/5 房源概述
Day22 Airbnb in Berlin 3/5 the ring zone 柏林Airbnb 3/5 蛋黃區
Day23 Airbnb in Berlin 4/5 listings analysis 柏林Airbnb 4/5 蛋黃區房源分析
Day24 Airbnb in Berlin 5/5 the ring zone summary 柏林Airbnb 5/5 蛋黃區房源分析小結
Day25 Beautiful Soup Try Out: Stepstone Posting 美麗的湯爬蟲初體驗:達石職缺
Day27 BS4 Scrape from Youtube 1/2 用美麗的湯爬取Youtube 1/2
Day28 BS4: Scrape from Youtube 2/2 用美麗的湯爬取Youtube 2/2
Day29 Scraping from IMDb with Selenium 1/2 用Selenium爬取IMDb 1/2
Day30 Scraping from IMDb with Selenium 2/2 用Selenium爬取IMDb 2/2