NYU Shanghai CSCI-SHU 360 Machine Learning project: by Alan, Ella and Steven.
- Block.csv : Before training, merged 点评, POI, Housing Price, town level.
- 拟合结果带街道.csv : Used K-means and Gaussian Mixture to cluster towns. Result visualization.
- 上海街道: Raw shapefiles of Town level Shanghai map.
- 采集结果: Raw data scrapped from 点评.
- housing_f_t.xlsx: Raw data of housing price connect with town level and fishnet level.
- POI_fishnet_steven.xlsx: POI values, calculated each fishnet cell.
- poi_fishnet.xlsx: similar to previous one.
- fishnet4900_shanghai.shp: Shapefile of fishnet.
- POI_Fishnet_for_training.xls: Early clustering training dataset.
- fishnet_centroid.xlsx: Fishnet centroid positions.
- fishnet_density.csv: Training attributes: 点评,POI,Housing mapped to each fishnet
- poi_pure.csv: Pure POI coordinates
- poi.csv: POI coordinates and attribute values
- housing_kriging.xlsx: Housing data fishnet, with Kriging interpol.
-
normalized_fishnet_town.csv: Early try of normalizing fishnet density
-
norm_f_town_district.csv: normalized data.
-
POI_Fishnet_for_training.csv: Training dataset that consists of 3 columns.
k_index | POI_normalize | commercial_price | Ground_truth_label | housing_price |
---|---|---|---|---|
1846 | -0.456332229 | -0.232740344 | 2 | 2.2294739887415953 |
1847 | -0.446541948 | -0.03508537 | 2 | 1.1101016907979073 |
1848 | -0.285135448 | -0.020974646 | 2 | 1.157380772499592 |
1849 | 0.51669039 | -0.092973206 | 2 | 0.02686459639227951 |
1850 | 0.782230116 | -0.171101576 | 2 | 0.02686459639227951 |
1851 | -0.181002222 | 0.274272878 | 2 | 0.02686459639227951 |
1852 | -0.25389548 | 0.494809612 | 2 | 1.157380772499592 |
1853 | 0.084537503 | 0 | 2 | 1.8267865167539685 |
1854 | 0.027264229 | 0.50115112 | 2 | 0.16032218247970148 |
1855 | -0.358028706 | -0.225356818 | 2 | 0.16032218247970148 |
1856 | -0.37364869 | -0.567579648 | 2 | 0.4922316729195175 |
1857 | -0.045629029 | 0 | 2 | 1.2664371518217568 |
- POI data, without normalization
- POI data, normalized at district level
- POI data, normalized at town level
- Housing data, without normalization
- Housing data, normalized at district level
- Housing data, normalized at town level
- dianping-scrapper.py: 点评scrapping code.
- alt.py: same as above.
- cookie.txt: Cookie generated during scrapping
- K_Means_街道.ipynb: Training code. Town level, clustering.
- poi_dp_house_pop_fish.ipynb: Training code. Fishnet level, clustering.
- poi_density_housing_normalize.ipynb: Data processing code. Normalizing fishnet data.
- Density_POI.ipynb: Pure POI data visualization.