IDAO CONTEST 2020

Jupyter notebook file descriprtion :

  1. 2_IDAO2020.ipynb
    • Models :
      • Neural Network output vector 3 x 1
    • Note :
      • delete sat_id
      • use x_sim, y_sim, z_sim to predict x, y, z respectively as a vector
    • Result
      • has big number of loss qrround 45841290.3181 for traiming
  2. 2a_IDAO2020.ipynb
    • Models :
      • Same as before but delete normalization
    • Result
      • has big number of loss qrround 45972115.7892 for traiming
  3. 3_IDAO2020.ipynb
    • Models :
      • Using linear regression by sklearn
      • predict label one by one (use x_sim, y_sim, z_sim to predict x, y, z one by one)
    • Note :
      • sat_id not used to predict the data
    • Result
      • prediction result was so close with the simulation result see graph
      • SMAPE output were below : (smape from prediction to real should be smaller than simulation to real)
        alt text
  4. 3a_IDAO2020.ipynb
    • Models
      • same as above but not used normalization
    • Result:
      • Have same value smape as above
        alt text
  5. 4a_IDAO2020.ipynb
    • Models
      • Using LInier Regression and use x_sim to predict x, y_sim to predict y, and z_sim to predict z
    • Note
      • sat_id not used to predict
    • Reuslt
      • Sampe from prediction to real is little bit smaller than before
        alt text
  6. 5_IDAO2020.ipynb
    • Models
      • Using neural network but add some layers in num of node in each layer
      • Predict x,y,z as a vector 1x3 use x_sim, y_sim, z_sim
    • Note
      • Not use sat_id as a predictor
    • Result
      • Has result similiar with above but so better if compare with first strategy
        alt text
  7. 5a_IDAO2020.ipynb<
    • Models
      • Using neural network and predict each label by the simulation and sat id (predict x by used sat_id and x_sim, y by used sat_id and y_sim, and also same for z)
      • Predict x,y,z as a vector 1x3 use x_sim, y_sim, z_sim
    • Result
      • Has result better than previous strategy
        alt text
  8. 6_IDAO2020.ipynb
    • Models
      • Use regressor XGBRegressor
      • Using x_sim, y_sim, z_sim to predict x, y, z one by one
    • Note:
      • Not used sat_id as predictor
    • Result
      • Has result quite bad compare to Linear Regression
        alt text
  9. 7_IDAO2020..ipynb
    • Models :
      • Using Neural network to predict each label
      • To choose the predictor by find correlation each label (used label that has high correlation altough it sampled) with the target using this dictonary :
       {'x':['x_sim', 'z_sim', 'Vy_sim'],
        'y': ['x_sim', 'y_sim', 'Vz_sim'],
        'z': ['sat_id', 'x_sim', 'z_sim', 'Vy_sim', 'Vz_sim'],
        'Vx': ['y_sim', 'Vx_sim', 'Vz_sim'],
        'Vy': ['z_sim', 'Vx_sim', 'Vy_sim', 'Vz_sim'],
        'Vz': ['y_sim', 'z_sim', 'Vy_sim', 'Vz_sim']}
      mean to predict x will be used sat_id, x_sim, y_sim, and Vz_sim. To see the correlation number vam see in cek_correlation.ipynb
  10. cek_correlation.ipynb
    • Desc:
      • Finding correlation between target and predictor and to make sure the correlation is valid in each condition also tried to sampling the dataframe

Note

  • Has been tried using RandomForestRegressor and SVR from sklearn byt the result was so bad

TO DO :

  • Add detail each jupyter file in description
  • Tuning parameters
    • Tunning number of epoch:
    • for epoch : 50 improvement 21% 0.184 vs 0.143
    • try for epoch 100 add callbacks just make it final :)
    • Get target 0.15 if possible (I got you)
  • Create submission
    • Score for tes.csv
    • Save the model
    • Doing Track 2