/DeepPVClassification

Convolutive neural network for binary classification on solar panels in aerian pictures. Data from Kasmi et al. (2022).

Primary LanguageJupyter Notebook

Applied Statistics Project: Deep Learning for Individual Solar Installation Detection from Aerial Images

Second-year project at ENSAE in partnership with Réseau Transport d'Electricité (RTE). Grade: 18 out 20.

Key Idea

Estimating diffuse electrical production is an issue of growing importance for the forecast of electrical production and consumption of French households. Indeed, more and more households are installing solar panels without it being exhaustively registered administrative data[1] making it more difficult to accurately predict national electricity consumption and thus optimize electricity distribution (RTE forecasts). We deploy here two convolutional neural network models (LeNet5 with TensorFlow and ResNet18[2] with PyTorch) to classify aerial images extracted from Google Earth Engine and labeled by Kasmi et al. (2022)[1].

Organization of the repository

  • The Stats_desc.ipynb file contains descriptive statistics of the Kasmi et al. dataset.
  • In the LeNET5.ipynb file, we parameterize the LeNet5 CNN to best predict the presence of solar panels on images.
  • In the ResNet18-pre_trained.ipynb file, we use a pre-trained ResNet18 architecture. The entire CNN is re-trained on the BDPV database.
  • In the ResNet18-transfer learning.ipynb file, only the fully-connected layers of the ResNet18 are re-trained on the BDPV database, the others are frozen.
  • In the ROC_PR_curve_ResNet18.ipynb file, we display the ROC and precision-recall curves of the two ResNet18 CNNs for comparison.
  • The delete_NA_img.ipynb file contains a function for deleting Google images that are associated with a "gray mask": the equivalent of a missing value for numerical data.
  • The src folder contains some of the python modules imported into the various notebooks: the LeNet5 architecture, the dataloader, the code for creating the confusion matrix, etc.
  • The weights folder contains the weights of our models.
  • rapport.pdf is the written project report.

Nota bene: The file ResNet18-pre_trained.ipynb should be executed on the INSEE cloud with the Jupyter-pytorch-gpu service and more than 15GB of persistence in the advanced service settings in order to download and unzip the dataset from Kasmi et al. (2022). You should also adjust the number of GPUs in the script based on the chosen machine configuration. Here is an example of a working session.

References

[1]A crowdsourced dataset of aerial images with annotated solar photovoltaic arrays and installation metadata Kasmi et al., 2023 [2]Deep Residual Learning for Image Recognition He et al., 2015 [//]: <> (HyperionSolarNet: Solar Panel Detection from Aerial Images Parhar et al., 2022)