/HorizonMachineLearning

Machine Learning Horizon Guided attributes for reservoir properties prediction

Primary LanguagePythonMIT LicenseMIT

HorizonMachineLearning

Machine Learning Horizon Guided attributes for reservoir properties prediction

General Concept

  • Seismic attributes are extracted along interpreted horizons
    • A window of, let's say 20 ms, is used for the extraction of various attributes.
    • These attributes are then exported to an x y z flat file. Let's say we extract 10 attributes.
  • Well petrophysical data, e.g. porosity, permeability or net to gross:
    • These are upscaled to the equivalent of the 20 ms window and supplied as a csv file.
    • A dataframe (a table) with all the attributes is generated:
    • Each attribute is listed in a column while the rows represent the individual locations, i.e. trace locations

Workflow I - Data Munging:

  • Format the horizon files into one file
  • Scale the horizon data
  • Create a well file with all the attributes back interpolated at the well locations
    with the last column being the petrophysical attribute, e.g. permeability
  • We are ready to apply Machine Learning

Workflow II - Data Analysis

  • Check data distributions and statistical ranges
  • Check for linearity between various predictors amongst themselves and with the target
    • Generate a matrix scatter plot
  • Check for feature importance using using RFE (Recursive Feature Elimination)
  • Check for feature contribution using PCA (Principle Component Analysis)

Workflow III - Model Fitting and Prediction:

swattriblist.py has many models to attempt to fit to your data they are all based on sklearn package.
CatBoost is installed and used instead of XGBOOST

Clustering

  • KMEANS is first tested to identify the optimum number of clusters
  • Once the optimum of clusters are found then KMEANS is applied to the predictors
  • The resulting clusters are then one hot encoded to be added as predictors for further model fitting
  • tSNE t distribution stochastic neighbor embedding. Attempts to project all your attributes to 2 components
  • umap Uniform manifold approximation and projection. A powerful clustering technique to project data on to 2 or 3
    components

Regression

Below are the various regression techniques that can be applied  
  • Linear Regression
  • SGDR : Stochastic Gradient Descent with Lasso, Ridge, and ElasticNet options
  • KNN : K Nearest Neighbors.
  • CatBoostRegression
  • NuSVR Support Vector Machines Regression
  • ANN Regression Artificial Neural Network using Keras

Classification

Below are various classification models that can be used:

  • LogisticRegression
  • GaussianNaiveBayes
  • CatBoostClassification
  • NuSVC Support Vector Machines Classification
  • QDA Quadratic Discriminant Analysis
  • GMM Gaussian Mixture Model
  • ANN Classification Artificial Neural Network using Keras

SemiSupervised Learning

Imbalanced Classification

Most of our data is imbalanced. These correction techniques apply to all classification models

  • ROS Random oversampling
  • SMOTE Synthetic Minority Oversampling
  • ADASYN Adaptive Synthetic Sampling