HorizonMachineLearning

Machine Learning Horizon Guided attributes for reservoir properties prediction

General Concept

Seismic attributes are extracted along interpreted horizons
- A window of, let's say 20 ms, is used for the extraction of various attributes.
- These attributes are then exported to an x y z flat file. Let's say we extract 10 attributes.
Well petrophysical data, e.g. porosity, permeability or net to gross:
- These are upscaled to the equivalent of the 20 ms window and supplied as a csv file.
- A dataframe (a table) with all the attributes is generated:
- Each attribute is listed in a column while the rows represent the individual locations, i.e. trace locations

Workflow I - Data Munging:

Format the horizon files into one file
Scale the horizon data
Create a well file with all the attributes back interpolated at the well locations
with the last column being the petrophysical attribute, e.g. permeability
We are ready to apply Machine Learning

Workflow II - Data Analysis

Check data distributions and statistical ranges
Check for linearity between various predictors amongst themselves and with the target
- Generate a matrix scatter plot
Check for feature importance using using RFE (Recursive Feature Elimination)
Check for feature contribution using PCA (Principle Component Analysis)

Workflow III - Model Fitting and Prediction:

swattriblist.py has many models to attempt to fit to your data they are all based on sklearn package.
CatBoost is installed and used instead of XGBOOST

Clustering

KMEANS is first tested to identify the optimum number of clusters
Once the optimum of clusters are found then KMEANS is applied to the predictors
The resulting clusters are then one hot encoded to be added as predictors for further model fitting
tSNE t distribution stochastic neighbor embedding. Attempts to project all your attributes to 2 components
umap Uniform manifold approximation and projection. A powerful clustering technique to project data on to 2 or 3
components

Regression

Below are the various regression techniques that can be applied

Linear Regression
SGDR : Stochastic Gradient Descent with Lasso, Ridge, and ElasticNet options
KNN : K Nearest Neighbors.
CatBoostRegression
NuSVR Support Vector Machines Regression
ANN Regression Artificial Neural Network using Keras

Classification

Below are various classification models that can be used:

LogisticRegression
GaussianNaiveBayes
CatBoostClassification
NuSVC Support Vector Machines Classification
QDA Quadratic Discriminant Analysis
GMM Gaussian Mixture Model
ANN Classification Artificial Neural Network using Keras

SemiSupervised Learning

Imbalanced Classification

Most of our data is imbalanced. These correction techniques apply to all classification models

ROS Random oversampling
SMOTE Synthetic Minority Oversampling
ADASYN Adaptive Synthetic Sampling

selkurdy/HorizonMachineLearning