Machine Learning Horizon Guided attributes for reservoir properties prediction
- Seismic attributes are extracted along interpreted horizons
- A window of, let's say 20 ms, is used for the extraction of various attributes.
- These attributes are then exported to an x y z flat file. Let's say we extract 10 attributes.
- Well petrophysical data, e.g. porosity, permeability or net to gross:
- These are upscaled to the equivalent of the 20 ms window and supplied as a csv file.
- A dataframe (a table) with all the attributes is generated:
- Each attribute is listed in a column while the rows represent the individual locations, i.e. trace locations
- Format the horizon files into one file
- Scale the horizon data
- Create a well file with all the attributes back interpolated at the well locations
with the last column being the petrophysical attribute, e.g. permeability - We are ready to apply Machine Learning
- Check data distributions and statistical ranges
- Check for linearity between various predictors amongst themselves and with the target
- Generate a matrix scatter plot
- Check for feature importance using using RFE (Recursive Feature Elimination)
- Check for feature contribution using PCA (Principle Component Analysis)
swattriblist.py has many models to attempt to fit to your data they are all based on sklearn package.
CatBoost is installed and used instead of XGBOOST
- KMEANS is first tested to identify the optimum number of clusters
- Once the optimum of clusters are found then KMEANS is applied to the predictors
- The resulting clusters are then one hot encoded to be added as predictors for further model fitting
- tSNE t distribution stochastic neighbor embedding. Attempts to project all your attributes to 2 components
- umap Uniform manifold approximation and projection. A powerful clustering technique to project data on to 2 or 3
components
Below are the various regression techniques that can be applied
- Linear Regression
- SGDR : Stochastic Gradient Descent with Lasso, Ridge, and ElasticNet options
- KNN : K Nearest Neighbors.
- CatBoostRegression
- NuSVR Support Vector Machines Regression
- ANN Regression Artificial Neural Network using Keras
Below are various classification models that can be used:
- LogisticRegression
- GaussianNaiveBayes
- CatBoostClassification
- NuSVC Support Vector Machines Classification
- QDA Quadratic Discriminant Analysis
- GMM Gaussian Mixture Model
- ANN Classification Artificial Neural Network using Keras
Most of our data is imbalanced. These correction techniques apply to all classification models
- ROS Random oversampling
- SMOTE Synthetic Minority Oversampling
- ADASYN Adaptive Synthetic Sampling