Transform seismic segy to logs using Machine Learning
This is a three step workflow.
Using preparelogs.py each well needs:
- the las, or any flat file listing depth and log values
- the deviation survey file with MD X Y Z
- the time depth listing to convert the log to time
Using logsrange.py compute the average start and end times for your data set
Using mlseistolog.py submit the attributes segy files and the logsfile csv
generated from preparelogs.py. The result is a segy file of the same values as
the logs.
The approach is to slice the attributes segys and the logs file, create a data set then
submit those to CatBoostRegression to compute a model per horizontal slice. The final product
results in predicted log values at every trace location. The seismic is created one slice at a time
- Volume attributes can be extracted from the time seismic segy, e.g. instantaneous attributes, spectral attributes or inversion attributes. These segy's are listed in a text file and submitted to -mlseistolog.py_
- Any log can be used for prediction, e.g. gamma ray, or density. Only a single log per file can be used. Each log is exported as an las file, a deviation file, and a depth time listing. For each well all three are input to preparelogs.py along with the well name. A composite file is generated or appended to an existing file. Each well is processed using its respective 3 files and unique name. As more wells are processed they are appended to the same file in the same directory. It is recommended to write a batch file to process all the wells so that another log, e.g. density, can be easily processed for all available wells. preparelogs.py can output the resampling in time to any integer interval, i.e. 2, 3, 4, etc.. Obviously, this should match the seismic sampling interval in ms.
- Once the composite file is generated, run logsrange.py to check on the average minimum and maximum times to use in mlseistolog.py mlseistolog.py runs by accepting a file listing of all the segys to be used as attributes (predictors) and a well list file that was previously generated by preparelogs.py and the start and end slice to process
- for each depth slice (two way time) the program extracts a slice from each segy attribute volume and compiles them into one big table the rows of which are individual traces, whereas the columns are the various attributes from the segys. If there are 10 attribute volumes, then there should be 12 columns in that table, the first 2 columns are the x and y coordinates while the last 10 represent the slices from each attribute.
- The final table is then standard scaled.
- For the same depth slice (two way time) the well log samples are read in from the well file and another table is generated with the well data, the rows of which are the wells and the columns are wellname, t2w, x, y, log value. This is then augmented by inserting the scaled seismic attributes as columns after the y column and before the log value. This file represents the model building data set.
- Right now only CatBoostRegression model is used. This is equivalent to XGBoost, with almost the same parameters, the significant of which are the learning rate, depth, and # of interations. Only those are given at the command line for the user to control.
- After the model is built the seismic table generated above is used to predict the log values for the rest of the traces.
- After all the traces are predicted they are inserted in a segy file, which represents the results of the ML exercise. The resulting slices are also saved as a csv for further analyses. A cross plot of actual vs predicted is generated at a user controlled increment along with a table of the slice as a csv. This should be used to analyse the quality of the model prediction.
- A pdf file of all the wells displaying the actual log and the predicted log is also generated as a QC.