/2020_Clustering

Clustering algorithm to improve the retrieval of offshore-onshore correlation functions (Viens and Iwata, 2020, JGR)

Primary LanguagePythonMIT LicenseMIT

Python codes to improve the retrieval of offshore-onshore correlation functions with machine learning (Viens and Iwata, 2020, JGR)

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

  • Update - 21/07/2020: The paper is available here (The accepted paper is also available on EarthArXiv).
  • Update - 03/06/2020: 2nd release of the code.
  • 06/03/2020: 1st release of the code

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Description:

We developed a method based on unsupervised learning to improve the retrieval of deconvolution functions between offshore and onshore stations. We applied our method to retrieve deconvolution functions between offshore seismic stations deployed on top of the Nankai, Japan, and surrounding onshore stations.

  • This repo contains three folders:
    • Codes folder:

      • Reproduce_Fig_1_2.py to reproduce Figures 1 and 2 of the paper.
      • Reproduce_Fig_4_5.py to reproduce Figures 4 and 5 of the paper (data download required, see below).
      • Function_clustering_DFs.py, the main function to perform the clustering (used by the two codes above).
    • Data folder:

      • Empty folder. The data required to reproduce Figures 4 and 5 can be downloaded here and should be placed in the Data folder.
    • Figures folder:

      • Contains the 4 figures generated by the two codes.

Additional info:

  • We follow the method proposed by Viens et al. (2017, GJI) to compute the deconvolution functions. The python function (e.g., deconvolution_stab) can be found here.

  • The raw Hi-net and DONET 1 data can be downloaded from the Hi-net website.

Example:

The Reproduce_Fig_4_5.py code produces two figures.

  • The first figure (Figure 4 of the paper) is a plot of the 16563 30-min deconvolution function between the KME18 and ARIH stations for the Z-Z component computed from the data recorded between April 1, 2015 and March 31, 2016. The waveforms from the raw stack over the year (b) and the energy ratio stack (c) are also shown. Assuming a theoretical Rayleigh wave velocity of 3.0 km/s, the first physical signals should arrive after 60 s given the 181 km inter-station distance. Therefore, the clear arrivals in the anti-causal part between −60 s and −140 s in (b) and (c) are likely Rayleigh waves propagating between the two stations. However, strong near-zero time lag arrivals can also be observed in (b) and overlap with Rayleigh waves arrivals in the causal part of the DF. Such non-physical arrivals are probably generated by ambient seismic field sources located between the two stations, as the inter-station path is mainly under the ocean. Note that the DF obtained with energy ratio stack method (c) has less spurious arrivals than the raw stack over the year, but is still strongly asymmetric.

  • The second figure (Figure 5) is the output of the clustering algorithm. For this station pair, the knee method on the BIC score (Figure (b)) determines that the optimal number of clusters is five. The projection of the data on the first two PCs is shown in Figure (a) together with the clustering results. In Figure (c), we show the stacked DFs from the five clusters. The algorithm automatically selects the DF from cluster 1, which has clear anti-causal and causal Rayleigh wave arrivals and almost no spurious arrivals.