Matching REAL catalogue to USGS catalogue for largest events

Question

Matching REAL catalogue to USGS catalogue for largest events

Opened this issue 3 years ago · 7 comments

Out of 24 USGS events in the timeframe 1 Jan 2020 to 30 Jun 2021 (right?)

Assuming a 4 second time window (+ and - from the USGS origin time) 7 events match

Assuming a 30 second time window, 8 events match.

The 1 event in balance:
USGS: 2020-09-24 09:34:06 96.68E 5.2464N mb 4.1, 220km, likely plate interface
REAL: 2020-09-24 09:34:35 96.1407 5.4246N ml 4.5, 32km

Likely different events. Could be triggered seismicity?
Seems to have its own event cluster while not being in the USGS catalogue

Both of these are offshore to the Northwest:

The station coverage here is not fantastic.

Answer 1 · 2021-12-23T05:36:05.000Z

REAL timestamps are all slightly later than the USGS catalogue timings which is not a serious issue until we do like a precise gridsearch

Answer 2 · 2021-12-24T07:31:26.000Z

After generating hypothetical travel times, only considering events of less than 80 km depth in the USGS catalogue

I use a P arrival fudge window of 4 s and S arrival fudge window of 8 s which is very generous, but also because:

I'm not sure what the accuracy of the USGS catalogue origin time is
There will be velocity model errors anyway (Muksin 2019 suggesting that the P wave velocity near the surface is much lower?)

I find that all the events up to May have corresponding phases as recorded by EQT (they made it past the customfilter), which is not surprising since these are the largest magnitude events and you'd imagine it would be difficult to miss them; there are 16 of them.

Of this 16, 7 of them are found in the REAL catalogue.
Of this 16, 3 of them only have 2 or 1 detections s.t. they would not be accepted by the association.

But since this is the customfilter (S SNR > 8, agreement == 20) set, if I look at the merged detections (which by the way drops EQT picks with only 1 phase), there are at least 53 detections (hence 106 phases), along with maybe 1 or 2 more events (could be aftershock, or something else entirely)

Hence, the recommendation from this would be to use the larger data set to run association, with the downside that it would probably take a long time.

I think I'll want to find a good grid spacing, which depends on the station-station distance, which I'll have to find myself. Also, I should want to run tests on a few specific days to check the number of events that REAL produce (which is also partly why I wrote the testbench)

Note: Maximum event-station distance is less than 200km so that will limit the search size for REAL grids

Find out what's the smallest grid spacing you're okay with

Speed

Vary no. of processors, keeping thread count at 32
Vary no. of vertical cells
Vary no. of horizontal cells

No. of events

Vary grid spacing
Vary threshold (unlikely that this is limiting...)

Answer 3 · 2021-12-24T07:33:29.000Z

On the bright side, most of the nodes are occupied right now so :) Merry Christmas to me I guess

Answer 4 · 2021-12-24T10:19:39.000Z

Along the linear array, you more or less want a grid size of 0.5 km because that's around half of the station distance (~1.3km, ish)

There are some very dense series but that probably doesn't matter too much for the initial association.

Unfortunately the grid search algorithm naively searches in a grid so it's very inefficient.

If you already know what the possible locations are you can have pre-defined templates... for the location.. so it's no longer a naive search.

Answer 5 · 2021-12-24T13:12:59.000Z

Test period: 5 Apr to 12 Apr 2021 since that includes 2 events in the USGS catalogue.

Speed benchmark: 5 Apr

I should add an option into REAL wrapper s.t. it automatically distributes the file list over some number of array jobs.

Answer 6 · 2021-12-27T07:31:02.000Z

No. of cores benchmark, 5 Apr 21
1 day, 32 OMP threads

16 cores: 9m 40s
8 cores: 9m 51s
4 cores: 13m 50s
2 cores: 13m48s
1 core:13m41s

Grid size of 0.1 deg (horizontal) and 5 km (vertical)
Range of 2 deg (horizontal) and 60 km (vertical)

Answer 7 · 2022-01-04T03:12:33.000Z

Settled on ~11 min run time per day with 1 deg, 0.05 deg spacings

Running on Jan to March, all EQT phases (no custom filter) with 16 workers.

Want to add n_worker option to make it even faster.

To compile number of events, check for collisions with the original catalogue.