Welcome!
obspyDMT (ObsPy Data Management Tool) is a command line tool for retrieving, processing and management of massive seismic data in a fully automatic way which could be run in serial or in parallel. Moreover, complementary processing and managing tools have been designed and introduced in addition to the obspyDMT options.
This tool is developed to mainly address the following tasks automatically:
- Retrieval of waveforms (MSEED or SAC), response files and metadata from IRIS and ORFEUS (via ArcLink) archives. This could be done in serial or in parallel for single or large requests.
- Supports event-based and continuous requests.
- Extracting the information of all the events via user-defined options (time span, magnitude, depth and event location) from IRIS and EMSC (European Mediterranean Seismological Centre).
- Updating the existing archives (waveforms, response files and metadata).
- Processing the data in serial or in parallel (e.g. Tapering, removing the trend of the time series, filtering and Instrument correction).
- Management of large seismic datasets.
- Plotting tools (events and/or station locations, Ray coverage (event-station pair) and epicentral-distance plots for all archived waveforms).
This tutorial has been divided into the following sections:
- How to cite obspyDMT
- Lets get started: install obspyDMT and check your local machine for required dependencies.
- Quick tour: run a quick tour for obspyDMT.
- Option types: there are two types of options for obspyDMT: option-1 (with value) and option-2 (without value)
- event-info request: if you are looking for some events and you want to get info about them without downloading waveforms.
- event-based request: retrieve the waveforms, response files and meta-data of all the requested stations for all the events found in the archive.
- continuous request: retrieve the waveforms, response files and meta-data of all the requested stations for the specified time span.
- Geographical restriction: if you are interested in the events happened in a specific geographical coordinate and/or retrieving the data from the stations in a specific circular or rectangular bounding area.
- Instrument correction: instrument correction for displacement, velocity and acceleration with full response file or Poles And Zeros (PAZ).
- Update: if you want to continue an interrupted request or complete your existing archive.
- Plot: for an existing folder, you could plot all the events and/or all the stations, ray path for event-station pairs and epicentral-distance/time for the waveforms using GMT-5 or basemap tools.
- Folder structure: the way that obspyDMT organize your retrieved and processed data in the file-based mode.
- Available options: all options currently available in obspyDMT.
If you use obspyDMT, please consider citing the code as:
Kasra Hosseini (2013), obspyDMT (Version 0.3.0) [software] [https://github.com/kasra-hosseini/obspyDMT]
Once a working Python and ObsPy environment is installed, there are two possible ways to have obspyDMT:
- Manually from the source code:
clone the obspyDMT git repository (or fork obspyDMT in GitHub and clone your fork):
$ git clone https://github.com/kasra-hosseini/obspyDMT.git /path/to/my/obspyDMT $ cd /path/to/my/obspyDMT $ python setup.py install
Alternatively:
$ git clone https://github.com/kasra-hosseini/obspyDMT.git /path/to/my/obspyDMT $ cd /path/to/my/obspyDMT $ pip install -v -e .
- Prepackaged Modules from Python Package Index (PyPI):
$ easy_install -N obspyDMT
In case that none of these worked for you, the source code could be downloaded directly from either PyPI or GitHub websites.
Finally, to check the dependencies required for running the code properly:
$ obspyDMT --check
ATTENTION: if obspyDMT is installed on your machine, it could be easily run from everywhere. However, if you want to use the source code instead:
$ cd /path/to/my/obspyDMT.py $ ./obspyDMT.py --check
In all the following examples, we assume that obspyDMT is already installed.
To run a quick tour for obspyDMT:
$ obspyDMT --tour
DMT-Tour-Data directory will be created in the current path and the retrieved/processed data will be organized there. (Please refer to Folder structure section for more information)
The retrieved raw counts could be plotted:
$ obspyDMT --plot_epi 'DMT-Tour-Data'
for plotting the corrected waveforms:
$ obspyDMT --plot_epi 'DMT-Tour-Data' --plot_type corrected
obspyDMT plots the ray coverage (ray path between each event-station pair) by:
$ obspyDMT --plot_ray 'DMT-Tour-Data'
There are two types of options in obspyDMT: option-1 (with value) and option-2 (without value). In the first type, user should provide value/s which will be stored and be used in the program as input. However, by adding type-2 options, which does not require any value, one feature will be activated or deactivated (e.g. if you enter '--check', refer to Lets get started section, the program will check all the dependencies required for running the code properly).
The general form to enter the input (i.e. change the default values) is as follow:
$ obspyDMT --option-1 'value' --option-2
To show all the available options with short descriptions:
$ obspyDMT --help
The options specified by --option=OPTION are type-1 (with value) and --option are type-2 (without value).
ONE GOOD THING: the order of options is commutative!
In this type of request, obspyDMT will search for all the available events based on the options specified by the user, print the results and create an event catalogue.
The following lines show how to send an event-info request with obspyDMT and present some examples.
The general way to define an event-info request is:
$ obspyDMT --event_info --option-1 'value' --option-2
The --event_info flag forces the code to just retrieve the event information and create an event catalog. For details on option-1 and option-2 please refer to Option types section.
Example 1: run with the default values:
$ obspyDMT --event_info
When the job starts, a folder will be created with the address specified for --datapath flag (by default: obspyDMT-data in the current directory). To access the event information for this example, go to /path/specified/in/datapath/2013-01-27_2013-02-01_5.5_9.9/EVENT [the folder names will change based on your request] and check the EVENT-CATALOG text file (Please refer to Folder structure section for more information)
Example 2: by adding flags to the above command, one can change the default values and add/remove functionalities of the code. As an example, the following command shows how to get the info of all the events with magnitude more than Mw 7.0 occured after 2011-03-01 and before 2012-03-01:
$ obspyDMT --event_info --min_mag '7.0' --min_date '2011-03-01' --max_date '2012-03-01'
In this type of request, the following steps will be done automatically:
- Search for all available events based on the options specified by the user.
- Check the availability of the requested stations for each event.
- Start to retrieve the waveforms and/or response files for each event and for all available stations. (default: waveforms, response files and metadata will be retrieved.)
- Instrument correction to all saved waveforms based on the specified options.
Retrieving and processing could be done in serial or in parallel.
The following lines show how to send an event-based request with obspyDMT and present short examples.
The general way to define an event-based request is:
$ obspyDMT --option-1 'value' --option-2
For details on option-1 and option-2 please refer to Option types section.
Example 1: to test the code with the defualt values run:
$ obspyDMT --test '20'
if you take away the option --test '20', the default values could result in a huge amount of requests. This option set the code to send 20 requests to IRIS and ArcLink which is suitable for testing.
When the job starts, a folder will be created with the address specified for --datapath flag (by default: obspyDMT-data in the current directory). [refer to Folder structure section]
Example 2: by adding flags to the above command, one can change the default values and add/remove functionalities of the code. As an example, the following commands show how to get all the waveforms, response files and metadata of BHZ channels available in TA network with station names start with Z for the great Tohoku-oki earthquake of magnitude Mw 9.0:
$ obspyDMT --min_mag '8.9' --min_date '2011-03-01' --identity 'TA.Z*.*.BHZ'
or instead of using identity option:
$ obspyDMT --min_mag '8.9' --min_date '2011-03-01' --net 'TA' --sta 'Z*' --cha 'BHZ'
In the case that you know from which data provider you want to retrieve the data, it is better to exclude the non-relevant one. For instance, in this example since we know that TA network is within IRIS, it makes more sense to exclude ArcLink by:
$ obspyDMT --min_mag '8.9' --min_date '2011-03-01' --identity 'TA.Z*.*.BHZ' --arc 'N'
Example 3: By default, obspyDMT saves the waveforms in SAC format. In this case, it will fill out the station location (stla and stlo), station elevation (stel), station depth (stdp), event location (evla and evlo), event depth (evdp) and event magnitude (mag) in the SAC headers. However, if the desired format is MSEED: (for downloading the same event and station identity as Example 2)
$ obspyDMT --min_mag '8.9' --min_date '2011-03-01' --identity 'TA.Z*.*.BHZ' --arc 'N' --mseed
Example 4: for downloading just the raw waveforms without response file and instrument correction:
$ obspyDMT --min_mag '8.9' --min_date '2011-03-01' --identity 'TA.Z*.*.BHZ' --arc 'N' --mseed --response 'N' --ic_no
Example 5: the default values for the preset (how close the time series data (waveform) will be cropped before the origin time of the event) and the offset (how close the time series data (waveform) will be cropped after the origin time of the event) are 0 and 1800 seconds. You could change them by adding the following flags:
$ obspyDMT --preset time_before --offset time_after --option-1 value --option-2
In this type of request, the following steps will be done automatically:
- Get the time span from input and in case of large time spans, divide it into small intervals.
- Check the availability of the requested stations for each interval.
- Start to retrieve the waveforms and/or response files for each interval and for all the available stations. (default: waveforms, response files and metadata will be retrieved.)
- Instrument correction to all saved waveforms based on the specified options.
- Merging the retrieved waveforms for all time intervals to get the original input time span and save the final product.
The following lines show how to send a continuous request with obspyDMT and present short examples.
The general way to define a continuous request is:
$ obspyDMT --continuous --option-1 value --option-2
For details on option-1 and option-2 please refer to Option types section.
Example 1: to test the code with the defualt values run:
$ obspyDMT --continuous --test '20'
if you take away the option --test '20', the default values could result in a huge amount of requests. This option set the code to send 20 requests to IRIS and ArcLink which is suitable for testing.
When the job starts, a folder will be created with the address specified for --datapath flag (by default: obspyDMT-data in the current directory). [refer to Folder structure section]
Example 2: by adding flags to the above command, one can change the default values and add/remove functionalities of the code. As an example, the following command lines show how to get all the waveforms, response files and metadata of the BHZ channels available in TA network with station names start with Z for the specified time span:
$ obspyDMT --continuous --identity 'TA.Z*.*.BHZ' --min_date '2011-01-01' --max_date '2011-01-03'
or instead of using identity option:
$ obspyDMT --continuous --net 'TA' --sta 'Z*' --cha 'BHZ' --min_date '2011-01-01' --max_date '2011-01-03'
In the case that you know from which data provider you want to retrieve the data, it is better to exclude the non-relevant one. For instance, in this example since we know that TA network is within IRIS, it makes more sense to exclude ArcLink by:
$ obspyDMT --continuous --identity 'TA.Z*.*.BHZ' --min_date '2011-01-01' --max_date '2011-01-03' --arc 'N'
Example 3: By default, obspyDMT saves the waveforms in SAC format. In this case, it will fill out the station location (stla and stlo), station elevation (stel), station depth (stdp), event location (evla and evlo), event depth (evdp) and event magnitude (mag) in the SAC headers. However, if the desired format is MSEED: (for downloading the same event and station identity as Example 2)
$ obspyDMT --continuous --identity 'TA.Z*.*.BHZ' --min_date '2011-01-01' --max_date '2011-01-03' --arc 'N' --mseed
Example 4: for downloading just the raw waveforms without response file and instrument correction:
$ obspyDMT --continuous --identity 'TA.Z*.*.BHZ' --min_date '2011-01-01' --max_date '2011-01-03' --arc 'N' --mseed --response 'N' --ic_no
If you are interested in the events happened in a specific geographical coordinate and/or retrieving the data from the stations in a specific circular or rectangular bounding area, you are in the right section! Here, we have two examples:
Example 1: to extract the info of all the events occured in 2010 in a rectangular area (lon1=44.38E lon2=63.41E lat1=24.21N lat2=40.01N) with magnitude more than 3.0 and maximum depth of 80 km: (395 events should be found!)
$ obspyDMT --event_info --min_mag '3.0' --max_depth '-80.0' --min_date '2010-01-01' --max_date '2011-01-01' --event_rect '44.38/63.41/24.21/40.01'
Example 2: to get all the waveforms, response files and metadata of BHZ channels available in a specified rectangular bounding area (lon1=125.0W lon2=70.0W lat1=25N lat2=45N) for the great Tohoku-oki earthquake of magnitude Mw 9.0, the command line will be:
$ obspyDMT --min_mag '8.9' --min_date '2011-03-01' --cha 'BHZ' --station_rect '-125.0/-70.0/25.0/45.0'
When obspyDMT retrieves waveforms and their response files, by default it applies the instrument correction to the waveform with displacement as the correction unit. To change the correction unit to Velocity or Acceleration:
$ obspyDMT --corr_unit 'VEL' --option-1 'value' --option-2 $ obspyDMT --corr_unit 'ACC' --option-1 'value' --option-2
where option-1 and option-2 are the flags defined by the user (see Option types section).
Please note that all the commands presented in this section could be applied to continuous request as well with slightly changes (refer to continuous request section).
Before applying the instrument correction, a bandpass filter will be applied to the data with default values: (0.008, 0.012, 3.0, 4.0). If you want to apply another band pass filter:
$ obspyDMT --pre_filt '(f1,f2,f3,f4)' --option-1 value --option-2
where (f1,f2,f3,f4) are the four corner frequencies of a cosine taper, one between f2 and f3 and tapers to zero for f1 < f < f2 and f3 < f < f4.
If you do not need the pre filter:
$ obspyDMT --pre_filt 'None' --option-1 value --option-2
In case that you want to apply instrument correction to an existing folder:
$ obspyDMT --ic_all 'address' --corr_unit unit
here address is the path where your not-corrected waveforms are stored. as mentioned above, unit is the unit that you want to correct the waveforms to. It could be DIS (default), VEL or ACC.
To make it more clear, let's take a look at an example with following steps:
Step 1: to get all the waveforms, response files and metadata of BHZ channels available in TA network with station names start with Z for the great Tohoku-oki earthquake of magnitude Mw 9.0 you type:
$ obspyDMT --min_mag '8.9' --min_date '2011-03-01' --identity 'TA.Z*.*.BHZ' --arc 'N'
Step 2: to correct the raw waveforms for velocity:
$ obspyDMT --ic_all '/path/specified/in/datapath' --corr_unit 'VEL'
At the end, you could idle the instrument correction functionallity by:
$ obspyDMT --ic_no --option-1 value --option-2
If you want to continue an interrupted request or complete your existing archive, you could use the updating option. The general ways to update an existing folder (located in address) for IRIS stations, ArcLink stations or both are:
$ obspyDMT --iris_update 'address' --option-1 value --option-2 $ obspyDMT --arc_update 'address' --option-1 value --option-2 $ obspyDMT --update_all 'address' --option-1 value --option-2
Please note that all the commands presented in this section could be applied to continuous request as well with slightly changes (refer to the continuous request section).
Example 1: first, lets retrieve all the waveforms, response files and metadata of BHZ channels available in TA network with station names start with Z for the great Tohoku-oki earthquake of magnitude Mw 9.0:
$ obspyDMT --min_mag '8.9' --min_date '2011-03-01' --identity 'TA.Z*.*.BHZ' --arc 'N'
now, we want to update the saved folder for BHE channels:
$ obspyDMT --update_all './obspyDMT-data' --identity 'TA.Z*.*.BHE'
For an existing folder, you could plot all the events and/or all the stations, ray path for event-station pairs and epicentral-distance/time for the waveforms.
The general syntax for plotting tools is:
$ obspyDMT --plot_option 'address'
that --plot_option could be --plot_ev for events, --plot_sta for stations, --plot_se for stations and events, --plot_ray for ray path between each event-station pairs and --plot_epi for epicentral-distance/time.
All the examples showed in this section are based on the folder created by the following request:
$ obspyDMT --min_mag '8.9' --min_date '2011-03-01' --identity 'TA.Z*.*.BHZ' --arc 'N'
Example 1: let's plot both stations and events available in the folder:
$ obspyDMT --plot_se './obspyDMT-data'
the default format is png, but assume that we want pdf for our figures, then:
$ obspyDMT --plot_se './obspyDMT-data' --plot_format 'pdf'
Example 2: in this example, we want to plot the ray path for event-station pairs but save the result in $HOME/Desktop:
$ obspyDMT --plot_ray './obspyDMT-data' --plot_format 'pdf' --plot_save '$HOME/Desktop'
obspyDMT organizes the retrieved and processed data in a homogeneous way. Basically, when you want to run the code, you could specify a directory in which all the data will be organized:
$ obspyDMT --datapath '/path/to/my/desired/address'
obspyDMT will create the folder (/path/to/my/desired/address) then start to create folders and files during retrieving and processing as it is shown in the figure:
All the options currently available in obspyDMT could be seen by:
$ obspyDMT --help
The options specified by --option=OPTION are type-1 (with value) and --option are type-2 (without value).
Here, you could also find some of the options available in obspyDMT with a short description. Options marked by (*) or (**) are:
(*): option-1 (with value)
(**): option-2 (without value)
Please refer to Option types section for more info about type 1 and type 2
options | description | options | description | |
---|---|---|---|---|
--help | show all the available flags with a short description for each and exit (**) | --test | test the program for the desired number of requests, eg: --test 10 will test the program for 10 requests. [Default: N] (*) | |
--version | show the obspyDMT version and exit (**) | --iris_update | update the specified folder for IRIS, syntax: --iris_update address_of_the _target_folder. [Default: N] (*) | |
--check | check all the dependencies and their installed versions on the local machine and exit (**) | --arc_update | update the specified folder for ArcLink, syntax: --arc_update address_of_the _target_folder. [Default: N] (*) | |
--type | type of the input (command or file) to be read by obspyDMT. Please note that for --type 'file' an external file (INPUT.cfg) should exist in the same directory as obspyDMT.py [Default: command] (*) | --update_all | update the specified folder for both IRIS and ArcLink, syntax: --update_all address_of_the _target_folder. [Default: N] (*) | |
--reset | if the datapath is found deleting it before running obspyDMT. (**) | --iris_ic | apply instrument correction to the specified folder for the downloaded waveforms from IRIS, syntax: --iris_ic address_of _the_target_folder. [Default: N] (*) | |
--datapath | the path where obspyDMT will store the data [Default: ./obspyDMT-data] (*) | --arc_ic | apply instrument correction to the specified folder for the downloaded waveforms from ArcLink, syntax: --arc_ic address_of _the_target_folder. [Default: N] (*) | |
--min_date | start time, syntax: Y-M-D-H-M-S (eg: 2010-01-01-00-00-00) or just Y-M-D [Default: 10 days ago] (*) | --iris_ic_auto | apply instrument correction automatically after downloading the waveforms from IRIS. [Default: Y] (*) | |
--max_date | end time, syntax: Y-M-D-H-M-S (eg: 2011-01-01-00-00-00) or just Y-M-D [Default: 5 days ago] (*) | --arc_ic_auto | apply instrument correction automatically after downloading the waveforms from ArcLink. [Default: Y] (*) | |
--min_mag | minimum magnitude. [Default: 5.5] (*) | --ic_all | apply instrument correction to the specified folder for all the waveforms (IRIS and ArcLink), syntax: --ic_all address_of_the _target_folder. [Default: N] (*) | |
--max_mag | maximum magnitude. [Default: 9.9] (*) | --ic_no | do not apply instrument correction automatically. This is equivalent to: --iris_ic_auto N --arc_ic_auto N (**) | |
--min_depth | minimum depth. [Default: +10.0 (above the surface!)] (*) | --pre_filt | apply a bandpass filter to the data trace before deconvolution (None if you do not need pre_filter), syntax: (f1,f2,f3,f4) which are the four corner frequencies of a cosine taper, one between f2 and f3 and tapers to zero for f1 < f < f2 and f3 < f < f4. [Default: (0.008, 0.012, 3.0, 4.0)] (*) | |
--max_depth | maximum depth. [Default: -6000.0] (*) | --corr_unit | correct the raw waveforms for DIS (m), VEL (m/s) or ACC (m/s^2). [Default: DIS] (*) | |
--event_rect | search for all the events within the defined rectangle, GMT syntax: <lonmin>/<lonmax>/ <latmin>/<latmax> [Default: -180.0/+180.0 /-90.0/+90.0] (*) | --zip_w | compress the raw-waveform files after applying instrument correction. (**) | |
--max_result | maximum number of events to be requested. [Default: 2500] (*) | --zip_r | compress the response files after applying instrument correction. (**) | |
--get_events | event-based request (please refer to the tutorial). [Default: Y] (*) | --iris_merge | merge the IRIS waveforms in the specified folder, syntax: --iris_merge address_of_the _target_folder. [Default: N] (*) | |
--continuous | continuous request (please refer to the tutorial). (**) | --arc_merge | merge the ArcLink waveforms in the specified folder, syntax: --arc_merge address_of_the _target_folder. [Default: N] (*) | |
--interval | time interval for dividing the continuous request. [Default: 86400 sec (1 day)] (*) | --iris_merge_auto | merge automatically after downloading the waveforms from IRIS. [Default: Y] (*) | |
--iris_bulk | using the IRIS bulkdataselect Web service. Since this method returns multiple channels of time series data for specified time ranges in one request, it speeds up the waveform retrieving approximately by a factor of two. [RECOMMENDED] (**) | --arc_merge_auto | merge automatically after downloading the waveforms from ArcLink. [Default: Y] (*) | |
--waveform | retrieve the waveform. [Default: Y] (*) | --merge_all | merge all waveforms (IRIS and ArcLink) in the specified folder, syntax: --merge_all address_of_the _target_folder. [Default: N] (*) | |
--response | retrieve the response file. [Default: Y] (*) | --merge_no | do not merge automatically. This is equivalent to: --iris_merge_auto N --arc_merge_auto N (**) | |
--iris | send request (waveform/response) to IRIS. [Default: Y] (*) | --merge_type | merge raw or corrected waveforms. [Default: raw] (*) | |
--arc | send request (waveform/response) to ArcLink. [Default: Y] (*) | --plot_iris | plot waveforms downloaded from IRIS. (*) | |
--SAC | SAC format for saving the waveforms. Station location (stla and stlo), station elevation (stel), station depth (stdp), event location (evla and evlo), event depth (evdp) and event magnitude (mag) will be stored in the SAC headers. [Default: MSEED] (**) | --plot_arc | plot waveforms downloaded from ArcLink. (*) | |
--time_iris | generate a data-time file for an IRIS request. This file shows the required time for each request and the stored data in the folder. (**) | --plot_all | plot all waveforms (IRIS and ArcLink). [Default: Y] (*) | |
--time_arc | generate a data-time file for an ArcLink request. This file shows the required time for each request and the stored data in the folder. (**) | --plot_type | plot raw or corrected waveforms. [Default: raw] (*) | |
--preset | time parameter in seconds which determines how close the time series data (waveform) will be cropped before the origin time of the event. [Default: 0.0 seconds. ] (*) | --plot_ev | plot all the events found in the specified folder, syntax: --plot_ev address_of _the_target_folder. [Default: N] (*) | |
--offset | time parameter in seconds which determines how close the time series data (waveform) will be cropped after the origin time of the event. [Default: 1800.0 seconds.] (*) | --plot_sta | plot all the stations found in the specified folder, syntax: --plot_sta address_of _the_target_folder. [Default: N] (*) | |
--identity | identity code restriction, syntax: net.sta.loc.cha (eg: TA.*.*.BHZ to search for all BHZ channels in TA network). [Default: ..*.*] (*) | --plot_se | plot both all the stations and all the events found in the specified folder, syntax: --plot_se address_of_the_target _folder. [Default: N] (*) | |
--net | network code. [Default: '*'] (*) | --plot_ray | plot the ray coverage for all the station-event pairs found in the specified folder, syntax: --plot_ray address _of_the_target_folder. [Default: N] (*) | |
--sta | station code. [Default: '*'] (*) | --plot_epi | plot epicentral distance-time for all the waveforms found in the specified folder, syntax: --plot_epi address_of_the_target _folder. [Default: N] (*) | |
--loc | location code. [Default: '*'] (*) | --min_epi | plot epicentral distance-time (refer to --plot_epi) for all the waveforms with epicentral-distance >= min_epi. [Default: 0.0] (*) | |
--cha | channel code. [Default: '*'] (*) | --max_epi | plot epicentral distance-time (refer to --plot_epi) for all the waveforms with epicentral-distance <= max_epi. [Default: 180.0] (*) | |
--station_rect | search for all the stations within the defined rectangle, GMT syntax: <lonmin>/<lonmax>/ <latmin>/<latmax>. May not be used together with circular bounding box station restrictions (station_circle) [Default: -180.0/+180.0/ -90.0/+90.0] (*) | --plot_save | the path where obspyDMT will store the plots [Default: '.' (the same directory as obspyDMT.py)] (*) | |
--station_circle | search for all the stations within the defined circle, syntax: <lon>/<lat>/ <rmin>/<rmax>. May not be used together with rectangular bounding box station restrictions (station_rect). (*) | --plot_format | format of the plots saved on the local machine [Default: png] (*) | |
send an email to the specified email-address after completing the job, syntax: --email email_address. [Default: N] (*) |