Reading a large no of external source files in specfem3d cartesian
padesh opened this issue · 10 comments
I am working on seismic interferometry modeling at an oil and gas reservoir scale for an offshore setup. The ambient field is due to gravity wave action at the seafloor. Thus, my ambient sources are distributed at the seabed. I cannot use the built-in interferometry modeling setup of Specfem3d cartesian due to:
- With the built-in setup, you can only have noise sources at the free surface and not at any other depth (as in my case, where I need them at the seabed).
2 . It is not implemented for acoustic yet, and in my case, I have an acoustic-elastic coupled setup.
I have created a custom workflow to achieve what I want to do following the Tromp 2010 paper:
- Inject a point force source at the virtual source location (main receiver).
- Record the wavefield at noise locations (place receivers at the seabed in my case).
- Time reverse the recorded wavefield and inject it back.
- Record it at the other stations to get a cross-correlation field.
For step 2, there are receivers at all the grid points at the seafloor (around 20,000) that become sources at step 3. So, I make them as external source files and inject them using extern STF flag of Parfile.
The problem I am facing is it's taking a lot of time (around 15-20 minutes) (both on GPUs and CPUs) to read these many numbers of external source files, while the rest of the tasks (mesher, database generation, solver) take about 4 minutes.
Is there any way to speed up the file reading? Any suggestions would be greatly appreciated.
You mean the specfem3d code?
:) yes, good suggestion... however, in this case I just added such a quick binary read to test it out for your workflow.
what you need:
- checkout the latest devel branch version which includes PR #1675
- create your source time function files in binary format.
either you already know how to do it or you can look at a simple python script in the package:
EXAMPLES/applications/meshfem3D_examples/regular_element_mesh/create_external_source_time_function.py
which would give you the pythonic way of writing a binary file for a Fortran routine to read.
- double-check that your binary files have the correct length, no checking is done on the Fortran code, it just reads in a chunk of single-precision data, assuming that it has the correct NSTEP length.
in general, this binary read should be an order-of-magnitude faster, however the whole source location routine spends usually more time on locating than reading in the sources. so, I doubt it will become much faster. also, reading in the sources in parallel could in principle be done, but it might also lead to some file server bottlenecks when accessing too many files at the same time - so not sure how much faster that would become.
probably the best would be to implement the noise sources at arbitrary positions and have the whole noise wavefield stored as is done for the default case. that would be a bit trickier to implement though... but well, if you have time...
@danielpeter Thanks for adding a binary read. I will test it out. I agree that implementing noise sources at arbitrary positions would be the best, as one would not have to write and read files. I think, just as we write seismograms at locations from the STATIONS file, we can do the same for a NOISE locations file. Instead of writing them out, we can keep them in memory, invert and pad them, inject them back as a source, and then write out the results at STATIONS. That would be the only change we would need to make in the code, I think. Does the code need to be in fortran if I work on it?
yep, SPECFEM3D is mostly Fortran2003 standard. let me know how it goes...
Hi @danielpeter , the binary read is not working. I even used the write_binary_files function from EXAMPLES/applications/meshfem3D_examples/regular_element_mesh/create_external_source_time_function.py
The solver gives error:
Problem when reading external source time file:
number of time steps in the simulation = 14999
number of time steps read from the source time function = 151
Please make sure that the number of time steps in the external source file read is greater or equal to the number of time steps in the simulation
I even loaded the binary in python to double check if it was right and it was.
this happens when it tries to read in a binary STF file with the ascii routine. it means that your binary file doesn't end with: .bin
please check the names of your external source time function file(s), and make sure the endings for the binary filenames are appropriate, i.e., something like <my_stf_filename>.bin
@danielpeter
No they are in .bin format.
Problem when reading external source time file: DATA/ormsby_12k_4ms_0.5Hz.bin
number of time steps in the simulation = 10000
number of time steps read from the source time function = 176
Please make sure that the number of time steps in the external source file read is greater or equal to the number of time steps in the simulation
and I am creating this file using the same function from the file
EXAMPLES/applications/meshfem3D_examples/regular_element_mesh/create_external_source_time_function.py
strange, i can't reproduce that. what compiler and flags did you use?
you could check also to change the name a bit, getting rid of the '.' before the ending, i.e., something like 'ormsby_12k_4ms_0_5Hz.bin' or for checking rename it to 'stf.bin' and make sure there are no additional characters after this name string. however, i'm just guessing at this point...
binary works. 20-30% reduction in time in reading large no of source files (~25000 with 15k time steps) as compared to ascii.