Request for advice on SPECFEM3D-newmark time integration algorithm
Takunmi opened this issue · 10 comments
Hello everyone,
I have been using SPECFEM3D for ground motion simulations for some time and now plan to delve deeper into the code to study the algorithms. However, I am struggling to understand the Newmark time integration algorithm employed in SPECFEM3D. My specific concerns are detailed in the attached Word file.
Any comments and suggestions would be greatly appreciated. Thank you very much.
Problem description.docx
😃 thanks for the detailed problem description!
although, i wouldn't call it a "problem", but more a time scheme "analysis". what your description shows correctly, is that the Newmark scheme for a particular choice of beta/gamma parameters (i.e., beta=0, gamma=1/2) becomes equivalent to the leapfrog scheme. the formulations are identical, and both explicit time schemes have the same conditional stability and second-order accuracy.
why say SPECFEM uses the Newmark scheme rather than a leapfrog? Dimitri introduced the Newmark scheme together with the spectral-element method. as you showed in your description, Dimitri was inspired by Hughes' book "The finite element method", which highlights the Newmark scheme in Chapter 9 for hyperbolic PDEs.
the Newmark scheme belongs to a predictor-corrector time scheme family, and is widely used in mechanics. the original paper from Newmark, "A method of computation for structural dynamics" from 1959 was published in the Journal of the Engineering mechanics division. unfortunately, in that paper you won't find any reference to the leapfrog scheme. although, i guess it must have been known already at that time. the Newmark scheme is in principle more general than the leapfrog scheme, as it allows for different beta/gamma parameters.
for SPECFEM, i guess one of the main important points was to have displacement, velocity and acceleration all available at the same time step (n) for outputting, since in seismology is it quite common to not only look at displacement, but also velocity and acceleration seismograms.
the leapfrog scheme in its default implementation stores velocity at an intermediate (n+1/2) time step, and then have displacement and velocity "leapfrogging" over each other. this wouldn't be convenient as output. however, it is also possible to re-formulate the leapfrog scheme to have velocity at integer time steps (n). that could be used as well and its implementation probably similar in memory usage as the Newmark scheme.
also, you point out a formulation of Dias & Grote that is based as well on a leapfrog scheme. in that formulation, velocity is completely replaced and the formulation only uses displacement (and acceleration). they assume that velocity often is ignored as output in simulations, so replacing velocity won't hurt there and makes it more convenient for the subsequent LTS analysis.
the choice whether to use this explicit Newmark or an equivalent leapfrog scheme is also a technical implementation question: how accurate and memory efficient is it, i.e., how many arrays do you need for the scheme and at what time should these arrays be accessible. the Newmark scheme seems a good choice for seismology simulations.
regarding the current SPECFEM Newmark implementation, another question would be how the absorbing boundaries would be performing if instead of using velocity at the intermediate step (n+1/2) for the boundary term, they would use velocity at (n) or (n+1)? there are slight differences in the output when I change this, and a more detailed analysis would be needed to see which implementation would actually be the most accurate.
My recollection from the literature on time integration schemes is that the formal equivalence between leapfrog and Newmark with beta=0 and gamma=1/2 is well established, though I don't have a reference off the top of my head.
Another advantage of Newmark, in addition of those mentionned by Daniel, is that initial conditions at time t=0 can be naturally applied both on displacements and velocities. In contrast, in leapfrog one has to pre-compute the velocity value at the first half-timestep (t=dt/2) based on the initial conditions d(0) and v(0). If this computation is not done carefully, one may introduce an error that is not second order.
hi, @danielpeter
Thank you very much for your detailed response and for sharing your insights on the future prospects of the absorbing boundaries module in SPECFEM. Your explanation about the equivalence of the Newmark and leapfrog schemes, as well as the historical context, was particularly enlightening.😃
I would like to seek your advice on how to improve the computational efficiency of the SPECFEM program by modifying the time integration algorithms. Specifically, are there any strategies or techniques you would recommend to enhance efficiency while maintaining second-order accuracy? It seems that the explicit Newmark algorithm, with its current implementation, already achieves optimal computational efficiency. Your guidance on this matter would be greatly appreciated.
from my point of view, there is not much to suggest for improving second-order time schemes. the computational time in the spectral-element method is spent mostly in the evaluation of the stiffness term. the time scheme updates have a very minor contribution to the computational efficiency of the method. and given that the formulations and stability conditions of leapfrog and Newmark are the same, I don't see much benefit in changing this.
I haven't given much thought about time schemes recently, but here would be 2 suggestions for time schemes to look at:
-
are there ways to circumvent the global stability condition? for that purpose we developed and implemented a local-time stepping (LTS) scheme for the Newmark time scheme in this SPECFEM3D_Cartesian version. this helps for meshes with local refinements on a small portion of the whole mesh. given the current AI trend, maybe you could try and find also ways to use machine learning techniques to adapt the time step locally, which would make the time-to-solution faster for such examples. at least, this seems to be possible for implicit time schemes. thus, it would be interesting in evaluating this.
-
are there more efficient time schemes with higher accuracy than second-order to avoid numerical dispersion in very long simulations? the Cartesian version has a fourth-order Runge-Kutta scheme (LDDRK) for that purpose. however, there are also other fourth-order time schemes, in particular the symplectic Position-Extended Forest-Ruth-Like (PEFRL) scheme, that seem to combine higher accuracy with better computational efficiency (PEFRL only needs 4 stages per time step instead of 6 for the LDDRK). Adding the full implementation of PEFRL was something I still wanted to do for the Cartesian version, but has been put on hold for a very long time. you might be able to finish this implementation.
regarding the SPECFEM3D_Cartesian development roadmap, here would be a list of current implementation topics:
hi,@danielpeter
Thank you for your previous response regarding the time integration schemes in the SPECFEM3D_Cartesian development roadmap. Your insights were incredibly helpful, and after studying your explanation, I researched the PEFRL scheme and am preparing to modify it based on the LDDRK scheme.
Now, I have a new question that I’d like to discuss with you, this time concerning the comparison of computational accuracy and efficiency between second-order and fourth-order schemes.
I compared the computational accuracy and efficiency of the LDDRK scheme (fourth-order) and the Newmark scheme (second-order) in the context of earthquake engineering. The model consists of approximately 70,000 grid points, using a mesh that transitions from coarse to fine, with a point source for the earthquake event. When using SPECFEM3D_Cartesian, and changing only between Newmark mode and LDDRK mode, I observed the following:
In terms of efficiency: the Newmark scheme took about 30 minutes to run, while the LDDRK scheme took around 160 minutes. The time difference is substantial, and this gap would likely widen if the LTS mode is activated under Newmark.
In terms of accuracy: As shown in the figure, when reading the acceleration time history at a surface station, the results of both schemes are nearly identical. To quantify the difference, I calculated the relative error by subtracting the acceleration values at the same time point and dividing by the peak acceleration. The maximum error was only 0.6%.
Although no analytical solution was used for comparison, this comparison between the two schemes reflects minimal difference in computational results within the context of earthquake engineering, whereas the time difference is significant.
You mentioned that fourth-order schemes have advantages in avoiding numerical dispersion in very long simulations compared to second-order ones. However, in the context of earthquake engineering problems, these advantages are not apparent. Therefore, is the only way to improve the computational efficiency of SPECFEM3D_Cartesian in such cases by enhancing the evaluation of the stiffness term or using machine learning or deep learning to adapt the time step locally?
I would greatly appreciate any insights you could offer on this issue.
just three remarks:
- regarding this sentence:
"..Therefore, is the only way to improve the computational efficiency of SPECFEM3D_Cartesian in such cases by enhancing the evaluation of the stiffness term or using machine learning or deep learning to adapt the time step locally?"
that would be too narrow in focus. there are many different ways to improve the computational efficiency of SPECFEM and other codes: some are more on the algorithmic side, like local-time stepping, others are more on the hardware side, like using half-precision methods.
for second-order vs. fourth-order time schemes, it's not necessarily the simulation duration but the number of propagated dominant wavelengths that matter. that number can always be pushed towards the high end in engineering and other fields by looking at high-frequency signals.
-
in your comparison, the Newmark time scheme is only second-order accurate for displacement, not acceleration. you would thus better compare displacements. and then to be fair towards the LDDRK scheme, you would use larger time steps in the LDDRK scheme than with the Newmark scheme as the stability regime is different between those two. trying to remember past results, i think we saw an increase in computational costs of ~2.6x (?) for LDDRK simulations, but again, with the benefit of being more accurate. so, your comparison with 30min vs. 160min looks a bit off.
-
"Faster" in principle seems always "better" for numerical wave propagation. But, what is the accuracy you're striving for? Higher accuracy methods are only needed if your measurements do require this. Thus, the goal of optimizing algorithms goes hand in hand with the measure of misfit you're looking at (signals, measurement equipment, material sensitivity, etc.) - meaning, don't overoptimize things unless it's necessary :)
Hi @danielpeter
Thank you for your previous assistance in addressing my questions. I am just starting to learn about programming and time integration algorithms, and I have many areas where I still need clarification. I truly appreciate your patience in answering my fundamental inquiries.
I have now made initial modifications to the PEFRL algorithm, as you mentioned earlier, enabling it to run in both forward and elastic simulation modes. I have also conducted preliminary validations and comparisons to verify its correctness, but I have encountered some issues. I hope you can provide me with some guidance on this. Attached is a document summarizing my current status.
Status of PEFRL Algorithm Modifications and Existing Issues.docx
Thank you for your help!
thanks for the update - this sounds promising :)
regarding the two issues:
-
attenuation: I have no other suggestions on this. the attenuation scheme to update the memory variables is a separate scheme (based on a variation of another lower-degree Runge-Kutta scheme). maybe looking at the LDDRK scheme helps to find a way how to deal with it for PEFRL.
-
comparison 4th-order vs 2nd-order scheme: you will likely have to find a setup with an analytic solution and compare the numerical solutions against the analytic one. this will also involve filtering the traces to get the same frequency content. as you noticed, different time scheme will affect the frequency content differently - so when comparing traces, this should involve a filter step first. at the end, you probably want an analysis plot of computational costs (i.e., time-to-solution) versus accuracy. such a plot would also help choosing a good setup as there are several trade-offs between simulation duration, epicentral distances, computational costs, frequency content and accuracy.
-
energy conservation: If I recall correctly, the Newmark scheme has also energy conservation, same as the PEFRL scheme. only the Runge-Kutta scheme doesn't conserve energy theoretically. and yes, the
OUTPUT_ENERGY
feature could be a good idea to check this for purely elastic simulations.
hi @danielpeter
Compared to the previous LDDRK algorithm, the PEFRL algorithm demonstrates clear advantages in terms of efficiency and energy conservation. At this stage, I aim to further explore the relative strengths and weaknesses between the PEFRL and Newmark algorithms. I have conducted some experiments and obtained preliminary results. If you have time, I would be sincerely grateful if you could review them and provide your valuable guidance. Your expertise and profound knowledge have always been a great inspiration to me, and I truly appreciate the insights and support you have provided in the past.
Current Status of PEFRL Algorithm Calculation Results.docx
Thank you for your help!