Operations with Box1D spectral element significantly slower than with GaussianFlux1D

Description

Not entirely sure if this is a real issue or I'm misunderstanding some synphot internals, but I have noticed that Observation operations when using a bandpass based on Box1D are systematically slower than equivalent operations with a bandpass based on a Gaussian element.

Expected behavior

Both sets of operations to take approximately the same amounts of time.

Actual behavior

Steps to Reproduce

MWE example below. Specifying integration_type has no effect in either case.

from synphot import Observation, SourceSpectrum, SpectralElement, units
from synphot.models import BlackBodyNorm1D, GaussianFlux1D, Box1D
import numpy as np
s = SourceSpectrum(BlackBodyNorm1D, temperature=5000)
l = s.avgwave()
BW = l*0.1
FWHM = BW*np.sqrt(np.log(2)/np.pi)*2
box = SpectralElement(Box1D, x_0=l, width=BW)
gauss = SpectralElement(GaussianFlux1D, mean=l, fwhm=FWHM)

obs_box = Observation(s, box)
obs_gauss = Observation(s, gauss)

flux_box = obs_box.integrate()
flux_gauss = obs_gauss.integrate()
print(flux_box - flux_gauss) # -0.0006330283047517593 ph / (cm2 s)

Timing tests with ipython on two different systems:

%timeit obs_box = Observation(s, box)
#sys 1: 458 ms ± 43.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
#sys 2: 106 ms ± 368 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

%timeit obs_gauss = Observation(s, gauss)
#sys 1: 62.8 ms ± 2.57 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
#sys 2: 5.77 ms ± 16.6 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

%timeit flux_box = obs_box.integrate()
#sys 1: 265 ms ± 22.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
#sys 2: 27.1 ms ± 250 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

%timeit flux_gauss = obs_gauss.integrate()
#sys 1: 48 ms ± 5.31 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
#sys 2: 4.33 ms ± 42 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

System Details

sys1:
macOS-11.7-x86_64-i386-64bit
Python 3.10.8 (main, Oct 12 2022, 02:49:24) [Clang 13.0.0 (clang-1300.0.29.3)]
Numpy 1.23.4
astropy 5.1
Scipy 1.8.1
Matplotlib 3.5.3
Synphot 1.1.1

sys2:
Linux-4.15.0-176-generic-x86_64-with-glibc2.27
Python 3.10.2 (main, Feb 22 2022, 10:06:51) [GCC 7.5.0]
Numpy 1.22.2
astropy 5.0.1
Scipy 1.8.0
Matplotlib 3.5.1
Synphot 1.1.1

Hi, thanks for the issue!

Before we worry about performance, SpectralElement(GaussianFlux1D, mean=l, fwhm=FWHM) does not make sense to me. GaussianFlux1D is meant to model SourceSpectrum with flux units. For SpectralElement, you should use Gaussian1D (the latter does not recognize fwhm because that parameter never made it back to astropy.modeling.models.Gaussian1D, so you have to pass in stddev instead which I set to BW/2).

synphot_refactor/synphot/models.py

Line 483 in cb3ff4b

class Gaussian1D(BaseGaussian1D):

Even so, your performance concern is valid. I can reproduce what you see on my machine as well. I think it is because by default, for historical reasons, Box1D samples more points than Gaussian1D:

>>> box.waveset.shape
(154263,)
>>> gauss.waveset.shape  # Gaussian1D, not GaussianFlux1D
(100,)

Then in the Observation constructor, several operations involve sampling at those points, so this contribute to the performance you see:

synphot_refactor/synphot/observation.py

Line 94 in cb3ff4b

stat = band.check_overlap(spec)

synphot_refactor/synphot/observation.py

Line 144 in cb3ff4b

self._init_bins(binset)

To illustrate this point, not that I would recommend you do this, let's pass in box model directly from astropy and only sample at its edges:

>>> from synphot.spectrum import SpectralElement
>>> from astropy.modeling import models
>>> box2 = SpectralElement(models.Box1D, x_0=l, width=BW)
>>> minimal_box_waveset = box.model.sampleset(minimal=True)
>>> %timeit obs_box2 = Observation(s, box2, binset=minimal_box_waveset, force='extrap')  # You will see warning.
3.82 ms ± 20.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
>>> %timeit obs_gauss = Observation(s, gauss)
5.02 ms ± 52.6 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
>>> %timeit obs_box = Observation(s, box)
88.3 ms ± 474 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

I am not sure what can be done to improve this. I don't think I can change the number of sampled points of Box1D without some protest from stakeholders. Is the performance of Box1D actually a blocker for you?

You can also consider defining your own Box1D model but with less sampled points. You can look here how it is done:

synphot_refactor/synphot/models.py

Line 190 in cb3ff4b

class Box1D(_models.Box1D):

If I have sufficiently clarified the issue, please close the issue. Thanks again!

Thanks very much for the feedback and suggestions. I'll look into implementing my own Box wrapper class. I have sort of a niche use case where I need to generate several thousand different observation objects, so the performance hit is actually quite noticeable.

I'm wondering whether you'd consider a minor pull request that preserves the same Box1D functionality by default, but allows for users to modify the default waveset if desired? The reason I ask is that SpectralElement rejects any model not in _model_param_dict, which means that the only way to overload Box1D is to create a class with the exact same name, which feels suboptimal. I believe that a solution preserving default behavior is as simple as adding an init to Box1D of the form:

    def __init__(self, step=0.01, *args, **kwargs):

        super(Box1D, self).__init__(*args, **kwargs)
        self.step = float(step)

and then modifying the sampleset method as:

    def sampleset(self, step=None, minimal=False):
        """Return ``x`` array that samples the feature.

        Parameters
        ----------
        step : float
            Distance of first and last points w.r.t. bounding box.

        minimal : bool
            Only return the minimal points needed to define the box;
            i.e., box edges and a point outside on each side.

        """
        if step is None:
            step = self.step

@dsavransky , ah, sorry about that. Your proposal seems reasonable to me, but I'll have to run it by stakeholders. I would certainly welcome a PR to open up the discussion. Thank you!