sunpy/sunraster

Proposed Changed to naming convention of SJI Classes

DanRyanIrish opened this issue · 3 comments

Little thought has been given to a naming convention for SJI classes. Currently, there is only one such class, SJICube, which was named on day 1 without consideration. However, issue #72 proposes a new SJICubeSequence class so this is a good time to raise this issue.

Since SJICube was first named, a naming convention has developed around the spectrograph objects:

  • IRISSpectrogram holds a single raster scan or sit-and-stare and is sub-classed from NDCube.
  • IRISSpectrogramSequence holds a sequence of raster scans and is subclassed from NDCubeSequence.
  • IRISSpectrograph is a container object for holding IRISSpectrogram or IRISSpectrogramSequence objects from different spectral windows from the same OBS.

If we were to apply this convention to SJI, we would have:

  • IRISMap: holds a single cube of SJI data, described by a single WCS.
  • IRISMapSequence: holds multiple IRISMap objects.
  • IRISSJI: a container object for holding IRISMap and IRISMapSequence objects from different passbands of the same OBS.

Since Spectrogram and Map imply 2D data, and IRISSpectrogram and IRISMap/SJICube can hold 2- or 3-D data, a more wordy but more explicit alternative would be change both naming conventions to:

  • IRISSpectrogramCube
  • IRISSpectrogramCubeSequence
  • IRISMapCube
  • IRISMapCubeSequence
    Although Cube can be taken to mean 3-D, it also can be used for any number of dimensions, as in NDCube. So I think this naming convention would be a little more understandable for objects that can 2- or 3-D data. But is it more clear enough to warrant changing the Spectrograph API and make give the classes longer names? Name lengths are not as important as they used to be thanks for Python's tab complete, but it's a question worth asking.

@Cadair, could you give some reaction to this with your experience of the evolution of the SunPy naming conventions? Having slept on it, I am in favour of the last option, i.e. IRISSpectrogramCube/IRISSpectrogramCubeSequence/IRISMapCube/IRISMapCubeSequence convention. It is more wordy, but more explicit and far more intuitive. And tab complete means it will require only one or two extra key strokes.

To be more abstract about it, the naming convention can be thought of as
Descriptor,ObjectType where the descriptor can be generic (e.g. ND for NDCube), MeasurementType for generic measurement type objects (as in MapCube) or InstrumentMeasurementType for instrument specific objects (as in IRISMapCube). In this convention, if the object type is an object of objects, it can have two components, i.e.CubeSequence implies a Sequence of Cubes. I believe this is similar to the existing or aspired-to naming convention of SunPy.

After personal discussion between @Cadair and myself, we believe the third option (IRISSpectrogramCube/IRISSpectrogramCubeSequence/IRISMapCube/IRISMapCubeSequence) is the best way forward.

So @BaptistePellorceAstro, could you open a PR that changes the name of SJICube to IRISMapCube?

I'm a little late to the party, but I think that name lengths are important and 'tab complete' should not deter from that (I'm sure everyone will type at least once in places where there is no tab complete). I also haven't had time to properly reflect on all the IRISpy objects, and right now it feels really confusing to have so many classes.

I don't see any reason to add Cube to the class names. Why does Spectrogram imply 2D? And why should the names of the class give information about the dimensionality of the data? I think SJI makes more sense than Map (you can have spectroheliograms, which are maps and not from SJI...).

More generally, I don't understand why it is necessary to have so many classes. This seems very confusing to an end-user. Why isn't everything a xxSequence object? You can have a IRISSpectrogram as a special case of IRISSpectrogramSequence where the number of rasters is 1. It seems to me that this would also allow for easier loading of level 3 data into this object. And why is IRISSpectrograph needed? I don't see any methods or use cases to do operations across spectral windows, so all it seems to add is an additional slicing to access any functionality, which could just as well be done with a dictionary. The same is more or less true for the SJI classes. Can't think of a use case to combine multiple SJIs in one object (here the different exposures are never co-temporal and rarely co-spatial and, unlike the spectrograph, they are always in separate files).

And finally, I think the names of the reader programs are too long (e.g. read_iris_spectrograph_level2_fits). Again, these are commands that people will type all the time. Why is so much verbosity needed? I've never seen such long names in any other package. Why not just call it e.g. read_level2? By context (module name) it is clear that it's for IRIS and spectrograph, and level 2 data are FITS files (even if in the future they are not, this function can be changed to accommodate both). astropy.io.fits reader's functions are not called get_data_from_generic_fits() but getdata(). Same for scipy.io.loadmat(), etc.

I'm sorry if this sounds like I'm complaining all the time, but I think we should get this right if we want people to start using IRISpy. There is already so much reluctance in the field to get people to change from IDL, and if it's too burdensome it will make people even less likely to try it.