ukaea/neutronics_material_maker

Use case / refactor

CoronelBuendia opened this issue · 1 comments

OK so I've written up some material classes which wrap the existing classes and allow me to play with the material cards if I want to.

I need to sit down with you next week and work out how Serpent actually uses these cards, because it is a little annoying that just to change a color you need to rewrite a new material. I am fairly sure Serpent will only read the necessary datafiles once though, and keep pointers to the info in memory, so it shouldn't affect performance.. right?

In Python, I want to only make one material object in memory per material, and then write different material cards if need be.

On to your refactor! I guess all the pains of making a module were too much so you lumped it all into one file :) This is fine, but it is a big script, so it is worth compressing if possible!

Some ideas for refactoring (sorry if this is preachy -- just trying to help!):

  • Separate data and code:
    def find_natural_abundance(self,symbol,atomic_number):
        if symbol == 'H' and atomic_number == 1: return 0.999885
        if symbol == 'H' and atomic_number == 2: return 0.000115
        ....

You have a lot of if statements basically performing lookup functions, which results in long scripts and data and code intertwined. It's unlikely in this case, but if you needed to change the properties of an element, say, you'd have to go and look through several functions and amend the data in the script.

I think it would be really cool to have a pandas DataFrame holding the data in memory (it is not that big), and then the functions just looking up the info in the DataFrame. So far example, playing around with some of the functions to get the raw data, I can then do:

from pandas import DataFrame
df = DataFrame(data={'Symbol':sym[:112], 'Name':name, 'Atomic Number': list(range(1,113))}, index=None)

df
Out[95]: 
     Atomic Number           Name Symbol
0                1       Hydrogen      H
1                2         Helium     He
2                3        Lithium     Li
3                4      Beryllium     Be
4                5          Boron      B
5                6         Carbon      C
6                7       Nitrogen      N
7                8         Oxygen      O

We'd need a range of sub-rows for the isotopes of each element, but pandas can handle that quite well with multiple indexing. Then looking up data is just a question of using the right functions on the DataFrame.

I would then save the DataFrame to a csv, or excel, whatever human readable format, and have that be the repository for atomic data. This would be more coherent than nesting it into the script itself.

This could potentially drop ~500 lines

  • List comprehensions:
    e.g.
    def find_all_natural_element_symbols(self):
        return ['Sn', 'Xe', 'Cd', 'Te', 'Ba', 'Dy', 'Gd', 'Hg', 'Mo', 'Nd', 'Os', 'Ru', 'Sm', 'Yb', 'Ca', 'Er', 'Hf',
                'Kr', 'Pd', 'Pt', 'Se', 'Ge', 'Ni', 'Ti', 'W', 'Zn', 'Zr', 'Ce', 'Cr', 'Fe', 'Pb', 'S', 'Sr', 'Ar', 'C',
                'K', 'Mg', 'Ne', 'Si', 'U', 'Ag', 'B', 'Br', 'Cl', 'Cu', 'Eu', 'Ga', 'H', 'He', 'In', 'Ir', 'La', 'Li',
                'Lu', 'N', 'Rb', 'Re', 'Sb', 'Ta', 'Tl', 'V', 'Be', 'O', 'F', 'Na', 'Al', 'P', 'Sc', 'Mn', 'Co', 'As',
                'Y', 'Nb', 'Rh', 'I', 'Cs', 'Pr', 'Tb', 'Ho', 'Tm', 'Au', 'Bi', 'Th', 'Pa']

    def find_all_natural_elements(self):
        return [Element(symbol='Sn'),
                Element(symbol='Xe'),
                ...]

becomes

NAT_ELEMENT_SYMBOLS = ['Sn', 'Xe', 'Cd', 'Te', 'Ba', 'Dy', 'Gd', 'Hg', 'Mo', 'Nd', 'Os', 'Ru', 'Sm', 'Yb', 'Ca', 'Er', 'Hf', 'Kr', 'Pd', 'Pt', 'Se', 'Ge', 'Ni', 'Ti', 'W', 'Zn', 'Zr', 'Ce', 'Cr', 'Fe', 'Pb', 'S', 'Sr', 'Ar', 'C',
                'K', 'Mg', 'Ne', 'Si', 'U', 'Ag', 'B', 'Br', 'Cl', 'Cu', 'Eu', 'Ga', 'H', 'He', 'In', 'Ir', 'La', 'Li',
                'Lu', 'N', 'Rb', 'Re', 'Sb', 'Ta', 'Tl', 'V', 'Be', 'O', 'F', 'Na', 'Al', 'P', 'Sc', 'Mn', 'Co', 'As',
                'Y', 'Nb', 'Rh', 'I', 'Cs', 'Pr', 'Tb', 'Ho', 'Tm', 'Au', 'Bi', 'Th', 'Pa']

@staticmethod
 def find_all_natural_element_symbols():
        return NAT_ELEMENT_SYMBOLS
@staticmethod
def find_all_natural_elements():
        return [Element(e) for e in NAT_ELEMENT_SYMBOLS]

That's already 80 or so fewer lines :)

Combing the above and looking at something like:

def find_natural_isotopes_in_element_from_symbol(self,symbol):
        if symbol == 'Sn': return [Isotope(symbol=symbol , atomic_number = 112), 
                                   Isotope(symbol=symbol , atomic_number = 114), 
                                   Isotope(symbol=symbol , atomic_number = 115),
                                   Isotope(symbol=symbol , atomic_number = 116), 
                                  ....

This could be:

def find_natural_isotopes_in_element_from_symbol(self,symbol):
       return [Isotope(symbol=symbol, atmoic_number=i) for i in self.df.loc[self.df['Symbol']==symbol]['Atomic Number']

And that's another 150 lines :D

  • sys.exit() considered harmful:
    Ok to be honest, I don't really know what people think of this, but when running in an IDE at least, this throws less clear error messages:
print('material_card_name must be provided when making a serpent_material_card from a material')
sys.exit()

gives:

Material()
A list of elements present within the material must be specified
An exception has occurred, use %tb to see the full traceback.

SystemExit

/home/mc2056/anaconda3/lib/python3.6/site-packages/IPython/core/interactiveshell.py:2870: UserWarning: To exit: use 'exit', 'quit', or Ctrl-D.
  warn("To exit: use 'exit', 'quit', or Ctrl-D.", stacklevel=1)

Basically the first thing you see is SystemExit in bright red :D
whereas,

    raise ValueError('A list of elements present within the material '
                               'must be specified')
Material()

Traceback (most recent call last):

  File "<ipython-input-61-c9a6cf1dec32>", line 1, in <module>
    Material()

  File "/home/mc2056/Code/neutronics_material_maker/neutronics_material_maker/nmm.py", line 1478, in __init__
    raise ValueError('A list of elements present within the material '

ValueError: A list of elements present within the material must be specified

The latter gives me an Error which relates to the problem, and a code line_number (without having to type %tb in the console). The former ok I guess if you are working from the command line, but in some IDEs / consoles, sys.exit() will actually close the console (at least in my experience).

In the branch I just pushed, you can see what I did in my little playpen, and I only changed a couple of the sys.exit()s because they were closing my consoles. Otherwise I haven't touched the main body or done any actual refactoring.

Happy to give the refactoring a bash if you think it is worth it. Talk Monday!

EDIT: Why did you drop the BaseObject by the way? Are the serpent material cards different between classes?

Thanks for all the suggestions and ideas, I have had a go with pandas and removed a few hundred lines of if statements, I am sure there are more opportunities to remove extra lines by using pandas more. I have also swapped the sys.exits with raise error. Catch up with you on Monday