File from tar -> rework from py2 to py3
Closed this issue · 4 comments
File from tar inherits from file to create "file-like object". Manually sets the position and size of the fits file -> i.e. mimics a "normal" file and spoofs some kind of internal pointers
-> Problem:
- python3 does not allow to do the same
- as far as I have seen not possible to change the mem, size and offset_data of a "normal" file
-> Possible solutions:
-> create custom class and implement necessary methods
-> see if there is an easier way of doing it, i.e. conversion from tars to fits
-> General work plan/ possible attempts:
- see how the file is used inside the instrument modules
- see what kind of data we can use
- worst case scenario -> for now only port HARPS and re-implement the loading data interface
- as far as I have seen not possible to change the mem, size and offset_data of a "normal" file
No true private method in python -> possible chance of being able to override those methods
Look at internal docs and see if possible
Do I really want to go this way ?????
Probably important to find the root source of pfits -> output from file_from_tar does not work within the code with the flag "pfits" set to True.
When does the code enter this region?
-> By default pfits is True
-> apparently .gz files work -> not sure how/why they work (untested that they actually work)
pfits can take other values:
- know values so far: True; 2
- can it be False?
tarmode always set to5 -> all others are unused (seems like it)
Breaking down the problem of
s = type('tarobj', (file,), {})(s, mode='rb')
s.mem = extr.name
s.offset_data = extr.offset_data
s.size = extr.size
Possible motivation of this line:
-> Avoid opening and closing .fits file everytime. Thus, tar file is opened and the "member" with the proper data, i.e. the correct .fits file, is found. Now that we have that information it was needed to keep track of where, in relation to the beginning of the file, the relevant information was stored.
Step by step of the problem:
#1 type('tarobj', (file,)
A new type, with the features of a normal file was created, albeit with a new name. Important part: custom made type, will allow us to create new attributes of this object, without raising errors
#2: (s, mode='rb')
Open the tar file, in a reading-binary fashion; Easier to understand in the following way:
new_filetype = type('tarobj', (file,))
opened_file = new_filetype(s, mode = 'rb')
#3: Setting the attribute mem, offset_data and size
Since we have created a new type, we can 'declare' new attributes by just accessing them; Thus, we store the 'name' (not sure if used afterwards), offset_data (distance, in bytes (not sure if bytes), from the beginning of the opened tar file and, size (size of the block))
#4: Those new attributes, will be used to retrieve the information from the header and .data of the desired fits file
Solution for Python3:
-
Create a class to hold all of the information, i.e. opened tar file and the position within it; Afterwards, change the relevant methods (header and data retrieval) to work nicely with that class?
-
Seems the most straigthforward approach; Otherwise, attempt to mock a "file" object, which I am not sure if/how i can do it
Introduced class to hold relevant information;
Apparently working fine;
Issue solved
Performance similar to old interface