Risoe.BINfileData2RLum.Analysis. X axis in the data must be changed

Question

Risoe.BINfileData2RLum.Analysis. X axis in the data must be changed

Closed this issue 5 years ago · 23 comments

SebastienHuot commented 5 years ago

Expected behaviour

list of values, from 0.16 to 40, by 0.16

Observed behaviour

list of values, from 0 to 40, by 0.1606426

Running mini example

rlum <- Risoe.BINfileData2RLum.Analysis(bin)

library(Luminescence)

hello

for example, I have an OSL curve, for 40 seconds, with 250 data channels. The default setting, on a Risoe!

the RLum object, created via Risoe.BINfileData2RLum.Analysis generates X-axis values, for each OSL curves. This is great. but there is a problem

the x-axis ranges
from 0 to 40, by 0.1606426
which is 40/(250 - 1)

it does not make sense. the data is recorded and aggregated over a single data channel. the corresponding x value is traditionally taken to be the end value of the channel.

so
from 0.16 to 40, by 0.16
which is 40/250

alternates ways are possible
0 to 39.84, by 0.16
also 250 channels

or 0.08 to 39.92, by 0.16

the first is more desirable (0.16 to 40), but the third is also good (0.08). The 2nd should be avoided, for 0 does not go well whenever you rely on a < log > scale.

The TL curves (simple ramp, no hold) also show the same issue
observed: from 0 to 240, by 1.004184
expected: from 1 to 240, by 1

I tried to track down the source, in the code, but lost it around Risoe.BINfileData2RLum.Data.Curve. So, I do not know where you actually do this x-axis calculation. I wanted to help, but I do not have the time right now.

thanks!

Sebastien

Answer 1 · 2019-07-31T22:07:23.000Z

Salut Sébastien,

This an issue that gave me already a headache some time ago and I guess that whatever I will write here will not be satisfying in your eyes since, you are right, from the scientific point of view (and also from the data processing perspective), the current solution is not perfect (but a reasonable compromise):

Starting from 0 is indeed hardly justified if you consider that we work with channels and the PMT (or the controller later) sums up the counts after a preset time interval. So, the first versions of 'Luminescence' did what is logical; they started at, in your case 0.16 with the first channel. This was ok until I tried to understand difference between published curve deconvolution data and data I had analysed. Surprisingly I realised: They start at 0 s. Wow, so I double-checked with the Analyst and, again, wow, all curves started at 0 s (which is still the case). After a chat, back in summer 2015, with Geoff and the R Team, I also changed the implementation to 0 s. The reasoning is twofold: (A) Users do not like diverging results from different tools (I know, a weak argument), and (B) our mathematical models do not account discretisation effects, it does not matter for small channels, but it impacts the output when the channels become larger.
The reason for the last channel is that BIN/BINX-file does not define the x-axis but leaves it to the user what to do about. The reader provides: LOW, HIGH and NPOINTS (the byte naming according to the format definition). Consequently, from a programming point of view, by definition, I have to define a sequence of numbers running from LOW to HIGH with the length NPOINTS. Since LOW is set to 0, and HIGH to, your example, 40, what you observe is what you get if you follow ((LOW - HIGH)/(NPOINTS - 1) to include LOW and HIGH. In other words, these values are not correct in the first place. The Analyst returns, however, at HIGH - channel_resolution, so in your case 39.92, which contradicts my argument above. Bottom line, this is a design flaw in the file format. It requires to make an assumption on a recorded signal based on timestamp information of the LED (but not the detector). Hard to get it right this way.

Perhaps we should follow the Analyst example? What do you think?

Besides, the code creating the x-axis is written in C++ and can be found here: https://github.com/R-Lum/Luminescence/blob/master/src/src_create_RLumDataCurve_matrix.cpp. Before we had an R code only solution, but for importing large datasets (> 100,000 curves) the performance impact was too much.

Answer 2 · 2019-07-31T22:46:17.000Z

allo Sebastian! I just had a quick look at Analyst. Not that it should serve as a base of reference. Nothing is perfect, least of all, me! If I look at Analysis/display/individual data curve, the channels go from 0 to 39.84, by 0.16 That corresponds to variant 2, in my previous explanation. Which is the worse you could do, because you cannot calculate log (0). Also, from a pure physically point of view, you ascribe a signal at time zero, which does not reflect reality. For purely physical calculation, when you want to extract a meaningful parameter, say the photoionisation cross section, the optimum is halfway: 0.08 to 39.92, by 0.16 this is not my opinion. It is what is taught in physics textbooks. So, if you cannot replicate published curves, then it would be good to retrieve the original data from these published curves, to decipher what they did. Because there might also have committed a slip up. from LOW, HIGH and NPOINTS, the formula is (using the SEQ numenclature) from High/NPoints + Low to High + Low by NPoints -> 0.16 -> 40 or from High/NPoints/2 + Low to High - High/NPoints/2 + Low by NPoints -> 0.08 to 39.92 with Low = 0, High = 40 and Npoints = 250 as an example. For the Lexsyg, in TL, their LOW is 25. The formula works as expected. so, there is no assumption made here. regarding the different/possible offset between the LED switching ON and the detector starting its count, this is an electronic issue. We are dealt with a range of numbers. We have to make do with it, whether there is an offset or not. S

…

________________________________ From: Sebastian Kreutzer <notifications@github.com> Sent: Wednesday, July 31, 2019 5:07 PM To: R-Lum/Luminescence <Luminescence@noreply.github.com> Cc: Sebastien Huot <melpomene100@hotmail.com>; Author <author@noreply.github.com> Subject: Re: [R-Lum/Luminescence] Risoe.BINfileData2RLum.Analysis. X axis in the data must be changed (#80) Salut Sébastien, This an issue that gave me already a headache some time ago and I guess that whatever I will write here will not be satisfying in your eyes since, you are right, from the scientific point of view (and also from the data processing perspective), the current solution is not perfect (but a reasonable compromise): 1. Starting from 0 is indeed hardly justified if you consider that we work with channels and the PMT (or the controller later) sums up the counts after a preset time interval. So, the first versions of 'Luminescence' did what is logical; they started at, in your case 0.16 with the first channel. This was ok until I tried to understand difference between published curve deconvolution data and data I had analysed. Surprisingly I realised: They start at 0 s. Wow, so I double-checked with the Analyst and, again, wow, all curves started at 0 s (which is still the case). After a chat, back in summer 2015, with Geoff and the R Team, I also changed the implementation to 0 s. The reasoning is twofold: (A) Users do not like diverging results from different tools (I know, a weak argument), and (B) our mathematical models do not account discretisation effects, it does not matter for small channels, but it impacts the output when the channels become larger. 2. The reason for the last channel is that BIN/BINX-file does not define the x-axis but leaves it to the user what to do about. The reader provides: LOW, HIGH and NPOINTS (the byte naming according to the format definition). Consequently, from a programming point of view, by definition, I have to define a sequence of numbers running from LOW to HIGH with the length NPOINTS. Since LOW is set to 0, and HIGH to, your example, 40, what you observe is what you get if you follow ((LOW - HIGH)/(NPOINTS - 1) to include LOW and HIGH. In other words, these values are not correct in the first place. The Analyst returns, however, at HIGH - channel_resolution, so in your case 39.92, which contradicts my argument above. Bottom line, this is a design flaw in the file format. It requires to make an assumption on a recorded signal based on timestamp information of the LED (but not the detector). Hard to get it right this way. Perhaps we should follow the Analyst example? What do you think? Besides, the code creating the x-axis is written in C++ and can be found here: https://github.com/R-Lum/Luminescence/blob/master/src/src_create_RLumDataCurve_matrix.cpp. Before we had an R code only solution, but for importing large datasets (> 100,000 curves) the performance impact was too much. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub<#80?email_source=notifications&email_token=AMURNUM6GZTAXGSMLLQOKS3QCIEJXA5CNFSM4IIKOTOKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3IWSYA#issuecomment-517040480>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AMURNUO5ADFQQ7PQLQWV2ZTQCIEJXANCNFSM4IIKOTOA>.

Answer 3 · 2019-07-31T23:36:31.000Z

The problem is not the physic but the math. The equations we are using do not account for the discretization effects. No problem for small channels but it becomes significant for large channels, albeit this is more a philosophic than a real practical problem, at least for CW-OSL.

Dividing the first channel in half sounds tempting, but what is the justification for it? From the data processing point of view, nothing was returned at this time. We do not know the count number at this time. It might be 100, it might be 100/2, it might be exp(x). I don't know, it depends on your signal. The best time would be the time of the arrival of the first photon. But we don't know this either.

So treating the data like the Analyst means a shift towards zero, sounds ok to me, just a different understanding.

Anyway, I see your points, all also justified. For the moment, I just don't know how to tackle them for the above and reasons as mentioned earlier. In the XSYG-file format, we decided to go for the 0.16 -> 40 solution, this is a least a clean solution regarding the data processing.

I will put it on the list, this needs to be done in a different context.

Answer 4 · 2019-08-01T14:45:35.000Z

@SebastienHuot I doubled check what the Viewer and the Viewer Plus are doing.

Viewer (BIN/BINX-files < v7): Shows only channels, nothing can go wrong here
Viewer Plus starts with the end of the first channel and ends at the time stamp of the last channel; probably the most logical solution.

Means, I will switch to that solution, which is also consistent with the XSYG-file format.

Answer 5 · 2019-08-01T15:35:33.000Z

thanks! I'm partial to that solution: 0.16 - 40 beside, in your case, it is also what you did for the Lexsyg. So, now it will be uniform, across binx and xsysg file format. < Dividing the first channel in half sounds tempting, but what is the justification for it? > -> that goes back a long time, for me. Back when I was doing physics at university. I will need to get back to you, if you want a more developed and justified explaination. More or less, the idea is that setting the observed value (y) at halfway along with bin width step (x) is a (linear) compromise. across the bin width step (say 0.16 second) at 0 second, there is no signal acquired yet. The very first count will occurs instantly after zero at 0.16 second, you have acquired the whole signal, within that time interval. Say, 100 counts but, again, if you split the hair as thinly as possible, you only have a sliver of a width of a signal, lying within that infinitely small time interval around 0.16 second. In math, you would switch to differential notation, to properly express these things. I do not want to go down that path, right now. the idea is that, you did not observed 100 counts, at 0.16 second. Rather, you have accumulated 100 counts, during the time interval 0 - 0.16 seconds. So, you compromise (via linear approximation) by assigning the count, 100, halfway through the time interval: 0.08 second. It only matters when you are near zero and measure a rapidly evolving phenomenon. I had comments for the other aspects you had raised, but if you are adopting the 0.16 - 40 approach, I will not say anything more. Oh, I see you have updated the package on CRAN. Thanks! have a nice day S

…

________________________________ From: Sebastian Kreutzer <notifications@github.com> Sent: Thursday, August 1, 2019 9:45 AM To: R-Lum/Luminescence <Luminescence@noreply.github.com> Cc: Sebastien Huot <melpomene100@hotmail.com>; Mention <mention@noreply.github.com> Subject: Re: [R-Lum/Luminescence] Risoe.BINfileData2RLum.Analysis. X axis in the data must be changed (#80) @SebastienHuot<https://github.com/SebastienHuot> I doubled check what the Viewer and the Viewer Plus are doing. 1. Viewer (BIN/BINX-files < v7): Shows only channels, nothing can go wrong here 2. Viewer Plus starts with the end of the first channel and ends at the time stamp of the last channel; probably the most logical solution. Means, I will switch to that solution. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub<#80?email_source=notifications&email_token=AMURNUPUK52AR22XMEPXHJ3QCLZJBA5CNFSM4IIKOTOKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3K2UAI#issuecomment-517319169>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AMURNUO7WNL2EWH57MRX7E3QCLZJBANCNFSM4IIKOTOA>.

Answer 6 · 2019-08-01T15:44:23.000Z

I just updated 'Luminescence' here on GitHub (branch: dev_0.9.x), the package released on CRAN does not contain the last two changes, sorry. I am not sure when I am going to release what will become version 0.9.4. It usually takes half a day for a release due to all the issues I have to consider along the way. Maybe at the end of August.

Answer 7 · 2019-08-01T16:01:42.000Z

I was not expecting the issues for #80 to be in the latest update. Nor #79. But I did look that the changes (News), to see what other changes had been made. I was simply extending my appreciation for the continued updates. S

…

________________________________ From: Sebastian Kreutzer <notifications@github.com> Sent: Thursday, August 1, 2019 10:44 AM To: R-Lum/Luminescence <Luminescence@noreply.github.com> Cc: Sebastien Huot <melpomene100@hotmail.com>; Mention <mention@noreply.github.com> Subject: Re: [R-Lum/Luminescence] Risoe.BINfileData2RLum.Analysis. X axis in the data must be changed (#80) I just updated 'Luminescence' here on GitHub (branch: dev_0.9.x), the package released on CRAN does not contain the last two changes, sorry. I am not sure when I am going to release what will become version 0.9.4. It usually takes half a day for a release due to all the issues I have to consider along the way. Maybe at the end of August. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub<#80?email_source=notifications&email_token=AMURNUKAAJUWCV43FMHI6NTQCMAFPA5CNFSM4IIKOTOKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3LAT7A#issuecomment-517343740>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AMURNUJNAHKSKA2W6VFMYTLQCMAFPANCNFSM4IIKOTOA>.

Answer 8 · 2019-08-01T16:01:52.000Z

It appears that the new commit passed all CI tests, means my changes do not have obvious unwanted side effects. If you like you can install the development version with the changes
included via

devtools::install_github("R-Lum/Luminescence@dev_0.9.x")

I will keep this ticket open for a while, just to be sure.

Thanks for the discussion and your support!

Answer 9 · 2019-08-16T17:51:31.000Z

slight update on the x formula

from LOW, HIGH and NPOINTS, the formula is (using the SEQ numenclature)
from (High-Low)/NPoints+Low
to High
by (High-Low)/NPoints
-> 0.16 -> 40

Answer 10 · 2019-08-16T18:55:47.000Z

@SebastienHuot Is it an update because something is wrong in the function? I double-check but it does what it should do: LOW = 0, HIGH = 40, NPOINTS = 250 results in

0.16  0.32  0.48  0.64 ... 40

The length of this object is 250. Did I overlook something?

Answer 11 · 2019-08-16T19:27:43.000Z

a discrepancy appeared if LOW > 0, typical from the Lexsyg's TL curves
such as
LOW = 25
HIGH = 199
NPOINTS = 174

as inscribed in the binx file. For the Lexsyg data, I agree it makes more sense to grab the < x > directly from the xsyg file, as it is recorded along with the PMT data.

as far as I know, the Risoe always has LOW = 0.

Answer 12 · 2019-08-16T20:02:59.000Z

Ah, because you mean now it should start at 25, not at 26, right?

Does your R terminal show something like:

[src_create_RLumDataCurve_matrix()] BIN/BINX-file non-conform. TL curve may be wrong!

If so, I know what is your problem, but this is related to FI not respecting the BIN/BINX-file format.

Answer 13 · 2019-08-16T20:09:02.000Z

[src_create_RLumDataCurve_matrix()] BIN/BINX-file non-conform. TL curve may be wrong!
for TL, from Lexsyg. Yes, all the time!

with
LOW = 25
HIGH = 199
NPOINTS = 174

from (High-Low)/NPoints+Low: (199-25)/174 + 25 = 26
to High: 199
by (High-Low)/NPoints: (199-25)/174 = 1

So, the answer is 26°C. the end of the first data channel

Answer 14 · 2019-08-17T09:17:18.000Z

Ah ok, now it's clear. The reason behind: If the BINX-file claims to be of version >= 4 and the curve is a TL curve, the format definition requires that the byte positions TOLON, TOLOFF and TOLDELAY are > 0. If this is not the case, the function still tries to calculate x-values according to the BINX-file format manual, but, as the message states, the curve might be wrong.

Can you send me one BINX-file with this problem via email? Then I can have a look; just in case. But if it is as I assume, perhaps there is nothing I can change, because if I would, I would break TL curves imported from Risø BINX-files which follow the format definition.

Answer 15 · 2019-08-20T16:05:44.000Z

TOL? You mean LType = 5 kind of data? I never done that, but here is an example in attachment (2x binx; 2x seq), on (dirty) K-feldspar, in the U340 filter on the Risoe. Natural signal. TOL: default parameters: 250°C; delay: 0, active: 1; inactive: 9; data points: 250; each channel is 0.1 second TOL-400: 400°C; delay: 0, active: 1; inactive: 9; data points: 400; each channel is 0.1 second the x axis is scaled properly, using the formulation I proposed. I have another type of binx files, where I record the PMT response, before and after a IRSL stimulation (433 - KF - Fading.binx and seq) here, both Delay and OFF are 5; ON is 0 the x axis is scaled properly.

…

________________________________ From: Sebastian Kreutzer <notifications@github.com> Sent: Saturday, August 17, 2019 4:17 AM To: R-Lum/Luminescence <Luminescence@noreply.github.com> Cc: Sebastien Huot <melpomene100@hotmail.com>; Mention <mention@noreply.github.com> Subject: Re: [R-Lum/Luminescence] Risoe.BINfileData2RLum.Analysis. X axis in the data must be changed (#80) Ah ok, now it's clear. The reason behind: If the BINX-file claims to be of version >= 4 and the curve is a TL curve, the format definition requires that the byte positions TOLON, TOLOFF and TOLDELAY are > 0. If this is not the case, the function still tries to calculate x-values according to the BINX-file format manual, but, as the message states, the curve might be wrong. Can you send me one BINX-file with this problem via email? Then I can have a look; just in case. But if it is as I assume, perhaps there is nothing I can change, because if I would, I would break TL curves important from Risø BINX-files which follow the format definition. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub<#80?email_source=notifications&email_token=AMURNUOCSYXX3CF2CTPC3ULQE66Z7A5CNFSM4IIKOTOKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4QHHRI#issuecomment-522220485>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AMURNUJF5UYQCDRZ6PATDVDQE66Z7ANCNFSM4IIKOTOA>.

Answer 16 · 2019-08-20T16:12:48.000Z

No, the bytes are used by the reader to implement various preheat possibilities, including the preheat plateau (features introduced with version 4). Anyway, thanks for your files, I will see what I can do. Maybe I have overlooked something.

Answer 17 · 2019-08-20T16:18:14.000Z

I forgot to add, a TL curve has three parts:

first ramping, here we need the Byte information from LOW, AN_TEMP, TOLDELAY
the plateau; simply TOLON
the end 'ramping' , where we need AN_TEMP, HIGH, TOLOFF

Answer 18 · 2019-08-20T17:58:59.000Z

Oh, thanks for these. I rarely record both the ramp and plateau. In great part, due to display issue. but first, I think I will briefly re-iterate the original symptom that triggered this issue: - the x scale begins at 0 and ends at 40 for a typical OSL stimulation of 40 seconds. The same problem occurs in the TL data. This is now fixed, in the 0.9.x version, by having the first data channel defined by the end of the stimulation/measurement length. later, I noticed a small mistake in the equation I had suggested: it did not properly took into consideration the case where LOW > 0. With the Risoe, LOW = 0. Everytime, as far as I can see. But I have not seen everything that it can record. With the Lexsyg, however, LOW = 25 for TL measurements. So, if one was to calculate the temperature axis, from a Lexsyg-derived binx file, my original equation would cause a problem. Hence, the revised equation.

…

-------------- back to the TL ramp + plateau measurement it is not easy to properly display this sort of measurement, within the same plot, because they have different x scale: temperature (TL ramp) versus time (plateau). If someone wants to do anything with this sort of data, they must first split the data records (TL versus plateau) and look at them separately. If you only want to see it as single curve (like SequenceEditor/Analyst and LexStudio do), for the beauty of it, the simplest way is to display the data against data point number. From first to last, where the first will always be 1. Alternatively, you could display the data against measurement time. From beginning to end, where the beginning should be defined by the end of the stimulation/measurement length of the TL ramp. The measurement length will be different, between the TL and plateau portion. by having the value of the first data point > 0 also makes it possible to display the curve on a logarithmic x axis. S

________________________________ From: Sebastian Kreutzer <notifications@github.com> Sent: Tuesday, August 20, 2019 11:18 AM To: R-Lum/Luminescence <Luminescence@noreply.github.com> Cc: Sebastien Huot <melpomene100@hotmail.com>; Mention <mention@noreply.github.com> Subject: Re: [R-Lum/Luminescence] Risoe.BINfileData2RLum.Analysis. X axis in the data must be changed (#80) I forgot to add, a TL curve has three parts: 1. first ramping, here we need the Byte information from LOW, AN_TEMP, TOLDELAY 2. the plateau; simply TOLON 3. the end 'ramping' , where we need AN_TEMP, HIGH, TOLOFF — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub<#80?email_source=notifications&email_token=AMURNUKNRTGHRTGHZE2XIJTQFQKMNA5CNFSM4IIKOTOKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4W3BGA#issuecomment-523088024>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AMURNUJMRVXU7SZYGXLTFOTQFQKMNANCNFSM4IIKOTOA>.

Answer 19 · 2019-08-20T18:16:57.000Z

Ok, I will modify my question: Did you test the TL curves with the updated version of the package? Because it should start from, your example, 26 to X if LOW is 25. Probably I should have mentioned that I did not use your equation. The implementation is the following (it's C++):

NumericVector seq(int from, int to, double length_out) {

  //set variables
  NumericVector sequence = length_out;
  double by = (to - from) / length_out;

  //set first channel
  sequence[0] = from + by;

  //loop and create sequence
  for (int i=1; i < length_out; ++i){
    sequence[i] = sequence[i-1] + by;

  }
  return sequence;
}

Answer 20 · 2019-08-20T18:19:02.000Z

Regarding the plateau plots: You are completly right, however, I follow the implementation by Risø to have data visualation similar to their own software. Besides, you can always create your own plots in the way you want it.

Answer 21 · 2019-09-17T14:58:11.000Z

@SebastienHuot Do you think I can close this issue and make it part of a new submission?

Answer 22 · 2019-09-18T16:15:14.000Z

looks good from where I stand. S

…

________________________________ From: Sebastian Kreutzer <notifications@github.com> Sent: Tuesday, September 17, 2019 9:58 AM To: R-Lum/Luminescence <Luminescence@noreply.github.com> Cc: Sebastien Huot <melpomene100@hotmail.com>; Mention <mention@noreply.github.com> Subject: Re: [R-Lum/Luminescence] Risoe.BINfileData2RLum.Analysis. X axis in the data must be changed (#80) @SebastienHuot<https://github.com/SebastienHuot> Do you think I can close this issue and make it part of a new submission? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub<#80?email_source=notifications&email_token=AMURNUO65PKFXADAB6LDX73QKDWAJA5CNFSM4IIKOTOKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD642HVY#issuecomment-532259799>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AMURNUPUAZZ2WS6BLD7EGN3QKDWAJANCNFSM4IIKOTOA>.

Answer 23 · 2019-09-18T16:31:34.000Z

Ok, big thanks @SebastienHuot for supporting us with here. I'll close this issue now, but if needed we reopen it later.