nolanlab/spade

Question about default transform parameters

jiaco opened this issue · 8 comments

jiaco commented

First off, thanks for all your work on SPADE. It has been a very interesting package to learn.

I am just a little bit confused about how the default transform has been obtained.

flowCore::arcsinhTransform( a = 0, b = 2 );

We have tested various settings and are actually quite happy with this one, but I am unable to understand exactly what a = 0 is doing. The docs say that 'a' is a "positive double that corresponds to the base of the logarithm." Despite the fact that I can see from the equation "x<-asinh(a+b*x)+c" that setting a to zero will merely drop the 'a' from the formula, I cannot get my head around setting the base of a logarithm to zero.

Does my question make sense? I warn you, this is my first exposure to FACS data, but I have plenty of R/bioinfomatics experience. Just never used these transform functions before and would like to try to explain how I justify the use of this transform and these particular parameters to the biologist.

Thanks again

Jason

Hi Jason,

I think that's either an error in the flowCore docs (copy/paste from logtransform), or they're attempting to make a poor comparison to log transforms because ArcSinh is log-like in some regions. a is simply the horizontal shift around 0, and (to me) doesn't do anything similar to changing the base of the log.

ArcSinh:
image

Log (blue line that's just labeled "log" is base e):
image

(BTW our default b is 0.2, not 2 -- in the plot legends above, x/5.)

Hope that helps!

jiaco commented

Yes that helps a lot. Really glad I asked. Thanks for the response.

On Mon, Nov 9, 2015 at 7:26 PM, Zach Bjornson notifications@github.com
wrote:

Hi Jason,

I think that's either an error in the flowCore docs (copy/paste from
logtransform), or they're attempting to make a poor comparison to log
transforms because ArcSinh is log-like in some regions. a is simply the
horizontal shift around 0, and (to me) doesn't do anything similar to
changing the base of the log.

ArcSinh:
[image: image]
https://cloud.githubusercontent.com/assets/469365/11042185/8fff49ee-86ca-11e5-9c17-abab74016594.png

Log (blue line that's just labeled "log" is base e):
[image: image]
https://cloud.githubusercontent.com/assets/469365/11042196/9f305ce6-86ca-11e5-9c62-dc736cbcd373.png

(BTW our default b is 0.2, not 2 -- in the plot legends above, x/5.)

Hope that helps!


Reply to this email directly or view it on GitHub
#119 (comment).

Jason S. Iacovoni
Bioinformatic Plateau
05.61.32.56.18
Inserm/UPS UMR 1048 - I2MC
Institut des Maladies Métaboliques et Cardiovasculaires
Building L4, Room 106
1 avenue du Pr Jean Poulhès
BP 84225
31432 Toulouse Cedex 4, France

SamGG commented

You are right in parameters understanding. asinh is OK, but the default parameters were badly chosen. I used asinh frequently. Here an example with the data of Moore et al 2012. Left: screen copy of their figure. Right: a simple asinh transform. Let me know if find any difference that justifies the use of any complex bi-exponential function.
2015-11-09_225336

@SamGG x/5 as the default was chosen for CyTOF data. x/150 is recommended for fluorescence. See Figure S2 from Bendall 2001:

image

SamGG commented

I agree. For classical flow cytometry, 150 is a good value, but sometimes, this value could be decreased down to 50 or increased up to 500. Pay attention to the left bound of the data: I noticed some truncation or floor effect on some occasions.
For CyTOF, my colleagues use default value of CytoBank, which makes them happy.
Bendall 2011, isn't it?

Oops, yes, 2011. Missing a 1.

I just want to chime in to say:

(1) Thanks to both Zach and Sam for this great interchange.

(2) It's always nice to see supplemental figures from 4 years ago still getting some mileage :)

(3) Sam, you're absolutely right that these hard-coded arcsinh cofactors are not always perfect -- they're really a compromise for the sake of simplicity and reproducibility. The defaults on Cytobank are 150 for fluorescence and 5 for CyTOF. For particularly sensitive or insensitive fluorescence parameters, the cofactors should be adjusted (on our LSR-II, APC-Cy7 often benefits from a different cofactor). The range of optimal arcsinh cofactors for CyTOF is narrower -- probably between 1-10. The trouble is, it's one more knob that users can fiddle with, which can get people into trouble, so most of the time we just leave it at the defaults. I don't know of a principled way to assign correct arcsinh cofactors on a channel-by-channel and experiment-by-experiment basis (in the same way we compensate fluorescence data, for example), but if there is one, I'd love to hear it!

ES

SamGG commented
  1. I agree that too many knobs might disturb the practitioner from the goal. But it's good to offer it to advanced users and to let them understand that there are many underlying decisions. Three answers come to my mind, but I didn't investigate them deeply. I don't use them because I prefer to adjust the transformation asinh coefficient and scale the result. While FCSTrans is certainly nice, using the default of the flowCore asinh transformation is not fair at all IMHO.
    https://www.bioconductor.org/packages/release/bioc/html/flowTrans.html
    https://www.bioconductor.org/packages/release/bioc/html/flowVS.html
    http://onlinelibrary.wiley.com/doi/10.1002/cyto.a.22037/abstract
    Thanks to all of you.