ropensci/unconf18

"Safety Profiler" for User package libraries

Opened this issue · 13 comments

We embark on unconf 2018 at an appro time: R 3.5.0 launches and lots of folks are feeling the x.y package upgrade/sidegrade process.

However, we could build a profiler/auditor — much like the emergent node audit — which would let folks know just how lagging they are and what potential safety issues they may be facing as a result.

This could take a bit of work and would also require delving into packages that include C[++] libraries with them.

One possibility is to build this as a wrapper/plugin for goodpractice. At last year's unconference Gabor created a plug-in structure and Hannah Frick has been working on extending and documenting it recently. We're likely going to have several plug-in checks specific for rOpenSci. A set of security-specific would be a great addition to those.

elinw commented

I think it's also good to just articulate what kinds of things this can focus on. For example what kinds of user inputs are being taken and are they being sanitized, when does that matter and when does that not matter.
Somewhere the deidentification issue should come in too.

defender and middlechild packages started

This is exciting. I will definitely use defender.

@czeildi did a ++gd job and has built a great foundation to expand upon, too. def let us know what kinds of checks you'd like @juyeongkim

I'd love to run defender on dependencies of the package I am checking. defender would look at DESCRIPTION file and download the package sources from CRAN or GitHub and run systems call check on them.

However, this might not actually be feasible if we want to check the dependencies of the dependencies. Maybe we can run a web service that automatically checks every packages in CRAN and provide a API to get summarized results.

Wow. I can actually do that, too :-) Need to ponder some things for that and now also figure out how we want to standardize defender output to accommodate that. This is a great idea @juyeongkim !

I am now slightly terrified of the command I'm abt to run on my home CRAN mirror to see how many pkgs run system calls

😮 😱

So this is totally doable and we'll adapt defender to it.

I've got a lighter-weight checker running against home CRAN now.

In the first 100 files we've already got:

system(paste(".", pathtoms, "ms 10", formatC(numsim*numloc,digit=7), " -t tbs -r tbs tbs < const |", ".", pathtoms, "sample_stats > afr-const.txt", sep=""))
system(paste(".", pathtoms, "ms 10", formatC(numsim*numloc,digit=7), " -t tbs -r tbs tbs -eN tbs tbs < exp | ", ".", pathtoms, "sample_stats > afr-exp.txt", sep=""))
system(paste(".", pathtoms, "ms 10", formatC(numsim*numloc,digit=7), " -t tbs -r tbs tbs -eN tbs tbs -eN tbs tbs < bott | ", ".", pathtoms, "sample_stats > afr-bott.txt", sep=""))
system(paste("jags ",run.file,sep=""))
system("cmd /c ver",intern=TRUE,invisible=TRUE)
system("uname -rps",intern=TRUE)
system("rm Rlisting001.tmp")
system("rm Rlisting001.tmp")
system("rm Rlisting001.tmp")

So there def are system calls :-)

This connection across people/projects makes me very very happy

@hrbrmstr That's great! Well, maybe not great for security... I'd love to chat about this idea and brainstorm with you and @czeildi. I think this will be a fun summer project. Maybe on slack or conference call?

Hi @stefaniebutland!!!