Building this document

To build this README, run build_readme.R. Talks data is in csv talks_table.csv

Talks

In alphabetical order.

Abhik Seal (Abbvie), Democratizing Natural Language Processing with I2E and R Shiny

Abstract

The primary objective of the presentation is to share insights of democratizing powerful natural language processing tool like I2E lingumatics and open source R and Shiny. The talk will focus on how we can leverage I2E python sdk natural language processing toolkit to perform natural language processing and visualize text mining results with R and Shiny. We will present several uses of our R shiny platform called pharmine and its use cases which we developed for minining biomedical data.


Slides

Aedin Culhane (Dana-Farber Cancer Institute), Multi-modal data integration
Abstract

NA


Slides

Andy Nicholls (GSK), Making Better Decisions
Abstract

In the early phases of clinical development, the future of a compound depends on more than just the result of hypothesis test on a single endpoint, in a single phase 2 study. We think a lot about how design choices affect immediate outcomes. GSK's Quantitative Decision Making (QDM) framework focusses on the question, "How do we design our study in order to increase the chances that it will deliver data that will allow us to decide whether the drug should continue in development, or stop?" The QDM Framework has been developed in R and takes advantage of the Biostatistics HPC environment, running thousands of hypothetical scenarios in close to real-time. The initiative is changing the way we plan and deliver clinical trials. Thanks to a Shiny front end, Statisticians are able to walk clinical teams through key trial design decisions in order to estimate the Probability of Success ? a key component in the QDM framework. This presentation will cover the core QDM concepts and present the key communication outputs created to support the process.


Slides

Becca Krouse (Rho), Building Open Source Tools for Safety Monitoring: Advancing Research Through Community Collaboration
Abstract

The Interactive Safety Graphics (ISG) workstream of the ASA-DIA Biopharm Safety Working Group is excited to introduce the safetyGraphics package: an interactive framework for evaluating clinical trial safety in R using a flexible data pipeline. Our group seeks to modernize clinical trial safety monitoring by building tools for data exploration and reporting in a highly collaborative open source environment. At present, our team includes clinical and technical representatives from the pharmaceutical industry, academia, and the FDA, and additional contributors are always welcome. The current release of the safetyGraphics R package includes graphics related to drug-induced liver injury. The R package is paired with an in-depth clinical workflow for monitoring liver function created by expert clinicians based on medical literature. safetyGraphics features interactive visualizations built using htmlwidgets, a Shiny application, and the ability to export a fully reproducible instance of the charts with associated source code. To ensure quality and accuracy, the package includes more than 300 unit tests, and it has been vetted through a beta testing process that included feedback from more than 20 clinicians and analysts. The Shiny application can easily be extended to include new charts or applied to other disease areas due to its modular design and generalized charting framework. Several companies have adapted the tool for their own use, leading to interesting discussions and paving the way for enhancements, which demonstrates the power of open source and community collaboration.


Slides

Carson Sievert (RStudio), Reproducible shiny apps with shinymeta
Abstract

Shiny makes it easy to take domain logic from an existing R script and wrap some reactive logic around it to produce an interactive webpage where others can quickly explore different variables, parameter values, models/algorithms, etc. Although the interactivity is great for many reasons, once an interesting result is found, it's more difficult to prove the correctness of the result since: (1) the result can only be (easily) reproduced via the Shiny app and (2) the relevant domain logic which produced the result is obscured by Shiny's reactive logic. The R package shinymeta provides tools for capturing and exporting domain logic for execution outside of a Shiny runtime (so that others can reproduce Shiny-based result(s) from a new R session).


Slides

Chase Clark (University of Illinois), Your Missing Step in Reproducible R Programming: Continuous Deployment
Abstract

The past few years have shown vast improvements in workflows for reproducible and distributable research within the R ecosystem. At satRday Chicago everyone in the audience said they used R Markdown, however only one person raised their hand when asked if they could associate their reports back to the code version that generated it. Since continuous integration is quickly becoming commonplace in the R community, continuous deployment (CD) is a logical and easy step to add to your workflow to enhance reproducibility. I will demo associating R Markdown to the code version that produced it and automating the build and release of both executable and cloud-based Shiny apps. Finally, an announcement of the electricShine package for creating Electron based Shiny apps will highlight the power of using CD with production-level Shiny apps.


Slides

David Cooper (GlaxoSmithKline), Using Machine Learning and Interactive Graphics to Find New Cancer Targets
Abstract

GlaxoSmithKline is searching for new oncology drug targets. We have CRISPR knockout data for many cancer cell lines and many genes. For these same cell lines, we also have genomic data --somatic mutations, copy number variants, and gene expression. We use machine learning (random forests) to find predictive relationships between genomic features and cell line growth under knockout. Then we use GLASSES, a shiny app, to share the results with biologists. GLASSES lets scientists interactively explore key relationships and discover novel cancer vulnerabilities.


Slides

Doug Kelkhoff (Roche/Genentech), Re-envisioning Clinical Content Delivery in the Open Source World
Abstract

Content delivery in preparation for filing a clinical study report requires robust tooling for quickly and reproducibly compiling analysis of study data. Traditionally, this reproducibility has stemmed from one-time, rigorous validation of a development environment and analytic workflow. More recently, this paradigm has shifted to match modern software development principles, transitioning toward continuous monitoring of software validation and quality.

I'll share our developing perspectives on validation and reproducibility, driven by a need to leverage open source tools. This vision leans on open source software such as R and its package ecosystems, publicly maintained containerized environments like the rocker project and cross-industry risk assessment via the R Validation Hub. By treating analysis as a software process in the content pipeline transforming raw data into analytic results, we can take advantage of the continuous deployment workflows prevalent in the software development world to shorten our filing timelines, while simultaneously delivery a more reproducible product to our health authority partners.


Slides

Elena Rantou (FDA), Using R for Generic Drug Evaluation and SABE R-package for Assessing Bioequivalence of Topical Dermatological Products
Abstract

Determination of bioequivalence (BE), a crucial part of the evaluation of generic drugs, may depend on clinical endpoint studies, pharmacokinetic (PK) studies of bioavailability, and In-Vitro tests, among others. Additionally, in reviewing Abbreviated New Drug Applications (ANDA), FDA reviewers often analyze safety studies and perform various kinds of simulations. A growing, vibrant group of statisticians in the Office of Biostatistics, CDER/FDA has adopted R for both their routine tasks and to address numerous scientific questions that are received in the form of internal consults. During the past 5 years, we have used R to run power simulations; generate the distribution of certain statistics of interest; assess the similarity of and cluster amino-acid sequences as well as, derive the distribution of the molecular weight of such sequences of a certain length; and determine the validity of data sets categorized for genotoxicity. R-package SABE was developed to accompany a new statistical test, used to assess BE of topical dermatological products when data for evaluation come from the In-Vitro Permeation Test (IVPT) [1]. BE tests consider comparisons between a Test (usually generic) and a Reference (RLD) product under a replicate study design. A function that assesses BE of a Test and a Reference formulation uses a mixed scaled criterion for the PK metrics AUC (Area Under the Curve) and Cmax (maximum concentration).


Slides

Ellis Hughes (Fred Hutch), Validation Framework for Assay Processing Pipelines
Abstract

In this talk I will discuss the steps that have been created for validating internally generated R packages at SCHARP (Statistical Center for HIV/AIDS Research and Prevention) and the lessons learned while creating packages as a team.

Housed within Fred Hutch, SCHARP is an instrumental partner in the research and clinical trials surrounding HIV prevention and vaccine development. Part of SCHARP's work involves analyzing experimental biomarkers and endpoints which change as the experimental question, analysis methods, antigens measured, and assays evolve. Maintaining a validated code base that is rigid in its output format, but flexible enough to cater a variety of inputs with minimal custom coding has proven to be important for reproducibility and scalability.

SCHARP has developed several key steps in the creation, validation, and documentation of R packages that take advantage of R's packaging functionality. First, the programming team works with leadership to define specifications and lay out a roadmap of the package at the functional level. Next, statistical programmers work together and approach the task from a software development view. Once the code has been developed, the package is validated according to procedures that comply with 21 CFR part 11, and leverage software development life cycle (SDLC) methodology. Finally, the package is made available for use across the team on live data. These procedures set up a framework for validating assay processing packages that furthers the ability of Fred Hutch to provide world-class support for our clinical trials.


Slides

Eric Nantz (Eli Lilly), Creating and reviving Shiny apps with {golem}
Abstract

Developing Shiny applications that meet design goals, easily deploy to multiple platforms, and contain easily maintainable components (all while adhering to best practices) is typically a difficult endeavor. Until recently, there has not been a tool addressing the optimal development workflow and construction of Shiny apps. The {golem} package by Think-R offers an opinionated framework for creating a Shiny app as a package, with {usethis}-like functionality to add a diverse set of capabilities. In this presentation, I will share how {golem} enables a robust standard for Shiny development and how it magically brought a dormant application back to life.


Slides

Jeannine Fisher (Metrum), Leveraging multiple R tools to make effective pediatric dosing decisions
Abstract

R Shiny apps allow for dynamic, interactive, real-time integration of knowledge within a drug-development program to support decision making. Here, an R Shiny app was used to explore the pharmacokinetic and pharmacodynamic effects of different dosing regimens of the anti IL-17 human mAb Cosentyx? (secukinumab) in pediatric patients. Secukinumab has been studied and approved to treat psoriasis in adult patients. Models which describe the dose-exposure-response relationships in adults (Lee et al., Clin Pharmacol Ther, 2019 and FDA, Medical Reviews BLA 125504, 2015) were used in the mrgsolve simulation package to explore these relationships in pediatric patients. The prior adult knowledge, used in conjunction with the computational infrastructure leveraged through R, the Shiny app, mrgsolve, and Rcpp, allows researchers to explore various dosing regimens in a difficult-to-study patient population. The tools and approaches described here have been routinely used to support regulatory interactions (ex. PIP) involving pediatric dosing.


Slides

Keaven Anderson (Merck), Teaching an old dog new tricks: modernizing gsDesign
Abstract

The gsDesign package for group sequential design is widely used with >30k downloads. The package was originally written in 2007 with substantial documentation and Runit testing created before 2010. A Shiny interface was created to make the package more approachable in about 2015. Recent efforts have focused on updating package to use Roxygen2, pkgdown, covr/covrpage and testthat as well as changing vignettes from Sweave to R Markdown. The learning curve for this modernization will be discussed as well as usage in a regulated environment.


Slides

Kelly O'Briant (RStudio), Shiny in Production: Building bridges from data science to IT
Abstract

We know that adopting documentation, testing, and version control mechanisms are important for creating a culture of reproducibility in data science. But once you've embraced some basic development best practices, what comes next? What does it take to feel confident that our data products will make it to production? This talk will cover case studies in how I work with R users at various organizations to bridge the gaps that form between development and production. I'll cover reasons why CI/CD tools can enhance reproducibility for R and data science, showcase practical examples like automated testing and push-based application deployment, and point to simple resources for getting started with these tools in a number of different environments.


Slides

Kevin Snyder (FDA), Interactive Visualization of Standardized CDISC-SEND-Formatted Toxicology Study Data Using R Shiny
Abstract

The standardization of nonclinical study data by the Clinical Data Interchange Standards Consortium (CDISC) via the Standard for Exchange of Nonclinical Data (SEND) has created an opportunity for the collaborative development and use of open source software solutions to analyze and visualize toxicology study data. Shiny is an open source R package that facilitates the development of user-friendly, web-based applications. The Pharmaceutical Users Software Exchange (PhUSE) consortium has provided a platform for stakeholders throughout the pharmaceutical industry to collaboratively build and share tools, e.g. R Shiny applications, to enhance the effectiveness and efficiency of drug development. The modeling of standard repeat-dose toxicology study endpoints, e.g. body weights, clinical signs, clinical pathology, histopathology, toxicokinetics, etc., in SEND has created new opportunities for dynamic, interactive visualization of study data above and beyond the static tables and figures typically included in static study reports. For example, clinical pathology data from nonclinical toxicology studies can be difficult to digest when presented as group means in data tables, due to the large number of potentially correlated analytes collected across treatment groups, sexes, and potentially multiple timepoints. An R Shiny application has been developed to allow end users to comprehensively examine these datasets, using a variety of analytical and visualization methods, with relative ease. The application is publicly hosted on shinyapps.io, and the source code can be found on the PhUSE GitHub website.


Slides

Leon Eyrich Jessen (Technical University of Denmark), Tidysq for Working with Biological Sequence Data in ML Driven Epitope Prediction in Cancer Immunotherapy
Abstract

We are amidst a data revolution. Just the past 5 years, the cost of sequencing a human genome has gone down approximately 10-fold. This development moves equally fast within areas such as mass spectrometry, in vitro immuno-peptide screening a.o. This facilitates the search for bio-markers, biologics, therapeutics, etc. but also redefines the requirements for storing, accessing and working with data and the skillset of bio data scientists. In this talk I will present tidysq, an R-package aiming at extending the Tidyverse framework to include (tidy) bio-data-science / bioinformatics. Tidysq will be presented in context with current status in ML driven (neo)epitope prediction within cancer immunotherapy.


Slides

Marianna Foos (Bluebird Bio), Breaking the Speed Limit: How R Gets Faster
Abstract

NA


Slides

Mirjam Trame (Novartis), nlmixr: an R package for population PKPD modeling
Abstract

nlmixr is a free and open source R package for fitting nonlinear pharmacokinetic (PK), pharmacodynamic (PD), joint PK/PD and quantitative systems pharmacology (QSP) mixed-effects models. Currently, nlmixr is capable of fitting both traditional compartmental PK models as well as more complex models implemented using ordinary differential equations (ODEs). It is under intensive development and has succeeded in attracting extensive attention and a willingness to make contributions from the pharmaceutical modeling community. We believe that, over time, it will become a capable, credible alternative to commercial software tools, such as NONMEM, Monolix, and Phoenix NLME.


Slides

Paul Schuette (FDA), Simulations, and Complex Innovative Trial Designs
Abstract

NA


Slides

Will Landau (Eli Lilly), Machine learning workflow management with drake
Abstract

Machine learning workflows can be difficult to manage. A single round of computation can take several hours to complete, and routine updates to the code and data tend to invalidate hard-earned results. You can enhance the maintainability, hygiene, speed, scale, and reproducibility of such projects with the drake R package. drake resolves the dependency structure of your analysis pipeline, skips tasks that are already up to date, executes the rest with optional distributed computing, and organizes the output so you rarely have to think about data files. This talk demonstrates a deep learning project with drake-powered automation.


Slides