/NHST-SECT

Hypothesis testing for topological data analysis via the smooth Euler characteristic transform

Primary LanguageRGNU General Public License v3.0GPL-3.0

NHST-SECT

Shape-valued data are of interest in applied sciences. While some statistical estimators have been developed for analyzing shape-valued observations, current techniques for performing statistical inference on such observations are limited. In this paper, we introduce a smooth Euler characteristic transform-based randomization-style null hypothesis test to distinguish different collections of shapes. Our proposed method has a solid mathematical foundation and is computationally efficient. Simulation studies indicate that the performance of our hypothesis testing framework is satisfactory. The mandibular molars from four genera of primates are analyzed to demonstrate the applications of our proposed hypothesis testing approach to real data sets. Our proposed approach is expected to be applied to phylogenetics and bioimage informatics in future research.

Results in the manuscript were derived by using the R version of the code. Based on the structure of the manuscript, this repositories is divided into three parts:

  1. Simulation Study (Section 3 of the manuscript)
  2. Silhouette Database (Section 4.1 of the manuscript)
  3. Mandibular Molars (Section 4.2 of the manuscript)

The R Environment

Code for this manuscript was run in R (version 4.2, Compatible with version 3.6 - 4.2). R is a widely used, free, and open source software environment for statistical computing and graphics. The most recent version of R can be downloaded from the Comprehensive R Archive Network (CRAN). For specific details on how to compile, install, and manage R and R-packages, refer to the manual R Installation and Administration.

For the R package required for different algorithms, please refer to the README.md file for each section (e.g., Simulation Study, Silhouette Database and Mandibular Molars).

Data Availability and Code Usage

Other details and required information of this manuscript are provided in the README.md file for each section. Most of the important code in this article is stored as R Markdown files. R Markdown is a file format for making dynamic documents with R. An R Markdown document is written in markdown (an easy-to-write plain text format) and contains chunks of embedded R code. More information about RMD can be found in the following links.

Relevant Citations

Questions and Feedback

Should you have any questions, please feel free to contact Jinyu Wang jinyu_wang@brown.edu. We appreciate any feedback you may have with our repository and instructions.