/tdaunif

uniform manifold samplers for topological data analysis

Primary LanguageR

Travis build status

tdaunif: Uniform manifold samplers for topological data analysis

This R package contains sampling functions for topological manifolds embedded (or immersed) in Euclidean space.

motivation

consolidation

Statistical topology and topological data analysis include a variety of methods for detecting topological structure from point cloud data sets, most famously persistent homology. (See Weinberger (2011) for a brief introduction.) These methods are often validated by applying them to point clouds sampled from spaces with known topology. Functions that generate such samples are therefore valuable to developers of topological–statistical software.

Such samplers have not been easy to find in the R package ecosystem. Handfuls exist in several different packages, but no single package or code repository has assembled a large collection for convenient use. That is the goal of this package.

uniformity

The simplest validations use points sampled uniformly from the surfaces of manifolds embedded in Euclidean space. Uniformity here is with respect to the surface area of the embedded manifold, but such uniform sampling is not trivial to do when the manifold can only be expressed via parameterization from a simpler parameter space, as when a square is rolled and bent into a torus. Uniform sampling on the parameter space may then produce uneven sampling on the manifold, regions of greater compression being oversampled relative to regions of greater expansion. One especially important motivation for this package is therefore to include uniform samplers for popular embeddings of topological manifolds, including functions that allow users to build their own with minimal effort (see help(manifold, package = "tdaunif")).

usage

installation

tdaunif is not yet on CRAN, but you can install it from GitHub using the remotes package:

remotes::install_github("corybrunson/tdaunif")

illustration

An intuitive embedding of the Klein bottle into 4-space is adapted from the popular “donut” embedding of the torus in 3-space. In tdaunif this is called the “tube” parameterization. We can sample uniformly from this embedding (without noise) and examine the coordinate projections as follows:

set.seed(0)
x <- sample_klein_tube(n = 360, ar = 3, sd = 0)
pairs(x, asp = 1, pch = 19, cex = .5)

Compare these neat projections to those of the same sample with noise added in the ambient Euclidean space:

x <- add_noise(x, sd = .2)
pairs(x, asp = 1, pch = 19, cex = .5)

For a thorough discussion of this sampler, see my blog post describing the method presented in Diaconis, Holmes, and Shahshahani (2013).

another illustration

A subset of samplers allow stratified sampling, which relies on an analytic solution to the problem of length or area distortion created by most parameterizations. For example, the planar triangle sampler can be stratified to produce a more uniform arrangement of points than a properly uniform sample would produce:

equilateral <- cbind(c(0,0), c(0.5,sqrt(3)/2), c(1,0))
x <- sample_planar_triangle(n = 720, triangle = equilateral)
y <- sample_planar_triangle(n = 720, triangle = equilateral, bins = 24)
par(mfrow = c(1L, 2L))
plot(x, asp = 1, pch = 19, cex = .5)
plot(y, asp = 1, pch = 19, cex = .5)

par(mfrow = c(1L, 1L))

See Arvo (1995) and Arvo’s notes from Siggraph 2001 for a detailed treatment of this techniquue.