SP-Tools are a set of machine learning tools that are optimized for low latency and real-time performance. The tools can be used with Sensory Percussion sensors, ordinary drum triggers, or any audio input.
SP-Tools includes low latency onset detection, onset-based descriptor analysis, classification and clustering, corpus analysis and querying*, neural network predictive regression, and a slew of other abstractions that are optimized for drum and percussion sounds.
SP-Tools is built around the FluCoMa Toolkit and requires v1.0 to be installed for this package to work.
Max 8.3 or higher or Live/M4L (Mac/Windows).
FluCoMa v1.0 or higher.
All abstractions work in 64-bit and M1/Universal Binary.
SP-Tools Teaser Video - Performance and Musical Examples
SP-Tools (alpha v0.1) - Initial Overview
SP-Tools (alpha v0.2) - Controllers and Setups
SP-Tools (alpha v0.3) - Filtering, Playback, and Realtime Analysis
SP-Tools (alpha v0.4) - Concatenation and Realtime Filtering
SP-Tools (alpha v0.5) - Grid-Based Matching, Erae Touch, and Max for Live
SP-Tools (alpha v0.6) - Max for Live Walkthrough
SP-Tools (alpha v0.7) - Ramps, Data Processing, Novelty, and Timestretching
Corpus-Based Sampler
Metal by the Foot 1/4
- BREAKING CHANGES - all objects that had a separate control inlet, now take those messages in the left-most inlet
- added new "ramp" objects for structural and gestural changes (
sp.ramp
,sp.ramp~
) - added new "data" objects for transforming, looping, and delaying descriptors (
sp.databending
,sp.datadelay
,sp.datagranular
,sp.datalooper~
,sp.datatranspose
) - added novelty-based segmentation for determining changes in material type (
sp.novelty~
) - added timestretching functionality to
sp.corpusplayer~
and theCorpus Match
M4L device
- added Max for Live devices (16 total) which cover (nearly) all the functionality of SP-Tools
- Max codebase further commented and tidied
- added Max for Live devices for some of the main/flagship functionality (
Concat Match
,Controllers
,Corpus Match
,Descriptors
,Speed
) - added
sp.gridmatch
abstraction for generic controller-based navigation of corpora - added support for the Erae Touch controller (
sp.eraetouch
) - improved path stability when loading example corpora
- added "concat" objects for real-time mosaicking and concatenative synthesis (
sp.concatanalysis~
,sp.concatcreate
,sp.concatmatch
,sp.concatplayer~
,sp.concatsynth~
) - added ability to apply filtering to any descriptor list (via
sp.filter
) - improved filtering to allow for multiple chained criteria (using
and
andor
joiners) - updated/improved pitch and loudness analysis algorithms slightly (you should reanalyze corpora/setups/etc...)
- added ability to filter corpora by descriptors (baked into
sp.corpusmatch
viafilter
messages) - added improved/unified corpus playback with
sp.corpusplayer~
- add realtime analysis abstractions (
sp.realtimeframe~
,sp.descriptorsrt~
,sp.melbandsrt~
,sp.mfccrt~
) - added new stereo corpus (corpus_plumbutter.json)
- improved corpus analysis to work with stereo files and files shorter than 100ms as well as adding more comprehensive metadata
- added
sp.corpuslist
abstraction for visualizing and playing samples in a corpus in list form - removed old playback abstractions (
sp.corpussimpleplayer~
,sp.corpusloudnessplayer~
,sp.corpusmelbandplayer~
)
- added "setups" (corpus scaling and neural network prediction/regression)
- added "controllers" (meta-parameters extracted from onset timings and descriptor analysis)
- added four new abstractions (
sp.controllers
,sp.speed
,sp.setupanalysis
,sp.setuptrain~
) - added new corpus (corpus_voice.json)
- added
@roundrobin
mode tosp.corpusmatch
Depending on your knowledge level with machine learning processes, some of these terms may not make a lot of sense, so here is a short glossary to help you get going.
class: a category or label, or "zone" in Sensory Percussion lingo
classification: the process of defining and labelling classes
cluster: a category or label that is determined by a clustering algorithm
concat: to join end-to-end. a type of resynthesis that stitches together fragments
corpus: a pre-analyzed folder of samples
descriptors: analyzed characteristics of a sound
melbands: perceptually-spread frequency bands
MFCCs: a list of numbers that describes complex timbral shapes
onsets: an analyzed attack in audio
regression: interpolating or predicting a new point given training data
sp.classifierdisplay lets you visualize what classes have been matched by sp.classmatch. Can display the typical snare/tom classes as well as an option to visualize kick classes.
sp.classmatch will find the nearest match in a set of pre-trained classes or clusters. The classes can be the default Sensory Percussion classes, auto-generated cluster names, or any arbitrary labels.
sp.classtrain will create a classifier based on incoming class labels and descriptor analysis. The labels can be the default Sensory Percussion labels or any arbitrary input.
sp.clustertrain will create a classifier based on incoming class labels and descriptor analysis. The labels can be the default Sensory Percussion labels or any arbitrary input.
sp.concatanalysis~ is based on sp.descriptorsrt~ but has all of the appropriate settings pre-baked so it can get sent directly to sp.concatmatch. It also takes some playback controls via Rate and Random parameters.
Analyze all the samples in a folder for a variety of descriptors, timeframes, and metadata. Keeps track of the location of the samples when analyzed.
sp.concatmatch works in conjunction with sp.concatanalysis~ and sp.concatplayer~ to create real-time audio mosaicking via concatatenative synthesis with a pre-analyzed concat corpus. sp.concatmatch handles the nearest neighbor matching.
sp.concatplayer~ is the underlying playback engine for sp.concatsynth~. It is a stripped back granular synth engine that plays back small and windowed fragments of a single large buffer unlike the corpus-based playback elsewhere in SP-Tools.
sp.concatsynth~ creates real-time audio mosaicking via concatatenative synthesis with a pre-analyzed concat corpus. sp.concatsynth~ handles the audio analysis and nearest neighbor matching in a single object.
sp.controllers works in conjuntion with sp.descriptosr~/sp.descriptorframe to create several meta-parameters based on loudness and centroid (brightness).
sp.corpusanalysis works in conjunction with sp.folderloop to analyze all the samples in a folder for a variety of descriptors, timeframes, and metadata to be used in sp.corpusmatch. Keeps track of the location of the samples when analyzed.
Analyze all the samples in a folder for a variety of descriptors, timeframes, and metadata. Keeps track of the location of the samples when analyzed.
sp.corpuslist loads and displays the contents of the polybuffer~ at the center of a corpus, allowing the viewing of the selected samples.
sp.corpusmatch works in conjunction with sp.descriptors~ or sp.descriptorsframe to find the nearest match in a pre-analyzed corpus. sp.corpusmatch also houses the required datasets, coll, and polybuffer~.
sp.corpusplayer~ is an all-in-one playback object that allows for mono or stereo playback, optional loudness and spectral compensation, along with various sample playback controls and features.
sp.crossbank~ is a cascade of cross~ filters for spectral compensation. Frequencies are pre-set to adjust the spectrum based on the melband analysis/compensation. It should be used inside a poly~ object.
sp.databending takes incoming descriptor data (descriptors, melbands, or MFCCs) and apply various transformations and "bends". The input can be lists or buffers and the same will be output.
sp.datadelay takes incoming descriptors (of any kind) and sends them through a delay line. The feedback and rolloff parameters function as they would in a conventional delay. The input can be lists or buffers and the same will be output.
sp.datagranular takes incoming descriptor data (descriptors, melbands, or MFCCs) and processes it through a "granular synth"-style process.
sp.datalooper~ take incoming descriptor data (descriptors, melbands, or MFCCs) and sends them into a looper with somewhat conventional looper controls
sp.datatranspose takes incoming descriptor data (descriptors, melbands, or MFCCs) and "transposes" it in different ways. The input can be lists or buffers and the same will be output.
sp.descriptordisplay plots the incoming realtime descriptors, along with the nearest match on a radar chart for visualizing the differences between the incoming audio and its nearest match.
sp.descriptorframe outputs loudness, centroid, spectral flatness, and pitch along with the derivatives for loudness/centroid/flatness and confidence for pitch.
sp.descriptors~ outputs loudness, centroid, spectral flatness, and pitch along with the derivatives for loudness/centroid/flatness and confidence for pitch.
sp.descriptors~ outputs loudness, centroid, spectral flatness, and pitch along with the derivatives for loudness/centroid/flatness and confidence for pitch.
sp.eraetouch acts as the API parser to/from the Erae Touch as well as creating the LED feedback for multiple zones. You can connect this to sp.gridmatch for corpus-based sample playback or use the XYZ outputs directly for anything else in Max.
sp.filter allows you to selectively send incoming descriptor messages to one of two outlets depending on whether the filtering criteria is met. This allows you to fork processing based on audio characteristics.
sp.folderloop is used in conjunction with sp.corpusanalysis to analyze every sample in a folder for the required descriptors and metadata.
sp.gridmatch finds the nearest match in a corpus based on a grid-ified XY space. You can load/use the same corpora as you would with sp.corpusmatch but instead of matching based on incoming audio descriptors you can match using XY coordinates from a controller or UI object.
sp.melbandframe outputs 40 melbands which can be used for spectral compensation in corpused-based sample playback.
sp.melbands~ outputs 40 melbands which can be used for spectral compensation in corpused-based sample playback
sp.melbands~ outputs 40 melbands which can be used for spectral compensation in corpused-based sample playback
sp.mfcc~ outputs 13 MFCC coefficients (skipping the 0th coefficient) which can be used for classification and clustering. Although abstract they can also be used to control parameters.
sp.mfccframe outputs 13 MFCC coefficients (skipping the 0th coefficient) which can be used for classification and clustering. Although abstract they can also be used to control parameters.
sp.mfcc~ outputs 13 MFCC coefficients (skipping the 0th coefficient) which can be used for classification and clustering. Although abstract they can also be used to control parameters.
sp.novelty~ takes audio input and outputs a bang and a trigger when novelty is detected. The novelty can be computed across different time frames and for different parameters.
sp.onset~ takes audio input and outputs a bang, trigger, and a gate when an onset is detected. The sensitivity is adjustable (0-100%) and a threshold can be set as an absolute noise floor (in dB).
sp.onsetframe~ takes audio input, just like sp.onset~ but instead of outputting just a bang/trigger/gate, it outputs the frame to start descriptor analysis. sp.onsetframe~ is useful when you want to analyze multiple descriptors and want them to all refer to the same exact analysis frame.
sp.playbackcore~ is the underlying poly~ that handles the polyphonic sample playback of matched corpus entries. It's not intended to be used on its own, but rather is the core component of sp.corpusplayer~.
sp.plotter is a utility for visualizing corpora and trained classes.
sp.ramp takes onsets as input (as bangs or triggers/gates) and incrementally outputs three versions of a given ramp based on the amount of defined events.
sp.ramp~ takes onsets as input (as bangs or triggers/gates) and outputs three versions of a given ramp allowing for sample accurate gestures to be triggered by incoming onsets.
sp.realtimeframe~ is the counterpart to sp.onsetframe~ where instead of outputting the frame to analyzed based on onset detection, sp.realtimeframe~ spits out a constant stream of frame values to analyze enabling realtime analysis of multiple descriptor types that remain in sync.
sp.setupanalysis is used in conjunction with sp.onsetframe~ to create analyses for multiple descriptors at 256 and 4410 analysis windows. This is later used to improve matching with sp.corpusmatch.
sp.setuptrain creates a setup or overview of of your instrument/sticks/sources. It saves multiple descriptors at multiple time frames and can be used to scale your input to match a corpus, or to improve matching overall.
sp.speed works in conjunction with sp.onset~ to create several parameters based on the time between attacks.