MTG/essentia.js

Beginner documentation improvements

kmturley opened this issue · 1 comments

What is the issue about?

  • Bug
  • Feature request
  • Usage question
  • Documentation
  • Contributing / Development

What part(s) of Essentia.js is involved?

  • essentia.js-core (vanilla algorithms)
  • essentia.js-model (machine learning algorithms)
  • essentia.js-plot (plotting utility module)
  • essentia.js-extractor (typical algorithm combinations utility)

Description

The Getting Started docs show new users how to use essentia.js:
https://mtg.github.io/essentia.js/docs/api/tutorial-1.%20Getting%20started.html

They show briefly how to use the Gain and Pitch algorithms. Then give a list of all algorithms, and showing the more advanced Windowing and Frames features. This sends beginners (like myself) down a rabbit hole of trying to find the right algorithms, and then trying to implement Windowing and Frames. I spent a long time on these before realizing there were higher-level methods which could more easily solve my use-cases.

I discovered these two resources which I feel better explain the power of essentia.js for beginners:
https://cs310.hashnode.dev/audio-features-extraction-with-javascript-and-essentia
https://essentia.upf.edu/tutorial_pitch_melody.html

They show off the capabilities for more common audio feature-detection such as:

  • bpm
  • danceability
  • duration
  • energy
  • key
  • loudness
  • scale
  • notes

It would be great to extend the Getting Started documentation to include some of these examples. I believe this would increase adoption of the library if there were more examples covering common and beginner use-cases.

Example code

// @ts-ignore
import { Essentia, EssentiaWASM } from 'essentia.js';
import * as wav from 'node-wav';
import { readFileSync } from 'fs';

const essentia = new Essentia(EssentiaWASM);

function loadFile(filepath) {
  const fileBuffer = readFileSync(filepath);
  const audioBuffer = wav.decode(fileBuffer);
  return essentia.arrayToVector(audioBuffer.channelData[0]);
}

function getNotes(vector) {
  const names: string[] = ['C', 'C#', 'D', 'D#', 'E', 'F', 'F#', 'G', 'G#', 'A', 'A#', 'B'];
  const melodia = essentia.PitchMelodia(vector).pitch;
  const segments = essentia.PitchContourSegmentation(melodia, vector);
  const onsets = essentia.vectorToArray(segments.onset);
  const durations = essentia.vectorToArray(segments.duration);
  const pitches = essentia.vectorToArray(segments.MIDIpitch);
  const notes: any = [];
  onsets.forEach((value, i) => {
    notes.push({
      start: onsets[i],
      duration: durations[i],
      midi: pitches[i],
      octave: Math.floor(pitches[i] / 12),
      name: names[pitches[i] % 12],
    });
  });
  return notes;
}

const vector = loadFile('./test/scale.wav');
console.log('bpm', essentia.PercivalBpmEstimator(vector).bpm);
console.log('danceability', essentia.Danceability(vector).danceability);
console.log('duration', essentia.Duration(vector).duration);
console.log('energy', essentia.Energy(vector).energy);
console.log('key', essentia.KeyExtractor(vector).key);
console.log('loudness', essentia.DynamicComplexity(vector).loudness);
console.log('notes', getNotes(vector));
console.log('scale', essentia.KeyExtractor(vector).scale);

Thanks for your suggestions @kmturley