/audio2textjs

A Node.js library for audio processing and transcription using the Whisper tool. It supports converting audio files to text using various pre-trained models

Primary LanguageJavaScriptMIT LicenseMIT

Audio2TextJS

License npm version VIEWS

Audio2TextJS is a Node.js library for audio processing and transcription using the Whisper tool. It supports converting audio files to text using various pre-trained models.

Features

  • Convert audio files to text with customizable options.
  • Automatically downloads necessary model files.
  • Supports multiple output formats: JSON, TXT, CSV.
  • Flexible configuration for threading, processors, and more.

Installation

To install the library, use npm:

npm install audio2textjs

Usage

import Audio2TextJS from 'audio2textjs';

// Example usage
const converter = new Audio2TextJS({
    threads: 4,
    processors: 1,
    outputJson: true,
});

const inputFile = 'path/to/input.wav';
const model = 'tiny'; // Specify one of the available models
const language = 'auto'; // or specify a language code for translation

converter.runWhisper(inputFile, model, language)
    .then(result => {
        if (result.success) {
            console.log('Conversion successful:', result.output);
        } else {
            console.error('Conversion failed:', result.message);
        }
    })
    .catch(error => {
        console.error('Error:', error);
    });

Models

The library includes the following models:

| Model     | Disk   | RAM     |
|-----------|--------|---------|
| tiny      |  75 MB | ~390 MB |
| tiny.en   |  75 MB | ~390 MB |
| base      | 142 MB | ~500 MB |
| base.en   | 142 MB | ~500 MB |
| small     | 466 MB | ~1.0 GB |
| small.en  | 466 MB | ~1.0 GB |
| medium    | 1.5 GB | ~2.6 GB |
| medium.en | 1.5 GB | ~2.6 GB |
| large-v1  | 2.9 GB | ~4.7 GB |
| large     | 2.9 GB | ~4.7 GB |

API Documentation

Audio2TextJS(options)

Creates an instance of Audio2TextJS with optional configuration options.

Parameters

  • options (Object): Optional configuration settings for the converter.

Example

const converter = new Audio2TextJS({
    threads: 4,
    processors: 1,
    outputJson: true,
});

runWhisper(inputFile, model, language)

Runs the Whisper tool for audio processing and transcription.

Parameters

  • inputFile (string): Path to the input WAV file.
  • model (string): Name of the model to use (tiny, base, etc.).
  • language (string): Spoken language ('auto' for auto-detect).

Returns

A Promise that resolves with an object containing success status, message, and optional output upon completion.

Example

converter.runWhisper('path/to/input.wav', 'tiny', 'auto')
    .then(result => {
        console.log('Conversion result:', result);
    })
    .catch(error => {
        console.error('Error:', error);
    });

Tree

│   .gitignore
│   LICENSE
│   package.json
│   README.md
├───examples
│   │   test.js
│   │
│   ├───cli
│   │       index.js
│   │       package.json
│   │       README.md
│   │
│   ├───express
│   │       app.js
│   │       package.json
│   │       README.md
│   │
│   └───telegraf
│
└───src
    │   binFiles.json
    │   convertAudioFile.js
    │   downloadWhisperModels.js
    │   fetchBinFiles.js
    │   index.js
    │   postinstall.js
    │   Audio2TextJS.js
    │
    ├───bin
    │   └───win32
    │           ffmpeg.exe
    │           ffprobe.exe
    │           whisper.exe
    │           .....
    │   └───linux
    │           ffmpeg
    │           ffprobe
    │           whisper
    │           .....
    │
    ├───models
    │       ggml-tiny.bin
    │       ggml-tiny.en.bin
    │       ggml-base.bin
    │       ggml-base.en.bin
    │       ggml-small.bin
    │       .....
    │

License

This project is licensed under the MIT License - see the LICENSE file for details.

Contact