Press a keybind, speak, and get instant text output. A speech-to-text tool that transcribes audio using OpenAI Whisper and outputs to stdout.
- Signal-driven: Press keybind → speak → get text (no GUI needed)
- UNIX philosophy: Outputs transcribed text to stdout for piping to other tools
- On-demand operation: Starts when called, processes audio, then exits
- Audio feedback: Beeps confirm recording start/stop and success
- Wayland native: Works with modern Linux desktops (Hyprland, Niri, etc.)
- Wayland desktop (Hyprland, Niri, GNOME, KDE, etc.)
- OpenAI API key (for Whisper transcription)
- System packages:
# Arch Linux
sudo pacman -S pipewire
# Ubuntu/Debian
sudo apt install pipewire-pulse
# Fedora
sudo dnf install pipewire-pulseaudioOptional (for direct typing keybindings):
# Arch Linux
sudo pacman -S ydotool
# Ubuntu/Debian
sudo apt install ydotool
# Fedora
sudo dnf install ydotool
# Setup ydotool permissions and service:
sudo usermod -a -G input $USER
# Enable and start ydotool daemon service
sudo systemctl enable --now ydotool.service
# Set socket environment variable (add to ~/.bashrc or ~/.zshrc)
echo 'export YDOTOOL_SOCKET=/tmp/.ydotool_socket' >> ~/.bashrc
# Log out and back in (or source ~/.bashrc)# Using your preferred AUR helper
yay -S waystt-bin
# or
paru -S waystt-bin- Download from GitHub Releases
- Install:
wget https://github.com/sevos/waystt/releases/latest/download/waystt-linux-x86_64
mkdir -p ~/.local/bin
mv waystt-linux-x86_64 ~/.local/bin/waystt
chmod +x ~/.local/bin/waystt
# Add to PATH (add to ~/.bashrc or ~/.zshrc)
export PATH="$HOME/.local/bin:$PATH"- Setup configuration:
# Create config directory and file
mkdir -p ~/.config/waystt
echo "OPENAI_API_KEY=your_api_key_here" > ~/.config/waystt/.env- Test the application:
# Run waystt and pipe output to see it working
waystt | tee /tmp/waystt-output.txt- Use with signals:
# Transcribe and output to stdout
pkill --signal SIGUSR1 waystt# Start waystt and save output to file
waystt > output.txt
# Start waystt and copy output to clipboard
waystt --pipe-to wl-copy
# Start waystt and type output directly
waystt --pipe-to ydotool type --file -
# Trigger transcription (if waystt is running)
pkill --signal SIGUSR1 waysttMost keybindings follow this pattern:
pgrep -x waystt >/dev/null && pkill --signal SIGUSR1 waystt || (waystt [OPTIONS] &)This means: "If waystt is running, send signal to transcribe. Otherwise, start waystt with specified options."
Add to your ~/.config/hypr/hyprland.conf:
# waystt - Speech to Text (direct typing)
bind = SUPER, R, exec, pgrep -x waystt >/dev/null && pkill --signal SIGUSR1 waystt || (waystt --pipe-to ydotool type --file - &)
# waystt - Speech to Text (clipboard copy)
bind = SUPER SHIFT, R, exec, pgrep -x waystt >/dev/null && pkill --signal SIGUSR1 waystt || (waystt --pipe-to wl-copy &)Add to your ~/.config/niri/config.kdl:
binds {
// waystt - Speech to Text (direct typing)
Mod+R { spawn "sh" "-c" "pgrep -x waystt >/dev/null && pkill --signal SIGUSR1 waystt || (waystt --pipe-to ydotool type --file - &)"; }
// waystt - Speech to Text (clipboard copy)
Mod+Shift+R { spawn "sh" "-c" "pgrep -x waystt >/dev/null && pkill --signal SIGUSR1 waystt || (waystt --pipe-to wl-copy &)"; }
}Keybinding Functions:
- Super+R (Hyprland) / Mod+R (Niri): Direct typing via ydotool
- Super+Shift+R (Hyprland) / Mod+Shift+R (Niri): Copy to clipboard
waystt starts on-demand, records audio, transcribes it, outputs to stdout, then exits:
# Terminal 1: Start waystt with output to file
waystt > transcription.txt
# Terminal 2: Trigger transcription (or use keyboard shortcut)
pkill --signal SIGUSR1 waysttThe --pipe-to option allows you to pipe transcribed text directly to another command:
# Copy transcription to clipboard
waystt --pipe-to wl-copy
pkill --signal SIGUSR1 waystt
# Type transcription directly into focused window
waystt --pipe-to ydotool type --file -
pkill --signal SIGUSR1 waystt
# Process transcription with sed and copy to clipboard
waystt --pipe-to sh -c "sed 's/hello/hi/g' | wl-copy"
pkill --signal SIGUSR1 waystt
# Save to file with timestamp
waystt --pipe-to sh -c "echo \"$(date): $(cat)\" >> speech-log.txt"
pkill --signal SIGUSR1 waysttConfiguration is read from ~/.config/waystt/.env by default. You can override this location using the --envfile flag:
waystt --envfile /path/to/custom/.envwaystt supports two transcription providers: OpenAI Whisper (default) and Google Speech-to-Text. Choose the one that best fits your needs.
OpenAI Whisper offers excellent accuracy and supports automatic language detection.
Required: Create ~/.config/waystt/.env with your OpenAI API key:
OPENAI_API_KEY=your_api_key_hereOptional OpenAI settings:
# Whisper model (whisper-1 is default, most cost-effective)
WHISPER_MODEL=whisper-1
# Force specific language (default: auto-detect)
WHISPER_LANGUAGE=en
# API timeout in seconds
WHISPER_TIMEOUT_SECONDS=60
# Max retry attempts
WHISPER_MAX_RETRIES=3Google Speech-to-Text provides fast, accurate transcription with support for many languages and dialects.
Setup Steps:
-
Enable Google Cloud Speech-to-Text API:
- Go to Google Cloud Console
- Create a new project or select existing one
- Enable the "Cloud Speech-to-Text API"
- Create a service account and download the JSON key file
-
Configure waystt for Google:
# Switch to Google provider
TRANSCRIPTION_PROVIDER=google
# Path to your service account JSON file
GOOGLE_APPLICATION_CREDENTIALS=/path/to/your/service-account-key.json
# Primary language (default: en-US)
GOOGLE_SPEECH_LANGUAGE_CODE=en-US
# Model selection (latest_long for longer audio, latest_short for shorter)
GOOGLE_SPEECH_MODEL=latest_long
# Optional: Alternative languages for auto-detection (comma-separated)
GOOGLE_SPEECH_ALTERNATIVE_LANGUAGES=es-ES,fr-FR,de-DEPopular Google language codes:
en-US- English (United States)en-GB- English (United Kingdom)es-ES- Spanish (Spain)fr-FR- French (France)de-DE- German (Germany)ja-JP- Japanesezh-CN- Chinese (Simplified)
Audio and system settings (apply to both providers):
# Disable audio beeps
ENABLE_AUDIO_FEEDBACK=false
# Adjust beep volume (0.0 to 1.0)
BEEP_VOLUME=0.1
# Debug logging
RUST_LOG=debugIf audio recording fails:
- Ensure PipeWire is running:
systemctl --user status pipewire - Check microphone permissions
- Verify microphone is not muted
OpenAI Provider:
- Verify your OpenAI API key is valid and has sufficient credits
- Check internet connectivity
- Review logs for specific error messages
Google Provider:
- Verify your service account JSON file path is correct
- Ensure the Speech-to-Text API is enabled in your Google Cloud project
- Check that your service account has the necessary permissions
- Verify your Google Cloud project has billing enabled
- Review logs for specific error messages
cargo test# Using default config location (~/.config/waystt/.env)
RUST_LOG=debug cargo run
# Or using project-local .env file for development
RUST_LOG=debug cargo run -- --envfile .envgit clone https://github.com/sevos/waystt.git
cd waystt
# Create config directory and copy example configuration
mkdir -p ~/.config/waystt
cp .env.example ~/.config/waystt/.env
# Edit ~/.config/waystt/.env with your API key
# Build the project
cargo build --release
# Install to local bin
mkdir -p ~/.local/bin
cp ./target/release/waystt ~/.local/bin/Licensed under GPL v3.0 or later. Source code: https://github.com/sevos/waystt
See LICENSE for full terms.