turian
Deep learning, NLP, audio AI 🧑🔬 Postdoc under Bengio 🧑🎓 ACL 10 Year Test of Time Award (lead author) 🌟 when art meets science... 👩🎤
Berlin + New York
Pinned Repositories
common
Common Python library, especially for text processing and controlling experimental runs
crfchunking-with-wordrepresentations
Train a CRF for syntactic chunking (CoNLL2000), and use word representations
kea-service
KEA 5.0 (keyphrase extraction software), modified to be an XML-RPC service
neural-language-model
Implementation of neural language models, in particular Collobert + Weston (2008) and a stochastic margin-based version of Mnih's LBL.
pytextpreprocess
Preprocess text for NLP (tokenizing, lowercasing, stemming, sentence splitting, etc.)
random-indexing-wordrepresentations
Induce word representations using random indexing (RI)
save-my-browser-tabs
Extension for Mozilla Firefox and Google Chrome to save all of your open tabs to a text file (window/tab index, URL and title of each tab)
stanford-pos-tagger-service
XML-RPC version of the Stanford POS tagger
textSNE
2-d visualization of high-dimensional input: Python code for rendering t-SNE code with text labels for each point
topia.termextract
Updates to Zope's keyphrase extractor (forked from 1.1.0)
turian's Repositories
turian/inverse-audio-synthesis
Inverse audio synthesis
turian/surge-python-docker
Docker image for running Surge synthesizer through Python
turian/audio-discrimination-crowdsource
Web service to crowd-source audio discrimination data
turian/dx7pytorch
A musical instrument audio dataset generated on-the-fly using FM synthesis.
turian/dx7render-docker
Render dx7 patches, dockerized
turian/kinda-deep
Technical blog
turian/multi-task-music-transcription-replicate
replicate model: MT3: Multi-Task Multitrack Music Transcription
turian/OpenHands
🙌 OpenHands: Code Less, Make More
turian/parsesyx
Parse SYX files (for DX7)
turian/sherlock-rest
A Django JSON REST API for Sherlock
turian/solo-learn
solo-learn: a library of self-supervised methods for visual representation learning powered by Pytorch Lightning
turian/2025-better-scores-worse-generation
A Curious Case of the Missing Measure: Better Scores and Worse Generation
turian/anticipatory-music-transformer-replicate
turian/ArchiveBox
🗃 Open source self-hosted web archiving. Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and more...
turian/cog-olmocr
Toolkit for linearizing PDFs for LLM datasets/training (now with multiple pages)
turian/cookiecutter_django_simple
turian/diffmoog
turian/docker-keyfinder-cli
turian/EfficientAT
This repository aims at providing efficient CNNs for Audio Tagging. We provide AudioSet pre-trained models ready for downstream training and extraction of audio embeddings.
turian/fxp2json
FXP presets, to and from JSON
turian/gollm
Unified Go interface for Language Model (LLM) providers. Simplifies LLM integration with flexible prompt management and common task functions.
turian/learnfm
A Python Yamaha DX7 module for audio learning
turian/local-pdf-nougat
Use nougat to do PDF to Markdown on local files, through replicate.com
turian/replicate_arxiv_llm_text
Replicate.com model to prepare arXiv papers for LLMs
turian/seek-tune
An implementation of Shazam's song matching algorithm.
turian/tiktoken-truncate
Fast truncation of strings to the maximum token length, using tiktoken
turian/versatile_audio_super_resolution
Versatile audio super resolution (any -> 48kHz) with AudioSR.
turian/voice-cloning-training
Voice data <= 10 mins can also be used to train a good VC model!
turian/whisperX
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
turian/whisply-replicate
Whisply, but as a replicate.com service