Pinned Repositories
3d-voice-chess
A voice driven 3D chess game for learning Voice AI
cc-crawl-statistics
Statistics of Common Crawl monthly archives mined from URL index files
cc-notebooks
Various Jupyter notebooks about Common Crawl data
cc-pyspark
Process Common Crawl data with Python and Spark
cc-warc-examples
CommonCrawl WARC/WET/WAT examples and processing code for Java + Hadoop
cldr-json
JSON Data from the Unicode CLDR Project
common-voice-tr-experiments
Our experiments with Common Voice Turkish datasets, colab notebooks with training results.
cv-tbox-dataset-analyzer
Analysis and Viewer for Mozilla Common Voice Datasets
cv-tbox-metadata-viewer
WepApp for examining Common Voice metadata
cv-tbox-split-maker
Creates alternative splits for Mozilla Common Voice datasets for further analysis. Supports delta-version upgrades.
HarikalarKutusu's Repositories
HarikalarKutusu/cv-tbox-dataset-analyzer
Analysis and Viewer for Mozilla Common Voice Datasets
HarikalarKutusu/cv-tbox-split-maker
Creates alternative splits for Mozilla Common Voice datasets for further analysis. Supports delta-version upgrades.
HarikalarKutusu/cc-crawl-statistics
Statistics of Common Crawl monthly archives mined from URL index files
HarikalarKutusu/cc-pyspark
Process Common Crawl data with Python and Spark
HarikalarKutusu/cc-warc-examples
CommonCrawl WARC/WET/WAT examples and processing code for Java + Hadoop
HarikalarKutusu/cldr-json
JSON Data from the Unicode CLDR Project
HarikalarKutusu/common-voice
Common Voice is part of Mozilla's initiative to help teach machines how real people speak.
HarikalarKutusu/common-voice-bundler
Script for bundling Common Voice (https://commonvoice.mozilla.org/) clips by language
HarikalarKutusu/commonvoice-utils
Linguistic processing for Common Voice
HarikalarKutusu/CorporaCreator
Command line tool to create corpora for Common Voice
HarikalarKutusu/cv-dataset
Metadata and versioning details for the Common Voice dataset
HarikalarKutusu/cv-tbox-metadata-viewer
WepApp for examining Common Voice metadata
HarikalarKutusu/cv-sentence-extractor
Scraping Wikipedia for fair use sentences
HarikalarKutusu/cv-tbox-dataset-compiler
HarikalarKutusu/cvtools
Scripts to get stats and validate data from Mozilla's Common Voice project.
HarikalarKutusu/esbuild-plugin-d.ts
ESBuild convenience plugin for compiling typescript definitions along with javascript
HarikalarKutusu/hold-event
HarikalarKutusu/langcodes
A Python library for working with and comparing language codes.
HarikalarKutusu/language_data
An optional supplement to `langcodes` that stores names and statistics of languages.
HarikalarKutusu/omnilingo
Listening-based language learning
HarikalarKutusu/react-three-fiber
🇨🇭 A React renderer for Three.js
HarikalarKutusu/sentence-collector
Tool to collect and review sentences for Common Voice
HarikalarKutusu/socket.io
Realtime application framework (Node.JS server)
HarikalarKutusu/STT-models
Open models for Coqui STT
HarikalarKutusu/three.js
JavaScript 3D Library.
HarikalarKutusu/transformers.js
Run 🤗 Transformers in your browser!
HarikalarKutusu/webarchive-indexing
Tools for bulk indexing of WARC/ARC files on Hadoop, EMR or local file system.
HarikalarKutusu/whirlwind-python
A whilrlwind tour of Common Crawl's data using Python
HarikalarKutusu/zeyrek
Python morphological analyzer for Turkish language. Partial port of ZemberekNLP.
HarikalarKutusu/zustand
🐻 Bear necessities for state management in React