ytsaig's Stars
xtekky/gpt4free
The official gpt4free repository | various collection of powerful language models
karpathy/minGPT
A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training
cleanlab/cleanlab
The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.
THUDM/GLM-130B
GLM-130B: An Open Bilingual Pre-Trained Model (ICLR 2023)
alirezamika/autoscraper
A Smart, Automatic, Fast and Lightweight Web Scraper for Python
MaartenGr/BERTopic
Leveraging BERT and c-TF-IDF to create easily interpretable topics.
huggingface/alignment-handbook
Robust recipes to align language models with human and AI preferences
argilla-io/argilla
Argilla is a collaboration tool for AI engineers and domain experts to build high-quality datasets
jina-ai/discoart
🪩 Create Disco Diffusion artworks in one line
adbar/trafilatura
Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XML
Nixtla/neuralforecast
Scalable and user friendly neural :brain: forecasting algorithms.
ddangelov/Top2Vec
Top2Vec learns jointly embedded topic, document and word vectors.
fhamborg/news-please
news-please - an integrated web crawler and information extractor for news that just works
allenai/natural-instructions
Expanding natural instructions
man-group/notebooker
Productionise & schedule your Jupyter Notebooks as easily as you wrote them.
booknlp/booknlp
BookNLP, a natural language processing pipeline for books
GEM-benchmark/NL-Augmenter
NL-Augmenter 🦎 → 🐍 A Collaborative Repository of Natural Language Transformations
princeton-nlp/DensePhrases
[ACL 2021] Learning Dense Representations of Phrases at Scale; EMNLP'2021: Phrase Retrieval Learns Passage Retrieval, Too https://arxiv.org/abs/2012.12624
Nv7-GitHub/googlesearch
A Python library for scraping the Google search engine.
oughtinc/ice
Interactive Composition Explorer: a debugger for compositional language model programs
philschmid/easyllm
r-three/t-few
Code for T-Few from "Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning"
YuanGongND/whisper-at
Code and Pretrained Models for Interspeech 2023 Paper "Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong Audio Event Taggers"
leondz/hatespeechdata
Catalog of abusive language data (PLoS 2020)
JanPalasek/pretty-jupyter
Creates dynamic html report from jupyter notebook.
label-sleuth/label-sleuth
Open source no-code system for text annotation and building of text classifiers
santhoshse7en/news-fetch
A Python Package which helps to scrape all news details from any news websites
ExpressAI/reStructured-Pretraining
reStructured Pre-training
BBN-E/ZS4IE
ZS4IE: A Toolkit for Zero-Shot Information Extraction with Simple Verbalizations
epfl-dlab/WikiHist.html
This is a repo containing all code and steps taken to download, setup the process and convert the whole English Wikipedia history from Wikitext to HTML format.