OhadRubin's Stars
charlax/professional-programming
A collection of learning resources for curious software engineers
jart/cosmopolitan
build-once run-anywhere c library
deepset-ai/haystack
:mag: LLM orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or conversational agent chatbots.
codelucas/newspaper
newspaper3k is a news, full-text, and article metadata extraction in Python 3. Advanced docs:
stas00/ml-engineering
Machine Learning Engineering Open Book
google-research/scenic
Scenic: A Jax Library for Computer Vision Research and Beyond
ekzhu/datasketch
MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog, HyperLogLog++, LSH Ensemble and HNSW
Zjh-819/LLMDataHub
A quick guide (especially) for trending instruction finetuning datasets
darrenburns/elia
A snappy, keyboard-centric terminal user interface for interacting with large language models. Chat with ChatGPT, Claude, Llama 3, Phi 3, Mistral, Gemma and more.
tldraw/make-real-starter
Make it real
claffin/cloudproxy
Hide your scrapers IP behind the cloud. Provision proxy servers across different cloud providers to improve your scraping success.
kakaobrain/coyo-dataset
COYO-700M: Large-scale Image-Text Pair Dataset
IntelLabs/fastRAG
Efficient Retrieval Augmentation and Generation Framework
google/jaxopt
Hardware accelerated, batchable and differentiable optimizers in JAX.
mkusner/wmd
Word Mover's Distance from Matthew J Kusner's paper "From Word Embeddings to Document Distances"
Dicklesworthstone/fast_vector_similarity
The Fast Vector Similarity Library is designed to provide efficient computation of various similarity measures between vectors.
google-deepmind/launchpad
facebookresearch/dpr-scale
Scalable training for dense retrieval models.
EleutherAI/cookbook
Deep learning for dummies. All the practical details and useful utilities that go into working with real models.
erfanzar/EasyDeL
Accelerate your training with this open-source library. Optimize performance with streamlined training and serving options with JAX. 🚀
faizancodes/Automated-Fundamental-Analysis
Python program that rates stocks out of 100 based on valuation, profitability, growth, and price performance metrics, relative to the company's sector.
philipbergen/zero
Zero MQ made easy with a few wrappers around pyzmq
TheDuckAI/DuckTrack
Multimodal computer agent data collection program
google-deepmind/einshape
HKUNLP/icl-ceil
[ICML 2023] Code for our paper “Compositional Exemplars for In-context Learning”.
cloneofsimo/ezmup
Simple implementation of muP, based on Spectral Condition for Feature Learning
kwang2049/easy-elasticsearch
Using business-level retrieval system (BM25) with Python in just a few lines.
EleutherAI/pile_dedupe
Pile Deduplication Code
sholtodouglas/multihost_dataloading
Experimenting with how best to do multi-host dataloading
sbl1996/Hax-LLM
Hastur's experiments in scaling LLM to 10B+ parameters with JAX and TPUs