jerrygaoLondon
I'm a full-stack Java/Python software/data engineer, a scientific researcher specialised in NLP/IE, misinfo/disinfo Semantic Web and Linked Data
University of SheffieldSheffield
jerrygaoLondon's Stars
openai/openai-cookbook
Examples and guides for using the OpenAI API
OpenInterpreter/open-interpreter
A natural language interface for computers
geekan/MetaGPT
🌟 The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming
microsoft/autogen
A programming framework for agentic AI 🤖
rasbt/LLMs-from-scratch
Implementing a ChatGPT-like LLM in PyTorch from scratch, step by step
mudler/LocalAI
:robot: The free, Open Source alternative to OpenAI, Claude and others. Self-hosted and local-first. Drop-in replacement for OpenAI, running on consumer-grade hardware. No GPU required. Runs gguf, transformers, diffusers and many more models architectures. Features: Generate Text, Audio, Video, Images, Voice Cloning, Distributed inference
crewAIInc/crewAI
Framework for orchestrating role-playing, autonomous AI agents. By fostering collaborative intelligence, CrewAI empowers agents to work together seamlessly, tackling complex tasks.
assafelovic/gpt-researcher
LLM based autonomous agent that does online comprehensive research on any given topic
dask/dask
Parallel computing with task scheduling
pgvector/pgvector
Open-source vector similarity search for Postgres
OpenMined/PySyft
Perform data science on data that remains in someone else's server
karpathy/minbpe
Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.
Chainlit/chainlit
Build Conversational AI in minutes ⚡️
karpathy/ng-video-lecture
facebookresearch/ijepa
Official codebase for I-JEPA, the Image-based Joint-Embedding Predictive Architecture. First outlined in the CVPR paper, "Self-supervised learning from images with a joint-embedding predictive architecture."
aboSamoor/polyglot
Multilingual text (NLP) processing toolkit
chiphuyen/dmls-book
Summaries and resources for Designing Machine Learning Systems book (Chip Huyen, O'Reilly 2022)
qdrant/fastembed
Fast, Accurate, Lightweight Python library to make State of the Art Embedding
AnswerDotAI/rerankers
A lightweight, low-dependency, unified API to use all common reranking and cross-encoder models.
parthsarthi03/raptor
The official implementation of RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval
JSv4/OpenContracts
Mass document analytics platform based on LlamaIndex, Pgvector, React and Django.
illuin-tech/colpali
The code used to train and run inference with the ColPali architecture.
huggingface/text-clustering
Easily embed, cluster and semantically label text datasets
KarelDO/xmc.dspy
In-Context Learning for eXtreme Multi-Label Classification (XMC) using only a handful of examples.
shmsw25/FActScore
A package to evaluate factuality of long-form generation. Original implementation of our EMNLP 2023 paper "FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation"
TIGER-AI-Lab/LongRAG
Official repo for "LongRAG: Enhancing Retrieval-Augmented Generation with Long-context LLMs".
facebook/threat-research
Welcome to the Meta Threat Research Indicator Repository, a dedicated resource for the sharing of Indicators of Compromise (IOCs) and other threat indicators with the external research community
viswavi/few-shot-clustering
fguzman82/CLIP-Finder2
CLIP-Finder enables semantic offline searches of images from gallery photos using natural language descriptions or the camera. Built on Apple's MobileCLIP-S0 architecture, it ensures optimal performance and accurate media retrieval.
DAD-CDM/dad-cdm-admin
This repo is for administrative documents for the DAD-CDM Open Project