talmago's Stars
lavague-ai/LaVague
Large Action Model framework to develop AI Web Agents
hananedupouy/LLMs-in-Finance
LLMs in Finance - Generative AI - AI Agents
urchade/GLiNER
Generalist and Lightweight Model for Named Entity Recognition (Extract any entity types from texts) @ NAACL 2024
SapienzaNLP/relik
Retrieve, Read and LinK: Fast and Accurate Entity Linking and Relation Extraction on an Academic Budget (ACL 2024)
lightonai/pylate
Late Interaction Models Training & Retrieval
jzhang38/TinyLlama
The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
jina-ai/annlite
⚡ A fast embedded library for approximate nearest neighbor search
tomaarsen/SpanMarkerNER
SpanMarker for Named Entity Recognition
HazyResearch/manifest
Prompt programming with FMs.
konradhalas/dacite
Simple creation of data classes from dictionaries.
y2kappa/timed
Rust crate to time your function using derive annotations.
EleutherAI/gpt-neox
An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed libraries
apple/tensorflow_macos
TensorFlow for macOS 11.0+ accelerated using Apple's ML Compute framework.
Lightning-AI/pytorch-lightning
Pretrain, finetune ANY AI model of ANY size on multiple GPUs, TPUs with zero code changes.
minimaxir/aitextgen
A robust Python tool for text-based AI training and generation using GPT-2.
microsoft/NeuronBlocks
NLP DNN Toolkit - Building Your NLP DNN Models Like Playing Lego
UKPLab/sentence-transformers
State-of-the-Art Text Embeddings
thunlp/PLMpapers
Must-read Papers on pre-trained language models.
mpuig/spacy-lookup
Named Entity Recognition based on dictionaries
allenai/scispacy
A full spaCy pipeline and models for scientific/biomedical documents.
fnl/segtok
Segtok v2 is here: https://github.com/fnl/syntok -- A rule-based sentence segmenter (splitter) and a word tokenizer using orthographic features.
sebastianruder/NLP-progress
Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the most common NLP tasks.
miso-belica/jusText
Heuristic based boilerplate removal tool
lambci/docker-lambda
Docker images and test runners that replicate the live AWS Lambda environment
jmcarpenter2/parfit
A package for parallelizing the fit and flexibly scoring of sklearn machine learning models, with visualization routines.
doccano/doccano
Open source annotation tool for machine learning practitioners.
mmistakes/minimal-mistakes
:triangular_ruler: Jekyll theme for building a personal site, blog, project documentation, or portfolio.
SelectTransform/st.js
JSON template over JSON
MagicStack/MagicPython
Cutting edge Python syntax highlighter for Sublime Text, Atom and Visual Studio Code. Used by GitHub to highlight your Python code!
pouchdb/pouchdb
:kangaroo: - PouchDB is a pocket-sized database.