shashankmc's Stars
nektos/act
Run your GitHub Actions locally 🚀
oobabooga/text-generation-webui
A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models.
openai/evals
Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.
TeamPiped/Piped
An alternative privacy-friendly YouTube frontend which is efficient by design.
minimaxir/textgenrnn
Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code.
bazelbuild/bazelisk
A user-friendly launcher for Bazel.
bazingagin/npc_gzip
Code for Paper: “Low-Resource” Text Classification: A Parameter-Free Classification Method with Compressors
teamclairvoyant/airflow-maintenance-dags
A series of DAGs/Workflows to help maintain the operation of Airflow
casper-hansen/AutoAWQ
AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:
argilla-io/distilabel
⚗️ distilabel is a framework for synthetic data and AI feedback for AI engineers that require high-quality outputs, full data ownership, and overall efficiency.
Marker-Inc-Korea/AutoRAG
RAG AutoML Tool - Find optimal RAG pipeline for your own data.
tecoholic/ner-annotator
Named Entity Recognition (NER) Annotation tool for SpaCy. Generates Traning Data as a JSON which can be readily used.
HumanSignal/label-studio-ml-backend
Configs and boilerplates for Label Studio's Machine Learning backend
tomaarsen/SpanMarkerNER
SpanMarker for Named Entity Recognition
inseq-team/inseq
Interpretability for sequence generation models 🐛 🔍
wietsedv/bertje
BERTje is a Dutch pre-trained BERT model developed at the University of Groningen. (EMNLP Findings 2020) "What’s so special about BERT’s layers? A closer look at the NLP pipeline in monolingual and multilingual models"
tsproisl/textcomplexity
Linguistic and stylistic complexity measures for (literary) texts
GateNLP/python-gatenlp
Python text processing, pattern matching, and NLP framework
JSv4/OpenContracts
Free, Open Source collaborative text annotating platform based on React and Django
joeddav/blog
JSv4/GremlinServer
A low-code microservices platform designed for legal engineers. Given a document, Gremlin will apply a series of Python scripts to it and return transformed documents and/or extracted data. Use with GremlinUI for an open source, modern, React-based low-code experience (https://github.com/JSv4/GremlinGUI)
MBAigner/PDFSegmenter
This library builds a graph-representation of the content of PDFs. The graph is then clustered, resulting page segments are classified and returned. Tables are retrieved formatted as a CSV.
saran9991/llm-data-annotation
Use Large Language Models like OpenAI's GPT-3.5 for data annotation and model enhancement. This framework combines human expertise with LLMs, employs Iterative Active Learning for continuous improvement, and integrates CleanLab (Confident Learning) to ensure high-quality datasets and better model performance
JHUAPL/PINE
Collaborative NLP annotation tool supporting enterprise authentication, inter-annotator statistics, active learning
JamesShakarji/Kolmogorov-Entropy-Implementations
I was disappointed there wasn't more open source material/expressions for Kolmogorov complexity/entropy. This repo contains implementations in various languages.
katehret/measuring-language-complexity
Kolmogorov complexity, language complexity, compression
Andrewymd/DMLPlayground
This repository contains code for training and evaluating various Deep Metric Learning (DML) algorithms on the CUB200-2011, Cars196 and SOP datasets.
raphael-sch/alaf
ALAF - Active Learning Annotation Framework
gabyarte/active-learning-in-ehealth
Active Learning for Name Entity Recognition on eHealth Corpus
lambdavi/SpanLuke
Legal Named Entity Recognition through combination of SpanMarkers and Luke