xtyrrell's Stars
rapid7/metasploit-framework
Metasploit Framework
haotian-liu/LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
exaloop/codon
A high-performance, zero-overhead, extensible Python compiler using LLVM
openai/evals
Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.
OpenBMB/MiniCPM-V
MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone
tensorflow/playground
Play with neural networks!
Doriandarko/claude-engineer
Claude Engineer is an interactive command-line interface (CLI) that leverages the power of Anthropic's Claude-3.5-Sonnet model to assist with software development tasks. This tool combines the capabilities of a large language model with practical file system operations and web search functionality.
clovaai/donut
Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022
OpenGVLab/InternVL
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
open-mmlab/mmocr
OpenMMLab Text Detection, Recognition and Understanding Toolbox
mindee/doctr
docTR (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.
lm-sys/RouteLLM
A framework for serving and evaluating LLM routers - save LLM costs without compromising quality!
cstate/cstate
🔥 Open source static (serverless) status page. Uses hyperfast Go & Hugo, minimal HTML/CSS/JS, customizable, outstanding browser support (IE8+), preloaded CMS, read-only API, badges & more.
mediar-ai/screenpipe
24/7 local AI screen & mic recording. Build AI apps that have the full context. Works with Ollama. Alternative to Rewind.ai. Open. Secure. You own your data. Rust.
atopile/atopile
Design circuit boards with code! ✨ Get software-like design reuse 🚀, validation, version control and collaboration in hardware; starting with electronics ⚡️
sinaatalay/rendercv
The engine of the RenderCV App
DAGWorks-Inc/burr
Build applications that make decisions (chatbots, agents, simulations, etc...). Monitor, trace, persist, and execute on your own infrastructure.
mezbaul-h/june
Local voice chatbot for engaging conversations, powered by Ollama, Hugging Face Transformers, and Coqui TTS Toolkit
Yuliang-Liu/MultimodalOCR
On the Hidden Mystery of OCR in Large Multimodal Models (OCRBench)
3rd/tsdiagram
Create diagrams and plan your code with TypeScript.
rectanglehq/Shapeshift
Transform JSON objects using vector embeddings
xiaoyu258/DocProj
Document Rectification and Illumination Correction using a Patch-based CNN
ZZZHANG-jx/DocRes
[CVPR 2024] DocRes: A Generalist Model Toward Unifying Document Image Restoration Tasks
ZZZHANG-jx/Recommendations-Document-Image-Processing
This repository contains a paper collection of the methods for document image processing, including appearance enhancement, deshadow, dewarping, deblur, and binarization.
yfzhang114/SliME
✨✨Beyond LLaVA-HD: Diving into High-Resolution Large Multimodal Models
kyegomez/Python-Package-Template
A easy, reliable, fluid template for python packages complete with docs, testing suites, readme's, github workflows, linting and much much more
jeromewu/tesseract.js-video
An example app to recognize video clip with tesseract.js
andreluisjunqueira/react-native-document-scanner-android
Document scanner android, feature live detection, auto-capture, perspective correction :vibration_mode: :camera: -- :trophy:
ParadoxZW/LLaVA-UHD-Better
A bug-free and improved implementation of LLaVA-UHD, based on the code from the official repo
fahmiaziz98/receipt_parsing
receipt parsing using donut model, next we will add using LLM + OCR or VLM