Pinned Repositories
3D-ResNets-PyTorch
3D ResNets for Action Recognition (CVPR 2018)
A-summary-of-detection-and-segmentation
AdaFocus
Reducing spatial redundancy in video recognition. SOTA computational efficiency.
Additive-Margin-Softmax
This is the implementation of paper <Additive Margin Softmax for Face Verification>
AdelaiDet
AdelaiDet is an open source toolbox for multiple instance-level detection and recognition tasks.
albumentations
Fast image augmentation library and easy to use wrapper around other libraries. Documentation: https://albumentations.ai/docs/ Paper about library: https://www.mdpi.com/2078-2489/11/2/125
Algorithm-Practice-in-Industry
搜索、推荐、广告、用增等工业界实践文章收集(来源:知乎、Datafuntalk、技术公众号)
Alpaca-CoT
We unified the interfaces of instruction-tuning data (e.g., CoT data), multiple LLMs and parameter-efficient methods (e.g., lora, p-tuning) together for easy use. Meanwhile, we created a new branch to build a Tabular LLM.(我们分别统一了丰富的IFT数据(如CoT数据,目前仍不断扩充)、多种训练效率方法(如lora,p-tuning)以及多种LLMs,三个层面上的接口,打造方便研究人员上手的LLM-IFT研究平台。同时tabular_llm分支构建了面向表格智能任务的LLM。
detectron2
Detectron2 is FAIR's next-generation research platform for object detection and segmentation.
PolarMask
Code for 'PolarMask: Single Shot Instance Segmentation with Polar Representation'
Xlsean's Repositories
Xlsean/Algorithm-Practice-in-Industry
搜索、推荐、广告、用增等工业界实践文章收集(来源:知乎、Datafuntalk、技术公众号)
Xlsean/Alpaca-CoT
We unified the interfaces of instruction-tuning data (e.g., CoT data), multiple LLMs and parameter-efficient methods (e.g., lora, p-tuning) together for easy use. Meanwhile, we created a new branch to build a Tabular LLM.(我们分别统一了丰富的IFT数据(如CoT数据,目前仍不断扩充)、多种训练效率方法(如lora,p-tuning)以及多种LLMs,三个层面上的接口,打造方便研究人员上手的LLM-IFT研究平台。同时tabular_llm分支构建了面向表格智能任务的LLM。
Xlsean/AtomBulb
旨在对当前主流LLM进行一个直观、具体、标准的评测
Xlsean/BELLE
BELLE: Be Everyone's Large Language model Engine(开源中文对话大模型)
Xlsean/CGSTVG
Implementation of "Context-Guided Spatio-Temporal Video Grounding"
Xlsean/CLIP-ImageSearch-NCNN
CLIP⚡NCNN⚡基于自然语言的图片搜索(Image Search)⚡以字搜图⚡x86⚡Android
Xlsean/CLIP-It
Xlsean/CLIP4Clip
An official implementation for "CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval"
Xlsean/CLUE
中文语言理解测评基准 Chinese Language Understanding Evaluation Benchmark: datasets, baselines, pre-trained models, corpus and leaderboard
Xlsean/Deep-Video-Super-Resolution
The state-of-the-art VSR
Xlsean/FollowAnything
Xlsean/generative_agents
Generative Agents: Interactive Simulacra of Human Behavior
Xlsean/google-gemini-yt-video-summarizer-AI-p
Summarize Videos and Generate Timestamps Efficiently using Google Gemini Pro
Xlsean/GroundingDINO
Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
Xlsean/langchain
⚡ Building applications with LLMs through composability ⚡
Xlsean/LangChain-Chinese-Getting-Started-Guide
LangChain 的中文入门教程
Xlsean/Llama2-Chinese
Llama中文社区,最好的中文Llama大模型,完全开源可商用
Xlsean/MMSum_Data_Collection_Tool
MMSum: A Dataset for Multimodal Summarization and Thumbnail Generation of Videos
Xlsean/MMSum_model
MMSum: A Dataset for Multimodal Summarization and Thumbnail Generation of Videos
Xlsean/MQ-Det
Official PyTorch implementation of "Multi-modal Queried Object Detection in the Wild" (accepted by NeurIPS 2023)
Xlsean/opencompass
OpenCompass is an LLM evaluation platform, supporting a wide range of models (LLaMA, LLaMa2, ChatGLM2, ChatGPT, Claude, etc) over 50+ datasets.
Xlsean/ovtrack
OVTrack: Open-Vocabulary Multiple Object Tracking [CVPR 2023]
Xlsean/Qwen-VL
The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.
Xlsean/RICE
Xlsean/scenic
Scenic: A Jax Library for Computer Vision Research and Beyond
Xlsean/Segment-Everything-Everywhere-All-At-Once
[NeurIPS 2023] Official implementation of the paper "Segment Everything Everywhere All at Once"
Xlsean/TSP
TSP: Temporally-Sensitive Pretraining of Video Encoders for Localization Tasks (ICCVW 2021)
Xlsean/UniVTG
[ICCV2023] UniVTG: Towards Unified Video-Language Temporal Grounding
Xlsean/VAST
Code and Model for VAST: A Vision-Audio-Subtitle-Text Omni-Modality Foundation Model and Dataset
Xlsean/VidChapters
[NeurIPS 2023 D&B] VidChapters-7M: Video Chapters at Scale