wangxidong06
Towards (Medical) LLMs’ interactivity
PHD@The Chinese University of Hong Kong, Shenzhen, BA@Beijing Institute of Technology, xidongwang1@link.cuhk.edu.cn
Pinned Repositories
Apollo
Multilingual Medicine: Model, Dataset, Benchmark, Code
CMB
CMB, A Comprehensive Medical Benchmark in Chinese
FastLLM
Fast LLM Training CodeBase With dynamic strategy choosing [Deepspeed+Megatron+FlashAttention+CudaFusionKernel+Compiler];
Huatuo-26M
The Largest-scale Chinese Medical QA Dataset: with 26,000,000 question answer pairs.
LongLLaVA
LongLLaVA: Scaling Multi-modal LLMs to 1000 Images Efficiently via Hybrid Architecture
Medical_NLP
Medical NLP Competition, dataset, large models, paper
acl-2023
Repository for the ACL 2023 conference website
BLAS_testbench
Basic Linear Algebra Subprograms testbench
Notes-and-Assigns-for-CS224N
Homework and Notes of CS224N
Optimized-LLM.cpp
Optimized LLM.cpp codes(LLaMa.cpp BLoomz.cpp Whisper.cpp) with Matrix Multiplication implemented by BLIS
wangxidong06's Repositories
wangxidong06/Notes-and-Assigns-for-CS224N
Homework and Notes of CS224N
wangxidong06/BLAS_testbench
Basic Linear Algebra Subprograms testbench
wangxidong06/Optimized-LLM.cpp
Optimized LLM.cpp codes(LLaMa.cpp BLoomz.cpp Whisper.cpp) with Matrix Multiplication implemented by BLIS
wangxidong06/acl-2023
Repository for the ACL 2023 conference website
wangxidong06/DoLa
Official implementation for the paper "DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models"
wangxidong06/emnlp-2023
Repository containing the website for the EMNLP 2023 conference
wangxidong06/EasyContext
Memory optimization and training recipes to extrapolate language models' context length to 1 million tokens, with minimal hardware.
wangxidong06/Firefly
Firefly(流萤): 中文对话式大语言模型(全量微调+QLoRA),支持微调Baichuan2、CodeLlama、Llma2、Llama、Qwen、Baichuan、ChatGLM2、InternLM、Ziya、Bloom等大模型
wangxidong06/flash-attention
Fast and memory-efficient exact attention
wangxidong06/flash-linear-attention
Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton
wangxidong06/llama-mistral
Inference code for Mistral and Mixtral hacked up into original Llama implementation
wangxidong06/llama.cpp
Port of Facebook's LLaMA model in C/C++
wangxidong06/LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
wangxidong06/LLMSFT_template
Various SFT acceleration framework scripts and codes
wangxidong06/Megatron-DeepSpeed
Ongoing research training transformer language models at scale, including: BERT & GPT-2
wangxidong06/Megatron-LLaMA
Best practice for training LLaMA models in Megatron-LM
wangxidong06/neurips_llm_efficiency_challenge
NeurIPS Large Language Model Efficiency Challenge: 1 LLM + 1GPU + 1Day
wangxidong06/OpenAIAPI
Use OpenAIAPI stably and quickly
wangxidong06/opencompass
OpenCompass is an LLM evaluation platform, supporting a wide range of models (LLaMA, LLaMa2, ChatGLM2, ChatGPT, Claude, etc) over 50+ datasets.
wangxidong06/OpenRLHF
A Ray-based High-performance RLHF framework (for 7B on RTX4090 and 34B on A100)
wangxidong06/PromethAI-Memory
Memory management for the AI Applications and AI Agents
wangxidong06/TensorRT
NVIDIA® TensorRT™, an SDK for high-performance deep learning inference, includes a deep learning inference optimizer and runtime that delivers low latency and high throughput for inference applications.
wangxidong06/TinyLlama
wangxidong06/torchtitan
A native PyTorch Library for large model training
wangxidong06/UltraFastBERT
The repository for the code of the UltraFastBERT paper