A curated list of general AI methods for Anything: AnyObject, AnyGeneration, AnyModel, AnyTask, etc.
Contributions are welcome!
- Awesome-Anything
- AnyObject - Segmentation, Detection, Classification, Medical Image, OCR, etc.
- AnyGeneration - Text-to-Image Generation, Editing, Inpainting, 3D, etc.
- AnyTask - LLM Controller + ModelZoo, General Decoding, Multi-Task Learning.
- AnyModel - Network Pruning, Network Quantization, Model Reuse.
- AnyX - Other Topics: Captioning, etc.
- Paper List
Title & Authors | Intro | Useful Links |
---|---|---|
HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in HuggingFace Yongliang Shen, Kaitao Song, Xu Tan, Dongsheng Li, Weiming Lu, Yueting Zhuang Preprint'23 [Jarvis (Project)] |
[Github] [Demo] |
|
TaskMatrix.AI: Completing Tasks by Connecting Foundation Models with Millions of APIs Yaobo Liang, Chenfei Wu, Ting Song, Wenshan Wu, Yan Xia, Yu Liu, Yang Ou, Shuai Lu, Lei Ji, Shaoguang Mao, Yun Wang, Linjun Shou, Ming Gong, Nan Duan Preprint'23 |
[Github] | |
Generalized Decoding for Pixel, Image and Language Xueyan Zou, Zi-Yi Dou, Jianwei Yang, Zhe Gan, Linjie Li, Chunyuan Li, Xiyang Dai, Harkirat Behl, Jianfeng Wang, Lu Yuan, Nanyun Peng, Lijuan Wang, Yong Jae Lee, Jianfeng Gao CVPR'23 [X-Decoder (Project)] |
[Github] [Page] [Demo] |
|
Pre-Trained Image Processing Transformer Chen, Hanting and Wang, Yunhe and Guo, Tianyu and Xu, Chang and Deng, Yiping and Liu, Zhenhua and Ma, Siwei and Xu, Chunjing and Xu, Chao and Gao, Wen CVPR'21 [Pretrained-IPT (Project)] |
[Github] | |
OpenAGI: When LLM Meets Domain Experts Yingqiang Ge, Wenyue Hua, Jianchao Ji, Juntao Tan, Shuyuan Xu, Yongfeng Zhang [OpenAGI (Project)] |
Github |
Title & Authors | Intro | Useful Links |
---|---|---|
DepGraph: Towards Any Structural Pruning Gongfan Fang, Xinyin Ma, Mingli Song, Michael Bi Mi, Xinchao Wang CVPR'23 [Torch-Pruning (Project)] |
[Github] [Demo] |
|
MQBench: Towards Reproducible and Deployable Model Quantization Benchmark Yuhang Li and Mingzhu Shen and Jian Ma and Yan Ren and Mingxin Zhao and Qi Zhang and Ruihao Gong and Fengwei Yu and Junjie Yan NeurIPS'21 [MQBench (Project)] |
[Github] [Page] |
|
OTOv2: Automatic, Generic, User-Friendly Tianyi Chen, Luming Liang, Tianyu Ding, Ilya Zharkov ICLR'23 [Only Train Once (Project)] |
[Github] | |
Deep Model Reassembly Xingyi Yang, Daquan Zhou, Songhua Liu, Jingwen Ye, Xinchao Wang NeurIPS'22 [Deep Model Reassembly (Project)] |
[Github] [Page] |
Title & Authors | Intro | Useful Links |
---|---|---|
Caption Anything (Project) Teng Wang, Jinrui Zhang, Junjie Fei, Yunlong Tang, Zhe Li, Mingqi Gao |
[Github] [Demo] |
|
Image2Paragraph:Transform Image into Unique Paragraph (Project) Jinpeng Wang |
Github | |
... |
A paper list for Anything AI
Paper | First Author | Venue | Topic |
---|---|---|---|
Segment Anything | Alexander Kirillov | Preprint'23 | Segmentation |
Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection | Shilong Liu | Preprint'23 | Grouding+Detection |
SegGPT: Segmenting Everything In Context | Xinlong Wang | Preprint'23 | Segmentation |
V3Det: Vast Vocabulary Visual Detection Dataset | Jiaqi Wang | Preprint'23 | Dataset |
Paper | First Author | Venue | Topic |
---|---|---|---|
High-Resolution Image Synthesis with Latent Diffusion Models | Robin Rombach | CVPR'22 | Text-to-Image Generation |
Adding Conditional Control to Text-to-Image Diffusion Models | Lvmin Zhang | Preprint'23 | Controlllable Generation |
GigaGAN: Large-scale GAN for Text-to-Image Synthesis | Minguk Kang | CVPR'23 | Large-scale GAN |
Paper | First Author | Venue | Topic |
---|---|---|---|
DepGraph: Towards Any Structural Pruning | Gongfan Fang | CVPR'23 | Network Pruning |
MQBench: Towards Reproducible and Deployable Model Quantization Benchmark | Yuhang Li | NeurIPS'21 | Network Quantization |
OTOv2: Automatic, Generic, User-Friendly | Tianyi Chen | ICLR'23 | Network Pruning |
Deep Model Reassembly | Xingyi Yang | NeurIPS'22 | Model Reuse |
Paper | First Author | Venue | Topic |
---|---|---|---|
HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in HuggingFace | Yongliang Shen | Preprint'23 | Modelzoo + LLM |
TaskMatrix.AI: Completing Tasks by Connecting Foundation Models with Millions of APIs | Yaobo Liang | Preprint'23 | Modelzoo + LLM |
Generalized Decoding for Pixel, Image and Language | Xueyan Zou | CVPR'23 | Multi Tasking |
Pre-Trained Image Processing Transformer | Chen, Hanting | CVPR'21 | Low-level Vision |