Pinned Repositories
RLIPv2
[ICCV 2023] RLIPv2: Fast Scaling of Relational Language-Image Pre-training
planets
planets
Video-ChatGPT
[ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted for spatiotemporal video representation. We also introduce a rigorous 'Quantitative Evaluation Benchmarking' for video-based conversational models.
Azure-Kinect-Sensor-SDK
A cross platform (Linux and Windows) user mode SDK to read data from your Azure Kinect device.
torchscale
Foundation Architecture for (M)LLMs