naykun

Joint Ph.D student at MSR Asia & BUAA.

Beihang University, MSRAHaidian, Beijing

naykun's Stars

qier222/YesPlayMusic
高颜值的第三方网易云播放器，支持 Windows / macOS / Linux :electron:
Language:Vue30.1k 221 1.8k4.4k
remy/nodemon
Monitor for any changes in your node.js application and automatically restart the server - perfect for development
Language:JavaScript26.4k 260 1.7k1.7k
guidance-ai/guidance
A guidance language for controlling large language models.
Language:Jupyter Notebook19.4k 119 5521.1k
BlinkDL/RWKV-LM
RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable). We are at RWKV-7 "Goose". So it's combining the best of RNN and transformer - great performance, linear time, constant space (no kv-cache), fast training, infinite ctx_len, and free sentence embedding.
Language:Python13k 133 228882
salesforce/LAVIS
LAVIS - A One-stop Library for Language-Vision Intelligence
Language:Jupyter Notebook10.2k 97 679989
facebookresearch/dinov2
PyTorch code and models for the DINOv2 self-supervised learning method.
Language:Jupyter Notebook9.7k 95 413862
threestudio-project/threestudio
A unified framework for 3D content generation.
Language:Jupyter Notebook6.5k 77 337498
princeton-vl/infinigen
Infinite Photorealistic Worlds using Procedural Generation
Language:Python6k 93 330497
tebelorg/RPA-Python
Python package for doing RPA
Language:Python5k 106 547679
sanchit-gandhi/whisper-jax
JAX implementation of OpenAI's Whisper model for up to 70x speed-up on TPU.
Language:Jupyter Notebook4.5k 44 182387
xinyu1205/recognize-anything
Open-source and strong foundation image recognition models.
Language:Jupyter Notebook3k 27 163282
z-x-yang/Segment-and-Track-Anything
An open-source project dedicated to tracking and segmenting any objects in videos, either automatically or interactively. The primary algorithms utilized include the Segment Anything Model (SAM) for key-frame segmentation and Associating Objects with Transformers (AOT) for efficient tracking and propagation purposes.
Language:Jupyter Notebook2.9k 52 154343
One-2-3-45/One-2-3-45
[NeurIPS 2023] Official code of "One-2-3-45: Any Single Image to 3D Mesh in 45 Seconds without Per-Shape Optimization"
Language:Python1.6k 71 5691
gitwatch/gitwatch
Watch a file or folder and automatically commit changes to a git repo easily.
Language:Shell1.5k 45 58217
NVIDIA/aistore
AIStore: scalable storage for AI applications
Language:Go1.4k 48 102188
facebookresearch/home-robot
Mobile manipulation research tools for roboticists
Language:Python976 31 170133
Totoro97/f2-nerf
Fast neural radiance field training with free camera trajectories
Language:C936 27 12568
allenai/mmc4
MultimodalC4 is a multimodal extension of c4 that interleaves millions of images with text.
Language:Python916 9 1735
showlab/Image2Paragraph
[A toolbox for fun.] Transform Image into Unique Paragraph with ChatGPT, BLIP2, OFA, GRIT, Segment Anything, ControlNet.
Language:Python800 11 3055
booydar/recurrent-memory-transformer
[NeurIPS 22] [AAAI 24] Recurrent Transformer-based long-context architecture.
Language:Jupyter Notebook762 10 061
StanfordVL/OmniGibson
OmniGibson: a platform for accelerating Embodied AI research built upon NVIDIA's Omniverse engine. Join our Discord for support: https://discord.gg/bccR5vGFEx
Language:Python587 21 50459
iejMac/video2dataset
Easily create large video dataset from video urls
Language:Python560 9 15667
facebookresearch/LaViLa
Code release for "Learning Video Representations from Large Language Models"
Language:Python499 8 3845
JialianW/GRiT
GRiT: A Generative Region-to-text Transformer for Object Understanding (https://arxiv.org/abs/2212.00280)
Language:Python310 2 2230
iejMac/video2numpy
Optimized library for large-scale extraction of frames and audio from video.
Language:Python202 3 2611
yilundu/cross_attention_renderer
CVPR 2023: Learning to Render Novel Views from Wide-Baseline Stereo Pairs
Language:Python140 4 1111
iejMac/clip-video-encode
Easily compute clip embeddings from video frames
Language:Python139 3 4119
JamesQFreeman/MicEye
Record radiologists' eye gaze when they are labeling images.
Language:Python49 4 412
facebookresearch/vq2d_cvpr
This repo contains the code for the recipe of the winning entry to the Ego4d VQ2D challenge at CVPR 2022.
Language:Python41 5 16
WikiChao/Ego-AV-Loc
[CVPR 2023] Egocentric Audio-Visual Object Localization
Language:Python23 3 40

naykun

naykun's Stars

qier222/YesPlayMusic

remy/nodemon

guidance-ai/guidance

BlinkDL/RWKV-LM

salesforce/LAVIS

facebookresearch/dinov2

threestudio-project/threestudio

princeton-vl/infinigen

tebelorg/RPA-Python

sanchit-gandhi/whisper-jax

xinyu1205/recognize-anything

z-x-yang/Segment-and-Track-Anything

One-2-3-45/One-2-3-45

gitwatch/gitwatch

NVIDIA/aistore

facebookresearch/home-robot

Totoro97/f2-nerf

allenai/mmc4

showlab/Image2Paragraph

booydar/recurrent-memory-transformer

StanfordVL/OmniGibson

iejMac/video2dataset

facebookresearch/LaViLa

JialianW/GRiT

iejMac/video2numpy

yilundu/cross_attention_renderer

iejMac/clip-video-encode

JamesQFreeman/MicEye

facebookresearch/vq2d_cvpr

WikiChao/Ego-AV-Loc