Pinned Repositories
trl
Train transformer language models with reinforcement learning.
Adversarial-Inverse-Graphics-Networks-for-Faces
AlignProp
AlignProp uses direct reward backpropogation for the alignment of large-scale text-to-image diffusion models. Our method is 25x more sample and compute efficient than reinforcement learning methods (PPO) for finetuning Stable Diffusion
Diffusion-TTA
Diffusion-TTA improves pre-trained discriminative models such as image classifiers or segmentors using pre-trained generative models.
Disentangling-3D-Prototypical-Nets
We present neural architectures that disentangle RGB-D images into objects' shapes and styles and a map of the background scene, and explore their applications for few-shot 3D object detection and few-shot concept classification.
EmbLang
Embodied Language Grounding With 3D Visual Feature Representations
Navigation-Deep-RL
This is a rep for navigating in unity environment using deep q learning network in pytroch
ProbabilisticNeuralProgrammedNetwork_Tensorflow
Code for "Probabilistic Neural Programmed Networks for Scene Generation.", Deng et al, NIPS 2018. This Code Base is ported Tensorflow 2.0 version of the official Pytorch Implementation
Slot-TTA
Slot-TTA shows that test-time adaptation using slot-centric models can improve image segmentation on out-of-distribution examples.
VADER
Video Diffusion Alignment via Reward Gradients. We improve a variety of video diffusion models such as VideoCrafter, OpenSora, ModelScope and StableVideoDiffusion by finetuning them using various reward models such as HPS, PickScore, VideoMAE, VJEPA, YOLO, Aesthetics etc.
mihirp1998's Repositories
mihirp1998/AlignProp
AlignProp uses direct reward backpropogation for the alignment of large-scale text-to-image diffusion models. Our method is 25x more sample and compute efficient than reinforcement learning methods (PPO) for finetuning Stable Diffusion
mihirp1998/VADER
Video Diffusion Alignment via Reward Gradients. We improve a variety of video diffusion models such as VideoCrafter, OpenSora, ModelScope and StableVideoDiffusion by finetuning them using various reward models such as HPS, PickScore, VideoMAE, VJEPA, YOLO, Aesthetics etc.
mihirp1998/Diffusion-TTA
Diffusion-TTA improves pre-trained discriminative models such as image classifiers or segmentors using pre-trained generative models.
mihirp1998/Slot-TTA
Slot-TTA shows that test-time adaptation using slot-centric models can improve image segmentation on out-of-distribution examples.
mihirp1998/cmu-vision.github.io
mihirp1998/ComplexAutoEncoder
Code for the paper: Complex-Valued Autoencoders for Object Discovery
mihirp1998/DALLE-pytorch
Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch
mihirp1998/Depth-Adaptive-Visual-Tracking-tf
This is an unofficial implementation of the Paper "Depth-Adaptive Computational Policies for Efficient Visual Tracking" By Chris Ying, Katerina Fragkiadaki
mihirp1998/dino
PyTorch code for Vision Transformers training with the Self-Supervised learning method DINO
mihirp1998/DiT
mihirp1998/dotfiles
mihirp1998/grok
mihirp1998/hora
In-Hand Object Rotation via Rapid Motor Adaptation (CoRL 2022)
mihirp1998/mae_aug
PyTorch implementation of MAE https//arxiv.org/abs/2111.06377
mihirp1998/Maskgit-pytorch
mihirp1998/multiobjective_symbolic_regression
This is a Python library that implements a Multi-objective Symbolic Regression algorithm. It can be used as a Machine Learning algorithm to create predictive models in the form of mathematical expressions.
mihirp1998/OFA
Official repository of OFA (ICML 2022). Paper: OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework
mihirp1998/project_template
mihirp1998/pymunk
Pymunk is a easy-to-use pythonic 2d physics library that can be used whenever you need 2d rigid body physics from Python
mihirp1998/retrieval-latent-diffusion
High-Resolution Image Synthesis with Latent Diffusion Models
mihirp1998/rlbench
mihirp1998/RLBench-1
A large-scale benchmark and learning environment.
mihirp1998/SlotCon
(NeurIPS 2022) Self-Supervised Visual Representation Learning with Semantic Grouping
mihirp1998/stable-dreamfusion
A working implementation of text-to-3D dreamfusion, powered by stable diffusion.
mihirp1998/time-change
mihirp1998/tmp
mihirp1998/tmp1
mihirp1998/tmp3
mihirp1998/toy-diffusion
A toy implementation of a diffusion model for low-dimensional data
mihirp1998/trl
Train transformer language models with reinforcement learning.