/LDM

Primary LanguagePythonMIT LicenseMIT

LDM: Large Tensorial SDF Model for Textured Mesh Generation

This is the official implementation of LDM: Large Tensorial SDF Model for Textured Mesh Generation.

| | Weights

demo_1.mp4

Features and Todo List

  • 🔥 Release huggingface gradio demo
  • 🔥 Release inference and training code.
  • 🔥 Release pretrained models.
  • Release the training data list.
  • Support text to 3D generation.
  • Support image to 3D generation using various multi-view diffusion models, including Imagedream and Zero123plus.

Install

# xformers is required! please refer to https://github.com/facebookresearch/xformers for details.
# We recommend using `Python>=3.10`, `PyTorch>=2.1.0`, and `CUDA>=12.1`.

pip install torch==2.3.0 torchvision==0.18.0 torchaudio==2.3.0 --index-url https://download.pytorch.org/whl/cu121

pip3 install -U xformers --index-url https://download.pytorch.org/whl/cu121

# other dependencies
pip install -r requirements.txt

Pretrained Weights

Our pretrained weight can be downloaded from huggingface.

For example, to download the fp16 model for inference:

mkdir pretrained && cd pretrained
wget https://huggingface.co/rgxie/LDM/resolve/main/LDM_6V_SDF.ckpt
cd ..

The weights of the diffusion model will be downloaded automatically.

Inference

### gradio app for both text/image to 3D, the weights of our model will be downloaded automatically.
python app.py

# image to 3d
# --workspace: folder to save output (*.obj,*.jpg)
# --test_path: path to a folder containing images, or a single image
python infer.py tiny_trf_trans_sdf_123plus --resume pretrained/LDM_6V_SDF.ckpt --workspace workspace_test --test_path example --seed 0

# text to 3d
# --workspace: folder to save output (*.obj,*.jpg)
python infer.py tiny_trf_trans_sdf --resume pretrained/LDM_6V_SDF.ckpt --workspace workspace_test --txt_or_image True --mvdream_or_zero123 True --text_prompt 'a hamburge' --seed 0

For more options, please check options. If you find the output unsatisfying, try using different multi-view diffusion models or seeds!

Training

preparing:

Training dataset: our training dataset is based on GObjaverse, which can be downloaded from here. Specifically, we used a ~80K filtered subset list from LGM. The data list can be found here. Furthermore, configure the options with the following:

  • data_path: The directory where your downloaded dataset is stored.
  • data_list_path: The path to the data list file.
  • The structure of dataset:
|-- data_path
    |-- dictionary_id
        |-- instance_id.rar    
        |-- ...

Pretrained model: As our model is trained starting from the pretrained OpenLRM model, please download the pretrained model here and place it in the ‘pretrained’ dir.

Training: The minimum recommended configuration for training is 8 * A6000 GPUs, each with 48GB memory.

# step 1: To speed up the convergence of training, we start by not cropping patches. Instead, we use a lower resolution and train with a larger batch size initially.
accelerate launch --config_file acc_configs/gpu8.yaml main.py tiny_trf_trans_sdf --output_size 64 --batch_size 4 --lr 4e-4 --num_epochs 50 --is_crop False --resume pretrained/openlrm_m_l.safetensors --workspace workspace_nocrop


# step 2: Furthermore, we introduce patch cropping and increase the patch resolution to capture better details.
accelerate launch --config_file acc_configs/gpu8.yaml main.py tiny_trf_trans_sdf --output_size 128 --batch_size 1 --gradient_accumulation_steps 2 --lr 2e-5 --num_epochs 50 --is_crop True --resume workspace_nocrop/last.ckpt --workspace workspace_crop

# (optional)step 3: To adapt the model to the 6 view inputs from Zero123plus, we refine the model obtained in the earlier stages.
accelerate launch --config_file acc_configs/gpu8.yaml main.py tiny_trf_trans_sdf_123plus --output_size 128 --batch_size 1 --gradient_accumulation_steps 2 --lr 1e-5 --num_epochs 20 --resume workspace_crop/last.ckpt --workspace workspace_refine


# (optional)step 4: Utilize FlexiCubes layer to further improve the texture details
accelerate launch --config_file acc_configs/gpu8.yaml main.py tiny_trf_trans_mesh --output_size 512 --batch_size 1 --gradient_accumulation_steps 1 --lr 1e-5 --num_epochs 20 --resume the_path_of_sdf_ckpt/last.ckpt --workspace workspace_mesh

Acknowledgement

This work is built on many amazing research works and open-source projects, thanks a lot to all the authors for sharing!

Citation

@article{xie2024ldm,
  title={LDM: Large Tensorial SDF Model for Textured Mesh Generation},
  author={Xie, Rengan and Zheng, Wenting and Huang, Kai and Chen, Yizheng and Wang, Qi and Ye, Qi and Chen, Wei and Huo, Yuchi},
  journal={arXiv preprint arXiv:2405.14580},
  year={2024}
}