Weixiang Sun1*, Xiaocao You2*, Ruizhe Zheng3*, Zhengqing Yuan4, Xiang Li5, Lifang He6, Quanzheng Li5, Lichao Sun6
1Northeastern University, 2Shanghai University of Finance and Economics, 3Fudan University, 4University of Notre Dame, 5Massachusetts General Hospital and Harvard Medical School, 6Lehigh University
- [2024.6.19] We release Bora, a video generation model specificaly for biomedical domain.
Endoscopy | Ultrasound | RT-MRI | Cell |
---|---|---|---|
# create a virtual env
conda create -n bora python=3.10
# activate virtual environment
conda activate bora
# install torch
# We recommend torch==2.2.2 under CUDA12.1
pip install torch torchvision
# install flash attention
pip install packaging ninja
pip install flash-attn --no-build-isolation
# install apex
# We recommend install from source
git clone https://github.com/NVIDIA/apex.git
cd apex
pip install -v --disable-pip-version-check --no-cache-dir --no-build-isolation --config-settings "--build-option=--cpp_ext" --config-settings "--build-option=--cuda_ext" ./
# install xformers
pip install -U xformers --index-url https://download.pytorch.org/whl/cu121
# install opensora
pip install -v .
Before running, besides Bora's weights, you also need to download the weights for the VAE and Text Encoder. We have provided all the links in the table below:
Bora | Video Encoder | Text Encoder |
---|---|---|
Bora | VAE | T5 |
# on single card
torchrun --standalone --nproc_per_node 1 scripts/inference.py configs/infer.py --ckpt-path Bora_CKPT
# on multi cards
torchrun --standalone --nproc_per_node N scripts/inference.py configs/infer.py --ckpt-path Bora_CKPT
# on four cards
torchrun --nnodes=1 --nproc_per_node=4 scripts/train_origin.py configs/train.py --data-path CSV_PATH --ckpt-path Bora_CKPT
To launch training on multiple nodes, prepare a hostfile according to ColossalAI, and run the following commands.
colossalai run --nproc_per_node 8 --hostfile hostfile scripts/train_origin.py configs/train.py --data-path CSV_PATH --ckpt-path Bora_CKPT
If you're using Bora in your research or applications, please cite using this BibTeX:
@article{sun2024bora,
title={Bora: Biomedical Generalist Video Generation Model},
author={Sun, Weixiang and You, Xiaocao and Zheng, Ruizhe and Yuan, Zhengqing and Li, Xiang and He, Lifang and Li, Quanzheng and Sun, Lichao},
journal={arXiv preprint arXiv:2407.08944},
year={2024}
}
We are greatful for the following works and generous contribution to open source.
Open-Sora: Democratizing Efficient Video Production for All
LLaVA: Large Language and Vision Assistant
Apex: A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch