snap-stanford/med-flamingo

GPU requirements (and other dependancies)

Opened this issue ยท 31 comments

evolu8 commented

Hi

It would be great to include GPU RAM and storage requirements in the README. The requirements appear to be non-trivial for many.

Great work. Thank you for opening! It is much appreciated.

Hello, I cannot use this model in google colab so is it related to gpu, ram requirements? I'm new working with LLm's and I couldn't figure out how can I use this model on colab.

mi92 commented

Thank you, will look into this. We used a single 40G A100 for inference (without any optimization for inference), but may work on slightly smaller GPU memory as well (I think 35G), but will update the README once I reran some jobs to check.

for only 64 MiB doesn't run on my 3090

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 64.00 MiB (GPU 0; 23.69 GiB total capacity; 23.01 GiB already allocated; 55.94 MiB free; 23.02 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Work for me

for only 64 MiB doesn't run on my 3090

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 64.00 MiB (GPU 0; 23.69 GiB total capacity; 23.01 GiB already allocated; 55.94 MiB free; 23.02 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

accelerator = Accelerator() #when using cpu: cpu=True

Add cpu=True

accelerator = Accelerator(cpu=True)

I tried running on Google Colab, are there any specific requirements or steps that I might be missing?

2023-08-12 08:59:25.005895: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
Loading model..
Using pad_token, but it is not set yet.
^C

I found a way to make it work on a free colab. I'm going to share my notebook soon!
It takes 14,2GB for end to end demo to work!

Check out this notebook for running Med-Flamingo on Google Colab, even with limited GPU:
Med_Flamingo on free colab
Big thanks to @NouamaneTazi for the helpful instructions. Give it a star if you find it useful!!

Hey Abir196, thanks for the notebook. I have run on my Colab and it had an issue with the GPU RAM, so I got a Colab Pro, and had to wait some time to get A100 (with 40GB) and your notebook works! So kudos for that. Your code consumes 35.9GB out of 40GB on A100.

Hey folks!! Could you share your ouputs? I have tested the notebook provided by @Abir196, thx a lot btw.

Hey could you run the notebook on colab free?

Hey folks!! Could you share your ouputs? I have tested the notebook provided by @Abir196, thx a lot btw.

Hey could you run the notebook on colab free?

This is https://colab.research.google.com/drive/15yxkczdWkB2Aagf-6c4yx7l_rjguPBVK?usp=sharing

Stanford team, do you have a video about how to make inferences and a little explanation? I think it would be a great a idea have one because I have ran the demo.py and I have some questions about how the output works and even how the prompt should be written.

Hey folks!! Could you share your ouputs? I have tested the notebook provided by @Abir196, thx a lot btw.

Hey could you run the notebook on colab free?

This is https://colab.research.google.com/drive/15yxkczdWkB2Aagf-6c4yx7l_rjguPBVK?usp=sharing

To anyone using this notebook please create a copy in drive. Thx.

Hey folks!! Could you share your ouputs? I have tested the notebook provided by @Abir196, thx a lot btw.

Hey could you run the notebook on colab free?

This is https://colab.research.google.com/drive/15yxkczdWkB2Aagf-6c4yx7l_rjguPBVK?usp=sharing

I tried to run this notebook but my session crashed after run this block:

# Initialize Accelerator
accelerator = Accelerator()  # Use hardware acceleration (GPU or TPU) based on availability
device = accelerator.device

print('Loading model..')

# Import create_model_and_transforms function
from open_flamingo import create_model_and_transforms

# Initialize the Flamingo model, image processor, and tokenizer
model, image_processor, tokenizer = create_model_and_transforms(
    clip_vision_encoder_path="ViT-L-14",
    clip_vision_encoder_pretrained="openai",
    lang_encoder_path="huggyllama/llama-7b",
    tokenizer_path="huggyllama/llama-7b",
    cross_attn_every_n_layers=4,

Hey folks!! Could you share your ouputs? I have tested the notebook provided by @Abir196, thx a lot btw.

Hey could you run the notebook on colab free?

This is https://colab.research.google.com/drive/15yxkczdWkB2Aagf-6c4yx7l_rjguPBVK?usp=sharing

I tried to run this notebook but my session crashed after run this block:

# Initialize Accelerator
accelerator = Accelerator()  # Use hardware acceleration (GPU or TPU) based on availability
device = accelerator.device

print('Loading model..')

# Import create_model_and_transforms function
from open_flamingo import create_model_and_transforms

# Initialize the Flamingo model, image processor, and tokenizer
model, image_processor, tokenizer = create_model_and_transforms(
    clip_vision_encoder_path="ViT-L-14",
    clip_vision_encoder_pretrained="openai",
    lang_encoder_path="huggyllama/llama-7b",
    tokenizer_path="huggyllama/llama-7b",
    cross_attn_every_n_layers=4,

Make sure that you are using GPU runtime.

Hey folks!! Could you share your ouputs? I have tested the notebook provided by @Abir196, thx a lot btw.

Hey could you run the notebook on colab free?

This is https://colab.research.google.com/drive/15yxkczdWkB2Aagf-6c4yx7l_rjguPBVK?usp=sharing

I tried to run this notebook but my session crashed after run this block:

# Initialize Accelerator
accelerator = Accelerator()  # Use hardware acceleration (GPU or TPU) based on availability
device = accelerator.device

print('Loading model..')

# Import create_model_and_transforms function
from open_flamingo import create_model_and_transforms

# Initialize the Flamingo model, image processor, and tokenizer
model, image_processor, tokenizer = create_model_and_transforms(
    clip_vision_encoder_path="ViT-L-14",
    clip_vision_encoder_pretrained="openai",
    lang_encoder_path="huggyllama/llama-7b",
    tokenizer_path="huggyllama/llama-7b",
    cross_attn_every_n_layers=4,

Make sure that you are using GPU runtime.

I forgot something. The requirements.txt did not work for me so if you have compatibilities issues, you can change the following lines in the file:

backports.zoneinfo==0.2.1;python_version<"3.9"
decorator==4.4.2
numpy==1.23.4

I forgot something. The requirements.txt did not work for me so if you have compatibilities issues, you can change the following lines in the file:

backports.zoneinfo==0.2.1;python_version<"3.9" decorator==4.4.2 numpy==1.23.4

I changed these but I got error:
ERROR: Cannot uninstall 'blinker'. It is a distutils installed project and thus we cannot accurately determine which files belong to it which would lead to only a partial uninstall.

I forgot something. The requirements.txt did not work for me so if you have compatibilities issues, you can change the following lines in the file:
backports.zoneinfo==0.2.1;python_version<"3.9" decorator==4.4.2 numpy==1.23.4

I changed these but I got error: ERROR: Cannot uninstall 'blinker'. It is a distutils installed project and thus we cannot accurately determine which files belong to it which would lead to only a partial uninstall.

You can skip it.

After the block I get a bug report:

from huggingface_hub import hf_hub_download
import torch
import os
from open_flamingo import create_model_and_transforms
from accelerate import Accelerator
from einops import repeat
from PIL import Image
import sys

# Append paths for custom modules
sys.path.append('/content/med-flamingo/scripts')
sys.path.append('/content/med-flamingo')
from src.utils import FlamingoProcessor
from demo_utils import image_paths, clean_generation
Details

===================================BUG REPORT===================================
Welcome to bitsandbytes. For bug reports, please run

python -m bitsandbytes

and submit this information together with your error trace to: https://github.com/TimDettmers/bitsandbytes/issues

bin /usr/local/lib/python3.10/dist-packages/bitsandbytes/libbitsandbytes_cuda118.so
CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching in backup paths...
CUDA SETUP: CUDA runtime path found: /usr/local/cuda/lib64/libcudart.so.11.0
CUDA SETUP: Highest compute capability among GPUs detected: 7.5
CUDA SETUP: Detected CUDA version 118
CUDA SETUP: Loading binary /usr/local/lib/python3.10/dist-packages/bitsandbytes/libbitsandbytes_cuda118.so...
/usr/local/lib/python3.10/dist-packages/bitsandbytes/cuda_setup/main.py:149: UserWarning: /usr/lib64-nvidia did not contain ['libcudart.so', 'libcudart.so.11.0', 'libcudart.so.12.0'] as expected! Searching further paths...
warn(msg)
/usr/local/lib/python3.10/dist-packages/bitsandbytes/cuda_setup/main.py:149: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('/sys/fs/cgroup/memory.events /var/colab/cgroup/jupyter-children/memory.events')}
warn(msg)
/usr/local/lib/python3.10/dist-packages/bitsandbytes/cuda_setup/main.py:149: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('http'), PosixPath('8013'), PosixPath('//172.28.0.1')}
warn(msg)
/usr/local/lib/python3.10/dist-packages/bitsandbytes/cuda_setup/main.py:149: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('--logtostderr --listen_host=172.28.0.12 --target_host=172.28.0.12 --tunnel_background_save_url=https'), PosixPath('//colab.research.google.com/tun/m/cc48301118ce562b961b3c22d803539adc1e0c19/gpu-t4-s-1ajde1tuf2ve4 --tunnel_background_save_delay=10s --tunnel_periodic_background_save_frequency=30m0s --enable_output_coalescing=true --output_coalescing_required=true')}
warn(msg)
/usr/local/lib/python3.10/dist-packages/bitsandbytes/cuda_setup/main.py:149: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('/env/python')}
warn(msg)
/usr/local/lib/python3.10/dist-packages/bitsandbytes/cuda_setup/main.py:149: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('//ipykernel.pylab.backend_inline'), PosixPath('module')}
warn(msg)
/usr/local/lib/python3.10/dist-packages/bitsandbytes/cuda_setup/main.py:149: UserWarning: Found duplicate ['libcudart.so', 'libcudart.so.11.0', 'libcudart.so.12.0'] files: {PosixPath('/usr/local/cuda/lib64/libcudart.so.11.0'), PosixPath('/usr/local/cuda/lib64/libcudart.so')}.. We'll flip a coin and try one of these, in order to fail forward.
Either way, this might cause trouble in the future:
If you get CUDA error: invalid device function errors, the above might be the cause and the solution is to make sure only one ['libcudart.so', 'libcudart.so.11.0', 'libcudart.so.12.0'] in the paths that we search based on your env.
warn(msg)

after that I still ran the code I mentioned last comment, session again crashed. I watch my gpu usage and it didn't move. although I use GPU runtime, my gpu usage didn't change. It used system ram only.

After the block I get a bug report:

from huggingface_hub import hf_hub_download
import torch
import os
from open_flamingo import create_model_and_transforms
from accelerate import Accelerator
from einops import repeat
from PIL import Image
import sys

# Append paths for custom modules
sys.path.append('/content/med-flamingo/scripts')
sys.path.append('/content/med-flamingo')
from src.utils import FlamingoProcessor
from demo_utils import image_paths, clean_generation

Details
after that I still ran the code I mentioned last comment, session again crashed. I watch my gpu usage and it didn't move. although I use GPU runtime, my gpu usage didn't change. It used system ram only.

That is a warning. You can continue running the notebook.

photo_2023-08-18 00 54 19

Here, I use GPU runtime, why couldn't gpu ram be used?

photo_2023-08-18 00 54 19

Here, I use GPU runtime, why couldn't gpu ram be used?

Continue running it and after make the inference you should see a peak in the gpu ram use.

but I cannot continue running. my session crashed and restarted after that ram usage.

Hey Abir196, thanks for the notebook. I have run on my Colab and it had an issue with the GPU RAM, so I got a Colab Pro, and had to wait some time to get A100 (with 40GB) and your notebook works! So kudos for that. Your code consumes 35.9GB out of 40GB on A100.

Remember this point @thedaffodil. Try use Colab Pro.

Hey folks!! Could you share your ouputs? I have tested the notebook provided by @Abir196, thx a lot btw.

Hey could you run the notebook on colab free?

That's why I asked at the beginning if you used the free version:) Thank you for your time.

Hey folks!! Could you share your ouputs? I have tested the notebook provided by @Abir196, thx a lot btw.

Hey could you run the notebook on colab free?

That's why I asked at the beginning if you used the free version:) Thank you for your time.

I miss that point. My apologies!! I'm doing some school and work stuffs so I got confused, sorry man. Please reach out me in genarohazael@gmail.com to future discussions.

It's okay. Good luck with your work. I'll reach out, thanks again.

Hey folks!! Could you share your ouputs? I have tested the notebook provided by @Abir196, thx a lot btw.

Hey could you run the notebook on colab free?

Yes, I used free colab (using T4 GPU). When running it cell by cell, at which cell does it crash for you?

Hey folks!! Could you share your ouputs? I have tested the notebook provided by @Abir196, thx a lot btw.

Hey could you run the notebook on colab free?

Yes, I used free colab (using T4 GPU). When running it cell by cell, at which cell does it crash for you?

# Import necessary libraries
from huggingface_hub import hf_hub_download
import torch
import os
from accelerate import Accelerator  # Import Accelerate library for hardware acceleration
from einops import repeat
from PIL import Image
import sys

# Append paths for custom modules
sys.path.append('/content/med-flamingo/scripts')
sys.path.append('/content/med-flamingo')
from src.utils import FlamingoProcessor
from demo_utils import image_paths, clean_generation

# Initialize Accelerator
accelerator = Accelerator()  # Use hardware acceleration (GPU or TPU) based on availability
device = accelerator.device

print('Loading model..')

# Import create_model_and_transforms function
from open_flamingo import create_model_and_transforms

# Initialize the Flamingo model, image processor, and tokenizer
model, image_processor, tokenizer = create_model_and_transforms(
    clip_vision_encoder_path="ViT-L-14",
    clip_vision_encoder_pretrained="openai",
    lang_encoder_path="huggyllama/llama-7b",
    tokenizer_path="huggyllama/llama-7b",
    cross_attn_every_n_layers=4,

photo_2023-08-23 08 46 27

after this my session collapsed

Could you give this new version a try? You can find it at Med_Flamingo_on_free_colab. The cell you indicated worked successfully on my end!
image

It worked thank you!

@Abir196 hi, thank you for your colab! I tried to run it, but I got an error in the final step of generation, the details are shown as follows:

Generate from multimodal few-shot prompt
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
[<ipython-input-16-92b680a2f954>](https://1x4sfdm78jh-496ff2e9c6d22116-0-colab.googleusercontent.com/outputframe.html?vrz=colab_20240228-060152_RC00_611016369#) in <cell line: 9>()
     10 
     11     # Generate text using the model
---> 12     generated_text = model.generate(
     13         vision_x=pixels.to(device),  # Convert images to the device
     14         lang_x=tokenized_data["input_ids"].to(device),  # Convert text input to the device

5 frames
[/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py](https://1x4sfdm78jh-496ff2e9c6d22116-0-colab.googleusercontent.com/outputframe.html?vrz=colab_20240228-060152_RC00_611016369#) in __getattr__(self, name)
   1612             if name in modules:
   1613                 return modules[name]
-> 1614         raise AttributeError("'{}' object has no attribute '{}'".format(
   1615             type(self).__name__, name))
   1616 

AttributeError: 'FlamingoLayer' object has no attribute 'self_attn'

Do you have any idea to address it? I really appreciate your help!