lucidrains/transfusion-pytorch

Pytorch implementation of Transfusion, "Predict the Next Token and Diffuse Images with One Multi-Modal Model", from MetaAI

PythonMIT

Issues

todo
#29 opened 20 days ago by lucidrains
0
Image Generation
#22 opened a month ago by taeinkwon
75
issue with checkpoint saving
#30 opened 19 days ago by mijanur132
1
Generate text description of image prompt
#27 opened 20 days ago by mijanur132
4
Unsure of the functionality of one line
#28 opened 21 days ago by Ivan-Zhong
1
Any idea about image caption implementation.
#26 opened a month ago by Ma-Weijian
3
Example of using VAE for image.
#20 opened a month ago by dingkwang
3
Overfit example seems to modify the text_and_images in place
#24 opened a month ago by robertmash2
2
Default times are multiplied by batch size
#23 opened a month ago by RefractAI
4
Issue with "decoding text" not finishing
#21 opened 2 months ago by siyuan5
3
Why are you eliminating the influence of positional information on modalities using RoPE?
#19 opened 2 months ago by shin-wn
15
modality_length_to_times_fn always default
#18 opened 2 months ago by RefractAI
1
[BUG] Fail to run the test samples
#17 opened 3 months ago by Masaaki-75
4
Offset setting miss due to addition of meta_id and modality_meta_info?
#16 opened 3 months ago by shin-wn
1
omnigen plz
#15 opened 3 months ago by af-74413592
0
Could you prepare a reasoning demo?
#13 opened 3 months ago by win10ogod
1
Got error when running example
#12 opened 3 months ago by Bing1002
6
text layernorm
#11 opened 3 months ago by cliangyu
2
Question about Diffusion Loss
#8 opened 4 months ago by JJJYmmm
1
Bug: modality_token_transform is empty?
#6 opened 4 months ago by GindaChen
3
Is there any pretrained model weight?
#5 opened 4 months ago by chenfengshijie
1
Question
#1 opened 4 months ago by zhang-haojie
1
Batch size >1 not working, and loss queries
#3 opened 4 months ago by RefractAI
1
Question about num_text_tokens setting in Transfusion
#4 opened 4 months ago by wdlctc
4