fe1ixxu/ALMA

State-of-the-art LLM-based translation models.

RubyMIT

Issues

Unable to Reproduce ALMA-7b-LoRA Performance, Seeking Assistance
#58 opened 2 months ago by liangyingshao
6
Fail to load ALMA-13B
#50 opened 2 months ago by wygao8
1
error on pretraining
#57 opened 2 months ago by tsbiosky
1
Using custom monolingual data instead of OSCAR dataset
#28 opened 8 months ago by learnercat
3
单语数据集的构建
#41 opened 7 months ago by ywlq
2
Issues with Translation Quality Using ALMA/ALMA-R Models on Multi-Domain Dataset
#56 opened 3 months ago by cocaer
2
citation for CPO paper
#48 opened 3 months ago by kashif
1
[Bug/Feature] The dataset isn't reading the same cache_dir
#35 opened 3 months ago by alvations
2
GPUs used during parallel data fine-tuning
#55 opened 3 months ago by liangyingshao
1
i change the sh for evaluate my data ,but met error
#54 opened 4 months ago by DengNingyuan
0
torch版本问题
#53 opened 4 months ago by DengNingyuan
0
Unknown source language
#52 opened 4 months ago by noahdasanaike
0
predict problem
#51 opened 4 months ago by leee-SeungHyeon
0
Questions about Inference
#47 opened 5 months ago by kira-lin
1
preprocess_cpo_data
#46 opened 5 months ago by martimfasantos
1
Question about ALMA(R)
#45 opened 5 months ago by mru4913
2
OOM 问题, 显卡是A00 40G
#42 opened 6 months ago by gongye19
4
OSError: Error no file named pytorch_model.bin, tf_model.h5, model.ckpt.index or flax_model.msgpack found in directory
#43 opened 6 months ago by gongye19
2
CPO question
#44 opened 6 months ago by gongye19
2
Error on running the evaluation command
#39 opened 7 months ago by Amrit-Bhaskar-abhask10
2
[Question] Exception during parallel data finetuning
#40 opened 7 months ago by aiyubx
1
Question on cpo loss
#36 opened 7 months ago by vince62s
1
No such file or directory
#38 opened 7 months ago by hxue3
0
Training metrics currently not logged?
#37 opened 7 months ago by SirRob1997
2
Loading ALMA-7B-R (LORA merged) through huggingface downloads Pretrained + LORA
#30 opened 7 months ago by tranvaj
1
How much parallel data?
#27 opened 7 months ago by zidsi
1
[Question] Replicating ALMA by training from scratch
#34 opened 7 months ago by alvations
2
Got Errors when pretraining LLaMA-2 on Monolingual Dataset
#33 opened 8 months ago by vhientran
4
[Question] Suggested machine and GPUs to run the training
#6 opened a year ago by alvations
2
Runing parallel_ft_lora.sh
#31 opened 8 months ago by zhengkid
2
Pretraining inquiry.
#32 opened 8 months ago by gyupro
3
A couple of questions for your theory
#29 opened 8 months ago by gyupro
2
Incomplete Translation from English to Chinese although `max_tokens` is enough
#24 opened 8 months ago by DeyangKong
3
running `runs/parallel_ft_lora.sh`gives overflow
#25 opened 9 months ago by hndrstwn
2
How much CPO data set is expected to be needed when creating a one-to-one machine translator?
#26 opened 9 months ago by qwopqwop200
3
Fine-tuning on longer contexts for better performance?
#23 opened 9 months ago by NilanEkanayake
1
</s> or eos needed for other base models?
#22 opened 9 months ago by zidsi
2
DPODataCollatorWithPadding( TypeError: __init__() got an unexpected keyword argument 'max_length'
#21 opened 9 months ago by sahsaeedi
2
The English-Chinese translation is incomplete.
#20 opened 9 months ago by detectRecog
2
How to fix error to access huggingface?
#18 opened 9 months ago by vhientran
1
Polite form selection
#13 opened 10 months ago by cmp-nct
5
Regarding the memory usage of full-weight fine-tuning
#17 opened a year ago by Franciscus-Carolus
2
Suggestion on foundation model
#16 opened a year ago by cmp-nct
4
Release `Random` and `Filtered` parallel corpora
#14 opened a year ago by zwhe99
2
NotImplementedError: all_exhausted stopping strategy in `interleave_datasets` is not implemented yet with a list of <class 'datasets.iterable_dataset.IterableDataset'>.
#15 opened a year ago by zwhe99
2
About how to specify pairs in `Parallel_ft.sh`.
#12 opened a year ago by kyoto7250
2
questions about reproduce the results of paper
#9 opened a year ago by ZeroneBo
4
What do i need to add a new language ?
#8 opened a year ago by MohamedAliRashad
1
7B or 13B ?
#7 opened a year ago by geronimi73
1
[Question] Is "this" needed in the general prompt?
#5 opened a year ago by alvations
1