Issues
- 6
- 1
Fail to load ALMA-13B
#50 opened by wygao8 - 1
error on pretraining
#57 opened by tsbiosky - 3
- 2
- 2
Issues with Translation Quality Using ALMA/ALMA-R Models on Multi-Domain Dataset
#56 opened by cocaer - 1
citation for CPO paper
#48 opened by kashif - 2
- 1
GPUs used during parallel data fine-tuning
#55 opened by liangyingshao - 0
- 0
torch版本问题
#53 opened by DengNingyuan - 0
Unknown source language
#52 opened by noahdasanaike - 0
predict problem
#51 opened by leee-SeungHyeon - 1
Questions about Inference
#47 opened by kira-lin - 1
preprocess_cpo_data
#46 opened by martimfasantos - 2
Question about ALMA(R)
#45 opened by mru4913 - 4
OOM 问题, 显卡是A00 40G
#42 opened by gongye19 - 2
OSError: Error no file named pytorch_model.bin, tf_model.h5, model.ckpt.index or flax_model.msgpack found in directory
#43 opened by gongye19 - 2
CPO question
#44 opened by gongye19 - 2
- 1
- 1
Question on cpo loss
#36 opened by vince62s - 0
No such file or directory
#38 opened by hxue3 - 2
Training metrics currently not logged?
#37 opened by SirRob1997 - 1
Loading ALMA-7B-R (LORA merged) through huggingface downloads Pretrained + LORA
#30 opened by tranvaj - 1
How much parallel data?
#27 opened by zidsi - 2
- 4
- 2
- 2
Runing parallel_ft_lora.sh
#31 opened by zhengkid - 3
Pretraining inquiry.
#32 opened by gyupro - 2
A couple of questions for your theory
#29 opened by gyupro - 3
Incomplete Translation from English to Chinese although `max_tokens` is enough
#24 opened by DeyangKong - 2
running `runs/parallel_ft_lora.sh`gives overflow
#25 opened by hndrstwn - 3
How much CPO data set is expected to be needed when creating a one-to-one machine translator?
#26 opened by qwopqwop200 - 1
- 2
</s> or eos needed for other base models?
#22 opened by zidsi - 2
DPODataCollatorWithPadding( TypeError: __init__() got an unexpected keyword argument 'max_length'
#21 opened by sahsaeedi - 2
- 1
How to fix error to access huggingface?
#18 opened by vhientran - 5
Polite form selection
#13 opened by cmp-nct - 2
- 4
Suggestion on foundation model
#16 opened by cmp-nct - 2
Release `Random` and `Filtered` parallel corpora
#14 opened by zwhe99 - 2
NotImplementedError: all_exhausted stopping strategy in `interleave_datasets` is not implemented yet with a list of <class 'datasets.iterable_dataset.IterableDataset'>.
#15 opened by zwhe99 - 2
About how to specify pairs in `Parallel_ft.sh`.
#12 opened by kyoto7250 - 4
- 1
- 1
7B or 13B ?
#7 opened by geronimi73 - 1