Issues
- 0
LLaMA3 supports an 8K token context length. When continuously pretraining with proprietary data, the majority of the text data is significantly shorter than 8K tokens, resulting in a substantial amount of padding. To enhance training efficiency and effectiveness, it is necessary to merge multiple short texts into a longer text, with the length remaining below 8K tokens. However, the question arises: how should these short texts be combined into a single training sequence? Should they be separated by delimiters, or should an approach involving masking be used during the pretraining process?
#1128 opened by Karliz24 - 0
[Parallel MD5] Accelerating `download.sh`
#1127 opened by DEKHTIARJonathan - 0
Unable to access the Hugging Face Llama-3 model repo
#1126 opened by Dounx - 0
- 1
how to download this model
#1121 opened by ZHEGG - 3
download.sh didn't work well
#1110 opened by BiyuHuang - 0
Providing SHA-256 hashes
#1122 opened by vargar2 - 0
No response from request to access models
#1120 opened by pwo42 - 0
Test Tokenizer gives Incorrect padding error
#1119 opened by dinuransika - 0
how can i evaluate mathematic datasets like GSM8K?
#1118 opened by junseo-jang - 0
Unable to access the model
#1100 opened by felix-hh - 4
How to solve it? I just can't use demo of llama-7B
#1065 opened by handicaps - 14
Agnostic Atheist AI not Normal
#1108 opened by JairusDavid - 4
Can not download Python model - 403 Forbidden
#1102 opened by shakasaki - 0
[Generation, Question] Why does the `seed` have to be the same in different processors (`Llama.build`)?
#1114 opened by keli-wen - 0
- 0
Change the name of openai to closeai and change the project name to openai.
#1112 opened by liupengzhouyi - 0
parameter count of Llama2-70B and Llama2-13B
#1111 opened by joyjitkundu032 - 0
Discussing a potential bias in Llama2-Chat that can lead to content safety issues
#1109 opened by LLM-DRA - 1
### System Info
#1105 opened by Karliz24 - 0
Architecture
#1106 opened by daniel-deychakiwsky - 0
Too long for pending a review for huggingface model
#1104 opened by Coco58323 - 0
ValidationError: Input validation error: `inputs` must have less than 4096 tokens. Given: 4545
#1103 opened by asma-10 - 0
How can i inference in C ?
#1101 opened by toutouya - 0
Will the cache kv become invalid?
#1099 opened by oslijunw - 0
- 1
- 0
- 0
After adding tokens, the model doubles in size.
#1095 opened by supech - 0
Translator Layer proposal
#1096 opened by IlyaGazman - 2
- 2
Some generation issues.
#1081 opened by zero-NP - 0
If I select China as my country, it will show that my link is invalid.
#1093 opened by ZhangYouJie-Major - 0
- 1
The response from meta-llama/Llama-2-7b-chat-hf ends with incomplete sentence when I am trying to get inference.
#1088 opened by YanjingRen - 0
- 2
Can't download llma weight file
#1087 opened by shigengtian - 0
How can i pretrain it With My Own Dataset
#1086 opened by aritralegndery - 6
- 0
evaluation
#1085 opened by ZHANGJINKUI - 0
为什么这个里面有两组大文件呀,一组的后缀是.safetensors;另一组的后缀是.bin
#1083 opened by hanxunyu - 3
example_chat_completion.py demo for llama-2-7B-chat is unusable. Dependency bugs
#1066 opened by tvarner - 0
Secure Delivery of Trained LLM for Client Demo
#1080 opened by humza-sami - 5
- 0
llama-2-7b-hf output almost OOOO
#1078 opened by SXxinxiaosong - 1
cannot find pytorch_model-00001-of-00003.bin
#1073 opened by Hannan1002 - 1
- 1
Seems to keep answering NULL string
#1071 opened by GlintFreedom - 0
403 Forbidden, after downloading 96%
#1070 opened by jkwiatk1 - 0
Could embeddings be dynamically constructed with reserved weights to indicate end of word / word with comma after / etc to save tokens?
#1072 opened by CodeExplode