Issues
- 0
llama3增量预训练冻结哪些层训练哪些层效果比较好?
#137 opened by CanvaChen - 0
请问有没有性别年龄检测模型?
#136 opened by broadcast98 - 0
请问70B的模型要如何使用,抱脸上的模型看着文件和其他模型不一样
#135 opened by afezeriaWrnbbmm - 0
pretrain.py的示例似乎有点错误
#134 opened by xinghudamowang - 1
请问,deepspeed 微调时,CPU的内存需要多大?
#126 opened by Batmana - 0
- 0
服务器最低配置要求是什么?
#132 opened by jqs1124 - 0
有人有pile的数据集吗?22个来源,825G的那个版本
#131 opened by AI-Study-Han - 0
问下大佬们有没有训练3B的打算?场景需要时延不能太高
#130 opened by zjuzhfbloodz - 0
readme上的加群二维码过期了
#129 opened by potong - 3
关于平行语料的预处理
#93 opened by lyy-zz - 0
Chinese-LLaMA-2-13B-hf样本模板prompt到底是什么样的?
#128 opened by sunzhaowei - 0
关于Chinese-LLaMA-2-13B (hf格式)
#125 opened by sun1092469590 - 1
Please clarify the License for Chinese-LLaMA-2
#124 opened by JayLiangs - 3
微信满员了,请重新上传新的微信图片 我可以免费做管理员
#123 opened by ArtificialZeng - 0
多轮对话问问题之后直接报错
#122 opened by caowenhero - 0
python3 llama_server.py结果乱码
#121 opened by caowenhero - 1
ChatFlow-13B.bin只有136字节
#120 opened by NewEricWang - 2
关于33B模型预训练语料长度
#96 opened by minlik - 0
huggingface上openllama-13b的模型大小为26.4G,转换为huggingface那种模型格式之后模型大小为24.7G,这也就是大概是以fp16或者是bf16保存的模型
#119 opened by belle9217 - 0
- 0
Chinese-LLaMA-33B在多少块gpu上训了多长时间?
#117 opened by JingxinLee - 0
是否考虑通过位置插值来扩展大语言模型的上下文窗口 ,将上下文窗口提升至32K
#116 opened by xfg0913 - 0
请问在指令微调时损失函数与预训练有什么区别吗
#115 opened by dazhaxie0526 - 0
open-llama13B做推理时,结果是英文
#114 opened by yating0823 - 1
使用openllama13B + openmodel进行推理时,结果都是数字?这个需要做其他操作?
#104 opened by suhaibo1 - 1
关于openllama的两个相关问题
#106 opened by lucasjinreal - 2
falcon的使用中文预料进行增量训练
#113 opened by fengstar7827 - 4
readme上的加群二维码过期了
#99 opened by aihaidong - 1
额,是我用错了吗?简单推理都不行吗
#112 opened by Mousaic - 1
Multi machine pre-training hung
#111 opened by BUPTAnderson - 2
请问有中文falcon的下载地址嘛
#110 opened by AlexXx-Wu - 0
Wrong argments
#109 opened by jeffchy - 0
- 0
openllama 13b base model生成内容比较奇怪
#107 opened by lucasjinreal - 0
请问是否有增量预训练的基础模型13B的评测结果?
#105 opened by caihaunqai - 1
如何cite?
#98 opened by hackerchenzhuo - 2
增量预训练的时候报错exits with return code = -9 ,单卡80G显存的A100
#103 opened by pydaxing - 1
请问OpenLLaMA-13B在转换为hf模型时,convert_llama_from_tencentpretrain_to_hf.py直接复制了词表tokenizer.model,open_llama.model没有用到,是正常的吗?
#94 opened by chk4991 - 1
- 0
Pretraining corpus formatting
#101 opened by treya-lin - 0
7b模型性能和billa对比
#100 opened by lucasjinreal - 0
请问大佬65B的模型何时能够放出
#97 opened by Expert68 - 0
请问openllama 13b怎么转成HF格式
#95 opened by lin1490188 - 1
博主群二维码过期了,可以更新一个新的二维码吗
#87 opened by zhangfan-algo - 1
Is it possible to support OPT models
#88 opened by treya-lin - 0
openllama性能评估
#92 opened by enbacoo - 0
- 1
Chinese-LLaMA-33B (hf格式)的模型如何部署,进行推理?
#90 opened by xfg0913 - 2
33b Huggingface 格式怎么转成TencentPretrain 格式
#89 opened by lyy-zz