baichuan-inc/Baichuan-7B
A large-scale 7B pretraining language model developed by BaiChuan-Inc.
PythonApache-2.0
Issues
- 0
[Question] 安装依赖时终端报错(deepspeed)
#146 opened by duolaBmeng673 - 0
[Question] 微信群的二维码失效了
#145 opened by yzhao-2023 - 1
[Question]不能安装xformers
#144 opened by Acid-uncoin - 1
- 3
- 0
[Question] 参数合并后有什么要注意的吗? 我将7B参数和微调参数合并之后,加载新模型,显存占用超过了24G,这个跟原始7B所需显存差很多?这会是什么导致的
#142 opened by Micla-SHL - 0
baichuan2和baichaun2-7B这俩仓库有啥区别吗
#141 opened by fxb392 - 7
[Question] 单机单卡训练,报错,无法初始化梯度。
#109 opened by xkjcf - 0
[Question] Baichuan-Text-Embedding can be open for open source or have api to use or pay for use? thanks
#140 opened by Yazooliu - 0
[Question] 我想用 Baichuan-7B来开发中文文本纠错功能,主要是错别字,请问下可行性?
#139 opened by suchstar - 3
[BUG] CUDA Out of Memory when eval model.
#133 opened by Crystalxd - 0
想问一下在A800上测试的吞吐量,换算到推理速度的话有多少tokens/s?
#138 opened by HJT9328 - 0
[Typo]
#137 opened by Chandler-Bing - 11
位置插值扩展context长度到8k或者32k
#97 opened by Louis-y-nlp - 0
[Question] RoPE的实现和论文里不一致
#136 opened by zehmaaa - 0
[Question] 可以提供模型的国内下载源吗
#134 opened by liulfy - 0
能提供个类似open_api.py的文件,可以供我们使用接口进行测试吗?
#131 opened by mawenju203 - 1
[Question] 请问7B没有用上FlashAttention吗?
#130 opened by nezhazheng - 0
[Evaluation] 提供 Baichuan 模型在 OpenCompass 上的评测结果
#128 opened by Leymore - 0
[Question] Baichuan-7B多GPU 原生部署、 int8 和 int4 量化部署
#127 opened by potong - 0
[Question] Baichuan-7B多卡GPU 原生部署、 int8 和 int4 量化部署方法
#126 opened by potong - 0
[Question] 多GPU部署Baichuan-7B方法
#125 opened by potong - 9
[Question] a100 80g单卡训练还 out of memory
#92 opened by wac81 - 0
[Question] 关于数据处理的疑问
#124 opened by mynewstart - 2
[Question] 当继续预训练是,loss一直是2.2几的状态,请问作者预训练阶段也是如此吗?
#106 opened by chenglu66 - 0
我要做预训练通用模型,样本数据加载这里可以给个demo数据?
#121 opened by wangweihua11 - 1
- 0
请问想接上下句古诗 需要怎么写提示词?
#120 opened by goog - 0
pretrain learning rate is le-8?
#119 opened by hegang1-tal - 0
请问部署后,如何通过API调用?
#118 opened by lemon-simple - 0
[Question] 你好,训练分词模型的代码可以分享吗?或者有什么参考吗?
#117 opened by StarrySeas1 - 1
[Question] 如何在单机多卡上,继续预训练?
#94 opened by xiaozhu1106 - 3
[Question] 训练垂直领域的模型,增量预训练的token数需要达到多少才能有比较好的效果?
#112 opened by parkLGW - 0
[Question]
#116 opened by wqmoran - 1
请问部署推理,最小的GPU显存需要多大呢?以及内存需要多大?[Question]
#110 opened by ArlanCooper - 0
[Question] 请问继续预训练的loss降到什么水平是合格的
#115 opened by parkLGW - 0
Can I use baichuan 7b for reading comprehension?
#114 opened by powerpistn - 0
请问13b的全参数微调, 以及全参数指令微调,能够用7b的train.py吗[Question]
#113 opened by quzx - 1
evaluate_mmlu.py文件中categories是啥包?是pycategories包吗?
#101 opened by kunzeng-ch - 0
[Question] 是否有对Tasks.word_segmentation 任务的分词示例代码
#108 opened by luxiaobai007 - 0
- 0
[Question]
#105 opened by felixdae - 0
[Question] output为什么要包含input呢
#104 opened by gaogaocn - 5
[Question] 模型显存占用28G?
#96 opened by goodnessSZW - 0
[Question] 模型参数问题
#103 opened by L-hongbin - 0
[Question] 后续打算出更小的版本么,如3B,1B等。
#102 opened by tingxinli1 - 0
[Question] 如何测试达到Max_token上限的输出。
#100 opened by cason0126 - 1
[Question] 关于模型在agi-eval上的评测细节
#98 opened by yangkexin - 1
这个模型如何做多项选择呢?[Question]
#90 opened by sc-lj - 1