deepseek-ai/DeepSeek-V2
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
MIT
Issues
- 4
怎么在langchain里面使用deepseek计算embedding?
#67 opened by ShuoAndy - 3
128k的推理有例子吗?
#56 opened by 520jefferson - 1
希望做vs2022扩展
#68 opened by woaidianqian - 2
线上api如何稳定的触发 tool_calls
#84 opened by wssnail - 0
Inquiry about Key/Value Storage and Matrix Merging in DeepSeekerV2 Inference Code
#92 opened by xlim1996 - 3
ValueError: The model's max seq len (163840) is larger than the maximum number of tokens that can be stored in KV cache (13360). Try increasing gpu_memory_utilization or decreasing max_model_len when initializing the engine.
#83 opened by ArtificialZeng - 5
Function Calling比以前难触发了
#88 opened by whoisfucker - 1
Will the Deepseek platform's API call be updated to support generating multiple texts (n>1)?
#55 opened by zchuz - 0
Exploring the Combined Effects of YaRN and Adjusted rope_base Values in deepseek v2
#87 opened by hannlp - 1
Error executing method determine_num_available_blocks: vLLM multi node fails for both DeepSeek-Coder-V2-Instruct and DeepSeek-Coder-V2-Lite-Instruct
#76 opened by liangfang - 1
DeepSeek-V2-Lite-Chat模型启动依赖问题
#78 opened by Malowking - 0
Question about the design of bos and eos token
#85 opened by jojo23333 - 1
HuggingFace中开源的代码似乎没有实现矩阵合并
#80 opened by meteorlin - 2
empty response from server
#82 opened by 879611427 - 4
It won't answer questions about the events that transpired in Tiananmen Square from April 15, 1989, to June 4, 1989.
#57 opened by richpav - 3
- 0
多轮在训练中是否需要特殊间隔符,用什么间隔符号?
#79 opened by AceCHQ - 0
自配大模型服务器如何选择GPU,CPU和内存
#77 opened by zhanghanting - 6
How to fine-tune deepseek v2 models?
#40 opened by satheeshkatipomu - 0
0628版本加载报错
#75 opened by bestpredicts - 2
怎么用dspy里的方法来调用deepseek?
#71 opened by buchikeke - 0
about the active param counts of DeepSeek-V2-Lite
#73 opened by imhmhm - 0
- 2
hi, could you provide a code like llama3?
#53 opened by lambda7xx - 2
- 0
- 0
Add support llama.cpp
#69 opened by techn0man1ac - 0
- 0
您好,可以查看源码吗?
#65 opened by Darleen71 - 0
同一个请求连续多次尝试都是相同错误
#64 opened by gauss-clb - 2
- 0
如何优化deepseek用来做文本审查时的prompt定义
#61 opened by xfghvgnfyjssjgte - 1
关于DeepSeek-Coder-V2-Lite-Base的128k捞针测试结果
#59 opened by chaochen99 - 0
如何让模型能够回答完问题自动停止
#60 opened by hensiesp32 - 1
- 1
Chat API响应的role字段不要设为null
#54 opened by jichulu - 2
- 2
Drop Token
#48 opened by Richie-yan - 1
你好,现在不支持,计划支持函数工具调用吗?
#47 opened by cristianohello - 1
has it function calling?
#45 opened by cristianohello - 1
has it function calling?
#46 opened by cristianohello - 0
Knowledge cutoff date
#50 opened by Shadow-Alex - 1
- 2
- 1
- 1
- 0
docker for vllm. with deepseekv2 support merged
#44 opened by supdizh - 0
有没有计划将 deepseek-v2-lite 上传到 modelscope
#43 opened by Tendo33 - 0
- 0
缓存C<sup>KV</sup><sub>t</sub> 多卡并行推理是否需要每张卡缓存一份
#41 opened by c-dafan