tensorchord/modelz-llm

OpenAI compatible API for LLMs and embeddings (LLaMA, Vicuna, ChatGLM and many others)

PythonApache-2.0

Issues

Local custom .gguf modals supported?
#102 opened 8 months ago by sengiv
2
Tokenizer class LLaMATokenizer does not exist or is not currently imported.
#101 opened 9 months ago by ken-xu-e
1
Cant use it on Windows
#98 opened 9 months ago by yogeshhk
3
Missing a LICENSE
#99 opened 9 months ago by loleg
1
bug: Failed to generate outputs
#97 opened a year ago by gaocegege
2
Function calling feature
#96 opened a year ago by willswordh
1
add llama-2
#90 opened a year ago by antonkulaga
1
do we support vicuna 13b, chatglm2 ?
#88 opened a year ago by timiil
1
feat: provide instructions on how community members can wrap models for this project
#87 opened a year ago by PaulConyngham
1
bug: can only concatenate str (not "int") to str
#86 opened a year ago by gaocegege
1
bug: Completion request returns wrong response
#83 opened a year ago by gaocegege
0
chore: Fix vicuna 7b
#82 opened a year ago by gaocegege
0
feat: support chatgpt web
#79 opened a year ago by dgqyushen
1
feat: Support falcon 7b
#77 opened a year ago by gaocegege
0
bug: Unexpected OOM in ChatGLM 6B
#69 opened a year ago by gaocegege
2
feat: Support more models
#11 opened a year ago by gaocegege
3
bug: Vicuna performance is not great
#73 opened a year ago by gaocegege
2
bug: RuntimeError: "LayerNormKernelImpl" not implemented for 'Half' in chatglm int4
#62 opened a year ago by gaocegege
2
bug: AttributeError: 'ChatGLMForConditionalGeneration' object has no attribute 'encoder'
#63 opened a year ago by arugal
4
bug: Install cuda again in the image
#55 opened a year ago by gaocegege
2
bug: RuntimeError: Only Tensors of floating point and complex dtype can require gradients
#60 opened a year ago by gaocegege
0
feat: Match lmsys/fastchat-t5-3b-v1.0 performance with fastchat
#42 opened a year ago by gaocegege
2
bug: Object missing required field `model`
#50 opened a year ago by gaocegege
0
test: check if the embedding API is compatible
#45 opened a year ago by kemingy
0
bug: Extra blank space in output
#48 opened a year ago by gaocegege
0
feat: Support flag to use CPU/GPU
#13 opened a year ago by gaocegege
0
feat: Support CICD for different model dry-run warmup
#9 opened a year ago by gaocegege
0
feat: Add a more elegant way to specify the model
#10 opened a year ago by gaocegege
2
feat: Add CLI argument int8 and int4
#38 opened a year ago by gaocegege
0
bug: 500 when exceeds the max token length with chatglm 6b int4
#35 opened a year ago by gaocegege
0
feat: Support embedding API
#14 opened a year ago by gaocegege
2
bug: 500 with langchain sdk
#16 opened a year ago by gaocegege
2
feat: Remove v1 from URI
#8 opened a year ago by gaocegege
0