Explanation:
This interface adopts the same input and output format as openai, fully compatible with the openai interface, and can be adapted to your application by simply changing one line of code, so as to access a large number of open source projects, open source models, and faster and more cost-effective services. This interface does not include the knowledge base processing capability of the Exhibition Wisdom Assistant.
Interface format reference: https://platform.openai.com/docs/api-reference/chat/object
API test page:
API access point: Please contact us on WeChat for OpenID.
Overseas node (faster, supports stream): https://v.stylee.top:8899/llm_api/{openid}/{userid}/{AI_system}/v1/chat/completions
China (does not support native stream):
https://ai.zyinfo.pro:8882/llm_api/{openid}/{userid}/{AI_system}/v1/chat/completions
API example: https://v.stylee.top:8899/llm_api/test/test/qwen/v1/chat/completions
Method: POST
URL parameter description:
URL filling format interpretation: fixed part of the URL access point + openid / userid / AI_system / + v1/chat ...
openid: please contact us to create, WeChat youkpan. There are more discounts for large quantities.
userid: Please refer to Exhibition Wisdom Assistant Knowledge Base AI Dialogue Interface to create userid, or fill in any one.
AI_system: processing model (if AI_system is empty, the model parameter in POST json will be used. API access point: ...{userid}//v1/chat...) :
Support temprature, top_p, frequency penalty (default is 0) and other parameters, compatible with openai interface.
AI_system Processing Model |
Model Name | Context | Price Yuan/1K token |
Chinese Ability | Explanation |
---|---|---|---|---|---|
gpt-3.5-turbo | ChatGPT (for testing, not supported yet) |
4K | 0.042 | Good | Usage limit |
llama2-70b | LLama2 70B | 4k | 0.019 | Average | Unlimited |
qwen | Alibaba Qianwen2 72B | 32k | 0.04 | Excellent | Unlimited |
baichuan | Baichuan 53B | 4k | 0.04 | Excellent | Limited, maximum 60 requests/min |
llama2-32k | LLAMA2 7B | 32K | 0.0042 | Poor | Unlimited |
mistral | mistral 7B language model | 4K | 0.0042 | Average | Unlimited, good 7B model. It is recommended to put the content in the first Message information. This model does not follow the instruction output. |
misins | mistral chat 7B | 4K | 0.0042 | Average | Unlimited, with instruction fine-tuning |
misorca | mistral chat + orca 7B | 8K | 0.0042 | Average | Unlimited, with instruction fine-tuning of large models, recommended for logical processing environment |
misins-moe | Mixtral-8x7B Instruct expert model | 32K | 0.0084 | Average | Unlimited, used for expert-level reasoning, with good effect |
misins-moe2 | Mixtral-8x7B Instruct expert model 2 |
32K | 0.0084 | Average | Unlimited, third-party fine-tuning version, used for expert-level reasoning, GSM8k effect is average. |
chat5-v | Visual model 5 | 8K | Calculated by processing time 0.015 yuan/S |
Good | Unlimited, used for visual input processing. Before calling, please refer to the uploadfile interface to upload image streams and instructions. |
deepseek-coder-33b | Programming expert 33b | 16K | 0.017 | Excellent | Unlimited, used for general software writing interaction, instruction interface |
codellama-34b | Code LLama 34b Instruct | 8K | 0.017 | Average | Unlimited, used for general software writing interaction, instruction interface |
codellama-13b | Code LLama 13b Instruct | 8K | 0.0063 | Average | Unlimited, used for general software writing interaction, instruction interface |
vicuna-16k | vicuna 13b 16k | 16K | 0.0063 | Average | Unlimited, used for longer context English processing |
yi-34b | Yi 34b Chat | 4K | 0.011 | Very good | Unlimited, used for longer context Chinese processing, shorter output content. |
code3 | Code expert model AIcoder 34B | 16K | 0.011 | Average | Unlimited, used for code completion |
pycode3 | Code expert model AIcoder 34B | 16K | 0.012 | Average | Unlimited, used for code completion, instruction fine-tuning |
yi-34b-t | Yi 34b Chat | 4K | 0.019 yuan/S | Very good | Unlimited, used for longer context Chinese processing, shorter output content. |
llama2-70b-t | llama2 70b for time fee | 4K | 0.019 yuan/S | Average | Unlimited, used for longer context English processing, shorter output content. |
qwen-14b-t | qwen-14b-chat | 8K | 0.015 yuan/S | Very good | Unlimited |
orca2-t | orca2 13B | 4K | 0.015 yuan/S | Average | Unlimited, good output logic |
mistral-t | mistral 7B | 8K | 0.012 yuan/S | Average | Unlimited |
misorca-t | mistral chat + orca 7B | 8K | 0.012 yuan/S | Average | Unlimited |
misins-moe-t | Mixtral-8x7B Instruct expert model | 32K | 0.019 yuan/S | Average | Unlimited, used for expert-level reasoning, if the output is shorter, the cost is lower, and the effect is good |
pycode3-t | python code expert model AIcoder 34b | 16K | 0.012 yuan/S | Average | Unlimited, used for code completion |
pycode3.1-t | python code expert model AIcoder 34b | 16K | 0.016 yuan/S | Average | Unlimited, used for code completion |
code3-t | Code expert model AIcoder 34B | 16K | 0.01 yuan/S | Average | Unlimited, used for code completion |
code3.0-t | Code expert model AIcoder 34B | 16K | 0.01 yuan/S | Average | Unlimited, used for code completion |
code3.1-t | Code expert model AIcoder 34B | 16K | 0.019 yuan/S | Average | Unlimited, used for code completion |
code2-t | Code expert model AIcoder 15B | 16K | 0.016 yuan/S | Average | Unlimited, used for code completion |
code2.1-t | Code expert model AIcoder 15B | 16K | 0.016 yuan/S | Average | Unlimited, used for code completion |
phi2-t | Small model phi 2.7B | 2K | 0.0032 yuan/S | Average | Unlimited, |
● X yuan/S, represents billing based on processing time, used for processing longer context, shorter output content. However, for models that have not been requested for a long time or are not frequently used, it may be necessary to reload the model, which takes 3 to 5 minutes. The cost does not include the model loading period.
Interrupt the current generation by calling the interface:
API: APIendpoint + "/suspend_current_generate?openid=&userid="
Check your account balance:
https://v.stylee.top:8882/api_user/info?openid=&type=
Please complete the openid field, and the type can be json.
For example, for the test account test:
https://ai.zyinfo.pro:8882/api_user/info?openid=test&type=
Privacy statement:
For companies with privacy concerns, we can help build a local deployment environment in the future. Except for legal requirements, we do not retain your content, make every effort to protect information security, do not involve your data in training, and employees have signed confidentiality agreements. Generally, employees do not have permission to view the processed content.
For access and obtaining openid, please contact us:
WeChat: youkpan
Email: youkpan@gmail.com
Shenzhen ZhanYing Information Technology zyinfo.pro