Hk-Gosuto/ChatGPT-Next-Web-LangChain

[Feature] 是否考虑支持文件上传功能

lialzm opened this issue · 44 comments

Is your feature request related to a problem? Please describe.

gpt4已经可以处理文件了,是否考虑将文件上传也添加进去

Bot detected the issue body's language is not English, translate it automatically.


Title: [Feature] Whether to consider supporting file upload function

gpt-4-vision-preview?

gpt-4-vision-preview?

麻烦大佬也更新下docker 镜像😂,支持图像上传

Bot detected the issue body's language is not English, translate it automatically.


gpt-4-vision-preview?

Could you please also update the docker image 😂 to support image uploading?

可以用这个版本的镜像:gosuto/chatgpt-next-web-langchain:20231206
有比较大的功能变更时我会手动打包一个日期为 tag 的镜像,如果想提前使用可以关注下
https://hub.docker.com/repository/docker/gosuto/chatgpt-next-web-langchain/tags?page=1&ordering=last_updated

image

发现bug,这个镜像识别图片可以,但是切换到其他模型正常输入文字发不出去,js报错
@Hk-Gosuto

image 发现bug,这个镜像识别图片可以,但是切换到其他模型正常输入文字发不出去,js报错 @Hk-Gosuto

我记得昨天已经修复了,等会我再打个包。

Bot detected the issue body's language is not English, translate it automatically.


image Found a bug. This image can recognize images, but when switching to other models and inputting text normally, it cannot be sent, and js reports an error @Hk-Gosuto

I remember it was repaired yesterday, I'll pack it up later.

image 发现bug,这个镜像识别图片可以,但是切换到其他模型正常输入文字发不出去,js报错 @Hk-Gosuto

我记得昨天已经修复了,等会我再打个包。

使用 gosuto/chatgpt-next-web-langchain:20231207

Bot detected the issue body's language is not English, translate it automatically.


image Found a bug, this image can recognize images, but when switching to other models, input text cannot be sent normally, and js reports an error @Hk-Gosuto

I remember it was repaired yesterday, I will pack it up later.

Use gosuto/chatgpt-next-web-langchain:20231207

感谢大佬,可以用了,很好很强大 👍

Bot detected the issue body's language is not English, translate it automatically.


Thank you boss, it can be used, it’s very good and powerful 👍

好像有个bug偶现,有时候调用接口没有传图片
操作步骤,先选择图片然后再输入文字

Bot detected the issue body's language is not English, translate it automatically.


There seems to be a bug that appears occasionally. Sometimes the calling interface does not send pictures.
Operation steps: first select the picture and then enter the text

好像有个bug偶现,有时候调用接口没有传图片 操作步骤,先选择图片然后再输入文字

我发现是快捷键发送会重现,用鼠标点击发送就可以发

Bot detected the issue body's language is not English, translate it automatically.


There seems to be a bug that occurs occasionally. Sometimes the calling interface does not send pictures. The operation steps are to select the picture first and then enter the text.

I found that the shortcut key to send will reappear. Click send with the mouse to send.

gpt-4-vision-preview?

是的,还有Whisper,挺多场景需要上传文件给到插件,然后插件处理的,另外想请教下vercel是直接实现文件的一些操作是不是限制很多呢,我尝试了下文件下载的很慢,大点文件就不行了

Bot detected the issue body's language is not English, translate it automatically.


gpt-4-vision-preview?

Yes, there is also Whisper. There are many scenarios where you need to upload files to the plug-in, and then the plug-in processes them. In addition, I would like to ask if there are many restrictions on vercel's direct implementation of some file operations. I tried it and the file download was very slow. Bigger files won’t work

gpt-4-vision-preview?

是的,还有Whisper,挺多场景需要上传文件给到插件,然后插件处理的,另外想请教下vercel是直接实现文件的一些操作是不是限制很多呢,我尝试了下文件下载的很慢,大点文件就不行了

vercel 的 edge function 没办法调用 nodejs 的一些核心函数,比如文件操作相关。
虽然运行时可以改为 nodejs 但是普通用户在 nodejs 运行时下接口运行时间会限制,导致很容易接口超时。

Bot detected the issue body's language is not English, translate it automatically.


gpt-4-vision-preview?

Yes, there is also Whisper. There are many scenarios where you need to upload files to the plug-in, and then the plug-in processes them. In addition, I would like to ask if there are many restrictions on vercel's direct implementation of some file operations. I tried it and the file download was very slow. , larger files will not work.

Vercel's edge function cannot call some core functions of nodejs, such as file operations.
Although the runtime can be changed to nodejs, ordinary users will have limited interface running time when running nodejs, causing the interface to easily time out.

好像有个bug偶现,有时候调用接口没有传图片 操作步骤,先选择图片然后再输入文字

我发现是快捷键发送会重现,用鼠标点击发送就可以发

主分支修复了,docker镜像用这个:gosuto/chatgpt-next-web-langchain:20231207-2

Bot detected the issue body's language is not English, translate it automatically.


There seems to be a bug that occasionally occurs. Sometimes the calling interface does not send pictures. The operation steps are to select the picture first and then enter the text.

I found that the shortcut key to send will reappear. Click send with the mouse to send.

The main branch has been repaired, and the docker image uses this: gosuto/chatgpt-next-web-langchain:20231207-2

image 现在方案看起来会超过存储限制

我看现在图片是base64的方案,如果上传到R2,使用图片url传到openai,应该还可以节省token费用?

Bot detected the issue body's language is not English, translate it automatically.


I see that the current image is a base64 solution. If you upload it to R2 and use the image URL to transfer it to openai, you should be able to save token fees?

我看现在图片是base64的方案,如果上传到R2,使用图片url传到openai,应该还可以节省token费用?

图片是根据像素单独计费的

Bot detected the issue body's language is not English, translate it automatically.


I see that the current image is a base64 solution. If you upload it to R2 and use the image URL to transfer it to openai, you should be able to save token fees?

Images are billed individually per pixel

image 现在方案看起来会超过存储限制

localStorage 最大是 5m 的容量,看来需要使用其它方案存储图像了。

Bot detected the issue body's language is not English, translate it automatically.


image Now the scenario looks like it will exceed the storage limit

The maximum capacity of localStorage is 5m. It seems that other solutions need to be used to store images.

image 现在方案看起来会超过存储限制

localStorage 最大是 5m 的容量,看来需要使用其它方案存储图像了。

觉得这个还是有必要的,也方便PDF浏览器兼容更多格式

Bot detected the issue body's language is not English, translate it automatically.


image Now the scenario looks like it will exceed the storage limit

The maximum capacity of localStorage is 5m. It seems that other solutions need to be used to store images.

I think this is still necessary, and it also makes it easier for PDF browsers to be compatible with more formats.

@Hk-Gosuto 我也同样有这个错误,使用的是:gosuto/chatgpt-next-web-langchain:20231207
“发现一个bug,这张图片可以识别图片,但是切换到其他模型时,输入文字无法正常发送,js报错”

Bot detected the issue body's language is not English, translate it automatically.


@Hk-Gosuto I also have this error, using: gosuto/chatgpt-next-web-langchain:20231207
"A bug was found. This picture can recognize the picture, but when switching to other models, the input text cannot be sent normally and js reports an error."

@Hk-Gosuto 我也同样有这个错误,使用的是:gosuto/chatgpt-next-web-langchain:20231207 “发现一个bug,这张图片可以识别图片,但是切换到其他模型时,输入文字无法正常发送,js报错”

gosuto/chatgpt-next-web-langchain:20231207-2

Bot detected the issue body's language is not English, translate it automatically.


@Hk-Gosuto I also have this error, using: gosuto/chatgpt-next-web-langchain:20231207 "I found a bug. This picture can recognize the picture, but when switching to other models, the text input cannot be normal. Send, js reports error”

gosuto/chatgpt-next-web-langchain:20231207-2

gpt-4-vision-preview 可以上传图片。请问是否支持多种日志文件上传的入口呢。

Bot detected the issue body's language is not English, translate it automatically.


gpt-4-vision-preview can upload images. I would like to ask if it supports multiple entrances for uploading log files.

gpt-4-vision-preview 可以上传图片。请问是否支持多种日志文件上传的入口呢。

视觉模型只能解析图片,上传日志之类的东西也没办法处理啊。还是说你想要的是 plus 账户的 Code Interpreter 功能?

Bot detected the issue body's language is not English, translate it automatically.


gpt-4-vision-preview can upload images. I would like to ask if it supports multiple entrances for uploading log files.

The visual model can only parse images, and cannot handle things like uploading logs. Or do you want the Code Interpreter function of the plus account?

image 现在方案看起来会超过存储限制

更新为使用对象存储来中转图像文件了,参考文档配置一下:docs/s3-oss.md
镜像:gosuto/chatgpt-next-web-langchain:20231210

Bot detected the issue body's language is not English, translate it automatically.


image Now the scenario looks like it will exceed the storage limit

Updated to use object storage to transfer image files, refer to the documentation for configuration: [docs/s3-oss.md](https://github.com/Hk-Gosuto/ChatGPT-Next-Web-LangChain/blob/main/ docs/s3-oss.md)
Mirror: gosuto/chatgpt-next-web-langchain:20231210

gpt-4-vision-preview 可以上传图片。请问是否支持多种日志文件上传的入口呢。

视觉模型只能解析图片,上传日志之类的东西也没办法处理啊。还是说你想要的是 plus 账户的 Code Interpreter 功能?

可以考虑接入LangChain的Code Interpreter开源实现 https://github.com/shroominic/codeinterpreter-api

gpt-4-vision-preview 可以上传图片。请问是否支持多种日志文件上传的入口呢。

视觉模型只能解析图片,上传日志之类的东西也没办法处理啊。还是说你想要的是 plus 账户的 Code Interpreter 功能?

可以考虑接入LangChain的Code Interpreter开源实现 https://github.com/shroominic/codeinterpreter-api

这个项目我之前有关注过,想支持要需要对这个项目进行改造,成本比较高,后面如果比较空闲的话会再研究下这个。

Bot detected the issue body's language is not English, translate it automatically.


gpt-4-vision-preview can upload images. I would like to ask if it supports multiple entrances for uploading log files.

The visual model can only parse images, and cannot handle things like uploading logs. Or do you want the Code Interpreter function of the plus account?

You can consider accessing LangChain’s Code Interpreter open source implementation https://github.com/shroominic/codeinterpreter-api

I have paid attention to this project before. If I want to support it, I need to modify this project. The cost is relatively high. If I have more free time, I will study this again later.