File Limit Request: dashinfer - 1024 MiB

Question

File Limit Request: dashinfer - 1024 MiB

Closed this issue 8 days ago · 5 comments

laiwenzh commented 3 months ago

Project URL

https://pypi.org/project/dashinfer/

Does this project already exist?

Yes

New Limit

1024

Update issue title

I have updated the title.

Which indexes

PyPI

About the project

DashInfer is a native LLM inference engine aiming to deliver industry-leading performance atop various hardware architectures, including CUDA, x86 and ARMv9.
We opensource dashinfer in April 2024, and recently updated it to v2.0.0, which has some prebuilt GPU codes, and exceeds upload limits.
https://github.com/modelscope/dash-infer

Reasons for the request

In v2.0.0 (not uploaded yet), we add some prebuilt binaries to maintain GPU support, which increase package size and exceeds upload limits.

Code of Conduct

I agree to follow the PSF Code of Conduct

Answer 1 · 2024-12-19T13:47:05.000Z

Hey @laiwenzh 👋
Have you tried to ship the prebuilt GPU code or other heavy files from the releases differently?
Projects like NLTK provide a runtime download to get more data files, and in your case that might work, but also I believe you could consider to deploy the GPU-related dependencies as a separate package even.
The reason of me asking is because having a 1G wheel doesn't sound like a good idea, even though it's nice to have all the dependencies is one place, it's for sure something that can be improved.
Let me know!

Answer 2 · 2024-12-20T08:57:39.000Z

Hey @laiwenzh 👋 Have you tried to ship the prebuilt GPU code or other heavy files from the releases differently? Projects like NLTK provide a runtime download to get more data files, and in your case that might work, but also I believe you could consider to deploy the GPU-related dependencies as a separate package even. The reason of me asking is because having a 1G wheel doesn't sound like a good idea, even though it's nice to have all the dependencies is one place, it's for sure something that can be improved. Let me know!

Hi @cmaureir , thanks for your quick reply!

Our project has many cuda kernels, these cuda kernels (not data) need to be compiled with different SMs. This results in a large shared library (.so) in our package. Cuda-related dependencies (like cuda toolkits, system libs) are not a part of our release package.

Currently our release cuda package on github is about 655MB. I hope the upload limits can be expand to lager than this size, because we plan to add more cuda kernels in our project.

Answer 3 · 2025-01-09T09:52:56.000Z

Hi @cmaureir

Could you review our project 's case again , per @laiwenzh said our project is a include many GPU kernel support, which cause the a large .so build.

I checked the guideline which said "project contains large compiled binaries to maintain platform/architecture/GPU support
"

For current build, could change this limit to 700M? so we can release our current version,

For future version, we can do the package splitting, do you have suggestion about size, I checked our package, we may split this package into two packages for one to 400M, another one to 200M , because we only have two large shared lib.

Answer 4 · 2025-03-19T02:57:51.000Z

Hi. @cmaureir , We latest release already cut some unnessery cuda binary , which lead the size come to around 458 MB, see : https://github.com/modelscope/dash-infer/releases/tag/v2.1.0 (DashInfer-2.1.0.cuda-12.4-shared.x86_64.tar.gz)

So could you change our project's limit to 500MB, so we can release this binary package to pip repo.

Thanks a lot!

Answer 5 · 2025-03-25T09:40:34.000Z

Hey,
I have increase the file limit to 500 MB, which is the limit I usually give to projects for the upload size. I understand each project is different, but I do believe you can still find ways of providing the big files with a different methods.
Have a nice week 🐍