jina-ai/GSoC

Research about deploying LLM with Jina

Nick17t opened this issue · 14 comments

Project idea 3: Research about deploying LLM with Jina

info details
Skills needed Python, Pytorch, CUDA, docker, Kubernetes
Project size 350 hours
Difficulty level Hard
Mentors @Alaeddine Abdessalem, @Joan Martínez

Project Description

  • Recently, large language models (LLMs) have gained attention for their ability to generate text to solve various tasks, such as question-answering, reading comprehension, and coding. However, most of these models are quite large, and deploying them requires certain technologies to be in place to enable scalability when using GPU resources.
    These technologies include:
  • model optimizer and weights partitioning across multiple GPU devices, whether within the same node or across different nodes.
  • weight offloading
  • model compression
  • optimizing the model for the underlying hardware

There are different libraries that allow applying these technologies on LLMs to ease the deployment. We can name, for instance DeepSpeed, Accelerate or FlexGen.

We aim to assess the capability of deploying such models with Jina and explore what integrations we can build with the existing ecosystem to enable LLM inference using the Jina stack.

The idea is to build demos/showcases with these technologies to host an LLM using Jina. Potentially, if for some reason these libraries cannot be used within jina framework, we would build integrations to use these technologies within jina.

Expected outcomes

The project aims to demonstrate the capability of Jina to deploy and scale LLMs and build generative applications in a cost-efficient manner. Specific outcomes include:

  • Implementation of LLM deployment using Jina and assessing scalability with GPU resources
  • Documentation and example code demonstrating the use of Jina for LLM deployment and inference
  • Building integrations with the mentioned libraries in order to use them within jina.
  • Evaluation of the cost-efficiency of deploying and scaling LLMs with Jina compared to other technologies

@JoanFM , @alaeddine-13
My self Aman Kumar
I am University Student major Computer Science Engineering.

My Expertise is : Machine Learning, Deep Learning, Advance NLP, MLOps, Kubernetes, KUDA, Pytorch, Tensorflow
My Previous projects are : Language Translation with RNNs,
Image Captioning, Resume ATS Cracker and many more

Experience :

  1. ML engineer Internship in suvidha foundation
  2. Google cloud Facilitator - Cloud Professional
  3. SEO internship in Wedcell Pvt Ltd.
    And two more i have

Location : I am from India from city Delhi

Connect with me !
My GitHub : https://github.com/Aman123lug
My LinkedIn :
https://www.linkedin.com/in/aman-kumar-5bb609228
Twitter : https://twitter.com/lug__aman
Blog Post : https://amanblog.hashnode.dev/
My Resume : https://drive.google.com/file/d/111CmblHkGmcSTAcpige_OaIklsVwjwvP/view?usp=sharing
Contact : 9711576118
ak06465676@gmail.com

Native Language : Hindi
Proficient Language : English

I am exploring the organisation my skill-set + my interest match with this organisation. I also want to involve in global communities around the world for Learning stuff. Open source is a great way to doing this.
Your project's goals and mission are truly inspiring, and I am eager to offer my skills and expertise to help you achieve them. I believe that our collaboration could lead to significant progress and positive impact in the field.
Community is a fundamental aspect of any successful project, and I am drawn to your team's collaborative and inclusive approach. I believe that by working together, we can leverage our diverse backgrounds and perspectives to achieve greater outcomes than we could individually.
I am Familier with Tech-stuff which are used in your projects. Also Have Some Experience in this Domain.

@JoanFM @alaeddine-13
Hi,myself Prakalp Choubey,a second year electronics undergraduate at BITS PILANI,India.I complete my GSoC'22 with the JBOSS Community successfully.My work revolved around workload classification using various machine learning techniques.This included collecting data of the system using different microbenchmarks and automating the process using python.I containerised the benchmark images by creating their docker files and deploying them on my own local system.
My Github:https://github.com/Prakalp23
My LinkedIn:https://www.linkedin.com/in/prakalp-choubey-069138240/
My twitter:https://twitter.com/ChoubeyPrakalp
I just wanted to get a link to open issues which I can resolve so as to get an idea of what I would be working upon during the summer.

@JoanFM @alaeddine-13
My name is Jinman Xie and I am a final-year postgraduate student majoring in Computer Science at Zhejiang University, China.

Experience:

  1. Deep learning framework internship in AI and Data Platform Department, Baidu Inc
  2. ML engineer Internship in Geographic location Tech center, ByteDance Inc.

Skills:
C++, Python, Machine learning, deep learning, pytorch, and CUDA programming.
Additionally, I have experience using Docker and win first prize in Python programming (ranking third) in 11th Blue Bridge Cup National Competition

I have been responsible for tasks such as model inference deployment acceleration and model pruning. Specifically, I participated in team projects involving model pruning, accelerating model computation using TensorRT and custom CUDA kernels. I believe my previous experience can be useful for organization.

I am thoroughly impressed by the capabilities of LLM and am eager to join its open source community. The opportunity to contribute my skills and knowledge to this community has filled me with excitement. If you feel I am a suitable candidate, contact me.
My GitHub: https://github.com/xjmxyt
My Resume: https://drive.google.com/file/d/1oMOtQleDKAZLtNk7PbYvbl-bHMvrXqGM/view?usp=sharing
Contact: xjmxyt@gmail.com

@JoanFM , @alaeddine-13

Hi, this is Jimmy from Canada, a second-year PhD candidate in SAIL (Software Analysis & Intelligence Lab) lab at Queen's University. My major research direction is machine learning experiment management and quality assurance.

Experience:

  • Full-time C++ software engineer in 2K, Los Angeles
  • Part-time machine learning mentor in Harbour Education, Beijing
  • Software engineering research intern in Lab 2012 at Huawei, Hangzhou

I have a solid background in both software engineering and machine learning model development. Particularly, I am good at theoretical knowledge and once served as a mentor in Harbour Education to teach college students about model compression as well as reinforcement learning. I have some experience with CUDA, Docker, and Kubernetes since my daily experiment in our current lab cannot escape from them.

My LinkedIn: https://www.linkedin.com/in/zhiminz/
My Resume: https://www.overleaf.com/read/tszwcvftnwdf
My Contact: z.zhao@queensu.ca

@alaeddine-13,@JoanFM
My name is Dheeraj, a second year UG student at BMSIT major at Information Science and Engineering.

Experience:

  • Data Science Intern at Innomatics Research labs.
  • Intern at Coincent.ai(NLTK)

Skills:
Programming languages: Python, C,Html, css
Libraries / Frameworks: TensorFlow, OpenCV, NLTK, PyTorch
Tools / Platforms: Git, GitHub, JupyterNotebook

My GitHub : https://github.com/Dheeraj-2022
My LinkedIn: https://www.linkedin.com/in/dheerajreddy20/
My resume :https://drive.google.com/drive/folders/1iwxRFLj292yHdrL-jC3B7RTdwIzSdeLs?usp=share_link
My Email : dheerureddy.s03@gmail.com

I have more interest about deploying LLM and I want to work with Jina to explore my skills and knowledge.

@JoanFM @alaeddine-13
My name is Ashutosh Srivastava, 2nd Year Student at IIT Roorkee.

Experience:

  1. I have worked with Kubernetes KubeAPI for Go, for building complex clusters with automated orchestration while working at SDSLabs.
  2. I as a member of InfoSecIITR and SDSLabs represented my college for the CSAW ESC 2022 which was on the topic of Adverserial attacks on Machine learning model. There I worked with optimizing PyTorch and Tensorflow model weights and watermarking them. We came 1st in the Research Vertical

Skills: I am very fluent in Python and Go. I have worked with PyTorch and Tensorflow models. I also have a good experience with Kubernetes, KubeAPI, and Knative. I have experimented with CUDA programming as well. Additionally I have worked on other projects written in Rust and Javascript.

I am very keen and eager on working on this project and would love to get some guidance and issues to work on to get ready for working on the project.

LinkedIn: https://www.linkedin.com/in/ashutosh-srivastava-1bbb0a223/
Resume: https://www.overleaf.com/read/cvwjjdgffqbc
Contact: ashutosh3002@gmail.com

I hope to make contributions to this project and hopefully, leave an impact.

Here are Some Brief Definitions For Those who might Need it.

What Exactly are LLMs?

Large language models (LLMs) are a type of machine learning model that uses deep learning techniques to process and generate natural language. These models are trained on large amounts of text data, such as books, articles, and web pages, to learn the patterns and structures of language. The training process involves feeding the model a large amount of text data and adjusting its parameters to predict the next word or sequence of words in a sentence.

LLMs have become increasingly popular in recent years due to their ability to generate human-like text, perform language-related tasks such as language translation and sentiment analysis, and answer complex questions. Examples of LLMs include GPT (Generative Pre-trained Transformer), BERT (Bidirectional Encoder Representations from Transformers), and T5 (Text-to-Text Transfer Transformer). These models have achieved state-of-the-art performance on a wide range of natural language processing (NLP) tasks and are used in many applications, including chatbots, virtual assistants, and search engines.

PyTorch
PyTorch is a popular deep learning framework that provides support for training and deploying neural networks. PyTorch is used extensively in the training and fine-tuning of LLMs, and it can also be used for deploying LLMs in production. PyTorch provides support for exporting trained models in various formats that can be used with Jina or other deployment frameworks.

CUDA
CUDA is a parallel computing platform and API developed by NVIDIA that allows you to use GPUs for general-purpose computing. GPUs can significantly speed up the training and inference of LLMs due to their ability to perform parallel computations. PyTorch provides support for CUDA and allows you to use GPUs for training and deploying LLMs.

Kubernetes
Kubernetes is a container orchestration system that allows you to deploy, manage, and scale containerized applications. Jina can be deployed on Kubernetes to provide a scalable and distributed neural search system. Kubernetes allows you to manage the deployment of Jina components, such as the Jina Gateway and Jina Pods, and scale them up or down depending on the workload.

In summary, PyTorch can be used for training and deploying LLMs, CUDA can be used for GPU acceleration, and Kubernetes can be used for deploying and scaling Jina components. These tools can be used in various combinations depending on your specific deployment scenario and requirements.

Incase you're new to the research phase and need a little insight. Next, I'll try to break down the project itself to a "lay man" definition

Hi @5hv5hvnk @zhimin-z @h4shk4t @xjmxyt @robinokwanma @Dheeraj-2022 @Aman123lug @Prakalp23

I am delighted to hear that you are interested in contributing to the Jina AI community! 🎉

To get started, please take a moment to fill out our survey so that we can learn more about you and your skills.

Also, don't forget to mark your calendars for the GSoC x Jina AI webinar on March 23rd at 2 pm (CET). This is an excellent opportunity to learn more about the projects and ask any questions you have about the requirements and expectations.

Our mentors will provide an in-depth overview of the projects and answer any questions you may have. So please don't hesitate to ask any questions or seek clarification on any aspect of the project.

Is there anything specific you would like to learn from the webinar? Do you have any questions about the Research about deploying LLM with Jina project that you would like to see clarified during the Q&A session? Let me know, and I'll be happy to help!

Looking forward to seeing you at the webinar, and thank you for your interest in the Jina AI community! 😊

Dear @JoanFM and @alaeddine-13,

I am Ali Quidwai, a Computer Engineering graduate from NYU Tandon School of Engineering. I am enthusiastic about joining your project on deploying LLM with Jina and believe that my expertise in machine learning, deep learning, advanced NLP, MLOps, Kubernetes, CUDA, Pytorch, and Tensorflow make me an excellent candidate for this project.

My experience includes:

  • Master's in Engineering from NYU Tandon School of Engineering.
  • Led a research team developing a computer vision application
  • Member of Plant Systems Biology at The Coruzzi Lab
  • Research on topological analysis of word embeddings with Prof. Parijat Dube

My GitHub repository showcases my work in projects such as

Here are some additional ways to connect with me:

I am excited about the opportunity to contribute to this project because I am passionate about open-source development and want to engage with global communities to learn and collaborate. Your project's goals and mission align with my interests and expertise, and I am eager to offer my skills to help you achieve them.

I am drawn to your team's collaborative and inclusive approach, and I believe that by working together, we can leverage our diverse backgrounds and perspectives to achieve greater outcomes. I am familiar with the technologies used in your projects and have relevant experience in the domain.

Looking forward to contributing to this project and discussing my qualifications further.

Best regards,
Ali Quidwai

Dear @JoanFM @alaeddine-13:
My name is Wuyang, 1st Year Postgraduate at University of Science and Technology of China.

Experience:

I have worked as an intern in the Natural Language Processing Laboratory of Tsinghua University when I was an undergraduate. I independently wrote the back-end of CUGE website by using python Flask framework. Meanwhile, I learned a lot about AI, including but not limited to NLP, Federated Learning, NAS, Model Acceleration and LLM.
Now as a graduate student at USTC, major in Cloud Computing and Edge AI, I am endlessly interested in open source projects that combines cloud and AI.

Skills:

  • I am very fluent in Python and Go.
  • I have worked with PyTorch and Tensorflow models. I have read a lot of papers on AI and done a lot of related experiments, including but not limited to NLP, Federated Learning, NAS, Model Acceleration and LLM.
  • I have a good experience with Kubernetes, and I've read its source code. I know and have used other cloud native products such as Prometheus, Volcano.

I am very keen and eager on working on this project and would love to get some guidance and issues to work on to get ready for working on the project.

Contact: wyoung@mail.ustc.edu.cn

@Nick17t
I'm Farhad Jaman, currently in my final year of Computer Science at Bangladesh University of Professionals, Bangladesh.

I came across your org in GSoC 2023 I am interested in contributing. While I'm new to open-source work, I'm proficient in, Machine Learning, Javascript, C++, and Python.

My background includes a strong understanding of ML & DL. I've been working on a research project using models like VGG-16, ResNet-50, and a hybrid approach for pneumonia classification. Now I am learning about Transformers and LLMs and contributing while learning new stuff.

Given my skills and academic focus, I believe I could contribute effectively to your project. I'd appreciate it if you guide me to the
potential areas where I could contribute to the project and how I can get started. Thank you.

I am Sanandi Naik. I am a fourth year student at IIT Delhi. I am enthusiastic about joining your project on deploying LLM with Jina and believe that my expertise in machine learning, deep learning, Kubernetes, CUDA, Pytorch, and Tensorflow make me an excellent candidate for this project in GSoC'24.

Some of my projects include:
Image processing (Prof. Agam Gupta) | Image Recognition | IIT Delhi (April’23 - Sept'23)
• Constructing Anomaly Detection model of Satellite images using Light Siamese Neural Network (LightCDNet).
• A deep neural network algorithm for large-scale remote sensing binary change detection on bitemporal images.
• Incorporating additional backbone structure with mobilenet, resnet and xception using tensorfow, keras, pytorch.
Prediction of Bioreactor Injection Time — (Prof. Anurag Rathore) | IIT Delhi (June’22 - January’23)
• Development of an AI-ML model for the prediction of bioreactor injection time for the development of microbials.
• Initial phase: Development of PCA classification plots for the chromatography peaks based on concentration.
• Second phase: Implementation of LSTM models, evaluation of other models-RNN, 1DCNN.

I recently finished a short course on LangChain for LLM Application Development by N.G. Andrew.

I would like to work further on LLMs. I believe I could contribute effectively to your project. I'd appreciate it if you guide me to the
potential areas where I could contribute to the project and how I can get started. Thank you.

Contact details:
email: sanandinaik.iitd@gmail.com
LinkedIn: https://www.linkedin.com/in/sanandinaik/