/rust-individual-project-1

Rust cli that summarizes text with pre-trained models

Primary LanguageRustCreative Commons Zero v1.0 UniversalCC0-1.0

Tests Build binary release Clippy Rustfmt publish to Dockerhub Benchmark

IDS721 Spring 2023 Individual Project 1 - Rust CLI Tool for Text Summarization

This project aims to build a Rust CLI tool that summarizes text, based on the common task of reading and summarizing books among students. The project uses the rust clap and libtorch to run a pre-trained hugging-face model for summarization.

Architectural Diagram

image

Project Goals/Outcomes

  • Develop my first Rust project
  • Use Github Codespaces and Copilot
  • Integrate libtorch and 'hugging-face pretrained models' into a Rust Cli project

Setup

  1. Install rust via rustup
  2. Install the libtorch (for Mac M1), Intel chips users can skip this step
brew install pytorch@1.13.1

Not Mac ARM chips users

  • Run, you can pass any text as the parameter at the end of the command. See below.
make run PARAMETER='The Chinese monarchy collapsed in 1912 with the Xinhai Revolution, when the Republic of China (ROC) replaced the Qing dynasty. In its early years as a republic, the country underwent a period of instability known as the \"Warlord Era\" before mostly reunifying in 1928 under a Nationalist government. A civil war between the nationalist Kuomintang (KMT) and the Chinese Communist Party (CCP) began in 1927. Japan invaded China in 1937, starting the Second Sino-Japanese War and temporarily halting the civil war. The surrender and expulsion of Japanese forces from China in 1945 left a power vacuum in the country, which led to renewed fighting between the CCP and the Kuomintang.'

  • Release
make releasex86
  • Bench
make benchx86

Mac ARM chips users

  • change the path in the Makefile to your libtorch path
export LIBTORCH=/opt/homebrew/Cellar/pytorch/1.13.1 &&export LD_LIBRARY_PATH=${LIBTORCH}/lib:$LD_LIBRARY_PATH
  • Run, you can pass any text as the parameter at the end of the command. See below.
make runarm PARAMETER='The Chinese monarchy collapsed in 1912 with the Xinhai Revolution, when the Republic of China (ROC) replaced the Qing dynasty. In its early years as a republic, the country underwent a period of instability known as the \"Warlord Era\" before mostly reunifying in 1928 under a Nationalist government. A civil war between the nationalist Kuomintang (KMT) and the Chinese Communist Party (CCP) began in 1927. Japan invaded China in 1937, starting the Second Sino-Japanese War and temporarily halting the civil war. The surrender and expulsion of Japanese forces from China in 1945 left a power vacuum in the country, which led to renewed fighting between the CCP and the Kuomintang.'

  • Release
make release
  • Bench
make bench

Below is the screenshot for the results results

CI/CD

Github Actions configured in .github/workflows

Docker

  • This repo main branch is automatically published to Dockerhub with CI/CD, you can pull the image from here
docker pull szheng3/sz-rust-ml-cli:latest
  • Run the docker image, you can pass any text as the parameter at the end of the command. See below.
docker run szheng3/sz-rust-ml-cli:latest 'The Chinese monarchy collapsed in 1912 with the Xinhai Revolution, when the Republic of China (ROC) replaced the Qing dynasty. In its early years as a republic, the country underwent a period of instability known as the \"Warlord Era\" before mostly reunifying in 1928 under a Nationalist government. A civil war between the nationalist Kuomintang (KMT) and the Chinese Communist Party (CCP) began in 1927. Japan invaded China in 1937, starting the Second Sino-Japanese War and temporarily halting the civil war. The surrender and expulsion of Japanese forces from China in 1945 left a power vacuum in the country, which led to renewed fighting between the CCP and the Kuomintang.'

GitHub releases

The binary could be downloaded from the release pages. release

Benchmark Results

Benchmark

Progress Log

  • Configure Github Codespaces.
  • Initialise Rust project with pretrained model from hugging-face
  • Add clap command line parsing for arguments (Text)
  • Dockerized the project.
  • CI/CD with Github Actions
  • Tag and Releases
  • Benchmark

References