This React App demonstrates ML Inference in the Browser using
- Cloudflare Pages to deliver the React app and model via worldwide Content Delivery Network (CDN)
- ONNX Runtime Web for model inference in the Browser
- Huggingface for NLP model hosting and training API (Transformer library)
- Google Colab for model training using GPU instances
Live demo at https://aiserv.cloud/.
See also my blog post Moving ML Inference from the Cloud to the Edge and Deploy Transformer Models in the Browser with #ONNXRuntime on YouTube.
The emotion prediction model is a fine-tuned version of the pre-trained language model microsoft/xtremedistil-l6-h384-uncased. The model has been fine-tuned on the GoEmotions dataset which is a multi-label text categorization problem.
GoEmotions, a human-annotated dataset of 58k Reddit comments extracted from popular English-language subreddits and labeled with 27 emotion categories . As the largest fully annotated English language fine-grained emotion dataset to date. In contrast to the basic six emotions, which include only one positive emotion (joy), the taxonomy includes 12 positive, 11 negative, 4 ambiguous emotion categories and 1 “neutral”, making it widely suitable for conversation understanding tasks that require a subtle differentiation between emotion expressions.
Paper GoEmotions: A Dataset of Fine-Grained Emotions
- The fine-tuned model is hosted on Huggingface:bergum/xtremedistil-l6-h384-go-emotion.
- The
go_emotions
dataset is available on Huggingface dataset hub.
See TrainGoEmotions.ipynb for how to train a model on the dataset and export the fine-tuned model to ONNX.
The model is quantized to int8
weights and has 22M trainable parameters.
Inference is multi-threaded. To use multiple inference threads, specific http headers must be presented by the CDN, see Making your website "cross-origin isolated" using COOP and COEP.
Three threads are used for inference. Due to this bug multi-threading and COOP headers had to be disabled as the model would silently fail to initialize on IOS devices.
For development, the src/setupProxy.js adds the required headers. See react issue 10210
- The App frontend logic is in src/App.js
- The model inference logic is in src/inference.js
- The tokenizer is in src/bert_tokenizer.js which is a copy of Google TFJS (Apache 2.0)
- Cloudflare header override for cross-origin coop policy to enable multi threaded inference public/_header.
The pre-trained language model was trained on text with biases, see On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? for a study on the dangers of pre-trained language models and transfer learning.
From dataset paper GoEmotions: A Dataset of Fine-Grained Emotions:
Data Disclaimer: We are aware that the dataset contains biases and is not representative of global diversity. We are aware that the dataset contains potentially problematic content. Potential biases in the data include: Inherent biases in Reddit and user base biases, the offensive/vulgar word lists used for data filtering, inherent or unconscious bias in assessment of offensive identity labels, annotators were all native English speakers from India. All these likely affect labeling, precision, and recall for a trained model. The emotion pilot model used for sentiment labeling, was trained on examples reviewed by the research team. Anyone using this dataset should be aware of these limitations of the dataset.
Install Node.js/npm, see Installing Node.js
In the project directory, you can run:
Runs the app in the development mode.
Open http://localhost:3000 to view it in your browser.
The page will reload when you make changes.
You may also see any lint errors in the console.
Builds the app for production to the build
folder.
It correctly bundles React in production mode and optimizes the build for the best performance.
Clone this repo and use Cloudflare Pages.
- Fix build to copy wasm files from node_modules to build to avoid having wasm files under source control.
- PR and feedback welcome - create an issue to get in contact.