LTI's Large Language Model Deployment

TODO: Add a description of the project.

This repo was originally a fork of the huggingface's BLOOM inference demos, ported to it's own repo to allow for more flexibility in the future.

Installation

pip install -e .

Example API Usage

Currently, the client must be run from a compute node on the tir cluster. If you don't have access to the tir cluster, please contact your advisor and ask.

Run the following commands, where tir-x-xx is the current location of the lti-llm running process. The first parameter, text corresponds the prompt that will be forced-decoded by the model. The function will return a list of Output objects, one for every prompt in the input list.

import llm_client

client = llm_client.Client(address="tir-x-xx")
ouputs = client.prompt("CMU's PhD students are")
print(outputs[0].text)

Model State

It is also possible to obtain the raw logit scores / output distribution from the model.

import llm_client

client = llm_client.Client(address="tir-x-xx")
outputs = client.prompt("CMU's PhD students are", output_scores=True)
print(outputs[0].scores.shape)

And equivalently, it is possible to obtain the raw hidden states from the model.

import llm_client

client = llm_client.Client(address="tir-x-xx")
outputs = client.prompt("CMU's PhD students are", output_hidden_states=True)
for layer in outputs[0].hidden_states:
    print(f"Layer {layer}: {layer.shape}")

Other Available Arguments

The rest available arguments are basically identical to Huggingface transformers' model.generate function. However, not all arguments are available, and better documentation of the ones that are will provided in the future.

neulab/lti-llm-deployment

LTI's Large Language Model Deployment

Installation

Example API Usage

Model State

Other Available Arguments