/Constitutional-AI

Primary LanguageJupyter Notebook

Project Overview

ConstitutionalAI for Python Code Generation

This project showcases an example of training and evaluating a language model using the concept of Constitutional AI. Constitutional AI involves aligning AI behavior with a set of predefined ethical guidelines, ensuring that the AI's outputs are consistent with these principles. This project demonstrates an implementation of ConstitutionalAI (or Recursive Language-Assisted Instruction Following, RLAIF) for generating Python code based on natural language instructions. The primary goal is to fine-tune a language model to produce high-quality Python code outputs while adhering to a set of rules or guidelines defined by a "constitution." Dataset The dataset used for fine-tuning is the python_code_instructions_18k_alpaca dataset from the Hugging Face Datasets library. It contains 18,000 examples of natural language instructions and their corresponding Python code outputs. Approach

Load GPT-2 Model and Tokenizer: The GPT-2 language model (GPT2LMHeadModel) and tokenizer (GPT2Tokenizer) from the Hugging Face Transformers library are loaded. The gpt2-medium variant is used. Add Special Tokens: A special [PAD] token is added to the tokenizer and the model's vocabulary. Format Data: The dataset is formatted into a structure suitable for fine-tuning, with prompts (instructions and inputs) and target outputs (Python code). Generate ConstitutionalAI Dataset: The ConstitutionalAI (or RLAIF) approach is used to iteratively refine the model's outputs based on a set of rules or guidelines defined by a "constitution." Two prompts are used:

constitution: Identifies issues in the model's output, such as incorrect code, lack of clarity, or poor writing. constitution_revision: Instructs the model to rewrite the response, addressing the identified issues and producing a revised, improved output.

Prepare Training Data: The formatted dataset is converted into a format suitable for training, with prompts and target outputs concatenated and tokenized using the GPT-2 tokenizer. Create PyTorch Dataset and DataLoader: A custom DatasetClass is defined to handle the tokenized data, and a DataLoader is created for efficient batching and shuffling during training. Fine-tune the Model: The GPT-2 model is fine-tuned using the prepared dataset and the AdamW optimizer. The fine-tuning process includes gradient accumulation and adjustable epochs and batch sizes. Generate Outputs: After fine-tuning, the model can generate Python code outputs based on new natural language instructions, adhering to the defined constitution or rules.

Technical Details

Libraries: Hugging Face Transformers, Hugging Face Datasets, PyTorch, tqdm Model: GPT-2 (gpt2-medium variant) Tokenizer: GPT-2 tokenizer with added special tokens Fine-tuning: AdamW optimizer, gradient accumulation, adjustable epochs and batch sizes Output Generation: Beam search with customizable parameters (max_length, num_return_sequences, no_repeat_ngram_size)

Usage

Install the required libraries (Hugging Face Transformers, Hugging Face Datasets, PyTorch, tqdm). Clone or download the project repository. Run the script, and it will load the dataset, format the data, fine-tune the GPT-2 model using the ConstitutionalAI approach, and generate Python code outputs based on the provided instructions while adhering to the defined constitution or rules.

Note: The provided code is a demonstration and may require adjustments or modifications based on your specific requirements and computational resources.