Coursera DeepLearning.ai: Generative AI for Everyone

Generative AI for Everyone: Lecture Notes

Introduction to Generative AI

What is Generative AI

AI is already pervasive in our lives and many of us use it dozens of times a day or more without even thinking about it. e.g. Every time you do a web search on Google or Bing, that's AI.

But many AI systems have been complex and expensive to build, and generative AI is making many AI applications much easier to build.

How Generative AI works

supervised learning, which turns out to be really good at labeling things.
supervised learning and generative AI, are the two most important tools in AI today.

Supervised learning is the technology that has made computers very good when
given an input, which I'm going to call A, to generate a corresponding output, which I'm going to call B.

But what we found starting around 2010 was that for a lot of applications, we had a lot of data, but even as we fed it more data, its performance wasn't getting that much better if we were training small AI models.
This means, for example, if you were building a speech recognition system, even as your AI listened to tens of thousands or hundreds of thousands of hours of data, that's a lot of data, it didn't get that much more accurate compared to a system that listened to only a smaller amount of audio data.
But what more and more researchers started to realize through this period is if you were to train a very large AI model, meaning an AI model on very fast, very powerful computers with a lot of memory, then its performance as you fed it more and more data will just keep on getting better and better.

It uses supervised learning to repeatedly predict what is the next word.
But at the heart of LLMs is this technology that's learned from a lot of data to predict what is the next word.

LLMs as a thought partner

given the propensity of LLM's to make things up and sometimes sound very authoritative and confident when making things up, I would probably want to double check anything it says about healthcare or medicine before following the suggestions.

AI is a general purpose technology

1.2 Generative AI Application

Writing

Output quality increases with context and specificity provided

e.g. in this translation it's not as good as native speaker
"front desk" translated literally vs "reception"
providing promt of "formal spoken hindi" improves the translation

Reading

Can be used for e.g.

proofreading
summarizing text
summarizing conversations
automating tasks
reputation monitoring

When building your own app, be very specific

Chatting

Can build specialized chatbots e.g. travel specific

What LLMs can and cannot do

Mental framework for LLM, can a fresh college grad perform the task...

Assume:

no access to internet nor external resources
no training specific on your company/business
no memory of previous task completed

Other limitation:

knowledge cutoffs:
- last time it 'scraped' the internet e.g. what is highest grossing film of
- learns erroneous information from "common" wrong information e.g. temperature of superconducter LK-99
Hallucinations: will sometimes make up information in an authorative voice
input and output length is limited e.g. trying to summarize a very long paper
Does not work well with structured data
- e.g. given table of home prices, estimate median price
- in this case can use supervised learning instead
Works best with unstructured data
Has bias and toxicity learned from the internet

Tips for prompting

Be detailed and specific
Guide the model to think through its answer
Experiment and iterate

Image generation (optional)

Diffusion Models have learned from huge numbers of images. This is an example of supervised learning.

Learn to generate slightly less noisy image from noisy image at every step
This takes lots of steps

1.3 Week 1 resources

Web UI chatbots to try

If you'd like to experiment with prompting a large language model (LLM), you can visit one of the chatbots linked below:

ChatGPTOpens in a new tab from OpenAI
BardOpens in a new tab from Google
Bing ChatOpens in a new tab from Microsoft

Generative AI and the Economy

You can learn about the impact of generative AI on the economy by reading these reports and articles:

McKinsey: The economic potential of generative AI: The next productivity frontierOpens in a new tab, McKinsey Digital report, June 2023
GPTs are GPTs: An Early Look at the Labor Market Impact Potential of Large Language ModelsOpens in a new tab, Tyna Eloundou, Sam Manning, Pamela Miskin, and Daniel Rock, March 2023 (arXiv:2303.10130)
Goldman Sachs: The Potentially Large Effects of Artificial Intelligence on Economic GrowthOpens in a new tab, Joseph Briggs and Devesh Kodnani, March 2023

2.1 Software Applications

Using generative AI in software applications

Generative AI makes deploying apps like sentiment analysis much simpler

Trying generative AI code yourself (optional)

DLAI - Learning Platform Beta

import openai
import os

openai.api_key = os.getenv("OPENAI_API_KEY")

def llm_response(prompt):
    response = openai.ChatCompletion.create(
        model='gpt-3.5-turbo',
        messages=[{'role':'user','content':prompt}],
        temperature=0
    )
    return response.choices[0].message['content']

prompt = '''
    Classify the following review 
    as having either a positive or
    negative sentiment:

    The banana pudding was really tasty!
'''

response = llm_response(prompt)
print(response)

Lifecycle of a generative AI project

Here the sentiment was classified as positive even though it shouldn't have
Have to continuously improve the system

list of techniques to improve the system

Cost intuition

Given typical adult reads at 250 words/min, cost is approx 8cents per hour

2.2 Advanced Technologies: Beyond Prompting

Retrieval Augmented Generation (RAG)

RAG enables LLM to have context

Examples of RAG Applications

Fine-tuning

Fine tuning is another technique

more complicated that RAG
used to get output to fit a certain style

e.g. Want output to be "optimistic"

Give it additional words (10-100k or more) to learn from

Also used for apps where task isn't easy

e.g. summarizing customer service calls

e.g. mimicking a writing/speaking style

Also used to learn specific domain knowledge e.g.

medical notes for patient with shortness of breat Pt c/o SOB, DOE. PE: RRR, JVD absent, CTAB. EKG: NSR. Tx: F/u w/ PCP, STAT CXR, cont. PRN O2.
legal doc: Licensor grants to Licensee, per Section 2(a)(iii), a non-exclusive right to use the intellectual property, contingent upon compliance with fiduciary duties outlined in Section 8, paragraphs 1-4, and payment as specified in Schedule B, within 15 days hereof.

Some models don't require an expensive 100B+ parameter model