AI Notes

notes on AI state of the art, with a focus on generative and large language models. These are the "raw materials" for the https://lspace.swyx.io/ newsletter.

This repo used to be called https://github.com/sw-yx/prompt-eng, but was renamed because Prompt Engineering is Overhyped. This is now an AI Engineering notes repo.

This Readme is just the high level overview of the space; you should see the most updates in the OTHER markdown files in this repo:

TEXT.md - text generation, mostly with GPT-4
- TEXT_CHAT.md - information on ChatGPT and competitors, as well as derivative products
- TEXT_SEARCH.md - information on GPT-4 enabled semantic search and other info
- TEXT_PROMPTS.md - a small swipe file of good GPT3 prompts
INFRA.md - raw notes on AI Infrastructure, Hardware and Scaling
AUDIO.md - tracking audio/music/voice transcription + generation
CODE.md - codegen models, like Copilot
IMAGE_GEN.md - the most developed file, with the heaviest emphasis notes on Stable Diffusion, and some on midjourney and dalle.
- IMAGE_PROMPTS.md - a small swipe file of good image prompts
Resources: standing, cleaned up resources that are meant to be permalinked to
stub notes - very small/lightweight proto pages of future coverage areas - AGENTS.md - tracking "agentic AI"
blog ideas- potential blog post ideas derived from these notes bc

Table of Contents

Motivational Use Cases
Top AI Reads
Communities
People
Misc
Quotes, Reality & Demotivation
Legal, Ethics, and Privacy

Motivational Use Cases

images
video
- img2img of famous movie scenes (lalaland)
  - img2img transforming actor with ebsynth + koe_recast
  - how ebsynth works https://twitter.com/TomLikesRobots/status/1612047103806545923?s=20
- virtual fashion (karenxcheng)
- seamless tiling images
- evolution of scenes (xander)
- outpainting https://twitter.com/orbamsterdam/status/1568200010747068417?s=21&t=rliacnWOIjJMiS37s8qCCw
- webUI img2img collaboration https://twitter.com/_akhaliq/status/1563582621757898752
- image to video with rotation https://twitter.com/TomLikesRobots/status/1571096804539912192
- "prompt paint" https://twitter.com/1littlecoder/status/1572573152974372864
- audio2video animation of your face https://twitter.com/siavashg/status/1597588865665363969
- physical toys to 3d model + animation https://twitter.com/sergeyglkn/status/1587430510988611584
- music videos
  - video killed the radio star, colab This uses OpenAI's Whisper speech-to-text, allowing you to take a YouTube video & create a Stable Diffusion animation prompted by the lyrics in the YouTube video
  - Stable Diffusion Videos generates videos by interpolating between prompts and audio
- direct text2video project
text-to-3d https://twitter.com/_akhaliq/status/1575541930905243652
- https://dreamfusion3d.github.io/
- open source impl: https://github.com/ashawkey/stable-dreamfusion
- demo https://twitter.com/_akhaliq/status/1578035919403503616
text products
- has a list of usecases at the end https://huyenchip.com/2023/04/11/llm-engineering.html
Jasper
GPT for Obsidian https://reasonabledeviations.com/2023/02/05/gpt-for-second-brain/
gpt3 email https://github.com/sw-yx/gpt3-email
gpt3() in google sheet 2020, 2022 - sheet google sheets https://twitter.com/mehran__jalali/status/1608159307513618433
- https://gpt3demo.com/apps/google-sheets
- Charm https://twitter.com/shubroski/status/1620139262925754368?s=20
https://www.summari.com/ Summari helps busy people read more
market maps/landscapes
- sequoia market map jan 2023, july 2023
- base10 market map https://twitter.com/letsenhance_io/status/1594826383305449491
- matt shumer market map https://twitter.com/mattshumer_/status/1620465468229451776 https://docs.google.com/document/d/1sewTBzRF087F6hFXiyeOIsGC1N4N3O7rYzijVexCgoQ/edit
- nfx https://www.nfx.com/post/generative-ai-tech-5-layers?ref=context-by-cohere
- a16z https://a16z.com/2023/01/19/who-owns-the-generative-ai-platform/
  - https://a16z.com/2023/06/20/emerging-architectures-for-llm-applications/
- madrona https://www.madrona.com/foundation-models/
game assets -

Communities

Discords
- Latent Space Discord (ours!)
- General hacking and learning
  - ChatGPT Hackers Discord
  - AI Alignment Lab Discord
  - Akhaliq Discord: https://discord.gg/nYqfg4gnBt
  - Karpathy Discord: https://discord.gg/3zy8kqD9Cp
  - HuggingFace Discord
- Art
  - StableDiffusion Discord
  - Deforum Discord https://discord.gg/upmXXsrwZc
  - Lexica Discord https://discord.com/invite/bMHBjJ9wRh
- AI research
  - LAION discord https://discord.gg/xBPBXfcFHd
  - Eleuther discord: https://www.eleuther.ai/get-involved/ (primer)
- Various startups
  - Perplexity Discord https://discord.com/invite/kWJZsxPDuX
  - Midjourney's discord
    - how to use midjourney v4 https://twitter.com/fabianstelzer/status/1588856386540417024?s=20&t=PlgLuGAEEds9HwfegVRrpg
https://stablehorde.net/
- Agents
  - AutoGPT discord
  - BabyAGI discord
Reddit

People

This list will be out of date but will get you started. My live list of people to follow is at: https://twitter.com/i/lists/1585430245762441216

Quotes, Reality & Demotivation

Narrow, tedium domain usecases https://twitter.com/WillManidis/status/1584900092615528448 and https://twitter.com/WillManidis/status/1584900100480192516
antihype https://twitter.com/alexandr_wang/status/1573302977418387457
antihype https://twitter.com/fchollet/status/1612142423425138688?s=46&t=pLCNW9pF-co4bn08QQVaUg
prompt eng memes
- https://twitter.com/_jasonwei/status/1516844920367054848
things stablediffusion struggles with https://opguides.info/posts/aiartpanic/
New Google
- https://twitter.com/alexandr_wang/status/1585022891594510336
New Powerpoint
via emad
Appending prompts by default in UI
DALLE: https://twitter.com/levelsio/status/1588588688115912705?s=20&t=0ojpGmH9k6MiEDyVG2I6gg
There have been two previous winters, one 1974-1980 and one 1987-1993. https://www.erichgrunewald.com/posts/the-prospect-of-an-ai-winter/. bit more commentary here. related - AI Effect - "once it works its not AI"
It's just matrix multiplication/stochastic parrots
- Even LLM skeptic Yann LeCun says LLMs have some level of understanding: https://twitter.com/ylecun/status/1667947166764023808

Legal, Ethics, and Privacy

NSFW filter https://vickiboykis.com/2022/11/18/some-notes-on-the-stable-diffusion-safety-filter/
On "AI Art Panic" https://opguides.info/posts/aiartpanic/
- I lost everything that made me love my job through Midjourney
Yannick influencing OPENRAIL-M https://www.youtube.com/watch?v=W5M-dvzpzSQ
art schools accepting AI art https://twitter.com/DaveRogenmoser/status/1597746558145265664
DRM issues https://undeleted.ronsor.com/voice.ai-gpl-violations-with-a-side-of-drm/
stealing art https://stablediffusionlitigation.com
- http://www.stablediffusionfrivolous.com/
- stable attribution https://news.ycombinator.com/item?id=34670136
- coutner argument for disney https://twitter.com/jonst0kes/status/1616219435492163584?s=46&t=HqQqDH1yEwhWUSQxYTmF8w
- research on stable diffusion copying https://twitter.com/officialzhvng/status/1620535905298817024?s=20&t=NC-nW7pfDa8nyRD08Lx1Nw This paper used Stable Diffusion to generate 175 million images over 350,000 prompts and only found 109 near copies of training data. Am I right that my main takeaway from this is how good Stable Diffusion is at not memorizing training examples?
scraping content
- https://blog.ericgoldman.org/archives/2023/08/web-scraping-for-me-but-not-for-thee-guest-blog-post.htm
- sarah silverman case - openai response https://arstechnica.com/tech-policy/2023/08/openai-disputes-authors-claims-that-every-chatgpt-response-is-a-derivative-work/
Licensing
- AI weights are not open "source" - Sid Sijbrandij
Diversity and Equity
- sexualizing minorities https://twitter.com/lanadenina/status/1680238883206832129 the reason is porn is good at bodies
- OpenAI tacking on "black" randomly to make DallE diverse

Alignment, Safety

Anthropic - https://arxiv.org/pdf/2112.00861.pdf
- Helpful: attempt to do what is ask. concise, efficient. ask followups. redirect bad questions.
- Honest: give accurate information, express uncertainty. don't imitate responses expected from an expert if it doesn't have the capabilities/knowledge
- Harmless: not offensive/discriminatory. refuse to assist dangerous acts. recognize when providing sensitive/consequential advice
- criticism and boundaries as future direction https://twitter.com/davidad/status/1628489924235206657?s=46&t=TPVwcoqO8qkc7MuaWiNcnw
Just Eliezer entire body of work
- https://twitter.com/esyudkowsky/status/1625922986590212096
- agi list of lethalities https://www.lesswrong.com/posts/uMQ3cqWDPHhjtiesc/agi-ruin-a-list-of-lethalities
- note that eliezer has made controversial comments in the past and also in recent times (TIME article)
Connor Leahy may be a more sane/measured/technically competent version of yud https://overcast.fm/+aYlOEqTJ0
- it's not just paperclip factories
- https://www.lesswrong.com/posts/HBxe6wdjxK239zajf/what-failure-looks-like
the 6 month pause letter
- https://futureoflife.org/open-letter/pause-giant-ai-experiments/
- yann lecun vs andrew ng https://www.youtube.com/watch?v=BY9KV8uCtj4
- https://scottaaronson.blog/?p=7174
- emily bender response
- Geoffrey Hinton leaving Google
- followed up by one sentence public letter https://www.nytimes.com/2023/05/30/technology/ai-threat-warning.html
xrisk - Is avoiding extinction from AI really an urgent priority? (link)
- AI Is not an arms race. (link)
- If we’re going to label AI an ‘extinction risk,’ we need to clarify how it could happen. (link)
OpenAI superalignment https://www.youtube.com/watch?v=ZP_N4q5U3eE

regulation

chinese regulation https://www.chinalawtranslate.com/en/overview-of-draft-measures-on-generative-ai/
- https://twitter.com/mmitchell_ai/status/1647697067006111745?s=46&t=90xQ8sGy63D2OtiaoGJuww
- China is the only major world power that explicitly regulates generative AI
italy banning chatgpt
- At its annual meeting in Japan, the Group of Seven (G7), an informal bloc of industrialized democratic governments, announced the Hiroshima Process, an intergovernmental task force empowered to investigate risks of generative AI. G7 members, which include Canada, France, Germany, Italy, Japan, the United Kingdom, and the United States, vowed to craft mutually compatible laws and regulate AI according to democratic values. These include fairness, accountability, transparency, safety, data privacy, protection from abuse, and respect for human rights.
U.S. President Joe Biden issued a strategic plan for AI. The initiative calls on U.S. regulatory agencies to develop public datasets, benchmarks, and standards for training, measuring, and evaluating AI systems.
Earlier this month, France’s data privacy regulator announced a framework for regulating generative AI.
regulation vs Xrisk https://1a3orn.com/sub/essays-regulation-stories.html

Misc

Whisper
- https://huggingface.co/spaces/sensahin/YouWhisper YouWhisper converts Youtube videos to text using openai/whisper.
- https://twitter.com/jeffistyping/status/1573145140205846528 youtube whipserer
- multilingual subtitles https://twitter.com/1littlecoder/status/1573030143848722433
- video subtitles https://twitter.com/m1guelpf/status/1574929980207034375
- you can join whisper to stable diffusion for reasons https://twitter.com/fffiloni/status/1573733520765247488/photo/1
- known problems https://twitter.com/lunixbochs/status/1574848899897884672 (edge case with catastrophic failures)
textually guided audio https://twitter.com/FelixKreuk/status/1575846953333579776
Codegen
- CodegeeX https://twitter.com/thukeg/status/1572218413694726144
- https://github.com/salesforce/CodeGen https://joel.tools/codegen/
pdf to structured data - Impira used t to do it (dead link: https://www.impira.com/blog/hey-machine-whats-my-invoice-total) but if you look hard enough on twitter there are some alternatives
text to Human Motion diffusion https://twitter.com/GuyTvt/status/1577947409551851520
- abs: https://arxiv.org/abs/2209.14916
- project page: https://guytevet.github.io/mdm-page/

ppeszko/ai-notes

AI Notes

Motivational Use Cases

Top AI Reads

Beginner Reads

Intermediate Reads

Advanced Reads

other lists like this