AI Notes

notes on AI state of the art, with a focus on generative and large language models. These are the "raw materials" for the https://lspace.swyx.io/ newsletter.

This repo used to be called https://github.com/sw-yx/prompt-eng, but was renamed because Prompt Engineering is Overhyped.

This Readme is just the high level overview of the space; you should see the most updates in the OTHER markdown files in this repo:

IMAGE_GEN.md - the most developed file, with the heaviest emphasis notes on Stable Diffusion, and some on midjourney and dalle.
- IMAGE_PROMPTS.md - a small swipe file of good image prompts
TEXT.md - text generation, mostly with GPT3
- TEXT_CHAT.md - information on ChatGPT and competitors, as well as derivative products
- TEXT_SEARCH.md - information on GPT3 enabled semantic search and other info
- TEXT_PROMPTS.md - a small swipe file of good GPT3 prompts
INFRA.md - raw notes on AI Infrastructure, Hardware and Scaling
AUDIO.md - tracking audio/music/voice transcription + generation
stub notes - very small/lightweight proto pages
- AGENTS.md - tracking "agentic AI"
- CODE.md - codegen models, like Copilot
- etc...

Table of Contents

Motivational Use Cases
Top AI Reads
Communities
People
Misc
Quotes, Reality & Demotivation
Legal, Ethics, and Privacy

Motivational Use Cases

images
video
- img2img of famous movie scenes (lalaland)
  - img2img transforming actor with ebsynth + koe_recast
  - how ebsynth works https://twitter.com/TomLikesRobots/status/1612047103806545923?s=20
- virtual fashion (karenxcheng)
- seamless tiling images
- evolution of scenes (xander)
- outpainting https://twitter.com/orbamsterdam/status/1568200010747068417?s=21&t=rliacnWOIjJMiS37s8qCCw
- webUI img2img collaboration https://twitter.com/_akhaliq/status/1563582621757898752
- image to video with rotation https://twitter.com/TomLikesRobots/status/1571096804539912192
- "prompt paint" https://twitter.com/1littlecoder/status/1572573152974372864
- audio2video animation of your face https://twitter.com/siavashg/status/1597588865665363969
- physical toys to 3d model + animation https://twitter.com/sergeyglkn/status/1587430510988611584
- music videos
  - video killed the radio star, colab This uses OpenAI's Whisper speech-to-text, allowing you to take a YouTube video & create a Stable Diffusion animation prompted by the lyrics in the YouTube video
  - Stable Diffusion Videos generates videos by interpolating between prompts and audio
- direct text2video project
text-to-3d https://twitter.com/_akhaliq/status/1575541930905243652
- https://dreamfusion3d.github.io/
- open source impl: https://github.com/ashawkey/stable-dreamfusion
- demo https://twitter.com/_akhaliq/status/1578035919403503616
text products
Jasper
GPT for Obsidian https://reasonabledeviations.com/2023/02/05/gpt-for-second-brain/
gpt3 email https://github.com/sw-yx/gpt3-email
gpt3() in google sheet 2020, 2022 - sheet google sheets https://twitter.com/mehran__jalali/status/1608159307513618433
- https://gpt3demo.com/apps/google-sheets
- Charm https://twitter.com/shubroski/status/1620139262925754368?s=20
https://www.summari.com/ Summari helps busy people read more
sequoia market map https://twitter.com/sonyatweetybird/status/1584580362339962880
base10 market map https://twitter.com/letsenhance_io/status/1594826383305449491
matt shumer market map https://twitter.com/mattshumer_/status/1620465468229451776 https://docs.google.com/document/d/1sewTBzRF087F6hFXiyeOIsGC1N4N3O7rYzijVexCgoQ/edit
game assets -

Communities

StableDiffusion Discord https://discord.com/invite/stablediffusion
LAION discord https://discord.gg/xBPBXfcFHd
Eleuther discord: https://www.eleuther.ai/get-involved/ (primer)
https://reddit.com/r/stableDiffusion
Akhaliq Discord: https://discord.gg/nYqfg4gnBt
Karpathy Discord: https://discord.gg/3zy8kqD9Cp
HuggingFace Discord: https://discuss.huggingface.co/t/join-the-hugging-face-discord/11263
Deforum Discord https://discord.gg/upmXXsrwZc
Lexica Discord https://discord.com/invite/bMHBjJ9wRh
Perplexity Discord https://discord.com/invite/kWJZsxPDuX
Midjourney's discord
- how to use midjourney v4 https://twitter.com/fabianstelzer/status/1588856386540417024?s=20&t=PlgLuGAEEds9HwfegVRrpg
https://stablehorde.net/

People

This list will be out of date but will get you started. My live list of people to follow is at: https://twitter.com/i/lists/1585430245762441216

Quotes, Reality & Demotivation

Narrow, tedium domain usecases https://twitter.com/WillManidis/status/1584900092615528448 and https://twitter.com/WillManidis/status/1584900100480192516
antihype https://twitter.com/alexandr_wang/status/1573302977418387457
antihype https://twitter.com/fchollet/status/1612142423425138688?s=46&t=pLCNW9pF-co4bn08QQVaUg
prompt eng memes
- https://twitter.com/_jasonwei/status/1516844920367054848
things stablediffusion struggles with https://opguides.info/posts/aiartpanic/
New Google
- https://twitter.com/alexandr_wang/status/1585022891594510336
New Powerpoint
via emad
Appending prompts by default in UI
DALLE: https://twitter.com/levelsio/status/1588588688115912705?s=20&t=0ojpGmH9k6MiEDyVG2I6gg

Legal, Ethics, and Privacy

NSFW filter https://vickiboykis.com/2022/11/18/some-notes-on-the-stable-diffusion-safety-filter/
On "AI Art Panic" https://opguides.info/posts/aiartpanic/
Yannick influencing OPENRAIL-M https://www.youtube.com/watch?v=W5M-dvzpzSQ
art schools accepting AI art https://twitter.com/DaveRogenmoser/status/1597746558145265664
DRM issues https://undeleted.ronsor.com/voice.ai-gpl-violations-with-a-side-of-drm/
stealing art https://stablediffusionlitigation.com
- http://www.stablediffusionfrivolous.com/
- stable attribution https://news.ycombinator.com/item?id=34670136
- coutner argument for disney https://twitter.com/jonst0kes/status/1616219435492163584?s=46&t=HqQqDH1yEwhWUSQxYTmF8w
- research on stable diffusion copying https://twitter.com/officialzhvng/status/1620535905298817024?s=20&t=NC-nW7pfDa8nyRD08Lx1Nw This paper used Stable Diffusion to generate 175 million images over 350,000 prompts and only found 109 near copies of training data. Am I right that my main takeaway from this is how good Stable Diffusion is at not memorizing training examples?

Alignment, Safety

Anthropic - https://arxiv.org/pdf/2112.00861.pdf
- Helpful: attempt to do what is ask. concise, efficient. ask followups. redirect bad questions.
- Honest: give accurate information, express uncertainty. don't imitate responses expected from an expert if it doesn't have the capabilities/knowledge
- Harmless: not offensive/discriminatory. refuse to assist dangerous acts. recognize when providing sensitive/consequential advice
Just Eliezer entire body of work
- https://twitter.com/esyudkowsky/status/1625922986590212096

Misc

Whisper
- https://huggingface.co/spaces/sensahin/YouWhisper YouWhisper converts Youtube videos to text using openai/whisper.
- https://twitter.com/jeffistyping/status/1573145140205846528 youtube whipserer
- multilingual subtitles https://twitter.com/1littlecoder/status/1573030143848722433
- video subtitles https://twitter.com/m1guelpf/status/1574929980207034375
- you can join whisper to stable diffusion for reasons https://twitter.com/fffiloni/status/1573733520765247488/photo/1
- known problems https://twitter.com/lunixbochs/status/1574848899897884672 (edge case with catastrophic failures)
textually guided audio https://twitter.com/FelixKreuk/status/1575846953333579776
Codegen
- CodegeeX https://twitter.com/thukeg/status/1572218413694726144
- https://github.com/salesforce/CodeGen https://joel.tools/codegen/
pdf to structured data - Impira used t to do it (dead link: https://www.impira.com/blog/hey-machine-whats-my-invoice-total) but if you look hard enough on twitter there are some alternatives
text to Human Motion diffusion https://twitter.com/GuyTvt/status/1577947409551851520
- abs: https://arxiv.org/abs/2209.14916
- project page: https://guytevet.github.io/mdm-page/

cabreraalex/ai-notes

AI Notes

Motivational Use Cases

Top AI Reads

Beginner Reads

Intermediate Reads

Advanced Reads