2023 遗留不少事

  1. 吴恩达的课 看到 61课就停了 , 2023 年11月7号
  2. 如何生产自己的pip包, 2023 年 10月 18号就停了
  3. 继续使用各种模型,总结
    1. https://start.chatgot.io/ 集合几个常见
    2. https://www.chatpdf.com/
    3. Moonshot AI https://moonshot.feishu.cn/docx/RnkWdeFo8oQabzxYFVwcNg1Mn9g
    4. 清华智普 、 通义千问



发布 pip 包 ,之前检索到的两篇没啥用的文章。 2023 10.18 号找的几篇文章也没有什么印象了。

  1. https://packaging.python.org/en/latest/guides/writing-pyproject-toml/
  2. https://mathspp.com/blog/how-to-create-a-python-package-in-2022

重新 google how to publish pip package 排名前几名文章,build 工具用最常见的 setuptools

  1. https://builtin.com/data-science/how-to-publish-python-code-pypi 简单也比较清晰,但没有实际例子
  2. https://www.turing.com/kb/how-to-create-pypi-packages python setup.py sdistsetup.py解释 参见 https://docs.python.org/3.10/distutils/introduction.html#distutils-simple-example 但这篇提到 init.py是不是必现? 它也没提 __init__.py
  3. 重读去年 第一篇 https://www.freecodecamp.org/news/how-to-create-and-upload-your-first-python-package-to-pypi/

另:Difference between Module and Class in Python



Bard 推荐

  1. Visualizing A Neural Machine Translation Model (Mechanics of Seq2seq Models With Attention)
  2. https://jalammar.github.io/illustrated-transformer/

最新的书要读 https://udlbook.github.io/udlbook/


udl 看到第三章 shallow neural networks, 图 3.8 visualise a linear function of the two inputs 没彻底理解,接着看第四章

https://docs.wandb.ai/tutorials Weights & Biases (W&B) is the AI developer platform, with tools for training models, fine-tuning models, and leveraging foundation models.

https://docs.wandb.ai/guides 试试

https://nlp.seas.harvard.edu/annotated-transformer/ 要读还要试验,但是之前把transformer 再了解清楚




https://realpython.com/python-requests/ 写代码用到又复习一下

A Review: Pipenv vs. Poetry vs. PDM 三个工具都能指定 python 版本

udl 看到第五章 Loss functions



RAG & LangChain

Advanced RAG Techniques: an Illustrated Overview 在看

How to Improve LLMs with RAG

Transformer 学习材料


  1. Understanding Transformers and Attention 2023写的,标注7分钟读完。模型是简述了,但我也陆陆续续从别的地方知道了,现在是进一步深入了解,所以还要看别的

  2. Transformers: A Beginner’s Guide 作为入门介绍,这篇比上篇好

Ketan Doshi 系列

  1. Transformers Explained Visually (Part 1): Overview of Functionality

  2. Transformers Explained Visually (Part 2): How it works, step-by-step 快看 mask 哪里没细看

  3. Transformers Explained Visually (Part 3): Multi-head Attention, deep dive Reshaping the Q, K, and V matrices 没有理解好

  4. Transformers Explained Visually — Not Just How, but Why They Work So Well 2021写的, 4.3 看完,较好理解

XQ 系列 有实际python 例子更容易理解

  1. Explained: Transformers for Everyone 2024,15分钟

  2. Explained: Tokens and Embeddings in LLMs 读完,对 embedding 有一定了解

  3. Explained: Attention Mechanism in AI 代码用 notebook 试验 https://hex.tech/blog/beginners-guide-to-python-notebooks/ 4,1阅读完 觉得有些概念没解释好,给他留言

  4. Explained: Hyperparameters in Deep Learning 4.1 读完,留言 loss function 到底怎么体现在transformer里?

4.3 读完以上10篇https://opencv.org/blog/pytorch-vs-tensorflow/

4.7 看完以下

An Intuitive Explanation of ‘Attention Is All You Need’: The Paper That Revolutionized AI and Created Generative AI like ChatGPT 2023 9 分钟 看完,没什么有用的

Understanding the Transformer Architecture in Simple English 2024, 8分钟 这篇比上一篇解释更清楚,作为入门了解。

Self-Attention: A step-by-step guide to calculating the context vector 2023 7分钟 先看了,因为对vector 有点兴趣,但是浑沦吞枣,对理解好像没有太大帮助

Mika.i Chak 系列8篇 还不错,短小精炼

Transformers — In Plaintext. Part 1 乍一看好像还可以

Transformers — Unknown Hero. Part 2

Transformers — In Deep Dive. Part 3

Transformers — does not exist without Input Processing. Part 4 位置编码用sin/cos,解释 对于浮点数,sin/cos表示更有效

Transformers — Is All About Attention. Part 5 QKV 计算过程图形呈现

Transformers — Multi-Head Attention. Part 6

Transformers — Masked Multi-Head Attention. Part 7

Transformers — Feed Forward and Output. Part 8 想到一个问题,整个过程怎么没看到loss function 的应用?

The Illustrated Transformer 读完 , 2018年写的,但任然是信息量最全的

4.8 开始

What are Query, Key, and Value in the Transformer Architecture and Why Are They Used? 2023,10分钟 4.8 开始读

The Math Behind Neural Networks

Jay Alammar 系列

  1. The Illustrated Transformer 读完
  2. Visualizing A Neural Machine Translation Model (Mechanics of Seq2seq Models With Attention) 2018
  3. https://jalammar.github.io/visualizing-neural-machine-translation-mechanics-of-seq2seq-models-with-attention/ 2018

Roadmap to Learn AI in 2024

2.28记录的 https://nlp.seas.harvard.edu/annotated-transformer/ 一定最后要试验,读完!!代码 https://github.com/harvardnlp/annotated-transformer/

Why ChatGPT Uses Decoder-Only

ChatGPT's Architecture - Decoder Only? Or Encoder-Decoder?


小插曲,被python list遍历坑了一下 How slicing in Python works 比如 ::-1, ::2, for i in range(0, len(parts), 2)Understanding string reversal via slicing 这里说的 You can omit one or more of the elements and it does "the right thing"

还有一个小教训要记牢:遍历数组元素的时候,如果一次要处理一个以上元素就不能用 for in 而是要index,而且 for i in range 中 i 不会变,要让i变化,或者设置step(如果step固定,,比如 for i in range(0, len(parts), 2)),或者就用 while 自己加 i 的step=

# 没有更简单的写法吗?
weighted_embeddings = {word: [weight * val for val in embedding]
                       for word, embedding in word_embeddings.items()
                       for word_weight, weight in attention_weights.items() if word == word_weight}
# 比如下面
weighted_embeddings = {word: [v * attention_weights[word] for v in word_embeddings[word]] 
                        for word in word_embeddings}



找到一个中文讲解 Transformer 详解 加代码学习,

https://www.zhihu.com/question/347678607 位置编码

Transformer 中的 Positional Encoding

Master Positional Encoding: Part I

AI tools

What's the difference between Cursor and the new version of Github Copilot?

How to maximise the Copilot's context awareness?

尝试 https://codeium.com/ 老是报错,先放弃


transformer cont.

What are Query, Key, and Value in the Transformer Architecture and Why Are They Used? 读完,还是觉得v 矩阵多余,结果发现 Simplified Transformer Block Architecture: Insights and Impact 也说简化努力包括去掉 v 矩阵

Transformer Architecture Simplified 本来以为是如何简化transformer,但其实简介

读完 Chen Margalit 系列 没有太多新东西了

  1. Simplifying Transformers: State of the Art NLP Using Words You Understand — part 3— Attention 有代码,读完!
  2. Simplifying Transformers: State of the Art NLP Using Words You Understand — part 2— Input 相关内容看过很多,快读
  3. Simplifying Transformers: State of the Art NLP Using Words You Understand — Part 4 — Feed-Forward- Layer
  4. Simplifying Transformers: State of the Art NLP Using Words You Understand — Part 5 — Decoder and Final Output

What Is ChatGPT Doing … and Why Does It Work? 看完

The Math Behind Neural Networks 草草读完,主要难点还是backpropagation,在 Grokking DP 看过,之前看到Chapter 12,这会在复习一下它关于 word embedding 的描述

还是回到 udl 接着从从第五章 loss function 但马上想到NLP的loss 要怎么算,会选哪些loss function


loss function in NLP

Cross Entropy in Large Language Models (LLMs)

学习 langchain https://github.com/liaokongVFX/LangChain-Chinese-Getting-Started-Guide

LangChain Agents: Unleashing the Power of Language Models for Real-World Automation


LangChain && RAG

Building a Document-based Question Answering System with LangChain using LLM model

AI Chatbot with your Knowledge base

Building Next-Gen Apps with AI Agents


Intro to LLM Agents with LangChain: Beyond Simple Prompts 代码跑不过,从头开始看 https://python.langchain.com/v0.2/docs/introduction/

No good (or at all) reStructuredText editor https://www.sphinx-doc.org/en/master/usage/restructuredtext/basics.html


How to Improve LLMs with RAG

What is an LLM Agent and how does it work?

跑了第一个例子 Build a Simple LLM Application with LCEL 碰到问题是如何看 LangSmith trace