2023 遗留不少事
- 吴恩达的课 看到 61课就停了 , 2023 年11月7号
- 如何生产自己的pip包, 2023 年 10月 18号就停了
- 继续使用各种模型,总结
- https://start.chatgot.io/ 集合几个常见
- https://www.chatpdf.com/
- Moonshot AI https://moonshot.feishu.cn/docx/RnkWdeFo8oQabzxYFVwcNg1Mn9g
- 清华智普 、 通义千问
发布 pip 包 ,之前检索到的两篇没啥用的文章。 2023 10.18 号找的几篇文章也没有什么印象了。
- https://packaging.python.org/en/latest/guides/writing-pyproject-toml/
- https://mathspp.com/blog/how-to-create-a-python-package-in-2022
重新 google how to publish pip package 排名前几名文章,build 工具用最常见的 setuptools
- https://builtin.com/data-science/how-to-publish-python-code-pypi 简单也比较清晰,但没有实际例子
- https://www.turing.com/kb/how-to-create-pypi-packages
python setup.py sdist
和setup.py
解释 参见 https://docs.python.org/3.10/distutils/introduction.html#distutils-simple-example 但这篇提到init.py
是不是必现? 它也没提__init__.py
- 重读去年 第一篇 https://www.freecodecamp.org/news/how-to-create-and-upload-your-first-python-package-to-pypi/
另:Difference between Module and Class in Python
Bard 推荐
- Visualizing A Neural Machine Translation Model (Mechanics of Seq2seq Models With Attention)
- https://jalammar.github.io/illustrated-transformer/
最新的书要读 https://udlbook.github.io/udlbook/
udl 看到第三章 shallow neural networks, 图 3.8 visualise a linear function of the two inputs 没彻底理解,接着看第四章
https://docs.wandb.ai/tutorials Weights & Biases (W&B) is the AI developer platform, with tools for training models, fine-tuning models, and leveraging foundation models.
https://docs.wandb.ai/guides 试试
https://nlp.seas.harvard.edu/annotated-transformer/ 要读还要试验,但是之前把transformer 再了解清楚
https://jalammar.github.io/illustrated-transformer/
https://realpython.com/python-requests/ 写代码用到又复习一下
A Review: Pipenv vs. Poetry vs. PDM 三个工具都能指定 python 版本
udl 看到第五章 Loss functions
RAG & LangChain
Advanced RAG Techniques: an Illustrated Overview 在看
两篇简介
-
Understanding Transformers and Attention 2023写的,标注7分钟读完。模型是简述了,但我也陆陆续续从别的地方知道了,现在是进一步深入了解,所以还要看别的
-
Transformers: A Beginner’s Guide 作为入门介绍,这篇比上篇好
Ketan Doshi 系列
-
Transformers Explained Visually (Part 1): Overview of Functionality
-
Transformers Explained Visually (Part 2): How it works, step-by-step 快看 mask 哪里没细看
-
Transformers Explained Visually (Part 3): Multi-head Attention, deep dive Reshaping the Q, K, and V matrices 没有理解好
-
Transformers Explained Visually — Not Just How, but Why They Work So Well 2021写的, 4.3 看完,较好理解
XQ 系列 有实际python 例子更容易理解
-
Explained: Transformers for Everyone 2024,15分钟
-
Explained: Tokens and Embeddings in LLMs 读完,对 embedding 有一定了解
-
Explained: Attention Mechanism in AI 代码用 notebook 试验 https://hex.tech/blog/beginners-guide-to-python-notebooks/ 4,1阅读完 觉得有些概念没解释好,给他留言
-
Explained: Hyperparameters in Deep Learning 4.1 读完,留言 loss function 到底怎么体现在transformer里?
4.3 读完以上10篇 和 https://opencv.org/blog/pytorch-vs-tensorflow/
4.7 看完以下
An Intuitive Explanation of ‘Attention Is All You Need’: The Paper That Revolutionized AI and Created Generative AI like ChatGPT 2023 9 分钟 看完,没什么有用的
Understanding the Transformer Architecture in Simple English 2024, 8分钟 这篇比上一篇解释更清楚,作为入门了解。
Self-Attention: A step-by-step guide to calculating the context vector 2023 7分钟 先看了,因为对vector 有点兴趣,但是浑沦吞枣,对理解好像没有太大帮助
Mika.i Chak 系列8篇 还不错,短小精炼
Transformers — In Plaintext. Part 1 乍一看好像还可以
Transformers — Unknown Hero. Part 2
Transformers — In Deep Dive. Part 3
Transformers — does not exist without Input Processing. Part 4 位置编码用sin/cos,解释 对于浮点数,sin/cos表示更有效
Transformers — Is All About Attention. Part 5 QKV 计算过程图形呈现
Transformers — Multi-Head Attention. Part 6
Transformers — Masked Multi-Head Attention. Part 7
Transformers — Feed Forward and Output. Part 8 想到一个问题,整个过程怎么没看到loss function 的应用?
The Illustrated Transformer 读完 , 2018年写的,但任然是信息量最全的
4.8 开始
What are Query, Key, and Value in the Transformer Architecture and Why Are They Used? 2023,10分钟 4.8 开始读
The Math Behind Neural Networks 长
Jay Alammar 系列
- The Illustrated Transformer 读完
- Visualizing A Neural Machine Translation Model (Mechanics of Seq2seq Models With Attention) 2018
- https://jalammar.github.io/visualizing-neural-machine-translation-mechanics-of-seq2seq-models-with-attention/ 2018
2.28记录的 https://nlp.seas.harvard.edu/annotated-transformer/ 一定最后要试验,读完!!代码 https://github.com/harvardnlp/annotated-transformer/
ChatGPT's Architecture - Decoder Only? Or Encoder-Decoder?
微信公众号各种文章
小插曲,被python list遍历坑了一下 How slicing in Python works 比如 ::-1, ::2, for i in range(0, len(parts), 2)
像 Understanding string reversal via slicing 这里说的 You can omit one or more of the elements and it does "the right thing"
还有一个小教训要记牢:遍历数组元素的时候,如果一次要处理一个以上元素就不能用 for in
而是要index,而且 for i in range
中 i 不会变,要让i变化,或者设置step(如果step固定,,比如 for i in range(0, len(parts), 2)
),或者就用 while 自己加 i 的step=
# 没有更简单的写法吗?
weighted_embeddings = {word: [weight * val for val in embedding]
for word, embedding in word_embeddings.items()
for word_weight, weight in attention_weights.items() if word == word_weight}
# 比如下面
weighted_embeddings = {word: [v * attention_weights[word] for v in word_embeddings[word]]
for word in word_embeddings}
找到一个中文讲解 Transformer 详解 加代码学习,
https://www.zhihu.com/question/347678607 位置编码
Transformer 中的 Positional Encoding
Master Positional Encoding: Part I
What's the difference between Cursor and the new version of Github Copilot?
How to maximise the Copilot's context awareness?
尝试 https://codeium.com/ 老是报错,先放弃
What are Query, Key, and Value in the Transformer Architecture and Why Are They Used? 读完,还是觉得v 矩阵多余,结果发现 Simplified Transformer Block Architecture: Insights and Impact 也说简化努力包括去掉 v 矩阵
Transformer Architecture Simplified 本来以为是如何简化transformer,但其实简介
读完 Chen Margalit 系列 没有太多新东西了
- Simplifying Transformers: State of the Art NLP Using Words You Understand — part 3— Attention 有代码,读完!
- Simplifying Transformers: State of the Art NLP Using Words You Understand — part 2— Input 相关内容看过很多,快读
- Simplifying Transformers: State of the Art NLP Using Words You Understand — Part 4 — Feed-Forward- Layer
- Simplifying Transformers: State of the Art NLP Using Words You Understand — Part 5 — Decoder and Final Output
What Is ChatGPT Doing … and Why Does It Work? 看完
The Math Behind Neural Networks 草草读完,主要难点还是backpropagation,在 Grokking DP 看过,之前看到Chapter 12,这会在复习一下它关于 word embedding 的描述
还是回到 udl 接着从从第五章 loss function 但马上想到NLP的loss 要怎么算,会选哪些loss function
读 Cross Entropy in Large Language Models (LLMs)
学习 langchain https://github.com/liaokongVFX/LangChain-Chinese-Getting-Started-Guide
LangChain Agents: Unleashing the Power of Language Models for Real-World Automation
Building a Document-based Question Answering System with LangChain using LLM model
AI Chatbot with your Knowledge base
Building Next-Gen Apps with AI Agents
https://www.promptingguide.ai/research/llm-agents
Intro to LLM Agents with LangChain: Beyond Simple Prompts 代码跑不过,从头开始看 https://python.langchain.com/v0.2/docs/introduction/
No good (or at all) reStructuredText editor https://www.sphinx-doc.org/en/master/usage/restructuredtext/basics.html
https://github.com/liaokongVFX/LangChain-Chinese-Getting-Started-Guide
What is an LLM Agent and how does it work?
跑了第一个例子 Build a Simple LLM Application with LCEL 碰到问题是如何看 LangSmith trace