/contents-agent-datasets

GitHub issues as a blog with comment feature, place to publish and/or relay contents, and open discussion forum.

New developments after I post the above on Zulip on Feb 14: alreadydone#5 (comment)

LLaMA/Alpaca ecosystem

Open-source models

OpenAI

Other corporations

  • Microsoft: Bing Chat (search + browsing tool use, GPT-4 based, sort of ChatGPT plugin preview), Bing Image Creator (DALL-E based)

Anthropic: Claude

Google: PaLM API, Bard

Baidu: ERNIE Bot (文心一言) (Chinese language)

  • https://twitter.com/johnjnay/status/1637843926840164353

  • -Supervised fine-tuning on your tasks

  • -Self-supervised learning (SSL) on your text

  • -RL w/ your reward model (RM)

  • -Filter high-temp outputs w/ RM

  • -Conditional SSL on RM-scored text

  • -Prompt w/ context

  • -Give it access to your tools

  • -Train (soft) parts of prompts

RepoCoder: Repository-Level Code Completion Through Iterative Retrieval and Generation (Microsoft / Wuhan), https://arxiv.org/abs/2303.12570

OpenChatKit, https://www.together.xyz/blog/openchatkit, featuring Customization recipes to fine-tune the model and Extensible retrieval system for live-updating answers.

ChatGPT plugins, https://openai.com/blog/chatgpt-plugins

Copilot for Docs, https://githubnext.com/projects/copilot-for-docs Compare - https://github.com/context-labs/autodoc

Tool use

Tool building (Coding)

AI efficiency

Algorithm optimization

Theorem proving

Future of Mathematics

More

GPT-4: The Bitterer Lesson, Alberto Romero, - - https://thealgorithmicbridge.substack.com/p/gpt-4-the-bitterer-lesson


LLM agents

Active learning

Iterated improvement

AI for Science


Datasets for fine-tuning

Toolformer / plugin / APIs:

StackLLaMA: A hands-on guide to train LLaMA with RLHF, https://huggingface.co/blog/stackllama

Context length Scaling Transformer to 1M tokens and beyond with RMT, - https://arxiv.org/abs/2304.11062 (2M tokens)