Feature: `-- plan` flag

Question

Feature: `-- plan` flag

swyxio opened this issue 2 years ago · 2 comments

swyxio commented 2 years ago

this is an easy one

https://news.ycombinator.com/item?id=35970653
Self-planning Code Generation with Large Language Model
https://arxiv.org/pdf/2303.06689.pdf idea
https://news.ycombinator.com/item?id=35735375

Answer 1 · 2023-05-18T06:26:05.000Z

swyxio commented 2 years ago

Answer 2 · 2023-06-19T05:36:08.000Z

The discovery that only GPT-4 can self-improve, while weaker models cannot, is very intriguing, indicating a new type of emergent ability (i.e. to improve upon natural language feedback) may only exist when the model is "mature" (large and well-aligned) enough
https://twitter.com/Francis_YAO_/status/1670618013089820674

Large Language Models (LLMs) have shown remarkable aptitude in code generation but still struggle on challenging programming tasks. Self-repair -- in which the model debugs and fixes mistakes in its own code -- has recently become a popular way to boost performance in these settings. However, only very limited studies on how and when self-repair works effectively exist in the literature, and one might wonder to what extent a model is really capable of providing accurate feedback on why the code is wrong when that code was generated by the same model. In this paper, we analyze GPT-3.5 and GPT-4's ability to perform self-repair on APPS, a challenging dataset consisting of diverse coding challenges.