MaurerKrisztian/issue-improver-action

Context of issue and comments > 4000 cause 400 errors

Closed this issue · 8 comments

If the issue contains more than the maximum allowable tokens, the action throws an Error: Request failed with status code 400.

If the context is greater than 4000 tokens, it should be split into multiple sections, summarized individually and then merged all using Openai to do the operations. So, x+1 Operations where x is the number of chunks.

Good idea, we need to find a way to count the text tokens https://help.openai.com/en/articles/4936856-what-are-tokens-and-how-to-count-them
then split the text according to the model because different models have different token limits. https://platform.openai.com/docs/models/gpt-4 e.g. some models have 32k token limit

Perhaps we should consider implementing a prompt token limit input since it is also paid. https://openai.com/pricing

I started work on this using gpt-tokenizer to encode and count the tokens and chunk it up. I think you're right about the token limit. It probably needs to be have total_token_limit and input_token_limit. The 4k, 8k, 32k token limits are input and output inclusive, so you need to provide room for prompt and response.

I'm just trying to get the release testing set up properly and I can post a PR eventually, unless you get to it first.

Thank you for your help! You can do it, I'm experimenting in the "rewrite" branch with different structures, and I'm trying to find an easier way to add / use / resolve placeholders by creating placeholder-resolvers and a service that can resolve prompt templates using all appearing template placeholders (because currently, the different section has different placeholders) + di + adding some cache. I can open a discussion where you can share your opinion on it. https://github.com/MaurerKrisztian/issue-improver-action/discussions

@kostecky I just looked at your PR, I like the direction.

Thanks! That was a mistake as I am just orienting myself to the Typescript development workflow and the language, as I don't typically use it. I hope to have something in a couple of days poking at it here and there when I have a moment.

Hi @MaurerKrisztian - I've made some progress. Still, I wanted to leap ahead quickly and ask a few questions to improve my dev environment with Github actions as I'm unfamiliar with GH Actions and typescript/javascript development. I'm just learning as I bump into issues.

  1. I've more or less figured out how the two actions work in the repo for building out the release and then releasing it to be used. I am still getting errors and will need to debug it, but I can at least get it released manually and can trigger it to test. Would you happen to have a doc or reference on how to get the GitHub repo I forked from you and the local development environment set up correctly and aligned?

  2. I would like to test the action out locally as I develop before going through the release pipeline and triggering an event. This is quite a long process to test small changes. Running the .ts code in a tight loop while I make changes would be ideal. I found https://github.com/nektos/act, but please let me know if there is a better or simpler way.

@kostecky The GitHub action is implemented using Typescript. The repository must include all the dependencies, so the branch that serves as the GitHub action version (tag or branch - name@version) should have the compiled Typescript code (dist) along with the necessary dependencies. I have created a GitHub workflow main.yml that automatically triggers whenever there is a push event. It builds the code (basically run npm run package) and pushes it to the "latest" branch, allowing me to test it.
Maybe your build step is broken, try to run npm run package before the push.

Honestly, this is my first time creating a custom GitHub action, still figuring out how things should be. The testing process has been challenging. Unlike regular applications, GitHub actions cannot be run locally. However, I have found a solution using https://github.com/nektos/act, which allows me to simulate the GitHub action environment for testing purposes. I recently found it, haven't tried it yet, but everyone recommended it.

Some models now can handle up to 128,000 context, so I think this will be enough for most uses.