daveshap/Raspberry

How to contribute data.

Closed this issue · 1 comments

There are some questions/tasks I'd like an open-source reasoning model to be able to answer, is there some way I can contribute those to this dataset?

Also, are you going to do synthetic generation, and if so, what's the process to participate in that?

Make suggestions/questions here regarding data:
image

https://github.com/daveshap/Raspberry/discussions/categories/data

We aren't quite at the synthesis phase yet. We're still gathering ideas. So far:

  • CoT: yes, we all tend agree this is the central concept
  • Reflection: Probably
  • MCTS: this is being talked about a bit

Then, as far as actually synthesizing data for reasoning, we're talking about using provable tests for synthesis (e.g. something that allows us to provably verify the reasoning worked)

  • math
  • chess
  • battleship
  • mastermind
  • coding
  • etc

in other words, testbeds or simulation environments were we can accurately calculate whether or not the reasoning was perfect. In my video this morning ( https://youtu.be/zzaEBGOVKIg ) I demonstrated that Claude seems to be able to validate and clean data. But we'll need to verify that.