How to contribute data.

Question

How to contribute data.

Closed this issue 4 months ago · 1 comments

There are some questions/tasks I'd like an open-source reasoning model to be able to answer, is there some way I can contribute those to this dataset?

Also, are you going to do synthetic generation, and if so, what's the process to participate in that?

Answer 1 · 2024-09-14T16:27:08.000Z

Make suggestions/questions here regarding data:

https://github.com/daveshap/Raspberry/discussions/categories/data

We aren't quite at the synthesis phase yet. We're still gathering ideas. So far:

CoT: yes, we all tend agree this is the central concept
Reflection: Probably
MCTS: this is being talked about a bit

Then, as far as actually synthesizing data for reasoning, we're talking about using provable tests for synthesis (e.g. something that allows us to provably verify the reasoning worked)

math
chess
battleship
mastermind
coding
etc

in other words, testbeds or simulation environments were we can accurately calculate whether or not the reasoning was perfect. In my video this morning ( https://youtu.be/zzaEBGOVKIg ) I demonstrated that Claude seems to be able to validate and clean data. But we'll need to verify that.