OpenBioLink/ThoughtSource
A central, open resource for data and tools related to chain-of-thought reasoning in large language models. Developed @ Samwald research group: https://samwald.info/
Jupyter NotebookMIT
Issues
- 0
mawps dataset cot incorrect
#138 opened by KonstantinHebenstreit - 0
New CoT Dataset Report: CoT-Collection
#137 opened by chunhuizng - 1
Tutorial notebook does not open
#128 opened by TolgaCorbaci - 0
Annotator sort by dataset
#100 opened by KonstantinHebenstreit - 0
Add info on ThoughtSource-100 to Readme
#115 opened by matthias-samwald - 1
- 3
Generate takes very long time to finish
#129 opened by michalrzak - 0
krippendorff scores
#130 opened by KonstantinHebenstreit - 2
- 0
- 0
- 0
- 0
- 2
- 2
save generated cots in case of error
#102 opened by KonstantinHebenstreit - 1
loading collections and generated_cots accept single strings, not only lists
#107 opened by KonstantinHebenstreit - 1
- 4
link to Annotator example file broken
#113 opened by tmontana - 0
- 0
change saving of default template
#105 opened by KonstantinHebenstreit - 0
PubmedQA dataset: Add reference CoT based on LONG_ANSWER field in source
#81 opened by matthias-samwald - 0
add conda
#90 opened by KonstantinHebenstreit - 0
- 0
Add datasets to Hugging Face dataset hub
#82 opened by matthias-samwald - 1
- 2
Default values for str if empty: None or ""
#48 opened by nomisto - 1
Datasets: MedQA, MedMCQA, PubmedQA
#68 opened by matthias-samwald - 1
- 1
Rename templates.json
#50 opened by nomisto - 2
Document our standardized data schema
#16 opened by matthias-samwald - 0
Build tests sometimes fail because of "too many requests" HTTP error in our remote sources
#61 opened by matthias-samwald - 0
`document_id` is redundant
#47 opened by nomisto - 8
Dataset: AQuA
#18 opened by nomisto - 0
Grant application writing
#14 opened by matthias-samwald - 0
Add datasets from 'Can large language models reason about medical questions?' paper (?)
#22 opened by matthias-samwald - 1
Create script that generates an overview of our converted datasets and their contents
#12 opened by matthias-samwald - 1
- 1
Create motivating demos of CoT streams
#13 opened by matthias-samwald - 1
- 0
- 1
Collect prompts used to generate CoTs
#10 opened by matthias-samwald - 0
- 17
Dataset: EntailmentBank
#7 opened by matthias-samwald - 1
Dataset: OpenBookQA
#4 opened by matthias-samwald - 8
Dataset: CommonsenseQA
#19 opened by nomisto - 0
- 7
Datasets: Datasets from Wei2022 repository (aqua, asdiv, commonsenseqa, date_understanding, gsm, mapwps sports_understanding, strategy_qa, svamp)
#9 opened by matthias-samwald - 5
Dataset: WorldTree
#5 opened by matthias-samwald - 0
Add field 'generated_answer' to schema
#21 opened by matthias-samwald - 0
Error during pip install
#20 opened by matthias-samwald