bigcode-project/bigcode-analysis
Repository for analysis and experiments in the BigCode project.
Jupyter NotebookApache-2.0
Issues
- 1
[Decontamination] Add readme and instructions to run substring decontamination
#19 opened by RaymondLi0 - 2
[Near Deduplication] Tokenization
#10 opened by ChenghaoMou - 0
cannot import AttentionType from gpt2
#20 opened by ocramz - 9
Decontamination
#13 opened by ChenghaoMou - 3
- 0
- 0
github scraping speed limit
#15 opened by bigximik - 1
Broken link
#12 opened by Sleepyhead01 - 2
[Near Deduplication] Benchmark
#7 opened by ChenghaoMou - 1
[Exact Substring Deduplication] Analysis
#8 opened by ChenghaoMou - 0
[Near Deduplication] Post processing
#9 opened by ChenghaoMou - 1
Rename model names on HF hub
#2 opened by harm-devries - 1
- 1
Reorganize bigcode-data-analysis repository
#3 opened by loubnabnl