swj0419/detect-pretrain-code
This repository provides an original implementation of Detecting Pretraining Data from Large Language Models by *Weijia Shi, *Anirudh Ajith, Mengzhou Xia, Yangsibo Huang, Daogao Liu , Terra Blevins , Danqi Chen , Luke Zettlemoyer.
PythonApache-2.0
Issues
- 0
Typos in Figure 1
#16 opened by xiami2019 - 4
- 4
What's the length setting of Table 1?
#7 opened by Spico197 - 0
- 0
Deprecated model
#14 opened by dtumkaya - 0
Request requirements
#13 opened by zc1023 - 2
Llama2 contamination
#8 opened by wendlerc - 0
About baseline methods
#12 opened by CUCHon - 0
Data Collection Code
#11 opened by zhentingqi - 0
Accuracy formula
#10 opened by lashoun - 0
Add requirments.txt
#9 opened by yhyu13 - 4
Wrong token probabilities calculation?
#5 opened by Shaier - 0
datasets package
#4 opened by xiaoshuang0918 - 2
Regarding Thresholding
#3 opened by aflah02 - 2
Question about logits from OpenAI models
#2 opened by jiangjiechen - 0
New paper???
#1 opened by BunsenFeng