Support multiple submissions per student
Opened this issue · 0 comments
This issue is primarily meant as a braindump/storm to collect ideas and remarks. Feel free to add your own!
Problem
Most programming platforms allow students to submit more than once for a given exercise. If that is the case, students who hand in plagiarized submissions will sometimes copy another students' solution as-is to confirm it is correct, and afterwards submit an altered version to hide plagiarism.
To discover this kind of plagiarism, you can submit all submissions for a single exercise. However, Dolos currently considers each file its own submission and will match files from the same student together. Since these submissions are often very similar, they will create high-similarity pairs that will drown out pairs between different students - reducing the effectiveness of the report.
Solution
- First, there needs to be a way to communicate to Dolos which submissions belong to the same student, I see two different methods:
- If the
paths
given as argument to the CLI are directories, all submissions within the same directory could be considered from the same student - When receiving input from an
input.csv
, a fieldstudent_id
could tell Dolos which submissions belong together. As this is the output of Dodona's export format, there is no need for Dodona to change anything to support this.
- If the
- Second, the Dolos algorithm should probably ignore matches between submissions of the same student and not generate
Pairs
between- However, it could be interesting to be able to view the changes a student made between subsequent submissions as well.
- Finally, careful consideration is needed how to implement this in the UI / CSV generation:
- A
Pair
could become the "most similar pair" between two students' submissions. However, this could differ between each pair of students. - There should be a way to go through the individual submissions of each student and compare them seperately. Often there is useful information contained in previous/later submissions than the most similar one.
- A
Sidenote
Currently at Aalto they are using labels to group students together. While this works surprisingly well, it does give some issues in the UI (e.g. a plagiarism graph that is very long).