- Do not start objectives section (intro) talking about kiskadee
- Set index words
- Improve kiskadee screenshots
- future work: use user feedback to enhance the ranking model
- extend terminology section (see tcc)
- Add kiskadee bug report overview in introduction
- It would be interesting to see the size of the distinct warnings on the histogram of the warning severities in each tool (see table IV)
- Show some comparison for when the tools cover the same flaw of not.
- There are 174 warning lines, where all the tools show a flaw, but the label is false according to Juliet. At the same time there is not any case where the same situation is hold but the label is true. Maybe this could be discussed in the paper (maybe an example?)
- Maybe other tools with broader range of findings could be more reliable in the validation.
- How does the true negative rate influence the classifier?
- In the attached file there are some interesting cases, which should be investigated in more detail. E.g.: all tools give warnings for a particular line, while Juliet gives a false label for that line.