daniel4x/daniel4x.github.io

Reproduce "Unveiling Hidden Links Between Unseen Security Entities"

Closed this issue · 8 comments

Hi, I read your paper "Unveiling Hidden Links Between Unseen Security Entities" and tried to reproduce it. However, I was unable to improve the MRR when I directly used ULTRA to fine-tune the threat knowledge graph. I used the dataset provided in the paper "Uncovering CWE-CVE-CPE Relations with Threat Knowledge Graphs". Can you see if there is a problem with this dataset? ? Or have you made any changes to ULTRA’s training source code? I'd be very grateful if you could tell me!
image
In order to look through it, I deleted part of the loss record.
image

Hi, @jr011 , thanks for your interest in my paper.

In the paper, I have multiple configurations of ULTRA:

  • ULTRA's original implementation without any change:
    • Zero-shot (no training)
    • Fine-tuned
  • VulnScopper, which is ULTRA combined with the language model embeddings

Which one of them are you trying to run?

In addition, note that the dataset presented in the paper is slightly different. I used all the CVEs that were available up to the date mentioned in the paper (with all the relevant restrictions as mentioned as well).

Thank you for your reply! I'm trying to run the Fine-tuned of ULTRA's original implementation without any change. Because this is the first step to implement VulnScopper, and I'm stuck here. I wonder if my dataset is incorrect, or if I need to make some hyperparameter changes to ULTRA for the threat dataset.

The results with a zero-shot using ULTRA were better than the one you provided. Therefore, I suggest starting from zero-shot, which will allow you to focus on the data and the graph rather than the model.

attaching zero-shot results (transductive-NVD):
image

This is my graph construction:
image

And I guess you are already familiar with the transductive dataset concept in ULTRA, but just in case, this is how I defined my dataset (I skip download as I already have the data in the relevant path):

class NvdTransductive(TransductiveDataset):

    name = "NvdTransductive"
    delimiter = '\t'

    def download(self):
        pass

I'm planning to upload my model and data as soon as I have the time

@jr011 another thing: are you evaluating only CPE->CVE and CVE->CWE edges?
Note that in my paper I'm interested only in these particular edges.

Yes, I conducted a zero-shot experiment and the results were similar to your zero-shot experiment and both my valid set and test set only contain CVE2CWE and CVE2CPE, my dataset definition is also the same as yours. The dataset I used is like this, it should be similar to your dataset. But I don’t understand why the effect of fine-tuning is so poor.
image

OK, that's great that the zero shot experiment was successful.

The results you are showing in the log (Hits@...) seems poor. Is it the general results for all types of edges or just for CWE, CPE, CVE?
I remember that in my code I have a custom evaluation to make sure I evaluate only relevant head to tail or tail to head relations.
E.g,
In my evaluation If edge was of type MatchingCVE (cpe to cve), I only evaluated queries such as (?, MatchingCVE, CVE).

Additionally, something seems odd with thr drop you have in your loss.
How many batches per epoch are actually used (print it from the training code)?
Can you try to decrease / increase negatives? (512 e.g.)

Try to begin with 500-1k batches per epoch.

Both my validation set and test set only have CWE, CPE and CVE, so the results are just for CWE, CPE and CVE. I also find it weird that my losses are going down. My bpe is set to null, I'll try setting it to other values, thanks again for your reply. I hope your paper will be published!

Both my validation set and test set only have CWE, CPE and CVE, so the results are just for CWE, CPE and CVE. I also find it weird that my losses are going down. My bpe is set to null, I'll try setting it to other values, thanks again for your reply. I hope your paper will be published!

Thanks @jr011 , feel free to contact again. I'm curious about how other researchers will use my technique and improve it.