有关解码阈值的问题 Question about decode threshold

Question

有关解码阈值的问题 Question about decode threshold

Closed this issue 3 years ago · 2 comments

作者你好，请问，解码objects时，论文中的结果对应的threshold是0.5吗？另外有对阈值threshlod的取值作相关对比实验吗？

# ./.utils.py: L39

def __call__(self, text: str, threshold: float = 0.5) -> Set:
        tokened = self.tokenizer.encode(text)
        token_ids, segment_ids = np.array([tokened.ids]), np.array([tokened.type_ids])
        mapping = rematch(tokened.offsets)
        entity_heads_logits, entity_tails_logits = self.entity_model.predict([token_ids, segment_ids])
        entity_heads, entity_tails = np.where(entity_heads_logits[0] > threshold), np.where(entity_tails_logits[0] > threshold)
        subjects = []

Answer 1 · 2021-12-22T09:40:26.000Z

paper 里的 threshold 是 0.5 哈，没有尝试不同 threshold 的 ablation study。
在实际使用过程中，对不同场景进行 threshold 的调整可能会所有帮助，比如重 precision 的场景可以把 threshold 调高点，重 recall 的可以调低点。

Answer 2 · 2021-12-23T02:30:30.000Z

好的，多谢。