sfischer13/python-arpa

The efficiency of computing backoff

huangruizhe opened this issue · 0 comments

def log_p_raw(self, ngram):
try:
return self._log_p(ngram)
except KeyError:
if len(ngram) == 1:
raise KeyError
else:
try:
log_bo = self._log_bo(ngram[:-1])
except KeyError:
log_bo = 0
return log_bo + self.log_p_raw(ngram[1:])

This try...catch mechanism to implement the backoff may not be efficient enough.
According to the python documentation:

A try/except block is extremely efficient if no exceptions are raised. Actually catching an exception is expensive.

However, it is common in a language model to have unseen ngrams and to backoff to lower orders. Thus, I guess it may be more appropriate to implement this using if...else (as also suggested in the documentation) instead of try...catch.