jacobeisenstein/gt-nlp-class

Formal Language Theory and Dijkstra's algorithm, Chapter 9

Closed this issue · 2 comments

Chapter 9, page 194 states that for the membership operation with we should (can) use Dijkstra's algorithm for a DFA. Surely with a DFA the membership test is much simpler, as a string has a unique path. So we need only follow the M edges labelled with the input symbol sequence, then check if this state is a final state F. Thus no need for Dijsktra, and the complexity is linear in the string size (and the size of the DFA is irrelevant). The need for Dijkstra would come if we use a NFA that hasn't been determinised, or a FST. My comment also pertains to p197, below [9.5] which repeats the bit about shortest-path for WFSAs.

Maybe I'm missing something, please let me know.

Thanks for the comment! For an unweighted DFA, I think you're right, and I'll make a change -- too late for the print edition, unfortunately. As described in 9.1.3, WFSAs are not assumed to be deterministic in the sense of having only one outgoing edge for each (state, symbol) pair. Rather there is a scoring function on transitions (old_state, symbol, new_state). So there can be multiple paths for a given string, and search is required to find the best-scoring one.

Thanks Jacob. Yes, that makes sense. The bit about WSFA was assuming that determinising them would be easy, but this doesn't appear to be the case (see e.g., http://adambuchsbaum.com/papers/det-sicomp.pdf), so it makes sense to design the matching algorithm for the non-deterministic WSFA directly.