If is a set of airports and means “there is a direct flight from airport to airport ” (for and in ), then the transitive closure of on is the relation such that means “it is possible to fly from to in one or more flights”.
The problem can also be solved by the Floyd–Warshall algorithm in .
Consider a transit airport , if there is a flight from to and another one from to , then is reachable from .
Define the state , which represents whether there is a path from to restricted to intermediate nodes from the set . When computing , there are only two possible scenarios:
is reachable from without using as an intermediate node.
Now, we not only want to ask whether it is possible to fly from to , but we also want to find the shortest path between them.
Consider an intermediate airport , if the sum of distances from to and to is strictly less than the currently known shortest path from to , then the optimal path must route through .
Formula:
1 2 3 4 5 6 7 8 9 10 11 12 13
adj_mat: Float[T, "i=n j=n"] # input, all positive, unreachable paths are weighted by INF transit: Float[T, "i=n j=n"] # INF initialized
transit[:, :] = adj_mat[:, :] for i inrange(transit.shape[-1]): transit[i, i] = 0.0
for k inrange(transit.shape[0]): for i inrange(transit.shape[0]): for j inrange(transit.shape[0]): transit[i, j] = min(transit[i, j], transit[i, k] + transit[k, j])
Given a string and context-free grammar , the CKY algorithm is used to determine whether is grammatically valid and to construct its parse tree.
The context-free grammar (assumed to be in Chomsky Normal Form) contains two kinds of rules and . To generate from , there must exist a split point such that yields the substring and yields the substring . This logic is highly analogous to the intermediate node concept in the Floyd algorithm.
Note that here represents the set of nonterminal symbols capable of generating the substring spanning from index to . Consequently, all parsing results for strictly smaller constituent substrings must be computed beforehand. This dependency directly dictates the distinct loop execution order in the CKY algorithm.
Coding
There are several variants of CKY algorithm tailored to specific tasks, such as recognition, tree counting, CFG parsing, and PCFG parsing. Since all these variants share the same core logic, they can be elegantly implemented by simply swapping out the underlying Chart class.
def_cky_one_sentence(self, sentence: list[str], chart: ChartBase) -> None: """ Main logic of CKY algorithm. """ # Leaf records are initialized by the Chart class. # See ChartBase.__init__
# for each width b from 2 to n: for length inrange(1, len(sentence)): # for each start position i from 1 to n-b+1: for left inrange(0, len(sentence) - length): right = left + length # for each left width k from 1 to b-1: for mid inrange(left, right): self._reduce(chart, left, right, mid)
def_reduce(self, chart: ChartBase, left: int, right: int, mid: int) -> None: """ Try to reduce (left, mid) (mid+1, right) -> (left, right) with all possible rules """ # for each key B in Chart(i,i+k) and key C in Chart(i+k,i+b): for left_nt, right_nt in product( chart.get(left, mid).keys(), chart.get(mid + 1, right).keys() ): # for each production rule A -> B C: for pa_nt inself._inv_nonterminal_production.get((left_nt, right_nt), ()): # Chart(i,i+b).reduce(key=A, left=(i,i+k,B), right=(i+k,i+b,C)) chart.reduce(left, right, mid, left_nt, right_nt, pa_nt)
All of the above algorithms have the same time complexity of .
The moment I first saw the CKY algorithm in the CoLi lecture, the Floyd algorithm popped into my brain! It’s really an interesting callback!