Quantcast
Channel: The House Carpenter
Viewing all articles
Browse latest Browse all 17

Dualities between depth-first search and breadth-first search

$
0
0

Something which I think is fairly well-known among programmers (I first learned it from reading Higher Order Perl) is that depth-first search and breadth-first search can be implemented in such a way that they differ only in their choice of data structure—depth-first search uses a stack (where you add to and remove from the same end), breadth-first search uses a queue (where you add to one end and remove from the other). Here is some illustrative Python code:

from collections import deque
from collections.abc import Iterable, Iterator
from typing import Protocol, TypeVar

T = TypeVar('T')

class Tree(Protocol[T]):
    def __call__(self, vertex: T) -> Iterable[T]:
        """Yield each of the vertex's children."""

def dfs(tree: Tree[T], start: T) -> Iterator[T]:
    stack = deque([start])
    
    while stack:
        vertex = stack.pop()
        yield vertex
        stack.extend(tree(vertex))
        
def bfs(tree: Tree[T], start: T) -> Iterator[T]:
    queue = deque([start])
    
    while queue:
        vertex = queue.pop()
        yield vertex
        queue.extendleft(tree(vertex))

Technical notes:

  • The Tree class is just an interface to say that a tree is a function (or another callable object) that, given something (a vertex), returns some other things of the same type (the children of the vertex).
  • Ordinary Python lists only support adding and removing elements efficiently (i.e. in constant time) at the back, because adding or removing elements at the front requires all the rest of the elements in the list to shift up or down by one place. They are therefore suitable for use as stacks, but not as queues. Instead I’ve used “deques” from the collections.deque module, which do support efficient adding and removing at both ends and hence make good queues as well as good stacks. Using an ordinary list for dfs would have been perfectly fine, but I’ve used deques for both functions just to makes the code more symmetric.

It’s worth noting however that there is a slight difference between how these two functions work, in that dfs yields sibling vertices in reverse order, compared to bfs (or to the order within the tree itself): for example, list(dfs(lambda _: (1, 2), 0)) evaluates to [0, 2, 1] while list(bfs(lambda _: (1, 2), 0)) evaluates to [0, 1, 2]. This isn’t an inherent consequence of the depth-first-ness or breadth-first-ness of the search order; it’s just due to the fact that with a stack, you take things out in the reverse order compared to how you put them in.

One way to rectify this while preserving the stack/queue duality is to keep vertices themselves out of the stack/queue. Instead, we can directly store the iterators over child vertices that we get from the tree:

def dfs(tree: Tree[T], start: T) -> Iterator[T]:
    stack = deque([iter([start])])
    
    while stack:
        try:
            vertex = next(stack[-1])
        except StopIteration:
            del stack[-1]
        else:
            yield vertex
            stack.append(iter(tree(vertex)))

def bfs(tree: Tree[T], start: T) -> Iterator[T]:
    queue = deque([iter([start])])
    
    while queue:
        try:
            vertex = next(queue[-1])
        except StopIteration:
            del queue[-1]
        else:
            yield vertex
            queue.appendleft(iter(tree(vertex)))

For dfs, this approach is likely to improve memory usage as well, since it means that only one iterator needs to be stored for each ancestor of the vertex currently being searched, while with the old dfs function, each younger sibling of each ancestor needs to be stored (and while it is possible for there to be no younger siblings, in which case the old implementation saves more memory, it is more likely that there will be many younger siblings in practice).

For bfs, on the other hand, there is actually only a tiny improvement in memory usage: it will still use the same amount of memory, it will just take a bit longer to get to that point. To be more precise, note that we can divide a tree into “levels”, where the first level consists of the root alone, the second level consists of the root’s children, the third level consists of the root’s grandchildren (i.e. all the children of vertices on the first level), etc. The distinctive characteristic of a breadth-first search is that it always completes one level before it moves on to the next. We can call vertices on the same level “cousins”. Also, given a bunch of cousin vertices, which we can think of as a “part” of the level they’re on, we can say that the part of the previous level consisting of the parents of those cousins is immediately “above”, and the part of the next level consisting of the children of those cousins is immediately “below”.

Now, suppose we’re doing a breadth-first search, and a vertex v at level n is currently being searched. The queue will contain the remaining vertices on level n (i.e. v‘s younger cousins), and the vertices on level n + 1 that have already been added from searching older cousins of v (i.e. the children of those older cousins). In the original bfs function these vertices are actually there in the queue, so the size S of the queue is the size of the remaining part of level n (= the number of younger cousins v has), plus the size of the part of level n + 1 that’s below the reached part of level n (= the total number of children v‘s older cousins have). In iterator-based bfs, on the other hand, the vertices are packaged into iterators according to their common parents; so the size of the queue is the size of the part of level n – 1 that’s above the remaining part of level n, plus the size of the reached part of level n. If the levels are increasing in size, this will generally be a smaller size than S. But at some point later on in the search, we’ll get to the children of v on level n + 1. Then, the size of the queue will be the size of the part of level n that’s above the remaining part of level n + 1, plus the size of the reached part of level n + 1. But the vertices remaining at this point are just the younger cousins of v‘s children, so the part of level n above them consists of v‘s younger cousins. Likewise the reached part of level n + 1 at this point consists of the children of v‘s older cousins. In other words the queue is the same size as it would have been for the original bfs function when searching v.

(There are better ways of doing a memory-efficient breadth-first search, which we’ll say more about shortly.)

Anyway, an alternative way of making dfs yield sibling vertices in the right order would be to write it recursively. This amounts to doing exactly the same thing as using a stack of iterators, except that rather than using an explicit stack we use the language’s call stack, and rather than explicitly working with iterators we use a for loop.

def dfs(tree: Tree[T], start: T) -> Iterator[T]:
    yield start
    
    for vertex in tree(start):
        yield from dfs(tree, vertex)

This is definitely the shortest and clearest way to write a depth-first search. Its main drawback is that in a language like Python you might end up running into a recursion limit, although it would have to be a very large and deep tree to get to that point.

And although it’s not really a meaningful drawback, the connection with breadth-first search seems to be lost by writing dfs recursively like this. Obviously one can’t write a breadth-first search recursively in the same way as a straightforward translation of iterator-based bfs, since recursion necessarily uses an implicit stack rather than an implicit queue.

However, quite surprisingly, it turns out that there is still a simple alteration we can make in order to make this recursive dfs function do a breadth-first search instead. This is clearest in Haskell notation, where it amounts to reversing the order of a Kleisli composition:

import Control.Monad
data Tree a = Tree a [Tree a]

children :: Tree a -> [Tree a]
children (Tree u vs) = vs

dfs :: Tree a -> [Tree a]
dfs t = [t] ++ (children >=> dfs) t

bfs :: Tree a -> [Tree a]
bfs t = [t] ++ (bfs >=> children) t

In Python, that new bfs function would look like this:

def bfs(tree: Tree[T], start: T) -> Iterator[T]:
    yield start
    
    for vertex in bfs(tree, start):
        yield from tree(vertex)

Basically we’ve exchanged the places of the call to tree and the recursive call to the search function.

While this is a very short way to write a breadth-first search, I would say it is rather more difficult to understand, compared to recursive dfs. This is mainly due to the fact that, unlike recursive dfs (which could easily be rewritten to return a sequence, rather than just an iterator) it makes essential use of the fact that it’s a generator function, so that the recursive calls in it don’t simply nest, but are rather interwoven with each other. The best way to understand how it works is to insert some logging and see it in action.

from functools import partial
from operator import getitem
import itertools as it
    
def bfs(tree: Tree[T], start: T) -> Iterator[T]:
    call_counter = it.count()

    def _bfs() -> Iterator[T]:
        call_id = next(call_counter)
        print(f'call ID {call_id} begun')
        print(f'call ID {call_id}: yielding {start}')
        yield start
        
        for vertex in _bfs():
            print(f'call ID {call_id}: yielding children of {vertex}')
        
            for child in tree(vertex):
                print(f'call ID {call_id}: yielding {child}')
                yield child
            
    yield from _bfs()

>>> tree = {1: (2, 3), 2: (4,), 3: (5, 6), 4: (7,), 5: (), 6: (8,), 7: (), 8: ()}
>>> search = bfs(partial(getitem, tree), 1)
>>> list(it.islice(search, 8))
call ID 0 begun
call ID 0: yielding 1
call ID 1 begun
call ID 1: yielding 1
call ID 0: yielding children of 1
call ID 0: yielding 2
call ID 0: yielding 3
call ID 2 begun
call ID 2: yielding 1
call ID 1: yielding children of 1
call ID 1: yielding 2
call ID 0: yielding children of 2
call ID 0: yielding 4
call ID 1: yielding 3
call ID 0: yielding children of 3
call ID 0: yielding 5
call ID 0: yielding 6
call ID 3 begun
call ID 3: yielding 1
call ID 2: yielding children of 1
call ID 2: yielding 2
call ID 1: yielding children of 2
call ID 1: yielding 4
call ID 0: yielding children of 4
call ID 0: yielding 7
call ID 2: yielding 3
call ID 1: yielding children of 3
call ID 1: yielding 5
call ID 0: yielding children of 5
call ID 1: yielding 6
call ID 0: yielding children of 6
call ID 0: yielding 8
[1, 2, 3, 4, 5, 6, 7, 8]
>>> next(search)
call ID 4 begun
call ID 4: yielding 1
call ID 3: yielding children of 1
call ID 3: yielding 2
call ID 2: yielding children of 2
call ID 2: yielding 4
call ID 1: yielding children of 4
call ID 1: yielding 7
call ID 0: yielding children of 7
call ID 3: yielding 3
call ID 2: yielding children of 3
call ID 2: yielding 5
call ID 1: yielding children of 5
call ID 2: yielding 6
call ID 1: yielding children of 6
call ID 1: yielding 8
call ID 0: yielding children of 8
call ID 5 begun
call ID 5: yielding 1
call ID 4: yielding children of 1
call ID 4: yielding 2
call ID 3: yielding children of 2
call ID 3: yielding 4
call ID 2: yielding children of 4
call ID 2: yielding 7
call ID 1: yielding children of 7
call ID 4: yielding 3
call ID 3: yielding children of 3
call ID 3: yielding 5
call ID 2: yielding children of 5
call ID 3: yielding 6
call ID 2: yielding children of 6
call ID 2: yielding 8
call ID 1: yielding children of 8
call ID 6 begun
...

Note that although it does yield the vertices in the correct, breadth-first order, it also gets itself into an infinite loop once it’s done, rather than just terminating! So in that respect, its behaviour is not quite simply that of recursive dfs plus doing the search in breadth-first order.

Let’s forget for a moment about the fact that only the yields from call ID 0 are actually sent back to the original caller, and just look at the sequence of nodes that are yielded, regardless of which recursive sub-call does the yielding. We’ll also break this sequence into segments, with a new segment starting whenever a new begins.

1
1, 2, 3
1, 2, 4, 3, 5, 6
1, 2, 4, 7, 3, 5, 6, 8

Do you see the pattern?

  • In each segment, the vertices yielded are all those up to a certain level in the tree. At first we only see 1, the unique vertex at the first level; in each subsequent segment the maximum reachable level increases by 1.
  • Within a segment, all vertices within the reachable levels are yielded in depth-first order. (But only the ones from the newest reachable level end up reaching the original caller, since those are the ones yielded by call ID 0).

So what this bfs function is actually doing is really something called iterative deepening depth-first search. It simply does a depth-first search, but only to a limited depth. Initially it only goes to the first level, but whenever it finishes searching, it starts again while increasing the depth limit by one. And it only yields the new vertices on the most recently-reached level. The effect of this is to yield vertices in breadth-first order, so in terms of the result it’s a breadth-first search, but in terms of the algorithm itself it’s more like a variation on depth-first search. It also now makes sense why it gets into an infinite loop eventually: after a certain point any deeper level contains no more vertices, but the algorithm will keep repeating the depth-first search and searching the new empty level.

This might seem like quite a silly way to do things, and it certainly involves a lot of redundant work compared to the alternative, since the higher levels of the tree are searched repeatedly again and again. However, it does have a big advantage in terms of memory usage, because, since it’s really a depth-first search, it has the memory usage of a depth-first search: it only needs to store the ancestors of the vertex currently being searched; in short, its memory usage corresponds to the depth of the tree. With a queue-based breadth-first search, on the other hand, the size of the queue is basically proportional to the size of the level currently being searched, i.e. the tree’s breadth. And trees in practice tend to be broader than they are deep. In fact they are often exponentially broader than they are deep, since all it takes for that to be the case is for the average number of children per vertex (the “branching factor”) to be greater than 1. In such a situation, the redundant work from searching higher levels repeatedly is less of an issue, since the deepest level reached accounts for the majority of the work in searching the tree anyway.

Iterative deepening depth-first search is known as IDDFS for short, and it’s generally associated with the field of artificial intelligence, since that’s where people are (or were) most likely to deal with problems involving brute-force search of massive trees with high branching factors, such as deciding the next move to play in a game of chess; this involves considering each of the possible moves, and for each move, the possible moves that could be made afterwards, and for each of the resulting 2-move sequences, the third moves that could be made afterwards; so it can naturally be modelled as a big tree. Although, nowadays, with the field being more focussed on machine learning, people might not be directly programming these kinds of tree searches so often any more (though I’m just guessing here—I don’t know that much about AI).

I’m not sure how well-known it is that you can write IDDFS as the simple recursive function above. You can of course write it in a more “direct” manner, and this is how it’s presented in standard references such as Korf (1985) or Russell & Norvig (2010). I learned about the connection between the recursive bfs implementation and IDDFS from a comment on an ActiveState code recipe by David Eppstein (and I learned about the recursive bfs implementation in the first place from a StackOverflow answer by Herrington Darkholme, who, it turns out, got it originally from the same code recipe as mentioned on this blog post of theirs).

For the record, here’s a more direct IDDFS implementation, including a termination check so that it stops once it reaches a level where there are no vertices:

import itertools as it

def depth_labelled(tree: Tree[T], start: T) -> tuple[Tree[tuple[T, int]], tuple[T, int]]:
    def new_tree(vertex_with_depth: tuple[T, int]) -> Iterator[tuple[T, int]]:
        vertex, depth = vertex_with_depth
        
        for child in tree(vertex):
            yield (child, depth + 1)

    return new_tree, (start, 0)

def depth_limited(tree: Tree[T], start: T, limit: int) -> tuple[Tree[tuple[T, int]], tuple[T, int]]:
    tree, start = depth_labelled(tree, start)
    
    def new_tree(vertex_with_depth: tuple[T, int]) -> Iterator[tuple[T, int]]:
        vertex, depth = vertex_with_depth

        if depth < limit:
            yield from tree(vertex_with_depth)

    return new_tree, start

def iddfs(tree: Tree[T], start: T) -> Iterator[T]:
    for limit in it.count():
        limited_tree = depth_limited(tree, start, limit)
        level_size = 0

        for vertex, depth in dfs(*limited_tree):
            if depth == limit:
                yield vertex
                level_size += 1

        if not level_size:
            break

To adapt the termination check to the original recursive bfs function, we have to modify the logic a bit so that we only get one level at a time:

def bfs(tree: Tree[T], start: T) -> Iterator[T]:
    yield start
    iterator = bfs(tree, start)
    level_size = 1
    
    while level_size:
        level = it.islice(iterator, level_size)
        level_size = 0
    
        for vertex in level:
            for child in tree(vertex):
                yield child
                level_size += 1

The function is now behaviorally more of an exact counterpart of dfs, although it is no longer so similar in implementation.

I still find it to be a bit of a startling coincidence that just switching the order of those two calls within the recursive implementation of dfs makes it into a breadth-first search (even if it behaves somewhat differently with regard to termination). It seems like something there should be a deeper explanation for, but I don’t know what it would be. And does it have anything to do with the stack/queue duality? The way IDDFS works doesn’t seem to involve a queue, not even implicitly, so maybe not. I don’t have answers to these questions, unfortunately.


Code snippets from this article are available on GitHub.


Viewing all articles
Browse latest Browse all 17

Trending Articles