CS 151, Fall 1998

CS 151 (Artificial Intelligence)
Notes: Search #1

Reading: Russell and Norvig, chapter 3.

Overview

Search algorithms walk through a set of alternatives to find the best, or an acceptable, alternative. Except for toy problems, search is not used by itself to solve AI problems. Rather, it is a standard component in many algorithms. Also bear in mind that search is a last resort: if there is a smarter algorithm for your particular problem, use it.

The specs for search problem consist of

a set of states (the "state space"),
a set of operators that move from one state to another,
a cost for each operator,
an initial state, and
a test that determines if a state satisfies the goals of the problem.

Issues to consider, when examining a search problem, include:

Is the state space small? large? infinite?
Is there a route from some state back to itself, i.e. is there a potential for looping?
Is there a bound on the number of operations required to get from the initial state to a goal state?
Is there a way to determine that a state cannot possible lead to a goal state, because some condition of the problem has already been violated?
Can knowledge about the specific problem be used to guide the search?

Example problems

Route planning. Given a road map, with distances for each road segment, find the shortest route between two specified locations. It is much easier to solve such problems if the program is also given 2D coordinates for the ends of the road segments (which are often cities or other interesting locations). A state is an ordered list of places (e.g. cities).

Missionaries and cannibals. Three missionaries and three cannibals must cross the river. Their boat can carry 1-2 people. However, the missionaries must never be outnumbered by the cannibals, on either side of the river. A state is a pair of numbers: the number of missionaries and cannibals on the left bank.

This is a very standard example, because it is tricky for people to solve but extremely easy to solve with a computer program.

Cryptarithmetic. Some piece of arithmetic written with letters rather than digits. Must find an assignment of digits to letters that makes the arithmetic work out. No digit can be assigned to two distinct letters. Leading digits cannot be zero. Example:

   CATS           6154
  + EAT          + 915
  -----          -----
   MICE           7069

This is an example of a "constraint satisfaction problem." It is frequently possible to determine that a partial assignment of values to letters cannot be extended to a full assignment, because problems with the arithmetic are appearing.

Knot equivalences. Given 2D pictures of two knots, it is possible to prove that they are the same 3D knot using 4 types of operations on the 2D pictures. See figure.

Several things are interesting about this problem. First, the set of possible 2D deformations is infinite. Second, it is not possible to tell in advance how many steps will be required to get from one knot to another. Finally, this procedure cannot tell when two pictures represent distinct 3D knots: not having found a sequence of transformations after k steps isn't proof that no sequence exists.

Overview of search methods

The main data structure for all search methods is a linked list of states. For example, for route planning:

 ((merced fresno)             ;; one partial path path
  (merced yosemite)           ;; another partial path
  (merced fresno yosemite))   ;; a third partial path

Depending on the search method, this linked list implements either a queue or a stack. Although queues and stacks can be implemented with arrays (see CS 70 for details), linked lists are traditional in AI because the length of the list is hard to predict in advance.

There are two general classes of search algorithms. Some algorithms use only the topology of the state/operation graph. These include breadth-first search, depth-first search, depth-limited search, iterative deepening, and bidirectional search. Other algorithms speed up the search using information about the cost of operations and/or estimates of the cost of getting between two states. Such algorithms include uniform-cost search, best-first search, and A* search.

Topology-only techniques

Depth-first search (DFS). DFS extends one path until it reaches the goal or gets stuck. When stuck, it backs up one link in the path and tries other possible extensions of the path. This can be implemented like BFS, except that a (LIFO) stack is used rather than a queue. Alternatively, it can be implemented using recursive function calls.

DFS tends to be fast and consume little memory. However, it returns a random path rather than a short one. Moreover, it is not guaranteed to find a path, even if one exists, when the set of states is infinite (e.g. the knot equivalence example). Precautions must be taken (see below) to make sure it does not loop. Even if it does not loop, it can still end up following an infinite path of states that never leads to a goal state.

Breadth-first search (BFS). BFS generates all the paths of length 1, then all the paths of length 2, and so forth. This is usually implemented by storing the partial paths on a FIFO queue. The first path on the queue is then removed, its extensions computed, and these extended paths added to the end of the queue. See note below on how to do this with linked lists.

BFS is guaranteed to find the path (if one exists), and to return the shortest path if there are multiple paths to the goal. It tends to be slow and it can use a lot of memory because it must store an entire level in its queue at once. Loop detection is not required, but it will speed up the search.

Depth-limited search. This is like DFS except that it stops when the current candidate path reaches length k. This technique is only appropriate when the particular search problem suggests a good choice for the bound k. However, unlike DFS, it is guaranteed to find a path if one exists (no longer than length k).

Iterative deepening. Do depth-limited search repeatedly, increasing the bound on each iteration until a solution is found. This method uses only a small amount of memory (like DFS), but it is guaranteed to return the shortest path and it cannot go off into space. Its running time is similar to that of BFS because, for most search problems, the candidate paths of length k are a very high percentage of the total set of candidate paths of length <= k.

Specifically, for a completely filled out binary tree of height k, there are 2**k notes in the bottom level. In the whole tree, there are 1 + 2 + ... + 2**k = 2*(k+1)-1 nodes. So approximately half the nodes in the tree are in the bottom level. In AI problems, nodes typically have more than two children, so more than half (often a lot more than half) of the nodes in the tree will be in the bottom level.

Bidirectional search. Bidirectional search works forwards from the start state and backwards from the goal state(s) in parallel, usually using BFS for both searches. A hash table is used to check whether the two searches have reached a common intermediate state. This can be very effective if there are only a few goal states and if operations are reversable (i.e. given a state it is possible to quickly find all states from which it could have been reached).

Avoiding loops

Search algorithms frequently incorporate some mechanism to prevent looping. This is essential for depth-first search, which can go into infinite loops. It is useful in other search methods to prevent wasted effort. Techniques for preventing looping include:

Keep a hash table of all states that have been explored. General solution to preventing loops but the hash table may get very big.
Problem ensures that states change in some monotonic way from generation to generation. E.g. in cryptarithmetic, a child state has values assigned to more variables than its parent state did.
Prevent loops within a state. E.g. in route planning, prevent a town from occuring twice on the same path.
Place markers on objects that form part of states. E.g. in route planning, mark towns that have been reached by some path. When you reach the town again, perhaps update best path to town but don't explore further.

Queues with linked lists

When implementing a queue using linked lists, you need to add values to the far end of the list. Do not do this using the append operation: this does lots of extra list copying. Rather, use the "destructive" operation set-cdr!.

For example, suppose the list foo is set to '(a b c d) and you want to add 'e to the end of foo. Doing the following causes a copy to be made of foo, which is immediately thrown away. This is wasteful and can slow down code substantially when lists get very long.

  (set! foo (append foo 'e))

Instead, find the last cell in the linked list. (I'll let you write the function to do this.) Let's call it bar. In this example, bar looks like '(d), but this one-cell linked list '(d) is the same as the last cell in foo.

Now, remember that the value in a one-cell linked list is called the car in scheme and the "next" pointer is called the cdr. So the following command will splice the one-cell linked list '(e) onto the end of bar.

   (set-cdr! bar '(e))

Since bar is part of foo, foo is now changed to contain '(a b c d e).

This page is maintained by Margaret Fleck.