A.I.
Ive been working through Peter Norvig and Stuart Russels Artificial Intelligence, A Modern Approach
(thanks to the Square engineering library) and one of the most helpful chapters involved methodically
demonstrating basic graph traversal algorithms for problem solving. If that sounds heady, its not I think
youll enjoy it.
First,
lets talk about graphs. A graph is any set of points (nodes) and the lines (edges) between those
points. A simple kind of graph is a tree structure. This is where theres a single root which has 1 or more
branches that then have their own branches, etc. Many things in computer science can be expressed using
some kind of tree structure graph.
Heres how this relates to AI: the root node (the one at the top of the tree) is the state the world is in right
now. Each other node represents a different state of the world thats reachable across an edge by the node
immediately above it. The line between them is some kind of action. Moving a cars wheels, turning on a
servo, whatever. The A.I. just needs to know that if you start at node A and do action X youll get to node B.
Your software will appear to be intelligent if it can start at the root node and find its way to a better state of
the world through a series of actions.
Here we walk down each branch of the tree all the way to the end using recursion. This is known as
recursive depth first search and is a great tool when you think that any path might have a good node
really far down so you just want to get really deep really fast. Its also the least code possible to find a
solution to our problem. Unfortunately, the simple implementation of depth-first involves recursion which
means were limited to only traversing graphs that have a total depth less than our runtime stack frame limit.
If youve ever seen a stack overflow error its because there was so much recursion in your program that
the computer assumed you were caught in an infinite loop and gave up. To demonstrate this try running this
simple program in irb:
def go(n)
puts n
go n + 1
end
go 0
On my machine the last thing printed was 8247 and then I saw:
SystemStackError: stack level too deep
which means Ruby let me use 8,247 stack frames
before giving up. If I were to try going 9,000 nodes deep
in a graph I'd get this error.
Now, if you wanted to do a depth-first search without recursion you can but its no longer the simplest code
so Ill skip it here. Suffice to say that it would mean youll have to manually keep track of which nodes to
visit in which order rather than letting your programming language do it for you.
The name breadth-first comes from the fact that itll look at all the nodes at each level from side to side
before proceeding down to the next level. Notice that the array variable is named queue. This is because in a
depth-first implementation youre always going to have a list that you put newly-discovered nodes onto the
end of and pull nodes to explore off of the front.
Breadth-first search is easy to reason about, you wont run out of stack space like when you used recursion
(although its possible youll run out of memory), and if your environment supports concurrency primitives
you might be able to run it in parallel quite easily. C# has a Consumer Queue that can help with this and
Clojure has multiple ways to iterate through a list in parallel. Ruby requires you to do more work
synchronizing when threads get to append their newly-discovered nodes to the end of the queue but
otherwise you get the parallelism cheaply.
In an 8-puzzle youve got a bunch of tiles in the wrong places and just one empty space that you can move
around until the tiles are in the right order. How do you know which move to make first? How do you know
when youre on the right track? How do you know if youve going in loops?
If we model each possible action as edges in a graph and each potential puzzle state as a node then we just
start at the beginning and begin exploring the graph. Well stop once weve found the solution (or, if we
built our code poorly, well stop when we run out of memory or time).
Now, there are two ways we can set up our data for this problem. One is to generate all possible states
(nodes) that the puzzle can have and to then connect the adjacent states. We would then have a complete
graph we could traverse and wed even be able to mark the solution ahead of time and know where it was
located in the graph. Unfortunately, the number of states is 9-factorial or about 360,000. Generating that
many puzzle slide orientations and then iterating through each one would take, at best, (9!)(9!-1)/2 nodecomparison operations (the formula for how many edges can exist between nodes in a graph is (n * n1)/2)
So lets not do that. Rather, lets start at the root node (the starting state) and then create branches from each
node as we go. Well stop when we discover our solution hopefully long before we examine 360,000
states.
What did we just do there? That is a Puzzle class where each instance knows whether its a solution. The
cells/tiles of the puzzle are kept in a list.
Now lets construct a way to represent a state (a node on the solution graph). A state isnt just a
representation of puzzle tile position but also the history of how that puzzle arrangement was reached from
the starting point. This is key: if we dont keep track of how we got to a solution node on the graph then
well never be able to report how to solve the puzzle. So we need to keep a list as we go of which actions
weve taken to arrive at the current node.
class Puzzle
def zero_position
@cells.index(0)
end
def swap swap_index
new_cells = @cells.clone
new_cells[zero_position] = new_cells[swap_index]
new_cells[swap_index] = 0
Puzzle.new new_cells
end
end
class State
Directions = [:left, :right, :up, :down]
attr_reader :puzzle, :path
def initialize puzzle, path = []
@puzzle, @path = puzzle, path
end
def solution?
puzzle.solution?
end
def branches
Directions.map do |dir|
branch_toward dir
end.compact.shuffle
end
private
def branch_toward direction
blank_position = puzzle.zero_position
blankx = blank_position % 3
blanky = (blank_position / 3).to_i
cell = case direction
when :left
blank_position - 1 unless 0 == blankx
when :right
blank_position + 1 unless 2 == blankx
when :up
blank_position - 3 unless 0 == blanky
when :down
blank_position + 3 unless 2 == blanky
end
State.new puzzle.swap(cell), @path + [direction] if cell
end
end We're extending the Puzzle class.
`zero_position` tells us which cell has
the '0'.
`swap` tells the puzzle:
"give me you, but with the '0' cell
replaced by the cell at some other
location of my choice."
This is how we'll simulate moving a tile.
This State class knows about one particular arrangement of the puzzle and is able to determine next steps.
When we call State#branches we get a list of adjacent puzzle arrangements (anything reachable by moving
the empty space over by 1 square) and each of these new states include the full list of steps necessary to
reach them.
Thats the setup. Now that we have some problem-specific helpers we can use our breadth-first algorithm
from up above to start tackling this.
def search state
state.branches.reject do |branch|
@visited.include? branch.puzzle.cells
end.each do |branch|
@frontier << branch
end
end
require 'set'
def solve puzzle
@visited = Set.new
@frontier = []
state = State.new puzzle
loop {
@visited << state.puzzle.cells
break if state.solution?
search state
state = @frontier.shift
}
state
end
Important: don't revisit puzzles
you've already seen!
The list of places we need to search
is known as the 'frontier'
If we feed in a solveable puzzle we can see that this code works. Lets try one where the empty tile was
moved right and then down. The solution should be to move it up and then left:
p solve(Puzzle.new [1, 4, 2,
3, 0, 5,
6, 7, 8]).path
# => [:up, :left] `solve` is going to return a State
instance. We care about it's #path
So it works, but its just kinda wandering around until it finds a solution. We gave it a problem that was
only 2 steps from a solution so if we gave it something harder would it ever finish? And how long would the
solution path be?
Heres our code running with a puzzle whos optimal solution is 20 steps away:
p solve(Puzzle.new [7, 6, 2,
5, 3, 1,
0, 4, 8]).path
# => [:up, :up, :right, :down, :right, :down, :left, :left,
#
:up, :up, :right, :down, :down, :right, :up, :left,
#
:down, :left, :up, :up]
# Time: 27 seconds I generated this by creating a solution state and
running `state = state.branches.sample` in a big loop.
It works! Eventually. But 27 seconds is a bit slow. What if we were tackling the 15-puzzle instead? Rather
than the 9! (360K) options we would be searching through 16! (20 trillion) options. That would take almost
literally forever.
Uniform-cost search
As we walk the graph were keeping a frontier a list of states were hoping to explore in the future.
Since we always add to the back of the list and take (shift) from the front its technically a FIFO queue
rather than just a list.
What if, rather than picking the next element from the queue to explore we tried to pick the best one? Then
we wouldnt have to explore quite so many trillions of nodes in our state graph.
Uniform-cost search entails keeping track of the how far any given node is from the root node and using that
as its cost. For our puzzle example that means that however many steps n it takes to get to state s then s has
a cost of n. In code: s.cost = steps_to_reach_from_start(s). A variant of this is called Dijkstras
Algorithm.
Theres one missing piece here though: we dont want to examine every item in the entire frontier queue
every time we want to pick the next lowest-cost element. What we need is a priority queue that
automatically sorts its members by some value so looking up an element by cost is cheap and doesnt slow
down the rest of what were trying to do.
class PriorityQueue
def initialize &comparator
@comparator = comparator
@elements = []
end
def << element
@elements << element
sort!
end
def pop
@elements.shift
end
private
def sort!
@elements = @elements.sort_by &@comparator
end
end
class State
def cost
path.size
end
end
require 'set'
def solve puzzle
@visited = Set.new
@frontier = PriorityQueue.new {|s| s.cost }
state = State.new puzzle
loop {
@visited << state.puzzle.cells
break if state.solution?
search state
state = @frontier.pop
}
state
end This is a terrible implementation of a
priority queue. The `#sort!` method iterates
through every item every time.
What you want is a priority queue backed by a heap data structure.
In Ruby you should use the `PriorityQueue` gem
and on the JVM there's a good Java implementation.
Sidebar: you may be wondering why this gains us any advantage? Sure, were now picking the the best
node from the queue rather than whichever one was added first but we still have to explore all of them,
right? Actually, no. Because were sorting by the cost of the nodes we can be guaranteed that whenever we
find a solution its the best one. There may be other paths to solutions in our graph but they are all
guaranteed to be of higher cost. So this Uniform-cost search lets us leave a vast section of the queue
unexplored.
What does that do to our performance? Well, if we re-run our above 20-step puzzle the time will drop
considerably from 27 seconds to 10 seconds (on my machine).
This is a big speedup and, for larger problems, can shave days off the calculation time. But theres much
more we can do.
A* Search
The uniform-cost search picks the best next state from the frontier. Lets enhance the codes understanding
of what makes something best by calculating not only the distance from the start to where we are but the
distance from where we are to the goal.
Old cost function: steps_to_get_to(s)
New cost function: steps_to_get_to(s) + steps_to_goal_from(s)
But, uh, how do we know how far we are from the solution? If we knew how far away the solution was wed
probably already have found it, right? Right. So rather than being exact, lets just pick a healthy estimate of
how far we are from a solution. One approximation would be how many tiles are out of place? That would
at least differentiate almost-solution nodes from not-even-close ones. But wed like to be a bit more precise.
So lets say that the distance cost between a given node and the solution node is the number of tilemovements that would be required if tiles could move through each other and go straight to their goal
positions. So a near-solution node might have a distance cost of 3 and a not-even-close node might have a
distance cost of 26. That should give us decent precision while also being fair. Its important that our costto-get-to-goal function doesnt accidentaly deprioritize good near-solution states.
To help us well calculate the Manhattan Distance between each tile and where its supposed to be.
Manhattan Distance is the distance between two places if you have to travel along city blocks. Essentially,
youre adding up the short sides of a right triangle rather than shortcutting across the hypotenuse. The
formula is pretty simple:
class Puzzle
def distance_to_goal
@cells.zip(Solution).inject(0) do |sum, (a,b)|
sum += manhattan_distance a % 3, (a / 3).to_i,
b % 3, (b / 3).to_i
end
end
private
def manhattan_distance x1, y1, x2, y2
(x1 - x2).abs + (y1 - y2).abs
end
end
class State
def cost
steps_from_start + steps_to_goal
end
def steps_from_start
path.size
end
def steps_to_goal
puzzle.distance_to_goal
end
end
require 'set'
def solve puzzle
@visited = Set.new
@frontier = PriorityQueue.new {|s| s.cost }
state = State.new puzzle
loop {
break if state.solution?
search state
state = @frontier.pop
}
state
end
Here we `zip` the current puzzle with the solution
and total up the distances between each cell
This % and / stuff is just turning an integer
into puzzle x,y coordinates
If youre following along at home (and using a real priority queue) you might think the code is broken
because it exited so fast. With a proper priority queue implementation this latest search took 0.07 seconds.
This A* search is able to quickly pick the best candidate to explore in any situation where the distance from
the current state to the goal state is knowable. In real-world pathfinding, e.g., you can use the geospatial
distance between two points. It doesnt work at all, however, in situations where you know the goal when
you see it but cant determine how close you are. A robot trying to find a door in unexplored territory would
not be able to use this, it would have to just keep bumbling around.
The full reference code for this is on GitHub including a full implentation in Clojure.
Huge thanks to my reviewer Ashish Dixit without whom this post would have been a typo-filled mess of
half-conveyed ideas.
A quick recap of the relative time and memory costs for these search algorithms:
uninformed depth-first:
{stack overflow error}
breadth-first w/o tracking `visited`: {out of memory error}
uninformed breadth-first:
27 seconds, 47,892 explored states
uniform-cost (Dijkstra's):
10 seconds, 51,963 explored states
A* search:
0.07 seconds, 736 explored states