CS 4300 - AI
Nov. 1, 2016
Assignment A5: Monte Carlo Probabilistic Agent
1. Introduction
This assignment implements a Monte Carlo probabilistic Agent algorithm that uses the A*
algorithm (for path finding) from a previous assignment. The program replaces a knowledge base
(KB) used to prove logical statements from the last assignment and uses probability to determine
its surroundings. The program implements three noteworthy functions: ‘WP estimates’ and a
‘board constraint’. The agent will be run on 250 50 Wumpus boards that are provided using
increasing amounts of probabilistic sampling. Scores and back trace data will be gathered that
address the following questions:
• What are the advantages and disadvantages of using probability with Monte Carlo to
solve the Wumpus world?
• Will the agent see an improvement in score from taking 50, 100, 200 10, 15, to 20
estimations?
• Are there any boards that the agent never solves?
2. Method
1. CS4300_WP_estimates.m
This is the main difference from the last assignment. The function takes as parameters what it
knows about the breeze and stench percepts, the number of estimation sample to average, and
an additional parameter that I added of whether the wumpus is still alive or not. The output is
the likelihood of a pit or a wumpus in each cell. The pits are based on a 20% chance of a pit
in a cell.
breezes = -ones(4,4);
breezes(4,1) = 1;
stench = -ones(4,4);
stench(4,1) = 0;
wumpus_alive = 1;
[pts,Wumpus] =
CS4300_WP_estimates(breezes,stench,1000,wumpus_alive);
1
pts =
0.2021 0.1967 0.1956 0.1953
0.1972 0.1999 0.2016 0.1980
0.5527 0.1969 0.1989 0.2119
0 0.5552 0.1948 0.1839
Wumpus =
0.0806 0.0800 0.0827 0.0720
0.0780 0.0738 0.0723 0.0717
0 0.0845 0.0685 0.0803
0 0 0.0741 0.0812
2. CS4300_board_constraint.m
A helper function that I provided that determines if a given board meets constraints of
satisfying both the stench and breeze boards.
breezes = -ones(4,4);
breezes(4,1) = 0;
stench = -ones(4,4);
stench(4,1) = 1;
wumpus_alive = 1;
b = CS4300_gen_board(.2,wumpus_alive);
b =
0 0 0 0
2 1 0 0
0 0 0 0
0 3 0 0
board_meets_constraint =
CS4300_board_constraint(breezes,stench,b);
board_meets_constraint =
1
3. CS4300_MC_agent.m
The agent function (CS4300_MC_agent) is similar to the agent in assignment A4b. The main
difference is that it incorporates the CS4300_WP_estimates to determine where it should go
when exploring new areas of the board. I also added another parameter to this function for
the number of how many estimations that CS4300_WP_estimates should make.
percept = [0,1,0,0,0];
num_trials = 20;
a = CS4300_MC_agent(percept,num_trials);
2
Another note about the Monte Carlo agent, in an effort to not put the agent 50 more points
negative, the agent only hunts down the wumpus if it has sensed a stench.
3. Verification of Program
1) CS4300_WP_estimates.m
a.
If the agent knows nothing about the board, the probabilities for pits
should be around 0.2 in each cell and for the wumpus should be around
1/15 (or 0.067). If 1000 random boards are generated to estimate this, the
following is sample output:
breezes = -ones(4,4);
stench = -ones(4,4);
wumpus_alive = 1;
[pts,Wumpus] =
CS4300_WP_estimates(breezes,stench,1000,wumpus_al
ive);
pts =
0.2400 0.1880 0.1860 0.1950
0.1890 0.2110 0.2210 0.1890
0.2010 0.1920 0.2210 0.2190
0 0.1990 0.2100 0.2250
Wumpus =
0.0710 0.0660 0.0830 0.0730
0.0510 0.0690 0.0650 0.0610
0.0740 0.0690 0.0710 0.0580
0 0.0700 0.0690 0.0500
mean(pts(:))
ans =
0.1817
mean(Wumpus(:))
ans =
0.0625
b. More interesting, is the example form the book on page 502. Where there is a
breeze sensed in both cells (1,2) and (2,1). According to the book, there
should generated a pits boards with a 0.31 in cells (1,3) and (3,1) and 0.86 in
cell (2,2).
3
P(p) =
0.31
B P(p) =
0.86
ok B P(p) =
0.31
pts =
Wumpus =
2) CS4300_MC_agent.m
I did step through the first 10 boards or so from the A5_boards.mat that was
provided. While the program didn’t do anything that was unexpected, it was challenging
at times to determine what the agent should be doing since it is based on varying
probabilities. If we continue with example board from 1) b., then the agent should never
choose to enter cell (2,2) since the probability is almost always going to be higher than
the other two options (3,1) and (1,3). I ran this board and for the times that it finished in a
reasonable amount of time the agent never died in cell (2,2). It turns out that this board
was hard to randomly generate. The times that gave up waiting and paused execution had
the board mostly explored (it had already moved past that point).
4
4. Data and Analysis
Each of the boards 1 though 50 were run 5 times each. The raw data has 1000 scores, 250 for
each sample size.
There was a significant improvement in score when increasing the number of random boards to
sample for probability. The biggest being form when the agent is not using any samples
whatsoever to just 10.
I did run the first 50 boards with 50 Monte Carlo samples without outputting the same statistics.
When I viewed the scores they averaged higher than these scores but not as great of an
improvement. The higher the number of samples, the closer the estimates are to the expected
value.
5. Interpretation
An advantage of the Monte Carlo agent is that it can be much faster than other techniques to
solve the wumpus world problem. That can also be a fault of Monte Carlo. There is no telling
exactly how long a solution will take. It can’t be deterministic since it relies on chance but it is a
bet that most are probably willing to take.
To reiterate, the agent did see a drastic improvement in score but taking more random samples
for probability. This allows me to consider this a successful strategy.
In the end, there were several boards the agent never did solve (48 was one). Technically, they
were all solvable. However, the joint probability of success was 1/8 or 1/16 in some simple
cases. I am not very surprise by this but it is interesting.
5
6. Critique
If I more time (and a significant amount of computing power) I would be interested in exploring
higher number of samples with the same boards and also the differences in scores and timing
between the Monte Carlo agent and the agent with using the knowledge base with Resolution
Theorem Proving.
7. Log