Written Problems 8

(From http://www-nlp.stanford.edu/~grenager/cs121//handouts/hw3.pdf)
Consider the classic game of Pac Man (http://en.wikipedia.org/wiki/Pacman in case you are too young to remember and/or have been living under a rock for the last 20 years ;)). Our goal in this problem is to model the Pac Man agent using an MDP. Assume we will only play a single board.

(a) What are the set of possible states s 2 S? In other words, what does each s specify? How many possible states are there, for the full game, with 250 board locations, 250 dots, 4 ghosts, and Pac-Man himself?

(b) What are the set of actions (assume we are controlling Pac-Man)?

(c) Give two different ways to specify the reward function. What are the advantages and disadvantages of each?

(d) Given that Pac-Man’s moves seem to be deterministic (i.e., he always goes where he wants to go) is the transition model (between states) deterministic? Stated differently, under what conditions would the transition model be deterministic?
AIMA Ex. 17.17