Topics: Markov Decision Process


(definition)

In the context of a Markov decision process (MDP), a policy is a mapping of every one of the MDP’s states to a a specific decision.

In other words, a policy tells us which decisions are taken on which state.

Policies are normally denoted by , while the set of possible policies in a given MDP is denoted by .

We can denote the decision taken in a state according to the policy with :