Topics: Markov Decision Process
(definition)
In the context of a Markov decision process (MDP), a policy is a mapping of every one of the MDP’s states to a a specific decision.
In other words, a policy tells us which decisions are taken on which state.
Policies are normally denoted by , while the set of possible policies in a given MDP is denoted by .
We can denote the decision taken in a state according to the policy with :