2011년 1월 25일 화요일

Markov Decision Process

BKX (1995) of Transportation Science is about FILM based on probabilistic flows. Since its direct formulation is non-linear, they applied Markov decision process (MDP) to re-formulate the model as a mixed-integer programming.

MDP a discrete time stochastic process. Wiki (http://en.wikipedia.org/wiki/Markov_decision_process) explains it as follows:

At each time step, the process is in some state s, and the decision maker may choose any action as. The process responds at the next time step by randomly moving into a new state s', and giving the decision maker a corresponding reward Ra(s,s'). that is available in state

The probability that the process chooses s' as its new state is influenced by the chosen action. Specifically, it is given by the state transition function Pa(s,s'). Thus, the next state s' depends on the current state s and the decision maker's action a. But given s and a, it is conditionally independent of all previous states and actions; in other words, the state transitions of an MDP possess the Markov property.

Markov decision processes are an extension of Markov chains; the difference is the addition of actions (allowing choice) and rewards (giving motivation). Conversely, if only one action exists for each state and all rewards are zero, a Markov decision process reduces to a Markov chain.


 

This stochastic version of FILM, however, may not be able to consider deviation of consumers at least with current structure of formulation. Because it only looks at the turning probabilities at each node to its incident node and d

댓글 없음:

댓글 쓰기