Too Long; Didn't Read
The value function is an efficient way to determine the value of being in a state. In a game of tic-tac-toe, getting 2 Xs in a row does not win the game, hence there is no reward. The value of state A is the sum of all next states’ probability multiplied by the reward for reaching that state A. In this case, a state A has a chance of winning the game by placing it at the top of a row. A state D is a state D with only 1 possible route to state E, since the only outcome is to receive the reward.