# Reward matrix, Dominated strategies, Optimal strategies

Please state all your assumptions and show all your work. Define your decision variables clearly. Briefly explain your constraints and objective functions. Define all units of measure (e.g., hours, $, $/hour, etc.). Explain what software package you used (e.g., LINDO, LINGO, EXCEL solver, etc.). For EXCEL solver, be sure to give a separate statement of the formulation /input. Use an equation editor for the equations. If you can not get one, then use subscripts to indicate indexing. Graphs should be detailed and easy to read.

A soldier can hide in one of five foxholes (1, 2, 3, 4, or 5) as shown below. A gunner has a single shot and may fire at any of the four spots A, B, C, or D. A shot will kill a soldier if the soldier is in a foxhole adjacent to the spot where the shot was fired. For example, a shot fired at spot B will kill the soldier if he is in foxhole 2 or 3, while a shot fired at spot D will kill the soldier if he is in foxhole 4 or 5. Suppose the gunner receives a reward of 1 if the soldier is killed and a reward of 0 if the soldier survives the shot.

[1] A [2] B [3] C [4] D [5]

a) Assuming this to be a zero-sum game, construct the reward matrix (payoff table).

b) Find and eliminate all dominated strategies.

c) We are given that an optimal strategy for the soldier is to hide 1/3 of the time in foxholes 1, 3, and 5. We are also told that for the gunner, an optimal strategy is to shoot 1/3 of the time at A, 1/3 of the time at D, and 1/3 of the time at B or C. Determine the value of the game to the gunner.

d) Suppose the soldier chooses the following non-optimal strategy: 1/2 of the time hide in foxhole 1; 1/4 of the time hide in foxhole 3, and 1/4 of the time, hide in foxhole 5. Find a strategy for the gunner that ensures that his expected reward will exceed the value of the game.

e) Write down each player's LP and verify that the strategies given in part (c) are optimal strategies.

© BrainMass Inc. brainmass.com October 25, 2018, 3:10 am ad1c9bdddfhttps://brainmass.com/math/matrices/reward-matrix-dominated-strategies-optimal-strategies-329354

#### Solution Preview

Please see the attached files for solution to given problems.

The only assumption in the problem is that it is ...

#### Solution Summary

Define your decision variables clearly. Briefly explain your constraints and objective functions. Define all units of measure (e.g., hours, $, $/hour, etc.).

examples of matrix and bimatrix games in game theory

CB600 Game Theory

CHAPTER ONE. GAME THEORY

1. Introduction

Game Theory is the mathematical modelling of situations of conflict and competition. These occur in many areas of life: we can think of competing firms in business or conflicts between countries to be resolved by diplomacy or by military means or simple contests of skill or chance between game-players. The common feature is that each side (called a player) has a number of possible alternative courses of action, and the actions taken by all the players will determine the final outcome for all players.

A game can be defined as = I, S, P, where

- I = {1, 2, ..., k} is the set of players,

- S = S1 S2 ... Sk is the set of all situations, where Si is the set of strategies of player i,

- P = S k, P = (P1, P2, ..., Pk) is the pay-off function, with Pi being the individual pay-off function of player i.

We elaborate on these concepts below.

Players are the participants of the game - companies, countries, generals, gamblers. The number of players influences the complexity of the game. There is not much we can do with one player. We shall mostly deal with two players, but will also look at the case of more than two players - this is more exciting as e.g. two players can cooperate against a third one.

Each player can choose from a number of strategies. These may be simple decisions (courses of action) or more complicated rules, detailing how each individual action from the opponent(s) should be answered. So, a simple strategy is "bid £1M for this contract" and a more complicated one is "I bid £1000 for the painting, if anyone bids a higher amount x, I bid 1.5x-500, repeat until won the bidding but stop at £5000". Once all players have made their choices, we arrive at a situation, also known as the outcome.

Payoffs are also known as rewards. Payoffs may be monetary gains or expressed in terms of utilities. Very briefly, the utility of a situation to a player is a measure of desirability. For example, the possible outcomes can be ranked by a player on a 1 to 10 scale; or we can say that winning a game has utility 1 and losing has utility 0. We do not go into details here, but you may read e.g. my CB586 notes or section 13.2 of Winston.

2. Games against Nature

The simplest examples in Game Theory involve only one genuine player. The player must make some decision aiS1 and an event ejS2 then occurs influencing the player's reward rij. For examples, the player may be a peasant planting some seed in the spring and the event is the weather until harvest-time. The player's payoff rij then is the profit made at harvest and is clearly dependent on both the type of seed sown and the weather. Another example is a gambler who makes a bet on the roulette; the event is the number on the wheel and the payoff is the money gained. If we assume that such an event is controlled by someone (Nature, God) then such decision theory problems can be viewed as games against nature, with Nature being the second player and the set of possible events being S2. (P2 is superfluous.) You may have encountered such problems previously (e.g. in CB586).

Consider the example of a newsagent who sells the monthly journal Obscure Games and Puzzles. Each month, the likely demand is for 2, 3 or 4 copies. The newsagent buys the journal from the publisher for £2 and sells it to customers for £3. Unsold copies are thrown away at the end of the month. How many copies should the newsagent order?

Let us introduce the concept of dominated decisions. A decision ai is dominated by another one ah, if for each event ej, rij rhj and for at least one event ek, rik < rhk. It can easily be shown that ordering 0 or 1 or more than 4 copies are all dominated decisions. Dominated decisions can then be eliminated from a player's set of strategies.

There is no unique answer to the newsagent's question. A number of so-called decision criteria are in use to provide guidance.

- The maximin criterion chooses the action ai with the largest value of minjS rij. According to this criterion, the newsagent should buy 2 copies.

- The maximax criterion chooses the action ai with the largest value of maxjS rij. According to this criterion, the newsagent should buy 4 copies.

- The expected value criterion chooses the action ai with the largest value of j pj rij, where pj is the probability of event ej occurring. According to this criterion, if all events are equally likely, the newsagent should buy 2 or 3 copies.

Other criteria are also used (see e.g. CB586 notes).

See Winston, section 13.1 for more details and further examples on this topic.

3. Matrix Games: Pure Strategies

In the following two sections we shall focus on a very simple class of games called matrix games. A game is called a matrix game, if

- there are only two players (1 and 2),

- there are only a finite number of strategies available to the two players, i.e. both S1 and S2 are finite sets, and

- for every situation S, the pay-offs of players 1 and 2 add up to a constant K, i.e. P1() + P2() = K (if K = 0, the game is called zero-sum, in general constant-sum).

The reason for calling such games matrix games is that the payoffs can be tabulated in a single matrix. Strategies of player 1 correspond to rows and those of player 2 to columns. Entries of the matrix are the corresponding payoffs of player 1. While player 1 will try to maximise the entry, player 2 will try to minimise it as this corresponds to maximising his/her gain. It is not necessary to tabulate player 2's payoffs, as they can be calculated using P2() = K-P1().

Another important feature of matrix games is that players are in total conflict. Every gain made by one of the players is at the expense of the other. Thus, cooperation between the players cannot occur.

The basic assumption about how such games are to be played was formulated as follows by János Neumann and Oscar Morgenstern. Each player chooses a strategy that enables him to do the best he can, given that his opponent knows the strategy he is following. (Note that it is possible to conceive that players do not know the strategies of others: we may look at this at the end, if there is time.)

In some cases, there is a very simple solution procedure to matrix games, called a solution in pure strategies. This solution will indicate which decision (row) player 1 and which decision (column) player 2 should choose. We calculate the minima for all rows and take the largest of them (denoted by v1). We also calculate the maxima for all columns and take the smallest of them (denoted by v2). If v1 = v2, then there exists a solution in pure strategies. The intersection(s) of the row(s) yielding v1 and the column(s) yielding v2 is/are called saddle point(s) of the matrix. (Also known as equilibrium points - if you have already learnt multi-variable optimisation you should be familiar with the concept of a saddle point in Calculus.) The corresponding entry (entries) will have value v (= v1 = v2), called the value of the game (to player 1). Otherwise it can be shown that v1 < v2 and in this case no solution exists in pure strategies; in this case we have to resort to other methods, some of which will be explained in the next section.

Consider the reward matrices and . The former has a saddle point, player 1 should choose row 3 and player 2 should choose column 2, the value of the game is 5. The latter does not have a saddle point and thus no solution in pure strategies.

Consider the example of two competing television networks. Between them, they have a guaranteed 10 million viewers. In a given time slot, they both plan to broadcast either a western or a soap opera or a comedy. Notice how the basic assumption is fulfilled here: neither of the networks knows until they make their own decision what the other one will broadcast but they know that it will be one of three genres and nothing else. The payoff to a network is the number of viewers; the table below shows P1 with P2 being 10-P1(). It can easily be checked that this game has a saddle point: network 1 should schedule a soap opera and network 2 should show a western. The value of this game is 4.5 to network 1 and 5.5 to network 2. (Observe that "comedy" is a dominated action for network 2.)

network 2

western soap opera comedy

network 1 western 3.5 1.5 6.0

soap opera 4.5 5.8 5.0

comedy 3.8 1.4 7.0

We can extend the above game as follows. For the sake of simplicity, let us consider that the networks have only the choice of western or soap opera. They must decide what to show in a slot this week and the same slot next week. Their decision for the second week can be influenced by the other network's choice for the current week. If a network shows the same genre twice, they lose 1m viewers to the other one. (If they both repeat genre, the losses cancel out.) Note that if they needed to decide for two weeks in advance, the solution would be the same as the one-week game above. Similarly, if there was no penalty for repeating genre, the networks would make the same decision in both weeks.

The strategies of each network are then as follows:

1. show western both weeks;

2. show western, if the other shows western, show western next week otherwise show soap;

3. show western, if the other shows western, show soap next week otherwise show western;

4. show western, then show soap;

5. show soap both weeks;

6. show soap, if the other shows western, show western next week otherwise show soap;

7. show soap, if the other shows western, show soap next week otherwise show western;

8. show soap, then show western.

The payoff table is shown below.

strategy of network 2

1 2 3 4 5 6 7 8

strategy of network 1 1 7.0 7.0 4.0 4.0 3.0 4.0 3.0 4.0

2 7.0 7.0 4.0 4.0 8.3 6.0 8.3 6.0

3 9.0 9.0 9.3 9.3 3.0 4.0 3.0 4.0

4 9.0 9.0 9.3 9.3 8.3 6.0 8.3 6.0

5 9.0 9.3 9.0 9.3 11.6 11.6 9.3 9.3

6 9.0 6.0 9.0 6.0 11.6 11.6 9.3 9.3

7 9.0 9.3 9.0 9.3 8.3 8.3 9.3 9.3

8 9.0 6.0 9.0 6.0 8.3 8.3 9.3 9.3

This game has a solution in pure strategies - network 1 shows show soap opera twice and network 2 shows western twice.

See Winston (section 14.1.) for further information on this topic.

View Full Posting Details