Extreme points pruning - Dynamic approach

3.2 Dynamic approach

3.2.3 Extreme points pruning

Number of facets and extreme points grows exponentially due to the depth of the tree. Not all the extreme points (and possibly facets) are however needed, some strategies are just dominated and can be pruned. In terms of those extreme points, let us consider the set of all extreme points of all facets, for which the best response of the follower in the computed node (just one action, not considering children) is the same. We will denote this set D. In terms of simplex, any linear combination of those points is reachable for the leader (because the best response of the follower will not change) as long as it doesn’t lower the follower’s utility, therefore if we find an extreme point E1∈Dsuch that there is a linear combination of two other points fromDthat offers the same or better utility to both players, E1 can be pruned. If we use follower’s and leader’s utility of extreme points fromDas the coordinates and plot those points into a graph along with their linear combinations, only points from the upper envelope can be preserved without loss of any possible solution to the whole problem (See Figure 3.1.). If all extreme points of an facet are pruned, that facet is pruned as well.

16CHAPTER 3. COMPUTING STRONG STACKELBERG EQUILIBRIUM IN SEQUENTIAL GAMES

Figure 3.1: Example of extreme points pruning

3.2. DYNAMIC APPROACH 17

Algorithm 2: Baseline algorithm for Solving the Extensive form Stackelberg games with perfect information and concurrent moves

1 function DynamicApproach(maxDepth, N odesByLevel);

Input :int maxDepth- maximal depth of the Tree

Input :N odesByLevel[][] - all nodes in the game-tree grouped by the their depth in the tree

Output:Value of the game

2 i←− maxDepth;

3 whilei≥0 do

4 N odesAtLevel←−N odesByLevel[i];

5 forallN ode n∈N odesAtLevel do

11 forallf ollowerAction f a∈ A do

12 FacetSetListF;

18 forallFacetCombination c ∈ C do

19 LP ←−createLPfromFacets(c);

20 P oints←− LP.transformToPolytope().getExtremePoints();

21 P oints.fillUtilityValues();

Example: Now we will show the simple example of how the LPs are created. Let us consider a game in node n, in which Al(n) = {a, b}, Af(n) = {c, d}. We will denote q(n, a, c) = s1, q(n, b, c) = s2, q(n, a, d) =s3 and q(n, b, d) = s4. There will be 7 facets and 10 extreme points in this example, as depicted in 3.1.

AsU_f values of a sub-tree correspond to the minimal utility the follower can get in that sub-tree,Uf(s1) =−1,Uf(s2) = 0,Uf(s3) =−2andUf(s4) =−4. Because the follower can play 2 actions, there will be just one best response constraint in every LP.

Now, we will consider actioncto be follower’s best response and create a linear program

18CHAPTER 3. COMPUTING STRONG STACKELBERG EQUILIBRIUM IN SEQUENTIAL GAMES

sub-tree Facet Extreme point leader’s utility value follower’s utility value

s₁ f₁ e₁ 0 1

Table 3.1: Example: utility values of extreme points along with their membership to facets and sub-trees

for each of 4 facet combinations from sub-trees s₁ and s₂. Starting with facets f₁ and f₃, their LP will have following form:

This will produce polytope with 3 extreme points, first for p(a) = c(e₁) = 1, second for p(a) = c(e2) = 1 and the third for p(b) = c(e4) = 1, (which correspond to leader playing one of his pure strategies). Therefore, newly generated facet f(f1, f3) will have 3 extreme points with utility values of both players equal to utility values of points e₁,e₂ and e₄ (for the summary of which extreme points and facets were created, see Table3.2).

Now we move to LP for combination of facets f₁ and f₄. The LP will have following form:

0≤p(a)≤1 0≤p(b)≤1 0≤c(e₁)≤1 0≤c(e₂)≤1

3.2. DYNAMIC APPROACH 19

The Polytope of this LP will have just 2 extreme points, again corresponding to pure strategies of the leader. So new facet f(f2, f3) will contain 2 extreme points.

The final combination of facets for action cas best response is the combination of facets f₂ andf₄.

20CHAPTER 3. COMPUTING STRONG STACKELBERG EQUILIBRIUM IN SEQUENTIAL GAMES

This will produce polytope with 3 extreme points, once again just copying utility values of extreme pointse3,e5 and e6 to new extreme points in facetf(f2, f4).

That is all for the actioncof the follower, so now we will move to LPs for actiondbeing the best response of the follower. We start by combining facetsf5 andf6, which will create a following LP:

This LP is clearly infeasible, which means that this combination of facets will never correspond to any best response of the follower. No facet or extreme points will be generated by this LP, so we move to the last combination of facets,f5 and f7:

This LP will yield 3 extreme points for the new facet f(f5, f7), one of which just copies utilities values of extreme point,e₁₀, but second one corresponds to values ofp(b) = 1, c(e₉) = 1/3, c(e10) = 2/3 which will create new extreme point e11 with Ul(e11) = 1/3 (−2) + 2/3 (−1) = −4/3 and U_f(e11) = 1/3 (−4) + 2/3 (2) = 0. The last extreme point e12

corresponds to p(a) = 2/3, p(b) = 1/3, c(e₈) = 2/3, c(e₁₀) = 1/3, therefore U_l(e₁₂) = 2/3 (2) + 1/3 (−1) = 1and Uf(e12) = 2/3 (−2) + 1/3 (2) =−2/3.

Now that all extreme points are calculated, begins the pruning phase. For action c, all the extreme points were just copied twice into new facets, therefore one copy of each will surely get pruned. Also, since extreme point e5 fares better (it has better leader’s utility value for the sam follower’s utility value) than pointse1 ande4, all copies of those two points will be pruned as well. Facetf(f₂, f₄) will be empty after the pruning and will be removed.

For action d, linear combination of copy of pointe10and e12 (with coefficients 3/4 and 1/4) fares better than the point e11, therefore e11 will be pruned from the facet f(f5, f7). The solution (4 facets, 6 extreme points) after the pruning is written in Table 3.3.

3.2. DYNAMIC APPROACH 21

follower’s action Facet leader’s utility value follower’s utility value

c f(f₁, f₃) 0 1

c f(f₁, f₃) -1 2

c f(f₁, f₃) 1 1

c f(f₁, f₄) 0 1

c f(f1, f4) -1 2

c f(f1, f4) 2 1

c f(f1, f4) 3 0

c f(f2, f3) 1 -1

c f(f2, f3) 1 1

c f(f2, f4) 1 -1

c f(f₂, f₄) 2 1

c f(f₂, f₄) 3 0

d f(f₅, f₇) 1 -2/3

d f(f₅, f₇) -4/3 0

d f(f₅, f₇) -1 2

Table 3.2: Example: facets and extreme points generated by the LPs

follower’s action Facet leader’s utility value follower’s utility value

c f(f1, f3) -1 2

c f(f1, f4) 2 1

c f(f1, f4) 3 0

c f(f2, f3) 1 -1

d f(f5, f7) 1 -2/3

d f(f₅, f₇) -1 2

Table 3.3: Example: Generated facets and extreme points after the pruning phase

22CHAPTER 3. COMPUTING STRONG STACKELBERG EQUILIBRIUM IN SEQUENTIAL GAMES

Figure 3.2: Example: visualization of pruning phase for follower’s action c

3.3 Computing the lower bound on the value of Stackelberg

In document ZADÁNÍ DIPLOMOVÉ PRÁCE (Stránka 30-37)