DYNAMIC PRICING & LEARNING IN ELECTRICITY MARKETS ...

Comment

Report 2 Downloads 177 Views

DYNAMIC PRICING & LEARNING IN ELECTRICITY MARKETS Alfredo Garcia (joint work with Enrique Campos and James Reitzes) Systems & Information Engineering University of Virginia

2

INTRODUCTION Power markets in SouthAmerica, Norway and New Zealand are predominantly hydro-based. Pricing rules for available water under centralized dispatch were well developed (using stochastic dynamic programming techniques) The opportunity cost of increasing current water usage thereby expanding current hydro generation is that less water is available for future electricity production In the new market structure it is necessary to model the strategic use of water. Price, essentially, is the control variable influencing the water levels and future profits available to hydro generators.

3

EXAMPLE Two identical generators with reservoir and production capacity equal to 1. Demand equals 1. Both generators receive inflows according to random variable w, where: ½ 1 with probability q w= 0 1−q

We posit a symmetric equilibrium. Let Vx,y denote firm i’s value function for the state (x, y), where x ∈ {0, 1} denotes firm i’s reservoir level, and y ∈ {0, 1} denotes its rival’s reservoir level. Letting β ∈ (0, 1) represent the discount factor, firm i’s value function for the state (0, 0) can be expressed as follows:

V0,0 = β [(1 − q)V0,0 + qV1,1] .

4

(1)

(1-q)2

(1,0)

q(1-q)

1/2(1-q)

(1,1)

(0,0)

q

q2

q(1-q)

5

1/2(1-q)

(0,1)

Next, consider the state, (0, 1):

V0,1 = β [(1 − q)V0,0 + qV1,1] .

(2)

Next, consider the state, (1, 0).

V1,0 = c∗ + β [(1 − q)V0,0 + qV1,1] .

(3)

Note that, by subtracting equation (1) (or (2)) from equation (3), it follows that:

V1,0 = c∗ + V0,0 = c∗ + V0,1. Lastly, consider the “competitive” state, (1, 1). Firm i’s value from withholding is as follows:

(p1,1 − ε) + β [(1 − q)V0,1 + qV1,1] .

6

(4)

Alternatively, firm i can bid ε above p1,1.

β [(1 − q)V1,0 + qV1,1] . In equilibrium;

p∗1,1 + β [(1 − q)V0,1 + qV1,1] = β [(1 − q)V1,0 + qV1,1] .

(5)

Rearranging equation (5), we obtain:

p∗1,1 = β(1 − q)(V1,0 − V0,1). Recognizing that V1,0 − V0,1 = c∗ from equation (4), we derive the following equation: p∗1,1 = β(1 − q)c∗. 7

A GENERAL FORMULATION Let Ki be a positive integer that denotes the storage capacity of the reservoir owned by player i ∈ {1, 2, 3, ..., n}. Q The state space is S = {0, 1, 2, ..., Ki}. i

Player i’s production capacity is denoted by xi (also a positive integer) in each period. We shall assume that players can only produce electricity whenever the water stored exceeds productive capacity. In each period, demand for electricity is denoted by D (a positive integer) and its assumed to be perfectly inelastic with respect to price. Bids are linear and can take any value in P =[0, c∗] where c∗ the maximum bid allowed or price cap imposed by the regulator. Given bids bi by players, the spot price for electricity p∗(b) (where b =(b1, b2, ..., bn) ∈ ΠiP) is set to equal the marginal generator’s bid. 8

Letting b[k] denote the k−minimum price bid and [k] ∈ {1, 2, 3, ..., n}, the associated player’s index, the spot price can be formally expressed as:

p∗(b) = b[m∗] where:

m∗ = arg min{k :

k X i=1

x[i] ≥ D}

Player’s i demand as a function of the bids and the state of the reservoirs s =(s1, s2, ..., sn) ∈ S in the system is:  p∗(b) and si ≥ x∗  x˜ bi = p∗(b) and si ≥ xi Di(b;s) = xi bi <  0 otherwise where x∗ =

m∗ P

i=1

9

X[i] − D

Moreover, since marginal costs are assumed to be negligible, the immediate resulting payoffs are of the form:

ri(b; s) = p∗(b) · Di(b; s)

The evolution of the reservoir is governed by a first-order stochastic difference equation: s0i = min[si − Di(b; s) + ξ i, Ki] where ξ i is a (non-negative, integer valued) random variable. To simplify notation in what follows we shall refer to the probability of reaching state s0 from state s after players bid b, as f (s0; b, s) . Finally, we assume a discount factor β ∈ (0, 1).

10

P.

Stationary Markovian Pricing Strategies. For every player i, a pure Markovian pricing strategy is denoted by π i : S 7→

A Markovian strategy combination π= (π 1, π 2, ..., π n) is a vector of Markovian pricing strategies, one entry for each player. Π is the set of all (pure) Markovian strategy combinations. Value function associated to strategy π :

Qπi (s)

11

= ri(π(s); s) + β

P

s0 ∈S

Qπi (s0) · f (s0; π(s), s)

A strategy combination π ∗ is a stationary MPE iff for every player i and every state s ∈S : π∗ Qi (s)

≥

(π i ,π ∗−i ) Qi (s)

where (π i, π ∗−i) is the strategy combination with player i bidding according to π i (instead of π ∗i ).

12

CHARACTERIZATION OF MPE Let Mi ⊂ S be defined as follows:

Mi = {s ∈ S | xi < si }

In the following discussion we restrict our attention to stationary Markovian pricing strategies π that have the following property:

s ∈ Mi =⇒ πi(s) = c∗

13

Valuation For s ∈ Mic and b ∈ P we have:

Viπ (s; b) =

Viπ (s) =

p∗((b, π −i(s)) · Di(b, π −i(s)) + P π 0 β Vi (s ) · f (s0; (b, π −i(s)), s) (1) s0 ∈S

sup[Viπ (s; b)]

(2)

b∈P

where (b, π −i(s)) stands for the strategy combination that equals π except at state s where player i bids b. Equation (1) determines the value of bidding b today assuming players will play according to π −i. Equation (2) summarizes the value of today’s best decision. 14

To Release or Not to Release ? (that is the question) If s ∈ Mic and Viπ (s; c∗) ≥ Viπ (s; b) for any b ∈ P , then player i will opt to withhold capacity. In other words, c∗ is a solution to problem (2) and we say that player i is supra-marginal. Similarly if:

Viπ (s; 0) ≥ Viπ (s; b)

for any b ∈ P , the inverse situation, player i would find that selling today at price p∗(π(s)) is optimal. In other words, bidding zero is a solution to problem (2) and we say that player i is infra-marginal. the indifference price for player i, pi(s, π) is the price that equates the

15

value obtained by releasing today or withholding.

pi(s, π) = P 1 Viπ (s0) · f (s0; (c∗, π −i(s)), s) xi β s0 ∈S

1 xi β

16

P

s0 ∈S

− Viπ (s0) · f (s0; (0, π −i(s)), s)

However, pi(s, π) may not lie in the feasible price set P . Therefore, one must define a truncated version:  if pi(s, π) ≤ 0 0 p˜i(s, π) = if pi(s, π) ≥ c∗ c∗  pi(s, π) otherwise

To complete our definition of indifference prices, we shall set the indifference price for the marginal bidder (or bidders) as follows:

p˜i(s, π) = p˜[m∗+1](s, π) − δ

for some δ > 0, where [m∗ + 1] is the index associated with first supra-marginal bidder (as defined above) under strategy combination π at the given state s. Theorem 1: Strategy “always bid your indifference prices” is a Markov Perfect Equilibrium, i.e. if π is of the form:

π i(s) = p˜i(s, π) then π is a Markov Perfect Equilibrium.

17

LEARNING DYNAMIC EQUILIBRIUM We start with a stationary policy c∗ c∗ 0 π : S → {0, , 2 , . . . c∗}n, M M that maps each state into the bids made by the players in the past. Note that the c∗ c∗ price interval [0, c∗] is discretized to the set P = {0, M , 2M , . . . , c∗}, for some integer M . (p,π 0−i ) ˜ (s) of the value functions for each Each player i has an initial estimate Vi state s, and price p ∈ P . Here (p, π 0−i) denotes the policy where all other players use policy π 0, but player i uses price p when state s is visited, and π i(s) at every other state.

18

At stage k of the algorithm, each player i proceeds as follows. (1) Using simulation, the player obtains a raw estimate of the value function (p,π k−i ) ˆ Vi (s) for each state s, and price p ∈ P .

(2) Corrects its estimate using previous estimates, i.e. k

k−1

(p,π (p,π ) V˜i i (s) = (1 − αk ) · V˜i i

for all states s.

)

k−1

(p,π (s) + αk · Vˆi i

)

(s),

(3) The estimate of the indifference prices is computed by ¹ º ∗ k k 1 ˜ (c ,π−i) (0,π ) k pˆi(s, π ) = · [Vi (s) − V˜i −i (s)] + p∗((0, π k−i(s)) xi for each state s where bxc is the closest point in {0, x ∈ [0, c∗].

19

c∗ c∗ ∗ M , 2M , . . . , c }

to

Price for (1,1)

Price for (1,2)

1.2

1.2

Player 1 Player 2 1

1

0.8

0.8

0.6

0.6

0.4

0.4

0.2

0.2

0

0

100

200

300

400

500

600

700

800

900

0

1000

0

100

200

300

400

Price for (2,1)

500

600

700

800

900

1000

600

700

800

900

1000

Price for (2,2)

1.2

1.2

1

1

0.8

0.8

0.6

0.6

0.4

0.4

0.2

0.2

0

0

100

200

300

400

500

600

700

800

900

1000

0

0

100

200

300

400

500

Computational Tests : Example for n=2, K=2, q = 1/3 20

21

Theorem 2: If π ≥ π 0 then p˜i(s, π) ≥ p˜i(s, π 0), for every player i and any given state s then d(π k , Π∗) → 0 with probability 1.

22

Recommend Documents

Dynamic Learning in Strategic Pricing Games

On Minimum-Uplift Pricing for Electricity Markets. - Harvard Kennedy ...

dynamic pricing