Multiagent Cooperative Search for Portfolio Selection - Semantic Scholar

Report 4 Downloads 108 Views
Multiagent Cooperative Search for Portfolio Selection David C. Parkes

Computer and Information Science Department University of Pennsylvania Philadelphia, PA 19104 [email protected]

Bernardo A. Huberman Internet Ecologies Group Xerox Palo Alto Research Center Palo Alto, CA 94304 [email protected]

Abstract We present a new multiagent model for the multiperiod portfolio selection problem. Individual agents each receive a share of initial wealth, and follow an investment strategy to adjust their portfolio as they observe movements of the market over time. The agents share their wealth at the end of the nal investment period. We show that a system of diverse agents will outperform a single agent in a simple stochastic market environment. Furthermore, a cooperative multiagent system, with a simple communication mechanism of explicit hint exchange between agents, achieves a further increase in performance. Finally we show that communication is redundant in a more realistic market that satis es the constraints between volatility and return implied by the Capital Asset Pricing Model.

1 Introduction Portfolios are an e ective way of increasing expected long term return while decreasing risk when investing in the stock market (Markowitz 1959). The portfolio selection problem, of choosing an optimal set of stocks in which to invest, has received considerable attention in both the nancial (Campbell, Lo, & MacKinlay 1997; Cover 1991) and statistics literature (Samuelson 1969; Cover & Gluss 1986; Algoet & Cover 1988; Cover & Ordentlich 1996). We introduce a new multiagent model for portfolio selection that builds on a recent computationally-ecient portfolio selection strategy with a worst-case performance guarantee (Helmbold et al. 1998). We present a quantitative assessment of the performance of our multiagent model in simulated stock markets. The multiagent model for portfolio selection assumes a system of bounded-rational cooperative agents that pool their initial wealth, manage a share of that investment each, and then pool their nal wealth. The agents use a myopic algorithm to change their portfolio between investment periods, based on the current market prices and their current portfolio. We later allow the agents to communicate through an explicit exchange of the recent performance of individual portfolio selection strategies. An agent can switch to the portfolio strategy of the agent that has been performing best in the recent past. This simple mechanism of hint exchange has enabled exponential performance improvements in other cooperative problem solving domains (Huberman 1990; Clearwater, Huberman, & Hogg 1991). We present the results of a quantitative assessment of the performance of our multiagent portfolio selection model in a simple stochastic market that show that: (a) a system of diverse agents will outperform a single agent; (b) a system of agents can further improve its performance through a simple communication mechanism of hint exchange between agents and strategy switching. Finally, we show that hint exchange and strategy switching is redundant in a market model that satis es the Capital Asset Pricing Model (CAPM). The CAPM is an equilibrium market model that places 1

constraints on the volatility of stock dynamics and quanti es correlations between price movements of individual stocks. Intuitively, additional communication between a small group of agents with sophisticated portfolio selection strategies is less relevant in an ecient market with price dynamics that re ect implicit communication between many agents. The rationality of agents within a real market is constrained by the amount of information that is available. In a large stock market, with many diverse stocks, there is simply too much information and too many strategies for any one investor to analyze. There are many stocks and other nancial instruments, and prices are but one piece of a complex web of information that is available. Every agent in our simulation has access to all of the information that is available (the stock prices) and could (with knowledge of the strategies of other agents in the system) simulate a system of agents, and achieve the system performance without any communication. However we constrain the rationality of our agents to re ect the constraints that are placed on investors in real markets. We assume simple agents that are unable to simulate the investment strategies of other agents, and can bene t from an exchange of information on the recent performance of the strategies of other agents, given the option of switching to another strategy.

2 The Portfolio Selection Problem In this section we specify two di erent multiperiod portfolio selection problems: model-based and model-free portfolio selection. The problems di er in the assumptions that we make about the price dynamics and the information available to agents. Consider a market of N stocks. The change of price of stock i in a single investment period is represented by its price relative, xi , where xi is the ratio of closing price to opening price over the investment period, non-negative by de nition.PA portfolio over N stocks is represented as a portfolio vector w = (w1 ; : : : ; wN ), where wi  0 and Ni=1 wi = 1. An agent that holds portfolio w invests a fraction wi in stock i. The wealth of an agent with portfolio w increases (or decreases) by its return P on investment, w  x = Ni=1 wi xi , given price relatives x. In multiperiod portfolio selection the price relatives in period t are denoted xt , and a portfolio in period t is denoted wt . The return on investment over T investment periods for an agent that chooses a sequence of portfolios fwT g = w1 ; : : : ; wT , given a sequence of stock price relatives, fxT g = QT 1 T x ; : : : ; x , is R = t=1 wt  xt . The goal of multiperiod portfolio selection is to select a sequence of portfolios fwT g to maximize some measure of performance over nal return on investment. The portfolio selection problem is hard because it is an online problem, the future inputs are unknown and an agent must choose a portfolio wt for investment period t without knowledge of the price relatives xt . The oine portfolio selection problem, with knowledge of the sequence of stock prices, is trivial (but unrealistic) { the optimum portfolio selection strategy with hindsight switches all investment at the start of each period into the stock that shows the greatest return in that period. There are three common types of analysis for online algorithms (Irani & Karlin 1997): (1) Average-case analysis. This requires a statistical model of the inputs; (2) Worst-case analysis. This is not informative for portfolio selection because any algorithm will have bad worst-case performance, consider stock prices chosen to make the stocks held in each investment period devalue; (3) Competitive analysis. This measures the worst-case performance of the online algorithm with respect to an optimal oine algorithm. In competitive analysis the performance of the optimal oine algorithm compensates for the diculty of the input sequence, often allowing more informative analysis than direct worst-case analysis. We only ask than an algorithm performs well relative to the diculty of the input. Given a particular performance measure, an online algorithm is strongly competitive with an oine algorithm if it achieves the maximum possible minimum ratio of performance with respect to the oine algorithm over all input sequences (Irani & Karlin 1997). In model-based portfolio selection agents have access to a statistical model of stock price dynamics. This allows average-case analysis of the performance of di erent portfolio selection strategies, for example within a traditional expected utility framework. Agents can determine an optimal portfolio strategy through direct optimization. In model-free portfolio selection agents have no statistical model of stock dynamics, and stock prices can be arbitrary sequences with no underlying statistics. 2

In model-free portfolio selection the optimality of a portfolio selection strategy is assessed within a framework of competitive analysis. In Section 2.2 we present a simple algorithm that has been developed for the model-free portfolio selection problem. The algorithm provides performance that is strictly competitive with a restricted class of o -line algorithms. In this paper we assume that individual agents do not have access to a statistical model of the stock market prices. However, we assess the performance of our multiagent portfolio selection systems on simulated stochastic models, and this enables us to perform economically signi cant average-case analysis of the performance of our portfolio selection systems.

2.1 Model-based Portfolio selection

The model-based multiperiod portfolio selection problem chooses a sequence of portfolios fwT g to maximize the expected utility over end-period return on investment, given a statistical model of stock price dynamics. An optimal model-based portfolio selection strategy for an agent with utility function U (R) over end-period return on investment solves: "

T Y

E T U fmax wT g fx g

t=1

wt  xt

!#

(1)

where the maximization is taken over all possible sequences of portfolio vectors in 1 has a higher expected return than the market, but also a higher risk, while a stock with i < 1 has a lower expected return, but also a lower risk. In our generative multivariate Normal distribution CAPM model, CAPM augments the geometric Brownian motion model of stock prices with quanti ed correlations between the price relatives of stocks. The price relatives x = (x1 ; : : : ; xN ) are drawn according to a multivariate Normal distribution, X  N (; ), with mean  = (1 ; : : : ; N ) and covariance matrix  (Huang & Litzenberger 1988). We select the vector of mean returns, , and the covariance matrix between price relatives across stocks, , to generate a sequence of stock price returns and market portfolio returns (implied by stock price returns) with statistics that satisfy the linear relation between return and volatility. We assume that the risk-free rate of interest is zero.

5.2 Experimental Details

In our simulation of CAPM we do not explicitly model the e ect of the investment actions of the agents in our multiagent portfolio selection system on stock prices. We assume that the investment actions of the group of agents within our model are small with respect to the total trading volume in the market, and that the e ect of their trading actions on stock prices can be ignored. The prices continue to form exogenous inputs to our multiagent portfolio selection model. We generate stock prices oine, with statistics that satisfy a wider equilibrium market, and then test the performance of our investment model on the prices. The price relatives of the stocks are generated from a multivariate Normal distribution, X  N (; ), with a vector of means  = (1 ; : : : ; N ), and an N  N covariance matrix . We assign means i to each stock from the same distribution as in the Simple market, i  U (0:9995; 1:01). We then search for values in the covariance matrix that complete a multivariate Normal distribution that generates a sequence of T price relatives with statistics that closely approximate the CAPM model (8,9), such that the excess return of each stock over the market portfolio is proportional to its covariance with the market. The standard deviation of each price relative are constrained to be in the same range as for the Simple market, i 2 [0:0; 0:2]. This process is completed oine, and we compare the performance of all multiagent portfolio selection models on the same sequences of CAPM stock prices. The multiagent systems are initialized with the same distributions of initial portfolios and learning rates as for the simulations on the Simple market. The switching parameters and performance window size are the same for every agent within a system, and optimized for the number of agents, with switching probability p = 0:004 and window size  = 200 typical. We generate 2000 sequences of price relatives that satisfy CAPM, and test the performance of the multiagent portfolio selection systems over 2000 investment periods, for systems with between 1 and 800 agents.

5.3 Results

Figure 12 shows the performance of each multiagent portfolio selection model in a CAPM market, averaged over 2000 trials. The best constant rebalanced portfolio strategy in this market yields an expected end-period log wealth, Perf BCRP = 12:3, while the market achieves a performance of Perf Market = 5:31. Figure 13 (b) compares the nal wealth of a group of communicating and noncommunicating agents in the CAPM market, for a groups of 400 agents. While some of the results from the Simple market continue to hold in the CAPM market [1 { 3], a couple of the earlier results no longer hold [4, 5]. In particular, the main results of our quantitative analysis of the performance of multiagent portfolio selection in the CAPM market are: 19

8.4

Communicating

Mean End−Period Log Wealth

8.2 Adaptive

8 7.8 7.6 7.4 7.2

Non−adaptive

7 6.8 6.6 6.4 0 10

1

10

2

10 Number of Agents

3

10

Figure 12: The performance of the non-adaptive, adaptive non-communicating, and adaptive and communicating agents as a function of the number of agents, in a CAPM market. Perf BCRP = 12:3; Perf Market = 5:31. [1'] [2'] [3'] [4']

A single adaptive agent continues to outperform a single non-adaptive agent. A system of non-adaptive agents continues to outperform a single non-adaptive agent. A system of adaptive agents continues to outperform a single adaptive agent in this market. A system of communicating and adaptive agents now performs no better than a system of noncommunicating agents.3 For example, a system of 400 communicating and adaptive agents in CAPM underperforms as often as it overperforms a system of 400 non-communicating agents, and achieves approximately the same average nal wealth (Figure 13 b). [5'] The market portfolio no longer performs better than the multiagent portfolio selection systems, and even performs worse than a single agent with a random non-adaptive constant rebalanced portfolio.

5.4 Analysis

First, we see that the systems of adaptive agents continue to outperform the systems of non-adaptive agents and a single adaptive agent [1'],[2'],[3']. The two fundamental techniques of investing in diverse initial portfolios and allowing individual agents to adapt their portfolios remain complementary. However, we see the e ects of introducing additional structure between the return and volatility of stocks { in particular the market portfolio no longer outperforms the multiagent portfolio selection systems [5'], and the systems of communicating and adaptive agents do not outperform the systems of non-communicating agents [4']. To understand the di erence in performance of the market portfolio and the communicating agents between the Simple market and the CAPM market, we again consider the typical structure of the best CRP. Recall that in the Simple market the best CRP is often a single stock buy-and-hold strategy. The distribution of the value of the maximum component of the best CRP across a random 3 There is only weak support for rejecting the null hypothesis that the non-communicating and communicating systems of agents have the same performance, with a minimum signi cance level of around 0.3 for systems with 50 or more agents.

20

200

150

150

Frequency

Frequency

200

100

50

0 0

100

50

1 2 3 4 5 Wealth(Communicating) / Wealth(Independent)

6

0 0

1 2 3 4 5 Wealth(Communicating) / Wealth(Independent)

(a)

(b)

Figure 13: Distribution of the ratio of the nal wealth of 400 adaptive and communicating agents to 400

non-communicating agents, over 2000 market trials. (a) Simple Market Model. Communication improves the nal wealth in 75% of the trials, with an average wealth 1.47 times greater. (b) CAPM Market. Communication improves the nal wealth in 53% of the trials, with an average wealth 1.05 times greater.

sample of 2000 CAPM market trials shows that there is typically one stock with around 50% of the investment, but we rarely see a single stock receiving more than 80% of the investment (Figure 10 a). This explains why the market portfolio, that can only achieve a performance that is competitive with the best single stock, does not perform as well in the CAPM market as it does in the Simple market [5']. Paradoxically, although the CAPM market has more structure than the Simple market, the investment problem for an agent with a randomly initialized universal portfolio selection algorithm appears to be easier. In the Simple market model the best CRP is typically at a corner of the simplex of portfolio vectors with non-negative components that sum to one, and a long way from the initial location of the e ective portfolio of a set of adaptive agents with random portfolios drawn from the Dirichlet(1=N; : : : ; 1=N ) distribution. By comparison, the average distance of the e ective initial portfolio of a system of 200 agents in the CAPM market is 0.50, compared to 0.81 in the non-CAPM market (see Figure 10(b) and the initial distances in Figures 9 and 14). We can see this e ect clearly in Figure 14, that compares the rate of convergence of the e ective portfolio of each multiagent system towards the best CRP, averaged over 2000 market trials. In the CAPM market the market portfolio is not very e ective at selecting the best CRP, and selects a portfolio that is initially worse than that of a system of non-adaptive agents. The system of communicating agents initially select a portfolio that is a better approximation to the best CRP than the system of non-communicating agents, but the system of non-communicating agents closes the gap towards the end of the period of investment. The adaptive agents in the CAPM market Table 1: Correlation of nal wealth with the best CRP wealth over 2000 trials in the Simple and CAPM markets, for multiagent systems with 200 agents.

Market Model

Investment Model Market NonAdaptive Communicating Portfolio adaptive agents and adaptive agents agents

Simple CAPM

0.9529 0.3893

0.2081 0.5871 21

0.3472 0.9414

0.4282 0.9647

6

0.9

Average Distance to wBCRP

0.8 0.7

Non−adaptive Adaptive Communicating Market

0.6 0.5 0.4 0.3 0.2 0.1 0

500

1000 Investment Period

1500

2000

Figure 14: The distance of the e ective portfolio of each multiagent system to the best

Constant Rebalanced Portfolio (CRP) computed with hindsight, averaged over 2000 market trials. Notice that the Market portfolio performs worse than the best CRP, and that the communicating agents converge at around the same rate as the non-communicating agents.

achieve a good performance without communication because their portfolios are initially closer to the best CRP, and most agents are able to perform well [4']. Another way to analyze the ability of a multiagent portfolio selection system to track the best CRP is to compare the correlation of the nal wealth from each system with the wealth of the best CRP. Table 1 shows the correlation of nal wealth with the best CRP for each multiagent system and market model. The results provide strong support for our earlier analysis of convergence of e ective portfolios to the best CRP. The adaptive agents are able to achieve a correlation of 0.9414 in return on investment with the best CRP strategy in the CAPM market, compared to only 0.3472 in the Simple market. Conversely, the market portfolio achieves a correlation in return with the best CRP of 0.9529 in the Simple market, but only 0.3893 in the CAPM market. The closer proximity of the best CRP to the initial e ective portfolio has a considerable e ect on the performance of the system, making universal portfolio selection easier. Finally, while the average number of agents that switch strategy and the average number of total strategy switches is the same in both markets, the average number of unique agents that are switched to is greater in the CAPM market than in the Simple market. For example, Figure 15 compares the average number of unique agents that are switched to by some agent in a system of 200 communicating agents in the CAPM market and the Simple market. In this example an average of 200 agents switch strategy at some point during the 2000 investment periods, and there are an average of 822 strategy switches over all investment periods, for both the Simple and the CAPM market. However the average number of unique agents that are switched to in the Simple market is 40, compared to 62 in the CAPM market. There are less agents switched to in the Simple market than in the CAPM market for the same number of total strategy switches. This shows that a smaller number of agents have the best strategy in at least one investment period in the Simple market. The agent with the best strategy in the CAPM market varies more over time because there are many more agents with similarly performing strategies. This provides direct evidence that strategy switching is more e ective in the Simple market. From a multiagent perspective the ine ectiveness of communication in the CAPM market is an interesting example of how the geometry of a search space can in uence the e ectiveness of parallel cooperative search techniques. The CAPM market model is derived under assumptions 22

100 90 80 CAPM Market

Frequency

70

Simple Market

60 50 40 30 20 10 0 0

20 40 60 80 Number of Unique Agents Switched To

100

Figure 15: Distribution of the number of unique agents switched to in 835 trials of the Simple and CAPM markets.

that investors hold homogeneous beliefs about the future statistics of stock prices. Communication between investors is implicit in the simulated stock prices, and one stock is less likely to stochastically dominate all other stocks. This makes hint exchange and strategy switching redundant because any one agent achieves a reasonably good performance.

6 Mean-Variance Analysis It is also interesting to compare the mean-variance eciency of the market portfolio with the more sophisticated multiagent portfolio selection strategies. In a market that satis es CAPM, such as the second set of simulated markets, all adequately diversi ed portfolios, including the market portfolio, will have approximately the same Sharpe Ratio (Sharpe 1970). The Sharpe Ratio for a portfolio is the ratio of excess expected return over the risk-free rate of return, to standard deviation in period-to-period return: (10) rp = Ep ? Rf p

where Ep is the expected per-period return from portfolio p, Rf is the risk-free rate of return (zero in our CAPM model), and p is the standard deviation in per-period return from portfolio p. Equivalently, a plot of the statistics of all ecient portfolios falls onto the Capital Market Line, a linear relation between excess return and variability:

Ep = Rf + rp

(11)

where r is the Sharpe ratio for all diversi ed portfolios. Figure 16 plots the expected per-period return (\Return") versus standard deviation in per-period return (\Variability") for each multiagent system, averaged over 2000 market trials. We plot one representative point for each type of portfolio selection model, because each multiagent portfolio selection model has almost identical return and variability for all group sizes greater than one. We also plot the return and variability characteristics for the best CRP computed oine. We see that the online portfolio selection models (non-adaptive, adaptive, communicating) all have statistics that fall approximately onto the Capital Market Line, a linear relationship between excess expected return and standard deviation in return. The Capital Market Line passes through (0, Rf ), where Rf , the return on a risk-free asset is zero in this case. This supports one of the main results of CAPM, that in equilibrium there exists a linear relationship between variability and expected return for ecient portfolios, and that the market portfolio is itself ecient. The Sharpe 23

0.01 0.009 0.008

Return

0.007 0.006 0.005 0.004 0.003

Non−adaptive Adaptive Communicating Market Best CRP

0.002 0.001 0 0

0.02

0.04 0.06 Variablity

0.08

0.1

Figure 16: Expected per-period return versus Standard deviation in per-period return for the systems of non-adaptive agents, adaptive agents, adaptive and communicating agents, market portfolio, and best CRP computed oine, averaged over 2000 trials. Each multiagent model has almost identical for group sizes greater than one, so we plot one representative point for each system. The multiagent systems have statistics that fall approximately onto the Capital Market Line. ratio of the best CRP is greater than the Sharpe ratio of the online portfolio selection models, consistent with its superior performance. The di erences between return and variability are still sucient to explain the di erences in performance between the market portfolio and the systems of adaptive agents. For example, the market portfolio has a per-period return distributed with  = 0:0077 and  = 0:0952, while a system of 400 adaptive agents has a per-period return distributed with  = 0:0064 and  = 0:0694. The distributions over log end-period return on investment for investment strategies with single period returns that are Normally distributed with these characteristics have means of 6.4 and 8.0 for the market and the adaptive agents respectively, and standard deviations of 4.3 and 3.1. Figure 17 plots the empirical distribution of nal log wealth for the market portfolio and a system of 400 adaptive agents in the CAPM market. The distributions are a good t with the performance that is predicted for an agent with Normally distributed single period returns with the measured mean and standard deviation. The adaptive agents are able to generate favorable return statistics through tracking the best CRP. They achieve a high correlation in return on investment with the best CRP and boost the performance at the tail of the wealth distribution over the market portfolio.

7 Related Work To the best of our knowledge this is the rst work that considers the performance of a system of adaptive agents for the portfolio selection problem. Blum and Kalai describe a randomized approximation to Cover's UNIVERSAL algorithm that invests in a set of random portfolios (Blum & Kalai 1997). A system of non-adaptive agents implements this randomized approximation. However Blum and Kalai do not also consider a set of adaptive agents with random initial portfolios. There has been previous work on using multiple heuristics to solve other search problems: sequential methods with possible restart (Selman, Levesque, & Mitchell 1992; Luby, Sinclair, & Zuckerman 1993; Johnson et al. 1989; Boese, Kahng, & Muddu 1994); parallel independent methods (Rao & Kumar 1992; Luby & Ertel 1993; Kau man & Levin 1987; Kornfeld 1981; Huberman, Lukose, & Hogg 1997); and cooperative parallel multiagent search (Knight 1993; Aldous & Vazirani 1994; Hogg & Williams 1993; Clearwater, Huberman, & Hogg 1991). A general theory predicts superlinear speedup in the performance of individual agents when the search methods are diverse and the agents are able to utilize information found in other parts of the search space (Huberman 1990). 24

100 Market Portfolio Adaptive Agents

Frequency

80

60

40

20

0 −5

0

5 10 15 Log(Final Wealth)

20

25

Figure 17: Distribution of nal log wealth of the market portfolio and a system of 400 adaptive agents in a simulated CAPM market.

Schaerf et.al. study communication within a multiagent system for load-balancing, where agents compete for resources, initially choosing a strategy on the basis of local information alone (Schaerf, Shoham, & Tennenholtz 1995). They show that when agents also communicate their beliefs on resource loading to other agents in the system, the system-wide performance falls. The key di erence between multiagent load balancing and multiagent portfolio selection is that the resources in load balancing are congestible, subject to capacity constraints: an uncongested resource with a light load that is used by a single agent will not remain uncongested when a large number of agents also select that resource. We assume that the investment actions of agents within our investment group have no e ect on the price of stocks: the prices are exogenous to the system, and many agents can follow the same portfolio selection strategy without adversely a ecting the value of that strategy. Schaerf et.al. nd that communication makes the performance of their multiagent system worse because local knowledge encourages diverse strategies and balanced resource-loading. In our stock market domain communication between the agents cannot decrease the performance of an existing strategy. The agent-based computational economics (ACE) literature includes studies of the dynamics of prices generated endogenously through the actions of many simple interacting agents (LeBaron et al. 1997; Arthur, Durlauf, & Lane 1997; Epstein & Axtell 1996). The goal is to build a market from the \bottom up" in order to understand the connection between simple agent actions and macro price dynamics. Although the CAPM has rarely been the subject of research in experimental economics, Levy (Levy 1997) and Bossaerts et.al. (Bossaerts, Kleiman, & Plott 1998) have recently reported results from CAPM experiments. Simple markets are created with the major properties of the CAPM model, and experiments conducted to test the statistical predictions of CAPM. We assume exogenous prices and do not model the e ect of the actions of our group of agents on market prices. In the simulated CAPM market we place our group of agents within the price dynamics of a larger equilibrated market.

8 Conclusions and Future Work In this paper we have introduced a new multiagent model for portfolio selection that mixes parallel search with hint exchange. The model assumes a system of bounded rational cooperative agents that pool their initial wealth, each manage a share, and then pool their nal wealth. Although we assess the performance of our multiagent portfolio selection systems on stochastic stock market models, we assume that the agents do not have access to these models. We also assume that the investment actions of the system of agents are small with respect to the complete market, and we treat prices as exogenous variables. We measure performance as the expected end-period log return. 25

The results show that a group of adaptive agents that share initial wealth and invest according to diverse strategies will outperform both a group of non-adaptive agents and a single adaptive agent that invests in isolation, in a simple market model. This shows that two fundamental approaches for approximating the best constant rebalanced portfolio can be combined { individual agent learning and multiagent diversi cation. Furthermore, we demonstrated that a system of adaptive agents where each agent reports the recent performance of its portfolio selection strategy and can probabilistically switch to the portfolio strategy that has recently been performing best, will outperform a system of non-communicating agents in a simple market model with no global structure to relate the expected return and volatility of each stock. We show that the performance of each portfolio selection system is directly related to its ability to select the best CRP, and furthermore that the statistical structure of the Simple market allows a market portfolio to perform well, makes individual-agent learning hard, and enables bene cial system-wide co-learning through communication and strategy switching. When the market statistics have more structure, such as in the CAPM market, we showed that an adaptive multiagent system will still outperform a system of non-adaptive agents and a single adaptive agent. However, communication between the agents becomes redundant, and communicating agents do no better than non-communicating agents. The portfolio selection problem is easier in the CAPM market because the best CRP invests more evenly across all stocks, and is closer in vector space to the initial e ective portfolio of a system of adaptive agents with random portfolios. The e ect of this is twofold: the system of adaptive agents is able to achieve good performance without communication (and additional communication and strategy switching does not help), and the market portfolio performs badly, although the return characteristics from all models do fall approximately onto the Capital Market Line. In future work we will investigate how the performance of our cooperative multiagent portfolio selection model scales with the number of stocks in the market space. We also propose further analysis of the micro- and macro-properties of the search algorithm that is implemented by the multiagent portfolio selection model, focusing at the micro-level at the occurrence and frequency of strategy switching between the agents, and at the macro-level on the eciency of the search algorithm through aggregate portfolio space. More advanced models of hint exchange and strategy switching could be tested. The portfolio selection problem is a general model that is applicable to many decision theoretic problems. These results are also applicable to economic approaches to hard computational problems, where it has been shown that a suitable portfolio of heuristics can improve the performance of programs for solving very hard problems (Huberman, Lukose, & Hogg 1997). In future work we will investigate the performance of a simple update rule that can discover a portfolio on the ecient frontier without explicitly modeling the performance distributions of individual heuristics. Individual problem solving agents that share good portfolio strategies will converge to a portfolio with the same optimal characteristics as the one we considered in this paper.

9 Appendix In this Appendix we prove two simple theorems, and then establish a number of optimality properties for the best CRP in a stationary stochastic market. For completeness we repeat some of the notation below. The best constant rebalanced portfolio (CRP) is computed oine to maximize return on investment:

wBCRP = arg max w Tlim !1

T Y t=1

w  xt

(12)

where, w = (w1 ; : : : ; wN ) represents a constant rebalanced portfolio across P N stocks, with investment wi maintained in stock i across all investment periods, wi  0, N wi = 1; xt = (xt1 ; : : : ; xtN ) represents the price relatives in period t, xti is the ratio of closing price to opening price of stock i in period t, and T is the number of investment periods. 26

Theorem 1. The portfolio that maximizes the expected single period log return in a market with non-negative, independent and identically distributed price relatives, lies on the ecient frontier. De nition. The ecient frontier is the set of all portfolios that have a smaller variance in single

period return on investment than all other portfolios with the same expected return on investment. We rst prove the following lemma: Lemma 1. Given two portfolios with the same expected single period return, but di erent period-to-period variance, the portfolio with the smaller variance has the larger expected log return for a market with non-negative, symmetric, independent and identically distributed price relatives. Proof. This follows from Jensen's inequality, which states that a concave function f : < ! < is characterized by the condition that Z

f (x)dF  f

Z

xdF



(13)

for any distribution F : < ! [0; 1]. We assume that the distribution F , that represents the distribution over price relatives, is non-negative, then (substituting f (x) = log(x)) E [log(X)]  log (E [X]) (14) Now, since E [X] = , we have E [log(X)]  log() (15) for all x drawn from F . When we also assume that the distribution F is symmetric, then E [log(X)] is strictly monotonic in the standard deviation, , of the distribution (proof omitted). Given that: (1) lim!0 E [log(X)] = log(); (2) E [log(X)]  log(); (3) E [log(X)] is strictly monotonic in standard deviation, then E [log(X)] must be a strictly decreasing function in . This completes the proof 2 Therefore the portfolio that maximizes the expected single period log return in a market with non-negative, independent and identically distributed price relatives, must lie on the ecient frontier because there can be no portfolio with the same expected single period return but a smaller variance

2

Theorem 2. A system of agents with universal portfolio selection strategies, that share initial

wealth, invest (without strategy switching), and pool nal wealth, has an universal overall portfolio selection strategy. Proof. Consider a group of M agents, each with an equal share of an initial group wealth, assumed (without loss of generality) to be $1. The nal wealth of the group of agents after T periods is given by ! T M 1 Y X t t (16) w x m=1 M t=1

m

t is the portfolio of agent m in period t. To prove that the e ective portfolio of the system where wm of agents is universal we need to prove (3) that:

lim

3  1 QT wt  xt 1=T M t=1 m 7 1=T 5 = 1 t=1 wBCRP (fxT g)  xt

2 PM m=1 6 min 4 T Q x T

T !1 f

g

For any market sequence, fxT g = x1 ; : : : ; xT , we require, lim

T !1

2 PM m=1 6 4 QT

1 M

QT

1=T 3

t t t=1 wm  x 7 1=T 5  1 t=1 wBCRP (fxT g)  xt

27

(17)

(18)

Let fwlT g = wl1 ; : : : ; wlT denote the portfolio selection strategy of agent l, the agent that achieves the least return on investment of all the agents for a particular stock price sequence, fxT g. The portfolio selection strategy of agent l has a nal return on investment less than or equal to the return on investment of all other agents m 2 f1; : : : ; M g; m 6= l: T Y

T Y t t wl  x  wmt  xt t=1 t=1

(19)

Substituting the return on investment of agent l for every agent in (18) gives a strictly smaller performance ratio. However, we can prove that this performance ratio is itself strongly competitive: lim

T !1

because

2  PM m=1 6 4 QT

3

1 QT t t 1=T M t=1 wl  x 7 1=T 5  1 t=1 wBCRP (fxT g)  xt

M X

1

m=1 M



! T Y t t wl  x = wlt  xt t=1 t=1

T Y

(20)

(21)

and the portfolio strategy of agent l is itself universal and satis es: 2

3  T wt  xt 1=T l t =1 7 lim 6 1=T 5  1 T !1 4 QT t=1 wBCRP (fxT g)  xt Q

(22)

Therefore (18) holds for any market sequence, fxT g, and the system of M adaptive but nonstrategy-switching agents has a universal portfolio selection strategy. This completes our proof 2

Claim 1 A constant di erence in wealth between investment strategy S and the wealth of the best CRP implies that strategy S achieves the same long term per-period growth rate. Proof.  T g)) ? log(RS (fxT g)) = C lim log( R ( f x (23) BCRP T !1

which implies which implies

  lim 1 log(RBCRP (fxT g)) ? log(RS (fxT g)) = 0

T !1 T

RM (fxT g)1=T = 1 lim T !1 RBCRP (fxT g)1=T

2

(24) (25)

Claim 2. The best CRP, wBCRP , also maximizes expected single period log return in a market with independent and identically distributed price relatives. Proof. T Y

t wBCRP = arg max w Tlim !1 t=1 w  x T Y

!1=T

t = arg max w Tlim !1 t=1 w  x ! T X 1 t = arg max w Tlim !1 T t=1 log w  x = arg max 2 w EX log w  x

28

Claim 3. The CRP that maximizes expected single period log return also maximizes expected end-

period log return, asymptotically for large numbers of investment periods, in a market with independent and identically distributed price relatives. Proof. Let fxT g = x1 ; : : : ; xT , denotes a sequence of price relatives vectors, independent and identically distributed across investment periods. T Y

t wBCRP = arg max w Tlim !1 EfXT g log t=1 w  x T X

= arg max w Tlim !1 EfXT g t=1

!

log w  xt

!

!

T X

= arg max w Tlim !1 t=1 EX log w  x = arg max 2 w EX log w  x

Claim 4. The CRP that maximizes expected single period log return also maximizes expected endperiod log return for any number of investment periods in a market with independent and identically distributed price relatives.

Proof.

T Y

t wBCRP = arg max w EfXT g log t=1 w  x T X

t = arg max w EfXT g t=1 log w  x T X

= arg max w t=1 EX log w  x = arg max w EX log w  x 2 Claim 5. The e ective portfolio of a system of agents is computed as the weighted average of the portfolios held by each agent in the system, where the weights are the wealth of each agent. Proof. Consider a system of M agents at the start of investment period t, agent i has portfolio wit and wealth wealth ti. The total wealth at the end of period t, given price relatives xt, is: M X  ? t +1 w ealth = w ealthti wit  xt i=1

(26)

The overall return on investment (dividing by total wealth at the start of period t) is M w ealtht+1 = X w ealthti ?wt  xt  PM PM t t i i=1 w ealthi i=1 i=1 w ealthi ! M t X w ealth i t t = PM t wi  x w ealth i i=1 i=1

Therefore the e ective portfolio, wet , of the system of agents is the weighted average of each portfolio, weighted by the wealth of each agent: M t X t we = PMw ealthi t wit i=1 i=1 w ealthi

29

2

(27)

References Aldous, D., and Vazirani, U. 1994. \Go with the winners" algorithms. In Proc. of the 35th Symp. on Found. of Comp. Sci., 492{501. Algoet, P. H., and Cover, T. M. 1988. Asymptotic optimality and asymptotic equipartition properties of log-optimum investment. The Annals of Probability 16(2):876{898. Algoet, P. 1992. Universal schemes for prediction, gambling and portfolio selection. The Annals of Probability 20(2):901{941. Arthur, W. B.; Durlauf, S.; and Lane, D., eds. 1997. The Economy as an Evolving Complex System II. Addison-Wesley. Bertsekas, D. P. 1987. Dynamic Programming: Deterministic and Stochastic Models. Prentice-Hall. Black, F.; Jensen, M. C.; and Scholes, M. 1972. The Capital Asset Pricing Model: Some empirical tests. In Jensen, M. C., ed., Studies in the Theory of Captial Markets. Praeger, NY. Blum, A., and Kalai, A. 1997. Universal portfolios with and without transaction costs. In Proceedings of the Tenth Annual Conference on Computational Learning Theory, 309{313. Boese, K. D.; Kahng, A. B.; and Muddu, S. 1994. A new adaptive multi-start technique for combinatorial global optimizations. Operations Research Letters 16:101{113. Borch, K. H. 1968. The Economics of Uncertainty. Princeton University Press. Bossaerts, P.; Kleiman, D.; and Plott, C. 1998. Experimental tests of the CAPM as a model of equilibrium in nancial markets. Technical Report Social Science Working Paper 1032, California Institute of Technology. Campbell, J. Y.; Lo, A. W.; and MacKinlay, C. 1997. The Econometrics of Financial Markets. Princeton University Press. Clearwater, S. H.; Huberman, B. A.; and Hogg, T. 1991. Cooperative solution of constraint satisfaction problems. Science 254:1181{1183. Cover, T. M., and Gluss, D. H. 1986. Empirical Bayes stock market portfolios. Advances in Applied Mathematics 7:170{181. Cover, T. M., and Ordentlich, E. 1996. Universal portfolios with side information. IEEE Transactions on Information Theory 42(2):348{363. Cover, T. M. 1991. Universal portfolios. Mathematical Finance 1(1):1{29. Dixit, A. K., and Pindyck, R. S. 1994. Investment under Uncertainty. Princeton University Press. Epstein, J. M., and Axtell, R. 1996. Growing Arti cial Societies: Social Science from the Bottom Up. MIT Press/Brookings. Helmbold, D. P.; Schapire, R. E.; Singer, Y.; and Warmuth, M. K. 1995. A comparison of new and old algorithms for a mixture estimation problem. In Proceedings of the Eigth Annual Conference on Computational Learning Theory, 69{78. Helmbold, D. P.; Schapire, R. E.; Singer, Y.; and Warmuth, M. K. 1998. On-line portfolio selection using multiplicative updates. Mathematical Finance 8(4):155{177. Hogg, T., and Williams, C. P. 1993. Solving the really hard problems with cooperative search. In Proc. 11th National Conference on Arti cial Intelligence (AAAI-93), 231{236. Huang, C., and Litzenberger, R. H. 1988. Foundations for Financial Economics. North-Holland. Huberman, B. A.; Lukose, R. M.; and Hogg, T. 1997. An economics approach to hard computational problems. Science 275:51{54. Huberman, B. A. 1990. The performance of cooperative processes. Physica D 42:38{47. 30

Irani, S., and Karlin, A. R. 1997. Online computation. In Hochbaum, D. S., ed., Approximation Algorithms for NP-Hard Problems. PWS Publishing. chapter 13, 521{564. Johnson, D. S.; Aragon, C. R.; McGeoch, L. A.; and Schevon, C. 1989. Optimization by simulated annealing: an experimental evaluation. Part I, graph partitioning. Operations Research 37:865{892. Kau man, S., and Levin, S. 1987. Toward a general theory of adaptive walks on rugged landscapes. Journal of Theoretical Biology 128:11{45. Kivinen, J., and Warmuth, M. K. 1997. Additive versus exponentiated gradient updates for linear prediction. Journal of Information and Computation 132(1):1{64. Knight, K. 1993. Are many reactive agents better than a few deliberative ones? In Proc. 13th International Joint Conference on Arti cial Intelligence (IJCAI-93), 432{437. Kornfeld, W. A. 1981. The use of parallelism to implement a heuristic search. In Proc. 7th International Joint Conference on Arti cial Intelligence (IJCAI-81), 575{580. LeBaron, B.; Arthur, W. B.; Holland, J. H.; Palmer, R.; and Tayler, P. 1997. Asset pricing under endogenous expectations in an arti cial stock market. In Arthur et al. (1997). Levy, H. 1997. Risk and return: An experimental analysis. International Economic Review 38:119{ 149. Luby, M., and Ertel, W. 1993. Optimal parallelization of Las Vegas algorithms. Technical Report TR-93-041, International Compter Science Institute, Berkeley, CA. Luby, M.; Sinclair, A.; and Zuckerman, D. 1993. Optimal speedup of Las Vegas algorithms. Technical Report TR-93-010, International Compter Science Institute, Berkeley, CA. Markowitz, H. M. 1959. Portfolio Selection. Wiley, New York. Merton, R. C. 1997. Continuous-Time Finanace. Blackwell, MA. Rao, V., and Kumar, V. 1992. On the eciency of parallel backtracking. IEEE Trans. on Parallel and Dist. Systems. Samuelson, P. A. 1969. Lifetime portfolio selection by dynamic stochastic programming. Review Econom. Statist. 51:239{246. Schaerf, A.; Shoham, Y.; and Tennenholtz, M. 1995. Adaptive load balancing: A study in multiagent learning. Journal of Arti cial Intelligence Research 2:475{500. Selman, B.; Levesque, H.; and Mitchell, D. 1992. A new method for solving hard satis ability problems. In Proc. 10th National Conference on Arti cial Intelligence (AAAI-92), 440{446. Sharpe, W. F. 1970. Portfolio Theory and Capital Markets. McGraw-Hill.

31