Communications in Statistics - Simulation and Computation
rP Fo Journal: Manuscript ID:
Date Submitted by the Author:
20-Jan-2010
Macedo, Pedro; Universidade de Aveiro Scotto, Manuel; Universidade de Aveiro Silva, Elvira; Universidade do Porto maximum entropy, robust regression, collinearity, outliers, linear regression
On
Abstract:
Review Article
w
Keywords:
LSSP-2010-0019
ie
Complete List of Authors:
Communications in Statistics - Simulation and Computation
ev
Manuscript Type:
rR
ee
A GENERAL CLASS OF ESTIMATORS FOR THE LINEAR REGRESSION MODEL AFFECTED BY COLLINEARITY AND OUTLIERS
ly
In this paper, a general class of estimators for the linear regression model affected by outliers and collinearity is introduced and studied in some detail. This class of estimators combines the theory of light, maximum entropy and robust regression techniques. Our theoretical findings are illustrated through a Monte Carlo simulation study.
Note: The following files were submitted by the author for peer review, but cannot be converted to PDF. You must view these files (e.g. movies) online. Resubmission ID LSSP-2010-0019_Previous Manuscript ID LSSP-2009-0210.R1.zip
URL: http://mc.manuscriptcentral.com/lssp E-mail:
[email protected] Page 1 of 25
w
ie
ev
rR
ee
rP Fo ly
On
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Communications in Statistics - Simulation and Computation
URL: http://mc.manuscriptcentral.com/lssp E-mail:
[email protected] Communications in Statistics - Simulation and Computation
REPLY ON REVIEWERS’ COMMENTS OF THE MANUSCRIPT ID: LSSP-2009-0210.R1 We would like to start by expressing our gratitude to the referee. He/she offered, in this second revision, valuable comments, corrections and suggestions for improvements in the paper. As an introductory note we would like to stress that the title of the paper has been slightly changed, namely “collinearity and/or outliers” has been replaced by “collinearity and outliers”
rP Fo
IMPORTANT NOTE:
All changes made throughout the previous version of the paper concerning the referee’s comments are identified in red font.
ee
Reviewer(s)' Comments to Author:
rR
“paper page 2: The sentence "The principle of ME it is often used to obtain the best possible predictions on the availability of historical data" is a little unclear. Is the idea that some people are wondering if some historical data exist and they use ME to get a prediction of whether or not the data exist? Or is it to make predictions based on historical data? In either case it would be nice to give a reference to some work where ME is used in this way. (Also perhaps "ME it is often used" should be replaced by "ME is often used".).” The sentence has been rewritten to read:
w
ie
ev
“The principle of ME is often used for solving ill-posed and underdetermined inverse problems in physics, biology, medicine, communication engineering (e.g. image reconstruction in tomographies, satellite images, search engines), statistics (e.g. statistical inference), and economics (e.g. efficiency and productivity analysis). Examples of applications of the principle of ME can be found in Paris and Howitt (1998), Miller and Horn (1998), Lence and Miller (1998), Fraser (2000), Golan and Dose (2001), Campbell et al. (2008) and Vinod and López de Lacalle (2009) among others.”
ly
On
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 2 of 25
“paper page 5: Perhaps "Given heed to this problem, Paris ..." should be "Giving heed to this problem, Paris ...” Done.
1/3
URL: http://mc.manuscriptcentral.com/lssp E-mail:
[email protected] Page 3 of 25
“paper page 5: Equation (2) defines H. But then what does the "min" mean? First, of all, shouldn't it be "arg min" instead of "min"? (It might help the reader to state explicitly that the minimum is to be taken jointly over p, L, and u. The equation asserts that the minimum is equal to $p'_{\beta} \ln p_{\beta} + L_{\beta} \ln \L_{\beta} + u' u$. Is that really what the minimum equals? Perhaps equation (2) should merely define H. A second equation or sentence can define the MEL estimators.” Please, see the answer in the next comment.
rP Fo
“paper page 6: Again in (4) I think it is "arg min" that's needed, not "min". Are the H's in (4) the same as each other? Are they the same as the H in (2)? In (2) H is a function of 3 arguments. In (4) both H's are written as functions of just one argument.” The referee is right: Expressions (2) and (4) in the previous version of the paper might lead to confusion. This point is really important and it deserved to be better highlighted. We have clarified this point in this revised version of the paper. The source of such confusion is the fact that we have tried to use the same notation as in Paris (2001, 2004). In the maximum entropy literature, notation “H(.)” is used to define an entropy measure. However, the same notation “H(.)” was used by Paris (2001) to denote a function of the residuals. In this revised version of the paper, the notation has been modified and additionally we tried to better explain the meaning of both expressions.
ev
rR
ee
“paper page 9: If different definitions of scale and affine equaivariance are in use, then the authors need to specify what definitions they are using.”
w
ie
We agree with the referee: this sentence was indeed not clear. We would like to point out that both us and Paris (2004) used the same scale and affine equivariance definitions, although Paris (2004) does not use the terms “scale” and “affine”. In view of the fact that the sentence doesn’t add anything new, we decided to eliminate it from the paper.
On
“paper page 10: Shouldn't it be "were replaced" instead of "was replaced"?” The referee is right. The sentence has been rewritten to read:
ly
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Communications in Statistics - Simulation and Computation
“[…] the singular values in the diagonal matrix obtained from the decomposition were replaced with an appropriate [...]”.
“paper page 11: On the previous page we read that X is 40 x 3. On page 11 we see the expression "X_{1}^{-1}" which seems to imply that X_{1} is square. I do not see X_{1} defined before p. 11. What is X_{1}?”
2/3
URL: http://mc.manuscriptcentral.com/lssp E-mail:
[email protected] Communications in Statistics - Simulation and Computation
The referee is right. In this revised version of the paper the definition has been corrected.
“paper page 15: I'm still troubled by the idea of comparing coefficients in a model without intercept to those in a model with intercept. In the response to the referee the authors write "In the Portland cement model, the comparison is not made directly with the estimates of the model without intercept, but can be made 'approximately'." They then display another way of writing the model showing the relationship between the coefficients in the models with and without intercept. (Why didn't they put this into the body of the paper?) I guess the coefficients are "approximately" equal because they differ by \beta_{0}/100 and that is supposed to be small. However, if \beta_{0} is 10^10 then \beta_{0}/100 is not small but their equation is still valid providing \beta_{1}, ... \beta_{4} are adjusted accordingly.”
rP Fo
In this revised version of the paper the example has been removed. In the first revision of the article we decided to keep it since this data set has been widely analyzed in several papers published in this journal (Communications in Statistics – Simulation and Computation).
w
ie
ev
rR
ee
ly
On
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 4 of 25
3/3
URL: http://mc.manuscriptcentral.com/lssp E-mail:
[email protected] Page 5 of 25
A GENERAL CLASS OF ESTIMATORS FOR THE LINEAR REGRESSION MODEL AFFECTED BY COLLINEARITY AND OUTLIERS
Pedro Macedo and Manuel Scotto Departamento de Matem´atica Universidade de Aveiro Campus Universit´ario de Santiago, 3810-193 Aveiro, Portugal
[email protected] and
[email protected] rP Fo
Elvira Silva
Faculdade de Economia Universidade do Porto
Rua Dr. Roberto Frias, 4200-464 Porto, Portugal
[email protected] rR
ee
Key Words: maximum entropy; robust regression; collinearity; outliers; linear regression. Mathematics Subject Classification: 94A17; 62J05.
ev
Abstract
In this paper, a general class of estimators for the linear regression model affected by
ie
outliers and collinearity is introduced and studied in some detail. This class of estima-
w
tors combines the theory of light, maximum entropy and robust regression techniques. Our theoretical findings are illustrated through a Monte Carlo simulation study.
On
Corresponding author: Pedro Macedo (E-mail address:
[email protected]). Address: Universidade de Aveiro, Departamento de Matem´atica, Campus Universit´ario de Santiago, 3810-193
ly
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Communications in Statistics - Simulation and Computation
Aveiro, Portugal.
1
URL: http://mc.manuscriptcentral.com/lssp E-mail:
[email protected] Communications in Statistics - Simulation and Computation
1
Introduction
The Maximum Entropy (in short ME) formalism was first established by Jaynes (1957a,b) based on physics (Shannon entropy and statistical mechanics) and statistical inference. In a linear pure inverse problem, the ME principle provides the probability distribution for which the current state of knowledge is sufficient to determine the probability assignment. The principle of ME is often used for solving ill-posed and underdetermined inverse problems in physics, biology, medicine, communication engineering (e.g. image reconstruction
rP Fo
in tomographies, satellite images, search engines), statistics (e.g. statistical inference), and economics (e.g. efficiency and productivity analysis). Examples of applications of the principle of ME can be found in Paris and Howitt (1998), Miller and Horn (1998), Lence and Miller (1998), Fraser (2000), Golan and Dose (2001), Campbell et al. (2008) and Vinod and L´opez de Lacalle (2009) among others. The works of Kullback and Leibler (1951), Kullback
ee
(1954, 1959), and Lindley (1956) were also fundamental to connect the areas of ME and
rR
information theory with statistical inference. Furthermore, in the last years many authors, among which Levine and Tribus (1979), Skilling (1984), Donoho et al. (1992), Golan et al. (1996) and Gamboa and Gassiat (1997) have developed efforts in making ME more under-
ev
standable (and also much less controversial) to a wider audience; see Golan (2002, 2007) and the references therein.
w
ie
The Generalized Maximum Entropy (hereafter GME) estimator proposed by Golan et al. (1996) contributed to the development of the ME econometrics literature in the recent years.
On
In view of the fact that in econometrics and statistics, the ill-posed and underdetermined real-world problems seems to be the rule rather than the exception, the GME estimator has
ly
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
acquired special importance in the toolkit of econometric techniques, by allowing econometric formulations free of restrictive and unnecessary assumptions. In particular, this estimator is widely used in linear regression models in which (a) the design matrix is ill-conditioned (collinearity), (b) the number of unknown parameters exceeds the number of observations, and (c) in small samples. Golan et al. (1996) specify a range of ill-posed or underdetermined 2
URL: http://mc.manuscriptcentral.com/lssp E-mail:
[email protected] Page 6 of 25
Page 7 of 25
problems where GME is used not only in linear models but also in nonlinear models. The GME estimator is based on Jaynes’ work, which uses the Shannon entropy measure, and selects probabilities that are the most uniform satisfying the observed information. More recently, Golan and Perloff (2002) introduced the GME-α class of estimators based on R´enyi and Tsallis entropies. Golan and Perloff (2002) demonstrated that certain members of this class of estimators may outperform the GME estimator. However, the main weakness of the GME class of estimators is that support intervals (exogenous information not always available) for the parameters and error vectors are needed. Those supports are defined as
rP Fo
being closed and bounded intervals in which each parameter or error are restricted to lie. The optimal probability vectors selected along with these supports allow us to obtain point estimates for both parameters and errors of the linear regression model. Giving heed to the problem of the definition of support intervals, Paris (2001) developed the Maximum Entropy
ee
Leuven (in short MEL) estimator based on the Theory of Light (TL) of Feynman (1985) and the GME estimator. The MEL estimator is generated using the Shannon entropy measure
rR
and the Ordinary Least Squares (OLS) estimator. Based on the MEL estimator as well as ME, and robust regression techniques we introduce a general class of estimators denoted by
ev
Maximum Entropy Robust Regression Group (in short MERG) estimators. Considering the same framework as in the MEL estimator, we define a general expression in which the R´enyi
ie
and Tsallis entropy can be also applied. Furthermore, different estimators based on robust regression, namely the Least Trimmed Squares (LTS) estimator, the Least Absolute Devia-
w
tions (LAD) estimator, and the Least Median of Squares (LMS) estimator, are considered (e.g. Rousseeuw (1984), Rousseeuw and Leroy (1987), Ellis and Morgenthaler (1992)). It is
On
worth to mention that several MERG estimators can be defined by merging other entropy expressions and/or other robust estimators.
ly
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Communications in Statistics - Simulation and Computation
At this point one question arises: why MERG estimators? Different reasons justify this approach. First, Paris (2001) proved that the MEL estimator is useful in dealing with the linear regression model affected by collinearity. Nonetheless, the results in Section 5 show
3
URL: http://mc.manuscriptcentral.com/lssp E-mail:
[email protected] Communications in Statistics - Simulation and Computation
that the MERG estimators (which include the MEL as a special case) outperform some traditional competitors when considering linear regression models affected by outliers and collinearity. Secondly, the MERG estimators allow to incorporate R´enyi and Tsallis entropies whereas the MEL estimator only uses the Shannon entropy measure. In doing so, the idea is to explore the results obtained by Golan and Perloff (2002) who illustrated that the GME-α class has lower mean squared error than does the GME for some entropy values in a set of experiments involving small samples, outliers and collinearity in the linear regression model. Third, the MEL estimator only considers the OLS estimator in the objective function. With
rP Fo
the MERG estimators different methods widely used in robust regression can also be applied. For example, one advantage of the LAD estimator over the OLS estimator is its robustness in the presence of outliers in the response variable. Moreover, in the case without outliers in the explanatory or independent variables the breakdown point of LAD is positive (e.g.
ee
Ellis and Morgenthaler (1992)). In contrast, however, the LAD estimator is very sensitive to the presence of outliers in the independent variables. Another weakness is the efficiency
rR
loss under Normal disturbances, though this estimator is more efficient than the OLS estimator in other cases. For a detailed analysis of advantages and weaknesses of LTS and LMS
ev
estimators when compared with the OLS estimator see Rousseeuw and Leroy (1987), Huber (2004) and Maronna et al. (2006).
ie
The rest of the paper is organized as follows: for completeness and reader’s convenience
w
in Section 2 the MEL estimator is explained in some details. In Section 3, the MERG class of estimators is introduced. Section 4 is devoted in discussing some properties related
On
with this class of estimators. In Section 5 the results are illustrated through a Monte Carlo simulation study. Finally, in Section 6 some concluding remarks are given.
ly
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
4
URL: http://mc.manuscriptcentral.com/lssp E-mail:
[email protected] Page 8 of 25
Page 9 of 25
2
The Maximum Entropy Leuven estimator
Consider the general linear model y = Xβ + u,
(1)
where y denotes a (N × 1) vector of noisy observations, β is a (K × 1) vector of unknown parameters, X is a known (N × K) matrix of explanatory variables and u is the (N × 1) vector of spherical disturbances, i.e, E[u|X] = 0 and E[uu0 |X] = σ 2 I. One of the most important challenge when dealing with model (1) is the estimation of β under severe collinearity and/or
rP Fo
outliers. It is well-known that collinearity is responsible for inflating the variance associated with the regression coefficients and may also affect the signs of coefficients, as well as statistical inference. Furthermore, outliers have also a potential influence in the estimates provided by the OLS estimator. Since traditional estimators developed to solve collinearity problems
ee
(e.g. Ridge Regression (RR) (e.g. Hoerl and Kennard (1970) and Hoerl et al. (1975)), GME and Liu-type (e.g. Liu (1993) and Liu (2003)) estimators) usually depend on exogenous
rR
information, these techniques perform poorly in some practical situations. Addressing this problem, Paris (2001) introduced the MEL estimator which is based on the GME estimator and the TL, as
ev
argmin p0β ln pβ + Lβ ln Lβ + r 0 r pβ ,Lβ ,r
(2)
ie
subject to
y = Xβ + r Lβ = β 0 β ββ , p = β Lβ 0 ≤ pβ ≤ 1
w On
where indicates the element-by-element Hadamard product and r 0 r is the sum of squared
ly
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Communications in Statistics - Simulation and Computation
residuals. Note that the MEL estimator considers the Shannon entropy measure in the objective function. As in the TL, where the probability of a photomultiplier being hit by a photon reflected from a sheet of glass equals the square of its amplitude, the probability of
5
URL: http://mc.manuscriptcentral.com/lssp E-mail:
[email protected] Communications in Statistics - Simulation and Computation
βk is given by the square of ∆ (i.e., the amplitude or normalized dimensionality), where βk ∆= p . Lβ
(3)
The MEL estimator is inspired by the TL and its corresponding analogy with statistical inference has been discussed in Paris (2001, p. 3). By introducing H¨older norms, it follows easily from (3) that βk is normalized with p = 2 (i.e., the Euclidean norm). Paris (2001) proved that the MEL estimator avoids the main criticism usually made to the GME estimator, i.e., the requirement of subjective information to define support intervals for β and u,
rP Fo
exogenous information which is, in general, not available. For further details concerning the choice of the supports see Golan et al. (1996) and Caputo and Paris (2008). More recently, Mishra (2004b) introduced the Modular Maximum Entropy Leuven (MMEL) estimator in which βk is normalized by taking p = 1 (i.e., the Absolute norm). Both Paris (2001) and
ee
Mishra (2004b) illustrated the good performance of their corresponding estimators under collinearity using a Monte Carlo simulation study. Moreover, both estimators outperform
rR
OLS estimator under collinearity and it seems that MMEL estimator perform better than the MEL estimator in some cases. Mishra (2004a) introduced a new model, which falls into
ev
the class of the MEL family of estimators, for which is also possible to minimize the sum of absolute residuals depending on the choice of different parameters rather than minimizing
ie
the sum of squared residuals, as in the MEL estimator case.
3
w
Maximum Entropy Robust Regression Group esti-
On
mators
ly
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 10 of 25
In this section, we introduce the class of Maximum Entropy Robust Regression Group (MERG) estimators as argmin pβ ,Lβ ,r
(K X
H1 (pβk ) + H2 (Lβ ) +
N X
) φ(rn )
n=1
k=1
6
URL: http://mc.manuscriptcentral.com/lssp E-mail:
[email protected] (4)
Page 11 of 25
subject to K X yn = xnk βk + rn k=1 K X , Lβ = βk2 k=1 βk2 p = βk Lβ where 0 ≤ pβk ≤ 1 and with φ(rn ) being a function of the regression residuals. Functions
rP Fo
H1 (pβk ) and H2 (Lβ ) are both entropy measures (e.g. Shannon, R´enyi or Tsallis entropies). However, since the H2 (Lβ ) is included in the objective function only to prevent the overflow of the Lβ parameter we define, for simplicity, the MERG estimators by considering H2 (Lβ ) = Lβ ln Lβ (using the Shannon entropy measure). It might be of interest, for example, to compare the MERG estimators with the regularization methods presented in Donoho
ee
et al. (1992), Tibshirani (1996), Gamboa and Gassiat (1997) and Hastie et al. (2009). The discussion provided in Golan (2002) is also important to better understand the connections
rR
among the different information-based theoretical methods. Following the TL, we always consider the Euclidean norm as in the MEL estimator. By considering different specifica-
ev
tions for the components in (4), several MERG estimators are obtained; see Table I below where the subscripts “R” and “T” means that it is used the R´enyi’s and Tsallis’ entropies,
ie
respectively.
w
(Table I about here)
On
Table I does not contain the function H2 (Lβ ) because it is defined by the Shannon entropy measure for all MERG estimators. Note that the MERG1 estimator is the MEL estimator or the MMEL estimator proposed by Mishra (2004b) if βk is normalized with p = 1. For this case Lβ =
K X
|βk |
and
k=1
pβk =
|βk | . Lβ
ly
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Communications in Statistics - Simulation and Computation
(5)
Obviously, the OLS estimator is obtained by removing the entropy components. In MERG2, 2 the sum in φ(rn ) has to be adapted since r(n:N ) represents the ordered squared residuals after
7
URL: http://mc.manuscriptcentral.com/lssp E-mail:
[email protected] Communications in Statistics - Simulation and Computation
removing a proportion, say ρ, of the largest squared residuals. Note that the LTS estimator (Rousseeuw (1984)) falls into this group if the entropy components are removed. The LTS estimator can be obtained using h X
2 r(n:N ),
(6)
n=1
for h = (1 − ρ)N + 1. MERG3 group contains, as a special case, the LAD estimator by removing the entropy components and it is basically the same as the case k1 = 1 in the model proposed by Mishra (2004a). Finally, keeping in mind that the goal of MERG4 is
rP Fo
to minimize the median of squared residuals (instead of the sum), the sum in φ(rn ) has to be removed. Note that the LMS estimator (Rousseeuw (1984)) is obtained by removing the entropy components.
ee
Regarding the other groups of estimators the difference is the entropy measures considered in (4). Moreover, as in the MERG2 group, the sum in φ(rn ) has to be adapted in the
rR
MERG2R and MERG2T groups. Furthermore, as in MERG4 group, in the MERG4R and MERG4T groups the sum in φ(rn ) has to be removed. The choices of q and α in these MERG estimators depend on the characteristics of Tsallis and R´enyi’s entropies as well as in the
ev
contribution of the higher and lower probability events to the entropy value (e.g. Golan and Perloff (2002)). For example, for q > 1 and α > 1 the events with higher probability con-
ie
tribute more for the value of the entropy than the lower probability events (for the Shannon
w
measure, the lowest and highest probability events have a small contribution to the entropy value); see Golan and Perloff (2002, pp. 205-209) for further details.
On
Other MERG estimators can be defined by merging other entropy expressions (e.g. Taneja
ly
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 12 of 25
(1989)) and/or other robust estimators (e.g. Rousseeuw and Leroy (1987) and Ryan (2009)). With this new general class of estimators we try to harness the potential of all methods (partially) employed (GME, GME-α, MEL, LAD, LTS and LMS estimators) and thus to create a group of estimators with high performance when dealing with regression analysis exhibiting collinearity and outliers, and in small sample sizes (for example, the GME estimator is 8
URL: http://mc.manuscriptcentral.com/lssp E-mail:
[email protected] Page 13 of 25
recognized as one method that provides good estimates based on small data).
Finally, an important remark: as for the MEL estimator, the MERG estimators do not possess a closed-form representation and the solutions must be found numerically by means of nonlinear optimization techniques. In order to study the structure of these estimators we used the Lagrangian functions with the corresponding Karush-Kuhn-Tucker (KKT) conditions. An optimization program based on the Quasi-Newton sequential quadratic programming method has been employed in our simulation study. The function used solves a
rP Fo
quadratic programming subproblem at each iteration, update an estimate of the Hessian matrix of the Lagrangian function and finally, a line search using a merit function is performed. It is possible, however, to improve this optimization procedure and probably improve the results presented in the simulation study. In some cases, when faced with convergence prob-
ee
lems and feasible but non-optimal solutions, we also used an iterative method based on the Nelder-Mead direct search and the GAMS software using different NLP solvers.
4
rR
Properties of MERG estimators
ev
Since MERG estimators are based on ME and robust regression techniques, its properties can
ie
be derived from the ones already established in ME (GME and GME-α) and in the robust regression (LTS, LAD and LMS) literature. Keeping that in mind, we discuss equivariance,
w
consistency and asymptotic normality for some MERG estimators.
On
In regression analysis, equivariance is a desirable property which allows the evaluation of estimate changes in data transformations and it is usually divided in regression, scale and
ly
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Communications in Statistics - Simulation and Computation
affine equivariance. It is well-known that OLS, LAD, LMS, and LTS estimators are regression, scale and affine equivariant. However, it is important to note that in some cases this property is dropped on behalf of other desirable properties. Following Paris (2004) we prove the scale and affine equivariance of the MERG estimators with the Shannon entropy
9
URL: http://mc.manuscriptcentral.com/lssp E-mail:
[email protected] Communications in Statistics - Simulation and Computation
measure. Proposition 4.1 The MERG estimators with the Shannon entropy measure are scale and affine equivariant. Proof. See Appendix.
In proving consistency and asymptotic normality, we show that the MERG estimators in
rP Fo
model (4) behave asymptotically as the OLS, LTS, LAD or LMS estimators, according to the model specified. Detailed proofs for the different components of the MERG estimators can be found in Golan et al. (1996) for GME estimator (large sample properties), Golan and Perloff (2002) for GME estimator with R´enyi and Tsallis entropies, and Maronna et al. (2006) for M-estimators.
ee
Proposition 4.2 The probability limit of entropy components with Shannon entropy mea-
rR
sure in the objective function of model (4) tend to zero as N → ∞. Proof. See Appendix.
ev
Proposition 4.2 is useful to derive the asymptotic properties of the MERG estimators by
ie
taking into account the properties of the OLS, LTS, LAD and LMS estimators. Note that
w
MERG4, for example, is not asymptotically normal since the LMS estimator is not even asymptotically normal. In contrast, the MERG2 estimator is consistent and asymptotically
On
normal since by Proposition 4.2 the MERG2 estimator have the same properties of the LTS estimator which is consistent and asymptotically normal.
5
ly
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Simulation study
The aim of this section is to illustrate the theoretical findings presented in Sections 3 and 4 through a Monte Carlo simulation study. In particular, this simulation study intends to 10
URL: http://mc.manuscriptcentral.com/lssp E-mail:
[email protected] Page 14 of 25
Page 15 of 25
illustrate the performance of MERG estimators when outliers and collinearity are present simultaneously in the linear regression model.
We consider a pseudo-random number generator based on the Marsaglia’s Ziggurat algorithm to define a (40 × 3) matrix X from a Normal distribution with zero mean and unit standard deviation. With the traditional singular value decomposition, the singular values in the diagonal matrix obtained from the decomposition were replaced with an appropriate expression that allows to define a matrix X1 with a desired condition number specified a
rP Fo
priori ; in this case cond2 X1 = 2000. The expression cond2 represents the 2-norm condition number (i.e., the ratio of the largest singular value of X1 with the smallest singular value). It is worth to mention that cond2 is used with respect to X1 instead of X10 X1 , since (cond2 X1 )2 = (cond2 X10 X1 ). Note that the ill-conditioning evaluation remains consistent
ee
(i.e., the condition number is very large).
rR
The model (without intercept) is given by y = X1 β + u, where β = (0.3, 0.2, 0.1)0 and u is a vector of N (0, 10−2 ) errors added to form the vectors of noisy observations y for each
ev
Monte Carlo trial. To create a small proportion of regression outliers, we replace the first two elements of y and the respective first two elements of the first column of X1 with the
ie
value 10. After this small change in X1 we obtained cond2 X1 ≈ 30. In this case, the two outliers in X1 mask the presence of collinearity! For the 100 trials performed, the mean
w
ˆ 2 )2 , is the measure used to evaluate the squared error loss (MSEL), with SEL = (||β − β|| performance of the LTS, Ridge, Liu-type and some MERG estimators. Table II presents the
On
MSEL obtained by the previous estimators. Other six MERG estimators (with R´enyi and Tsallis entropies) were tested in this simulation study, but not reported in this table since
ly
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Communications in Statistics - Simulation and Computation
their MSEL values remain between 0.0969 (the MSEL for MERG4) and 0.8674 (the MSEL for MERG1), both presented in Table II. The ridge parameter η used in the Ridge estimator is given by η=
σ ˆ2K 0 , βˆ βˆ 11
URL: http://mc.manuscriptcentral.com/lssp E-mail:
[email protected] (7)
Communications in Statistics - Simulation and Computation
where σ ˆ2 =
ˆ 0 (y − y) ˆ (y − y) N −K
(8)
and βˆ is given by the OLS estimator (Hoerl et al. (1975) and Muniz and Kibria (2009)). The Liu-type estimators are defined as ˆ βˆk,d = (X 0 X + kI)−1 (X 0 Y − dβ).
(9)
The parameters k and d for the two Liu-type estimators are presented in Liu (2003). We consider the Liu-type1 when βˆ is given by the OLS estimator and the Liu-type2 when βˆ
rP Fo
is given by the Ridge estimator. Note that other Ridge and Liu-type estimators can be performed with other methods to estimate the parameters (for example, the Generalized Cross-Validation estimator proposed by Golub et al. (1979) to estimate the ridge parameter is also widely used in the literature).
ee
(Table II about here)
rR
Two important conclusions emerge from this experiment. First, the MERG estimators perform well and rival with the Liu-type1 estimator in this simulation study. Secondly, the outliers mask the presence of collinearity and the careless use of robust estimators can be
ev
dangerous (the LTS estimator, for example, performs poorly in this experiment).
ie
As a final note we present some additional results which could form the basis for further
w
research. The same simulation study was carried out with cond2 X1 = 5. Note that, in this case, the linear regression model is only affected by outliers. The best results were obtained
On
with the LTS estimator (MSEL= 0.0011) whereas the results from the twelve MERG estimators tested remain practically the same as those presented in Table II, with MSEL between
ly
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 16 of 25
0.0608 and 0.8637 (best results with MERG estimators with R´enyi and Tsallis entropies). We also test the IRLS estimator (Iteratively Reweighted Least Squares, where the weights at each iteration are calculated by applying Tukey’s biweight function to the residuals from the previous iteration) being in this case MSEL= 0.5539. Small MSEL for the MERG estimators is also achieved when the same simulation study is performed with cond2 X1 = 2000, 12
URL: http://mc.manuscriptcentral.com/lssp E-mail:
[email protected] Page 17 of 25
but without outliers. It seems that the MERG estimators also perform well in the presence of severe collinearity or outliers in the linear regression model. These issues need, indeed, further research with a more complete simulation study.
6
Conclusions
An extension of the MEL estimator from Paris (2001) has been introduced in this paper. Following a similar procedure, more estimators can be derived simply by merging another
rP Fo
entropy measures with other robust estimators (for example, M-estimators, Generalized Mestimators, S-estimators, MM-estimators, among others). For instance, note that is possible to show that the OLS, LTS, LAD and LMS estimators are particular cases of the S-estimator defined by
ee
min S(r1 (β), . . . , rN (β)),
(10)
ˆ β
considering S as a generic scale estimator. It follows that the dispersion S(r1 (β), . . . , rN (β)) is given as the solution of
rR N X n=1
ψ
r n
S
= N k,
(11)
ev
where ψ(·) is a function to be selected and k is a consistency constant. In this case, S is an M-estimator of scale. The S-estimators are regression, scale and affine equivariant as well as
ie
asymptotically normal (e.g. Rousseeuw and Leroy (1987)).
w In this paper, equivariance, consistency and asymptotic normality for some MERG esti-
On
mators have also been addressed. These properties can be easily derived from the ME and robust regression literature.
ly
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Communications in Statistics - Simulation and Computation
This new general class of estimators rivals (and in some cases outperform) with some traditional competitors although this assertion should be tempered with some caution, since more simulation studies and empirical applications are needed. However, two important features of the MERG estimators seems to be clear: they are easy to compute and no relevant prior 13
URL: http://mc.manuscriptcentral.com/lssp E-mail:
[email protected] Communications in Statistics - Simulation and Computation
information are needed to implement them.
Topics of future research include the detailed analysis of the MERG estimators’ properties under the presence of small sample sizes and, on the other hand, the discussion of issues involving breakdown points, asymptotic efficiency and influence functions. It would be also interesting to develop MERG estimators within the context of heteroskedastic and autocorrelated errors.
rP Fo
Acknowledgments
We would like to start by expressing our gratitude to the referee. He/she offered extremely
ee
valuable perspectives on our work and effective suggestions for improvements. This research is (partially) supported by Programa Operacional “Ciˆencia, Tecnologia, Inova¸c˜ao” of the
rR
Funda¸ca˜o para a Ciˆencia e a Tecnologia (FCT) cofinanced by the European Community fund FEDER. The first author is also supported by the grant SFRH/BD/40821/2007 from FCT.
w
ie
ev ly
On
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
14
URL: http://mc.manuscriptcentral.com/lssp E-mail:
[email protected] Page 18 of 25
Page 19 of 25
References Campbell, R., Rogers, K., and Rezek, J. (2008). Efficient frontier estimation: a maximum entropy approach. Journal of Productivity Analysis, 30:213–221. Caputo, M. R. and Paris, Q. (2008). Comparative statics of the generalized maximum entropy estimator of the general linear model. European Journal of Operational Research, 185:195–203.
rP Fo
Donoho, D. L., Johnstone, I. M., Hoch, J. C., and Stern, A. S. (1992). Maximum entropy and the nearly black object. Journal of the Royal Statistical Society, Series B, 54:41–81. Ellis, S. P. and Morgenthaler, S. (1992). Leverage and breakdown in L1 regression. Journal of the American Statistical Association, 87:143–148.
ee
Feynman, R. P. (1985). QED: The Strange Theory of Light and Matter. Princeton University Press, Princeton.
rR
Fraser, I. (2000). An application of maximum entropy estimation: the demand for meat in the United Kingdom. Applied Economics, 32:45–59.
ev
Gamboa, F. and Gassiat, E. (1997). Bayesian methods and maximum entropy for ill-posed
ie
inverse problems. The Annals of Statistics, 25:328–350.
w
Golan, A. (2002). Information and entropy econometrics - editor’s view. Journal of Econometrics, 107:1–15.
On
Golan, A. (2007). Information and entropy econometrics - volume overview and synthesis. Journal of Econometrics, 138:379–387.
ly
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Communications in Statistics - Simulation and Computation
Golan, A. and Dose, V. (2001). A generalized information theoretical approach to tomographic reconstruction. Journal of Physics A, 34:1271–1283. Golan, A., Judge, G., and Miller, D. (1996). Maximum Entropy Econometrics: Robust Estimation with Limited Data. John Wiley and Sons, Chichester. 15
URL: http://mc.manuscriptcentral.com/lssp E-mail:
[email protected] Communications in Statistics - Simulation and Computation
Golan, A. and Perloff, J. M. (2002). Comparison of maximum entropy and higher-order entropy estimators. Journal of Econometrics, 107:195–211. Golub, G. H., Heath, M., and Wahba, G. (1979). Generalized cross-validation as a method for choosing a good ridge parameter. Technometrics, 21:215–223. Hastie, T., Tibshirani, R., and Friedman, J. (2009). The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer, New York, second edition.
rP Fo
Hoerl, A. E. and Kennard, R. W. (1970). Ridge regression: Biased estimation for nonorthogonal problems. Technometrics, 12:55–67. Hoerl, A. E., Kennard, R. W., and Baldwin, K. F. (1975). Ridge regression: Some simulations. Communications in Statistics - Simulation and Computation, 4:105–123.
ee
Huber, P. J. (2004). Robust Statistics. John Wiley and Sons, New Jersey.
rR
Jaynes, E. T. (1957a). Information theory and statistical mechanics. Physical Review, 106:620–630.
ev
Jaynes, E. T. (1957b). Information theory and statistical mechanics. II. Physical Review, 108:171–190.
ie
Kullback, S. (1954). Certain inequalities in information theory and the Cramer-Rao inequal-
w
ity. Annals of Mathematical Statistics, 25:745–751.
On
Kullback, S. (1959). Information Theory and Statistics. John Wiley and Sons, New York. Kullback, S. and Leibler, R. A. (1951). On information and sufficiency. Annals of Mathe-
ly
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
matical Statistics, 22:79–86.
Lence, S. H. and Miller, D. J. (1998). Estimation of multi-output production functions with incomplete data: A generalised maximum entropy approach. European Review of Agricultural Economics, 25:188–209.
16
URL: http://mc.manuscriptcentral.com/lssp E-mail:
[email protected] Page 20 of 25
Page 21 of 25
Levine, R. D. and Tribus, M. (1979). The Maximum Entropy Formalism. MIT Press, Cambridge. Lindley, D. V. (1956). On a measure of the information provided by an experiment. Annals of Mathematical Statistics, 27:986–1005. Liu, K. (1993). A new class of biased estimate in linear regression. Communications in Statistics - Theory and Methods, 22:393–402.
rP Fo
Liu, K. (2003). Using Liu-type estimator to combat collinearity. Communications in Statistics - Theory and Methods, 32:1009–1020. Maronna, R. A., Martin, R. D., and Yohai, V. J. (2006). Robust Statistics - Theory and Methods. John Wiley and Sons, Chichester.
ee
Miller, G. and Horn, D. (1998). Probability density estimation using entropy maximization. Neural Computation, 10:1925–1938. Mishra, S. (2004a).
rR
Estimation under multicollinearity: Application of restricted Liu
and maximum entropy estimators to the Portland cement dataset. Available at SSRN: http://ssrn.com/abstract=559861.
ie
ev
Mishra, S. (2004b). Multicollinearity and maximum entropy leuven estimator. Economics Bulletin, 3:1–11.
w
Muniz, G. and Kibria, B. M. G. (2009). On some ridge regression estimators: An empirical
On
comparisons. Communications in Statistics - Simulation and Computation, 38:621–630. Paris, Q. (2001). Multicollinearity and maximum entropy estimators. Economics Bulletin,
ly
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Communications in Statistics - Simulation and Computation
3:1–9.
Paris, Q. (2004). Maximum entropy leuven estimators and multicollinearity. Rivista Statistica, 64:3–22.
17
URL: http://mc.manuscriptcentral.com/lssp E-mail:
[email protected] Communications in Statistics - Simulation and Computation
Paris, Q. and Howitt, R. E. (1998). An analysis of ill-posed production problems using maximum entropy. American Journal of Agricultural Economics, 80:124–138. Rousseeuw, P. J. (1984). Least median of squares regression. Journal of the American Statistical Association, 79:871–880. Rousseeuw, P. J. and Leroy, A. M. (1987). Robust Regression and Outlier Detection. John Wiley and Sons, New York.
rP Fo
Ryan, T. P. (2009). Modern Regression Methods. John Wiley and Sons, New Jersey, second edition.
Skilling, J. (1984). Data analysis: The maximum entropy method. Nature, 309:748–749. Taneja, I. J. (1989). On generalized information measures and their applications. Advances
ee
in Electronics and Electron Physics, 76:327–413.
rR
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society, Series B, 58:267–288.
ev
Vinod, H. D. and L´opez de Lacalle, J. (2009). Maximum entropy bootstrap for time series: The meboot R package. Journal of Statistical Software, 29:1–19.
w
ie ly
On
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
18
URL: http://mc.manuscriptcentral.com/lssp E-mail:
[email protected] Page 22 of 25
Page 23 of 25
Appendix Proof of Proposition 4.1. For the MERG1 estimator the result trivially holds. Moreover, from the expression in (4) it follows that the MERG2 estimator takes the form (1−ρ)N +1 K X X 2 argmin pβk ln pβk + Lβ ln Lβ + r(n:N . ) pβ ,Lβ ,r
(12)
n=1
k=1
Since the MEL estimator is scale and affine equivariant, it follows that the MERG2 estimator
rP Fo
2 is also scale and affine equivariant, because r(n:N ) represents the ordered squared residuals
after removing a proportion of the biggest squared residuals. On the other hand, the MERG3 estimator matches the LAD estimator (which is scale and affine equivariant) by removing the entropy components in the objective function ) (K N X X |rn | . argmin pβk ln pβk + Lβ ln Lβ +
ee
pβ ,Lβ ,r
(13)
n=1
k=1
rR
In contrast, by keeping the entropy components in (13), and following the same line of reasoning as in Paris (2004, pp. 11 - 14), we need to compare the solutions obtained from the Lagrangian functions with the usual KKT conditions in scaled and unscaled models.
ev
Since the four conditions of model (2) are still valid for the MERG3 estimator defined in (13), the Lagrangian function (L) and the corresponding KKT conditions are similar to those
ie
for the MEL optimization problem in Paris (2004). Formally, the main difference between
w
both problems relies on the partial derivative of L with respect to the residual variable, given by
On
∂L ∂L = 2u − ξ = 0 [MEL] and = 1 − ξ = 0 [MERG3], ∂u ∂r
(14)
where ξ is the Lagrange multiplier in L for the first condition of model (4). Thus, the
ly
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Communications in Statistics - Simulation and Computation
equivariance proof for the MEL estimator is also valid for the MERG3 estimator. Similar arguments can be used for the MERG4 estimator if the entropy components in the expression (K ) X argmin pβk ln pβk + Lβ ln Lβ + mediann (rn )2 , (15) pβ ,Lβ ,r
k=1
19
URL: http://mc.manuscriptcentral.com/lssp E-mail:
[email protected] Communications in Statistics - Simulation and Computation
are not removed. We need again to compare the solutions obtained from the Lagrangian functions with the usual KKT conditions in scaled and unscaled models. The main difference between MEL and MERG4 optimization problems is the partial derivative of L with respect to the residual variable, ∂L ∂L = 2u − ξ = 0 [MEL] and = 2rn − ξ = 0 [MERG4]. ∂u ∂r
(16)
The proof of the scale and affine equivariance for the MEL estimator is also valid for the MERG4 estimator. If the entropy components are removed in (15), the MERG4 estimator
rP Fo
matches the LMS estimator, which is scale and affine equivariant since
and
median(θy − X(θ β))2 = θ2 median(y − Xβ)2
(17)
median(y − (XA)(A−1 β))2 = median(y − Xβ)2 .
(18)
ee
rR
Proof of Proposition 4.2. The proof conducted by Paris (2004) for the MEL estimator is sufficient to prove this proposition. Knowing that the superscript “N ” indicates the
ev
dependence on the sample size N and applying the probability limit, is obtained ! K 1 X N p N N pβk ln pN = 0. lim βk + Lβ ln Lβ N →∞ N k=1
ie
(19)
w
Remark. The previous result is also valid for the MERG estimators with R´enyi and Tsallis entropies (for α and q strictly positive) since 1 lim N →∞ N
K X 1 α N N log2 (pN βk ) + Lβ ln Lβ α−1 k=1
K X
!
p
=0
(20)
ly
and
On
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 24 of 25
q (pN βk ) − 1 1 k=1 N p N ln L lim + L β β = 0. N →∞ N q−1
20
URL: http://mc.manuscriptcentral.com/lssp E-mail:
[email protected] (21)
Page 25 of 25
Table I: MERG estimators. Group
P
H1 (pβk )
φ(r)
MERG1
P
pβk ln pβk
rn2
MERG2
P
pβk ln pβk
2 r(n:N )
MERG3
P
pβk ln pβk
|rn |
MERG4
P
MERG1R MERG1T MERG2R
rP Fo
MERG2T MERG3R
MERG3T MERG4R
MERG4T
k
k k k
pβk ln pβk P 1 log2 k (pβk )α α−1 P q k pβk − 1 /(q − 1) P 1 (pβ )α α−1 log2 P k k q k pβk − 1 /(q − 1) P 1 log2 k (pβk )α α−1 P q k pβk − 1 /(q − 1) P 1 log2 k (pβk )α α−1 P q k pβk − 1 /(q − 1) k
mediann (rn )2 rn2
2 r(n:N )
|rn |
mediann (rn )2
w
ie
ev
rR
ee
ly
On
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Communications in Statistics - Simulation and Computation
21
URL: http://mc.manuscriptcentral.com/lssp E-mail:
[email protected] Communications in Statistics - Simulation and Computation
rP Fo
Table II: MSEL for different estimators in the simulation study. MSEL LTS
14.6829
(ρ = 0.1)
ee
Ridge
2.3680
Liu-type1
0.6212
Liu-type2
10.5924
rR MERG1 MERG2
0.8674
(ρ = 0.1)
0.1070
MERG3
0.7418
MERG4
0.0969
ev
MERG1T
(q = 4)
MERG3T
(q = 4)
0.8141
ie 0.5762
w ly
On
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
22
URL: http://mc.manuscriptcentral.com/lssp E-mail:
[email protected] Page 26 of 25