Development and application of a robotic chemical mass balance ...

Report 4 Downloads 40 Views
This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution and sharing with colleagues. Other uses, including reproduction and distribution, or selling or licensing copies, or posting to personal, institutional or third party websites are prohibited. In most cases authors are permitted to post their version of the article (e.g. in Word or Tex form) to their personal website or institutional repository. Authors requiring further information regarding Elsevier’s archiving and manuscript policies are encouraged to visit: http://www.elsevier.com/copyright

Author's personal copy

Environmental Modelling & Software 26 (2011) 469e481

Contents lists available at ScienceDirect

Environmental Modelling & Software journal homepage: www.elsevier.com/locate/envsoft

Development and application of a robotic chemical mass balance model for source apportionment of atmospheric particulate matter Georgios Argyropoulos, Constantini Samara* Environmental Pollution Control Laboratory, Department of Chemistry, Aristotle University, 541 24 Thessaloniki, Greece

a r t i c l e i n f o

a b s t r a c t

Article history: Received 15 October 2009 Received in revised form 12 October 2010 Accepted 17 October 2010

An advanced computational procedure is presented for the source apportionment (SA) of airborne particulate matter (PM) using chemical mass balance (CMB) receptor modeling. The so-called “Robotic Chemical Mass Balance” model (RCMB) minimizes personal judgment, by leading straight-forwardly to the bestefit combination of the source profiles that are included in a set of input data. RCMB involves application of an established least squares fitting method to every one of the possible combinations that can be made from the source profiles, without any human interference, in contrast with previous CMB modeling software. Any successful applications of the fitting method are automatically ranked according to performance measures, common in multiple linear regression (MLR). By maximizing an overall fitting index, the proposed computational procedure provides a unique solution to the conventional CMB problem, which cannot be questioned readily, unless additional information becomes available about the study area. This explicit advantage of RCMB is illustrated by a comparison with the original CMB analysis of the Crows, California PM2.5 data from the San Joaquin Valley Air Quality Study (SJVAQS). Ó 2010 Elsevier Ltd. All rights reserved.

Keywords: Airborne particulate matter Chemical mass balance Source apportionment Source profiles

1. Introduction 1.1. Chemical mass balance models In the field of atmospheric sciences, source apportionment (SA) models are used to estimate the contributions of individual sources to the concentration of atmospheric pollutants like particulate matter (PM), based on ambient data registered at monitoring sites. There are three main groups of SA techniques (Viana et al., 2008): (a) basic numerical methods, which mainly offer a qualitative interpretation of monitoring data (e.g. the correlation of wind direction with levels of measured components to identify source locations) (b) physical methods, useful in scenario studies that are based on emission inventories and dispersion models to simulate aerosol formation, emission, transport and deposition (c) chemical methods that rely on the statistical evaluation of PM chemical data acquired at receptor sites (receptor models). Receptor modeling is founded upon the hypothesis of mass and chemical species conservation, assumed for the airborne PM collected at the receptor site, which allows for a mass balance analysis to be used, in order to identify and apportion sources of PM in the atmosphere. There is a wide range of receptor models whose * Corresponding author. Tel./fax: þ30310 997747. E-mail address: [email protected] (C. Samara). 1364-8152/$ e see front matter Ó 2010 Elsevier Ltd. All rights reserved. doi:10.1016/j.envsoft.2010.10.010

main difference is the degree of knowledge required about the pollution sources prior to their application. The two main extremes of receptor modeling are multivariate models, which require little knowledge about pollution sources, and chemical mass balance (CMB) models, which require an almost complete knowledge, respectively. (Gordon, 1988; Henry et al., 1984; Hopke, 1991). The main assumptions on which CMB models rely are: (a) all the sources, contributing significantly to a receptor site, have been identified and have had their emissions chemically characterized (b) chemical species do not react with each other, i.e. they add linearly (c) compositions of source emissions are constant over the period of ambient and source sampling (d) the number of sources is less than the number of chemical species (e) the source compositions are linearly independent of each other (f) measurement uncertainties are random, uncorrelated and normally distributed (Henry et al., 1984; Thurston and Lioy, 1987). If all the contributing sources are known, their chemical compositions remain constant during the sampling, and there is no interaction between their chemical species that would cause mass removal or formation, then the total airborne particulate mass concentration Cmass measured at the receptor site will be a linear sum of the contributions of the individual sources Sj (Henry et al., 1984):

Cmass ¼

n X j¼1

Sj

(1)

Author's personal copy

470

G. Argyropoulos, C. Samara / Environmental Modelling & Software 26 (2011) 469e481

Similarly, the mass concentration of an aerosol property i, Ci, will be (Henry et al., 1984): n X

Ci ¼

aij Sj

(2)

j¼1

where aij is the mass fraction of source contribution j possessing property i. Writing equation (2) for every measured chemical species defines the CMB problem, provided that the source contributions Sj will be considered as the dependent variables. 1.2. The ordinary least squares (OLS) solution of the CMB problem If the number of chemical species m exceeds the number of contributing sources n, then multiple linear regression (MLR) can be applied to the over determined linear system that is defined in this case by equations (2), in order to estimate a set of probable values for the source contributions Sj. Ordinary least squares (OLS) was historically one of the first fitting methods that were utilized for this purpose (Hopke, 1991). OLS can extract a set of probable values for the source contributions Sj, by minimizing the sum c2 of the squared residuals: m X

c ¼ 2

0 @Ci 

i¼1

n X

12 aij Sj A

(3)

j¼1

The above function reaches a minimum when all its partial derivatives with respect to the source contributions Sj become zero. Setting these partial derivatives equal to zero produces the system of normal equations (Björck, 1996):

AT $A$S ¼ AT $C

(4)

where A is the m  n matrix of the mass fractions aij, AT the transpose matrix of A, C the m-dimensional vector of the chemical species ambient concentrations Ci and S the n-dimensional vector of the source contributions Sj. The system of normal equations (4) can then be solved according to the following expression (Björck, 1996):

S ¼



1 AT $A $AT $C

(5)

Equation (5) represents the OLS solution of the CMB problem.

known as multicollinearity (Meloun et al., 2002), is a common problem of virtually every application of MLR, which can cause numerical errors in the inversion of matrix AT∙A, as well as instabilities in the source contributions estimates; the latter ones can often have a negative sign that precludes any physical interpretation. 1.4. The ordinary weighted least squares (OWLS) solution of the CMB problem The absence of multicollinearity from the source profiles only guarantees that the OLS estimate will not be subjected to serious numerical errors or obvious mathematical artifacts. However, according to the Gauss-Markov theorem (Björck, 1996), in order for an OLS solution to best represent as well the real source contributions, the measurement uncertainties of the chemical data acquired at the receptor site should be random, uncorrelated, and follow normal distributions that are identical so as to satisfy the homogeneity of variance. The mass fractions aij, from which the source’s chemical profiles are represented, should also be considered as errorless constants. Nevertheless, heterogeneity of variance is the rule rather than the exception for the measurement uncertainties of the chemical data acquired at the receptor site. For this reason, Friedlander (1973) considered the fitting method of ordinary weighted least squares (OWLS) as a more appropriate solution of the CMB problem, than OLS. OWLS can extract a set of probable values for the source contributions Sj, in which the heteroscedasticity of the receptor’s chemical data is reflected, by minimizing the following function:

m X

c2 ¼

Ci 

Pn

j¼1

!2 aij Sj

s2Ci

i¼1

(7)

where sCi is the typical error of the measurement of Ci. Setting all the partial derivatives of equation (7) with respect to the source contributions Sj equal to zero ultimately leads to the OWLS solution of the CMB problem (Björck, 1996):

S ¼



AT $W$A

1

$AT $W$C

(8)

where W is the m x m diagonal matrix with s2 Ci on the diagonal. The OWLS solution is equivalent to the one of a corresponding OLS system (Björck, 1996):



1

1.3. Perfect collinearity and multicollinearity

S ¼

As already mentioned, one of the main assumptions, on which CMB models rely, is linear independence of the source profiles (the fractional amount of each species in the emissions from each source type). This can be easily deduced from the fact that an application of the OLS method is impossible in the case of perfect collinearity (Meloun et al., 2002), i.e. if there are linear combinations of the source profiles, exactly equal to the corresponding zero vector:

where AW ¼ G∙A, CW ¼ G∙C and GT∙G ¼ G∙GT ¼ W. In the absence of perfect collinearity among the columns of AW, the Gauss-Markov theorem will hold true for the equivalent OLS system, since the typical errors of the dimensionless concentrations CW,i are always equal to 1, regardless of the ambient data used for the fitting.

n X

fj Aj ¼ 0

(6)

j¼1

where Aj is the m-dimensional vector, defined by the j column of matrix A, f one non-zero n-dimensional vector of the elements fj, and 0 the m-dimensional zero vector. If equation (6) holds true, then the determinant of matrix AT∙A becomes zero too, this means that equation (5) cannot be defined. The reliability of least squares fitting is also questioned even if there are linear combinations of the source profiles that are almost equal to the corresponding zero vector. The latter case, widely

ATW $AW

$ATW $CW

(9)

1.5. Singular value decomposition (SVD) of the weighted source profile matrix In order to identify those sources which cannot be estimated accurately by an ordinary weighted least squares system, due to multicollinearity, Henry (1992) developed a criterion, based on the singular value decomposition (SVD) of the weighted source profile matrix:

AW ¼ U$D$V T

(10)

where U is an m  m orthogonal matrix, V an n  n orthogonal matrix, and D an m  n diagonal matrix with non-zero elements on the diagonal dj (the singular values of matrix AW).

Author's personal copy

G. Argyropoulos, C. Samara / Environmental Modelling & Software 26 (2011) 469e481 T The columns of matrix U are the eigenvectors of AW∙AW , the T ∙AW, and d2j are the correcolumns of V are the eigenvectors of AW sponding eigenvalues. OWLS estimates with standard errors less than some value, say s, are only possible in the eligible space, which is defined by the eigenvectors for which the corresponding inverse singular values are less than s. A source contribution Sj will be estimable to an accuracy of s2, only if its projection in the eligible space has unit length. (Henry, 1992). The projection length of Sj in the eligible space can be determined by column j of matrix V, according to the follow procedure: (a) the elements of column j that correspond to inverse singular values, smaller than s, are squared (b) their squares are then summed (c) the sum of the squares is square rooted (Björck, 1996). In fact, the requirement of unit projection length is a bit too strict and can be relaxed so that an estimable source need only have a length of 0.95 or so in the eligible space. Sources that have moderate projections of length 0.3e0.95 cannot be precisely estimated, but they may participate in linear combinations which are estimable with error less than s. Indeed, if the eligible space is Ldimensional and the estimable sources are E, then there must be (LeE) independent linear combinations of inestimable sources, which can be estimated with the desired accuracy (Henry, 1992).

1.6. The effective variance weighted least squares (EFWLS) solution of the CMB problem A well-conditioned OWLS system does not ensure anywise that the solution, it will extract for a CMB problem, will be reliable, unless the mass fractions aij, from which the source’s chemical profiles are represented, can be considered as errorless constants, whereas, in practice, they must actually be determined experimentally, just like the ambient concentrations Ci. Indeed, since compositions of source emissions actually vary substantially over time, their uncertainties will probably be greater than those of the ambient concentrations (Watson et al., 1984). Least squares fitting when all observables contain error is a common problem in regression analysis and a number of methods exist to handle it. Watson et al. (1984) adopted for the solution of the CMB problem the least squares fitting scheme of Britt and Luecke (1973), which is based on the assumption that all the uncertainties of the measurements at the receptor site and its sources are uncorrelated, and normally distributed. The Britt and Luecke algorithm consists of an iterative procedure that may approximate a set of probable values for the source contributions Sj, in which all the measurement uncertainties are reflected, by minimizing likelihood function (11), in which the over bars indicate the “real” values of the mass fractions.

c ¼ 2

m X

Ci 

n X

12

aij Sj A

j¼1

þ

s2Ci

i¼1

 2 m X n aij  aij X i¼1 j¼1

s2aij

(11)

where saij are the typical errors of aij All the estimates for the source contributions Sj are initially set equal to zero. The iteration loop involves, then, first determining the diagonal elements of the m  m diagonal matrix (Vek)1 that contains the effective weightings, according to the following relationship:



vke;ii

1

0 ¼

@s2 Ci

þ

n X j¼1

11  2 k saij $ Sj A 2

(12)

where the superscript k indicates the iteration’s number. The new estimates for the “real” values of the mass fractions aij are afterwards calculated for each source’s chemical profile by:

  1   1 1 k þ Sk $V $ V k k $ AkT $ V k Akþ1 ¼ A $ I  A Ak Aj e e j j j i  1  h $ C  ASk ð13Þ $AkT $ Vek

471

(13)

where VAj is one m  m diagonal matrix with elements on the diagonal VAj ;ii ¼ s2aij and I is the m  m identity matrix. Finally, the new estimates for the source contributions Sj are calculated by:

 i  1 1  1 h Skþ1 ¼ Sk þ AkT $ Vek Ak $AkT $ Vek $ C  ASk

(14)

The iteration loop is terminated if the new estimates vary from the previous ones less than a threshold value, however, its convergence for a given CMB problem cannot be guaranteed anyhow, because it depends on the degree of collinearity that may occur among the columns of the effectively weighted source profile matrix, defined by the matrix product Ge∙A, where Ge is one m  m diagonal matrix for which it holds true that Ge $GTe ¼ GTe $Ge ¼ ðVek Þ1 . Since the iteration loop involves modifying matrix (Vek)1, it is no longer possible to decide “a priori” whether a system is collinear or not. SVD can still be applied to the matrix product Ge∙A, in order to potentially uncover an ill-conditioned system, but only after a successful convergence (Watson et al., 1997). The least squares fitting method of Britt and Luecke can be simplified substantially if the differences between the “true” values of the mass fractions aij and the experimental ones aij are considered as negligible. This approximation, widely known as the effective variance weighted least squares (EFWLS) solution (Watson et al., 1984), is currently the official method suggested by the Environmental Protection Agency of United States (US EPA) for CMB modeling; the complete version of the Britt and Luecke algorithm is provided only as an extra option, by their most recently released CMB modeling software (CMB 8.2. model), designated for research purposes mainly (Coulter, 2004; Watson, 2004).

1.7. Diagnostic criteria for the validation of an EFWLS solution of a CMB problem The US EPA has established a standard set of diagnostic criteria for the validation of an EFWLS solution of a CMB problem, which are listed in Table 1 (Coulter, 2004; Watson, 2004). The most profound performance measures, included by this set, are the estimates for the source contributions Sj alone; any negative values, among them, obviously reflect a mathematical artifact, precluding any physical interpretation. The two fit indices, R2 and c2red., are considered by the US EPA as primary performance measures of an EFWLS solution, along with the percent mass explained (%mass), although the latter one can be misleading when the total mass concentration that has been measured for the PM of the ambient sample is small (Coulter, 2004; Watson, 2004). The fraction FracEst of those source contributions that have acceptably large projection lengths in the eligible space is another primary performance measure that has been established by the US EPA so as to validate the well-conditioning of the least squares system being solved (Coulter, 2004; Watson, 2004). The standard error s defining the eligible space is set equal to a percent of the ambient PM concentration that the user of the US EPA CMB 8.2 model adjusts. The threshold value defining almost unit projection lengths in the eligible space is also adjusted by the user. Each T-statistic ratio Tstatj, is an additional performance measure, established by the US EPA, as an indicator of whether j

Author's personal copy

472

G. Argyropoulos, C. Samara / Environmental Modelling & Software 26 (2011) 469e481

Table 1 Diagnostic criteria of an EFWLS solution of a CMB problem. Performance measure(s)

Target value(s) (US EPA)

Sj

>0  0:8

R2 ¼ 1  Pm

i¼1

c2red: ¼

c2 Ci2 ðvke;ii Þ1

4

c2 mP n

n j ¼ 1 Sj $100 Cmass E FracEst ¼ n Sj Tstatj ¼ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi varðSj Þ

100%  20%

ðAk $SÞi  Ci ffi ðRes=UncerÞ ¼ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi P s2Ci þ ð nj¼ 1 Sj $saij Þ2

j(Res/Uncer)ij2

%mass ¼

1 2

source contribution is below detection limit or not. Low T-statistic values for several source contributions may be caused by collinearity among their profiles; the presence of collinearity could be uncovered in this case by the source contribution’s projections in a properly adjusted eligible space. (Coulter, 2004; Watson, 2004) Finally, each ratio (Res/Uncer)i is also an additional performance measure, established by the US EPA, which specifies the number of uncertainty intervals by which the calculated and measured concentrations of i species differ. A negative value of (Res/Uncer)i shows that there is an insufficient contribution to i species indicating, according to US EPA, that there may be source profile(s) missing. (Coulter, 2004; Watson, 2004) An overall fitting index has also been established by the US EPA, named Fit Measure, which is defined by the primary performance measures, according to equation (15), if %mass is smaller than 100 (Coulter, 2004; Watson, 2004):

Fit Measure ¼

   wf1 $ c12 þ wf2 $R2 þ wf3 $ %mass 100 þ wf4 $FracEst wf1 þ wf2 þ wf3 þ wf4 (15)

where wf1, wf2, wf3, wf4 are adjustable weighting fractions (their default value is 1) If %mass is larger than 100, then its quotient must be reversed in equation (15), in order to define Fit Measure (Coulter, 2004; Watson, 2004).

1.8. Limitations of conventional CMB modeling Although CMB modeling relies on the assumption that every source, which contributes significantly to the receptor site, has been identified, it is most often applied without any definite knowledge about the ones that actually do. Even if the polluting sources are all known, CMB modeling also requires that their compositions will remain constant over the period of ambient and source sampling, which is, nonetheless, unlikely to occur (Javitz et al., 1988; Watson et al., 1984). There is virtually nothing to do too, in order to predict an occurrence of collinearity among the columns of the effectively weighted source profile matrix during run-time, if the algorithm of Britt and Luecke, or the EFWLS approximation has been adopted for the solution of the CMB problem (Watson et al., 1997). Due to those common violations of CMB assumptions, CMB modeling generally involves in practice initially applying a fitting method (usually the EFWLS approximation) to a range of least squares systems, defined by different chemical species and/or source profiles, which have been all considered as equally probable for reflecting the true emissions at the receptor site. Afterwards, if any of these least

squares systems actually converge to solutions that present acceptable performance measures, they are ranked, according to an overall fitting index. The final decision about the fitting species and/or source profiles that best represent PM emissions, at the receptor site, may also depend on additional information about the study area, as well as personal judgment (Cheng and Hopke, 1986; Lowenthal et al., 1992; Watson et al., 1997; Watson, 2004). That preliminary task of practical CMB modeling often complicates SA, because a typical set of the input data that are usually gathered for routine environmental studies can potentially define a very wide range of “candidate” least squares systems (Samara et al., 2003; Samara, 2005), far exceeding the capabilities of existing CMB modeling software, as it is illustrated below. Equation (16) determines the total number PN,M of least squares systems that can possibly be defined by a set of input data, including the ambient concentrations of M chemical species, along with N profiles (M > N) of the chemical abundances of these species, possessed by probable sources:

PN;M ¼

M X N! M! J!ðN  JÞ! I!ðM  IÞ! J¼1 I ¼ Jþ1 N X

(16)

According to equation (16), a typical set of input data, which includes the ambient concentrations of 23 chemical species, along with 18 source profiles (like the one illustrated below as a test case) can define an astronomic total of 1,613,294,846,589 least squares systems for one ambient sampling. Even if there is an additional condition that the degrees of freedom (IeJ), possessed by “candidate” least squares systems, must be more than a threshold number, say K, there are still 585,711,642,651 least squares systems that possess 5 degrees of freedom or more, according to equation (17):

PN;K ¼

M X N! M! J!ðN  JÞ! I!ðM  IÞ! I ¼ JþK J¼1 N X

(17)

In fact, however, any trial applications of a fitting method can be limited to an even more narrow sub range involving only least squares systems that consist of specific fitting species, to which all the markers of probable sources have been included. The total number PN of least squares systems that belong to such a sub range can be determined by equation (18):

PN ¼

N X

N! ¼ 2N  1 J!ðN  JÞ! J¼1

(18)

Under the latter condition, the ambient concentrations of specific chemical species, along with 18 source profiles, can define 262,143 least squares systems. According to the US EPA protocol for applying and validating the CMB model, initial source contribution estimates that are obtained for an average ambient PM sample should be optimized separately for each ambient PM sample of the case study, by addition, depletion or substitution of source profiles, after examination of the performance measures (Watson, 2004). Nevertheless, the trial CMBs, performed on one ambient sample, normally include a lot fewer combinations of source profiles than the ones of equation (18), usually a few hundred or so, for a routine environmental study (Chow et al., 1990). The US EPA CMB 8.2 model, in particular, operating in Best Fit Mode, is capable of ranking, according to the Fit Measure index, a maximum of only 10 least squares systems, whose fitting species and source profiles must have been manually selected by the CMB modeler, using 10 pairs of species and profiles selection arrays that are provided for this purpose, by the model’s graphical user interface (Coulter, 2004).

Author's personal copy

G. Argyropoulos, C. Samara / Environmental Modelling & Software 26 (2011) 469e481

It is apparent that standard trial-and-error procedures of conventional CMB modeling, such as the manually-driven Best Fit Mode of the US EPA CMB 8.2 model not only are considerably laborious, but can also be strongly susceptible to personal prejudices about the study area, because they can never rule out the mathematical probability that combinations of source profiles may fit ambient data better than the relatively few ones, tested.

sub menu “Probable Sources”, a line must also have been included in the “Source Profiles” sheet of the Macro’s Workbook, registering, at specific columns, the name, code and origin of the source, the size of PM, the mass fractions of measured chemical species, and the typical errors of these fractions, before the Macro is started.

2. Method and software description 2.1. Development of the RCMB computational procedure Fig. 1 shows a logical diagram of RCMB. A fitting method, chosen among the ones available, is subsequently applied to every one of the least squares systems that can be defined by an introduced set of input data according to equation (18). The fitting methods available by RCMB are the ordinary least squares, the ordinary weighted least squares, the complete version of the algorithm of Britt and Luecke, and the EFWLS approximation. Before the beginning of the computational procedure, the user of the software can also choose diagnostic criteria, according to which any successful applications of the fitting method will be validated during run-time. The diagnostic criteria, available for the validation of possible fits, are the absence of any negative values, among the estimates for the source contributions, the values of fit indices c2red., and R2, the value of %mass, the value of fraction FracEst, and the values of the T-Statistic ratios. During run-time, if a trial application of the chosen fitting method is successful, then a text line, reporting the estimates for the source contributions, the performance measures, and the value of the Fit Measure Function, is stored for that trial, provided that the diagnostic criteria, to which the validation of fits may have been assigned, have all been met. After the chosen fitting method has been applied to every least squares system that can be defined by the introduced set of input data, according to equation (18), any text lines that have been stored, reporting the validated solutions of that running session, are sorted, according to the Fit Measure index of equation (15). The explicit advantage of the proposed computational procedure is that the bestefit combination of source profiles, deriving straight-forwardly from the maximization of Fit Measure, provides a unique solution to the conventional CMB problem, which minimizes personal judgment. 2.2. Description of software The software package that materializes RCMB consists of a Microsoft Excel Macro, written in Visual Basic 6.3, which implements the GUI, and an executable code that performs the least squares fitting calculations, compiled with the Free Pascal IDE for Win32 for i386, to which the TPMath Library of Jean Debord (2007) had been encapsulated. The Macro is loaded by Excel from a Workbook, whose data sheets serve either for storing input data, or for illustrating any output data, after a running session. The main menu of the GUI is shown in Fig. 2.a. For each ambient sample, that is available for selection from the combo boxes of main menu, a line must have been included in the “Ambient Data” sheet of the Macro’s Workbook, registering, at specific columns, the location and date of sampling, the size of PM, the mass concentration of PM, the ambient concentrations of measured chemical species, and the typical errors of these concentrations, before the Macro is started. Fig. 2b includes the sub menu “Probable Sources” (appearing when the command button of main menu, named “Select Sources”, is pressed), from which the probable sources of the ambient sample’s PM are determined. For each source that is displayed by

473

Fig. 1. Logic Diagram of Robotic Chemical Mass Balance.

Author's personal copy

474

G. Argyropoulos, C. Samara / Environmental Modelling & Software 26 (2011) 469e481

Fig. 2. GUI of RCMB.

The fitting method of a running session is chosen by pressing the “Select Method” command button of main menu, which loads the sub menu “Select Method”, included in Fig. 2c. The check box that appears at the bottom of this sub menu enables Robotic Mode; if it has not been checked out, before the beginning of a running session, then the selected fitting method will only be applied to the least squares system, including all the selected source profiles.

Pressing the command button of main menu, named “Settings”, loads the “Settings” sub menu, also included in Fig. 2c, from which the eligible space of Henry’s criterion is adjusted, and the performance measures that are going to be employed for the validation of fits, during run-time, are checked out. Finally, the fitting species of a running session are picked out from the registered ones, using the “Fitting Species” sub menu,

Author's personal copy

G. Argyropoulos, C. Samara / Environmental Modelling & Software 26 (2011) 469e481

475

Fig. 3. Output of the RCMB.

included in Fig. 2d, which is loaded, by pressing the “Select Species” command button of main menu. As soon as the user of RCMB has selected the ambient sample, the probable sources, the fitting method, and the fitting species of a running session, the command button of main menu, named “Run”, is enabled, from which a running session is started. After the least squares fitting calculations have been completed, the output file that the executable code produces is imported by the Macro; if there are any lines reporting validated solutions, they are sorted according to the Fit Measure function, and eventually displayed by the “Contributions” Sheet of the Macro’s Workbook (Fig. 3). 3. Test case The RCMB model was applied to the Crows, California PM2.5 data from the San Joaquin Valley Air Quality Study (SJVAQS), available as a test case with the US EPA CMB 8.2 model (Coulter, 2004). This data set includes the PM2.5 concentrations of mass, organic carbon (OC) 2and elemental carbon (EC), nitrate (NO 3 ), sulfate (SO4 ), ammonium þ þ ), soluble sodium (Na ) and potassium (K ), and elemental (NHþ 4 species (Al, Si, S, Cl, K, Ca, Ti, V, Cr, Mn, Fe, Ni, Cu, Zn, Br, and Pb), registered for 33 24-h aerosol samples collected at Crows Landing, between June 1988 and June 1989 (Chow et al., 1990; Coulter, 2004). Eighteen source profiles, listed in Table 2, were selected by Chow et al. (1990) as input data for the CMB model, representing major source types of particulate emissions at the San Joaquin Valley (SJV). Source contributions to ambient PM2.5 were determined from the EFWLS solution of the CMB problem, using the USEPA/DRI CMB 7.0 model. Estimates of source contributions were calculated independently for each ambient PM2.5 sample, with manual addition, deletion, or substitution of source profiles, after examination of the performance measures, listed in Table 1. Sulfur was not included as a fitting species, since sulfate was also available. Copper and zinc were excluded too, from the fitting species of all the CMBs performed for the SJVAQS, because brass shavings from sampling fittings had compromised their measurements (Chow et al., 1990).

The average contributions that were derived for major source types are shown in Fig. 4a. Because geological profiles (SOIL1, SOIL3, SOIL16, and SOIL17) were too collinear to be distinguished from one another by the EFWLS, only a single profile representing this source type was used in the CMB apportionments for all ambient samples. The same was also true for vegetative burning profiles (BAMAJC and STAGBC), motor vehicle exhaust profiles (MOVES2 and WHDIEC), and oil combustion profiles (SFCRUC and CHCRUC). Resembling the original CMB analysis, the RCMB receptor model was also applied for the SA of each ambient PM2.5 sample, using the EFWLS fitting method, the same fitting species as the ones chosen by Chow et al. (1990), and all the source profiles listed in Table 2 (N ¼ 18). The performance measures, which were utilized for the automatic validation of successful convergences, included the Table 2 Source Profiles Applied in the SJVAQS (Chow et al., 1990). Profile ID

Profile Mnemonic

Description of Source Profile

01 02 15 16 17

SOIL01 SOIL03 SOIL16 SOIL17 BAMAJC

35 37

STAGBC MOVES2

29 30 27 39 42

WHDIEC MOTIBC SFCRUC CHCRUC SCRRFC

51 54 56 35 61 60

AMSUL AMNIT NANO3 MARINE LIME OC

Stockton Agricultural (Peat) Soil Fresno Paved Road Dust Bakersfield Unpaved Road Dust (Residential) Taft Unpaved Road Dust Wood smoke emissions, Bakersfield Cordwood Using Majestic Fireplace Stockton Agricultural (Wheat) Burning South Coast Motor Vehicle Emissions, MOVES-SS (NEAEWOB, WOT, TVMT) Wheeler High Station Diesel Truck Emissions Modesto Tire Fueled Power Plant Emissions Santa Fe Crude Oil Boiler Emissions Chevron Racetrack Crude Oil Boiler Emissions Stanislaus County Municipal Waste Fueled Power Plant Emissions Ammonium Sulfate, Secondary Aerosol Ammonium Nitrate, Secondary Aerosol Sodium Nitrate, Reacted Marine Aerosol Primary Marine Aerosol Primary Construction Emissions (Limestone) Secondary Organic Carbon

Author's personal copy

476

G. Argyropoulos, C. Samara / Environmental Modelling & Software 26 (2011) 469e481

Fig. 4. Average source contributions to ambient PM2.5 at Crows Landing (a) SJVAQS (b) RCMB.

absence of any negative values among estimates for source contributions, the fit indices R2 and c2red, and the T-stat ratios. Similarly to Chow et al. (1990), % mass was included to the performance measures of RCMB only for those ambient PM2.5 samples, whose mass concentrations had been measured to be above 10 mg/ m3, since lower values were within a few percent precision intervals of the PM2.5 mass measurements. All the other performance measures, listed in Table 1, were inspected manually for registered least squares systems, after the end of each running session. The output of RCMB is summarized for each ambient PM2.5 sample in Table 3. It is apparent that almost all the sets of input data, listed in

Table 3, define a plethora of least squares systems, converging to solutions that meet common statistical criteria, which confirms the well-established fact (Cheng and Hopke, 1986) that two different solutions, both having acceptable performance measures, can often be found for the conventional CMB problem by two different people. Table 4 lists the source contribution estimates (mg/m3), the performance measures R2, c2red, and % mass, and the Fit Measure index of the best fit that was determined for each ambient PM2.5 sample by RCMB. It is apparent that source profiles, which had not been resolved by the original CMB analysis, due to collinearity, such as BAMAJC and STAGBC, were estimable by the preference of RCMB

Table 3 Summarization of the output of RCMB. Sampling date

Failures of convergence

Negative estimate(s)

Low R2

High c2

High or low %mass

Low T-Stat value(s)

Possible Fits

20/06/88 02/07/88 14/07/88 26/07/88 07/08/88 19/08/88 25/08/88 31/08/88 06/09/88 12/09/88 18/10/88 30/10/88 11/11/88 17/11/88 23/11/88 29/11/88 05/12/88 11/12/88 17/12/88 23/12/88 29/12/88 04/01/89 10/01/89 16/01/89 22/01/89 28/01/89 03/02/89 09/02/89 15/02/89 21/02/89 23/03/89 10/04/89 10/05/89

26604 19299 9447 22534 28313 29145 23712 26226 23977 15393 27657 33434 15779 2559 24057 32508 28378 32806 31989 12138 19058 29828 21724 32394 35384 31069 1806 31621 33264 30671 20734 10355 10743

90474 127343 197485 131458 113921 122686 153009 108371 125474 157746 114845 92837 121004 102699 90142 88232 101358 84878 101186 111562 109007 92670 94830 92731 72244 80933 126134 85784 88230 90817 111295 135386 114201

50041 29068 8079 16667 32654 22126 1794 23204 23986 1825 22072 28560 21458 116452 19711 18742 18657 15775 18523 20573 18647 19337 18995 17867 16458 13165 120163 18999 17141 19169 20587 31377 64800

94648 84755 44144 88117 84124 86783 81254 102767 86260 83178 96066 106838 102106 38646 127474 122586 112737 127799 109331 115405 113666 119358 124377 118081 137529 135573 13564 123885 122827 119865 109017 81531 64427

11 145 59 6 146 35 92 8 41 216 23 27 N/A N/A 0 0 0 0 36 N/A 54 0 0 0 0 0 N/A 0 0 0 N/A N/A N/A

351 1483 2927 3025 2943 1303 2215 1422 2345 3732 1457 399 1646 1772 679 47 828 783 767 2440 1649 880 2178 742 469 1210 464 1768 652 1594 447 3295 7913

14 50 2 336 42 65 67 145 60 53 23 48 150 15 80 28 185 102 311 25 62 70 39 328 59 193 12 86 29 27 63 199 59

Author's personal copy

G. Argyropoulos, C. Samara / Environmental Modelling & Software 26 (2011) 469e481

477

Table 4 Source contribution estimates (mg/m3) and performance measures values of the Best Fits of RCMBa. Date

20/06/88

02/07/88

14/07/88

26/07/88

07/08/88

19/08/88

25/08/88

31/08/88

06/09/88

12/09/88

18/10/88

SOIL01 SOIL03 SOIL16 SOIL17 BAMAJC STAGBC MOVES2 WHDIEC MOTIBC SFCRUC CHCRUC SCRRFC AMSUL AMNIT NANO3 MARINE OC LIME R2 %mass Fit Measure

(0.00) (0.00) (0.00) (0.00) (0.00) (0.00) 2.61 (0.00) (0.00) (0.00) (0.00) 3.55 1.96 (0.00) 1.45 (0.00) (0.00) (0.00) 0.88 1.38 85 0.8168

(0.00) (0.00) 1.44 (0.00) 1.1 (0.00) 1.94 (0.00) (0.00) (0.00) (0.00) 1.23 2.04 (0.00) 0.91 (0.00) 1.94 (0.00) 0.97 0.9 77 0.9478

2.6 (0.00) (0.00) (0.00) (0.00) (0.00) (0.00) 1.77 (0.00) (0.00) (0.00) 4.59 (0.00) (0.00) (0.00) (0.00) (0.00) (0.00) 0.84 2.37 77 0.6753

(0.00) (0.00) 4.38 (0.00) (0.00) 2.51 3.38 (0.00) (0.00) 1.07 (0.00) 2.56 3.81 (0.00) 0.91 (0.00) (0.00) (0.00) 0.97 1.05 88 0.9364

(0.00) (0.00) 1.29 (0.00) (0.00) 1.8 1.64 (0.00) (0.00) (0.00) (0.00) (0.00) 3.49 (0.00) 0.73 (0.00) 1.57 (0.00) 0.93 1.61 76 0.7721

(0.00) (0.00) 3.61 (0.00) (0.00) 2.69 5.71 (0.00) (0.00) (0.00) (0.00) 2.45 2.88 (0.00) 1.34 (0.00) (0.00) (0.00) 0.97 0.95 78 0.9331

(0.00) (0.00) 8.71 (0.00) 4.21 (0.00) 3.61 (0.00) (0.00) (0.00) (0.00) (0.00) 2.83 (0.00) 1.03 (0.00) (0.00) (0.00) 0.98 0.84 77 0.9805

(0.00) (0.00) 4.94 (0.00) 1.69 (0.00) 4.34 (0.00) (0.00) 1.92 (0.00) 4.91 4.27 (0.00) 3.93 (0.00) (0.00) (0.00) 0.98 0.73 87 1.0749

(0.00) (0.00) 4.22 (0.00) (0.00) (0.00) 4.98 (0.00) (0.00) (0.00) (0.00) 4.25 1.61 (0.00) 1.11 (0.00) (0.00) (0.00) 0.95 1.32 83 0.8459

(0.00) (0.00) (0.00) (0.00) 3.26 (0.00) (0.00) (0.00) (0.00) (0.00) (0.00) 17.15 0.93 (0.00) 0.86 (0.00) 6.6 (0.00) 0.93 0.93 83 0.9465

(0.00) (0.00) (0.00) (0.00) (0.00) (0.00) 3.48 (0.00) (0.00) (0.00) (0.00) 15.73 2.59 2.86 (0.00) (0.00) (0.00) (0.00) 0.93 1.15 87 0.8887

Date

30/10/88

11/11/88

17/11/88

23/11/88

29/11/88

05/12/88

11/12/88

17/12/88

23/12/88

29/12/88

04/01/89

SOIL01 SOIL03 SOIL16 SOIL17 BAMAJC STAGBC MOVES2 WHDIEC MOTIBC SFCRUC CHCRUC SCRRFC AMSUL AMNIT NANO3 MARINE OC LIME R2 %mass Fit Measure

(0.00) (0.00) 1.04 (0.00) 3.26 (0.00) 4.5 (0.00) (0.00) 0.42 (0.00) (0.00) 7.35 6.15 (0.00) (0.00) (0.00) (0.00) 0.97 0.87 78 0.9661

(0.00) (0.00) 0.92 (0.00) 2.06 (0.00) 2.08 (0.00) (0.00) (0.00) (0.00) (0.00) 1.01 1.28 1.02 (0.00) (0.00) (0.00) 0.97 0.63 92 1.1566

(0.00) (0.00) (0.00) (0.00) 0.71 (0.00) (0.00) (0.00) (0.00) (0.00) (0.00) (0.00) 0.83 (0.00) 1.17 (0.00) (0.00) (0.00) 0.91 0.48 70 1.2329

(0.00) (0.00) (0.00) (0.00) 0.85 (0.00) 1.66 (0.00) (0.00) 0.71 (0.00) 1.82 0.83 6.59 0.86 (0.00) (0.00) (0.00) 0.92 1.97 93 0.7853

(0.00) (0.00) (0.00) (0.00) 1.36 (0.00) 2.23 (0.00) (0.00) (0.00) 0.11 (0.00) 3.43 16.42 0.49 (0.00) (0.00) (0.00) 0.98 0.37 86 1.5165

(0.00) (0.00) 0.46 (0.00) 2.39 (0.00) 4.58 (0.00) (0.00) (0.00) (0.00) (0.00) 1.51 13.12 (0.00) (0.00) 5.7 (0.00) 0.96 0.9 98 1.0187

(0.00) 0.67 (0.00) (0.00) 4.63 (0.00) 3.82 (0.00) (0.00) 0.59 (0.00) (0.00) 3.58 28.87 (0.00) (0.00) 3.14 (0.00) 0.97 1.03 89 0.9431

(0.00) (0.00) 0.3 (0.00) 2.7 (0.00) 2.13 (0.00) (0.00) 0.31 (0.00) (0.00) 2.35 9.27 0.76 (0.00) 1.68 (0.00) 0.96 1.15 83 0.8874

(0.00) (0.00) (0.00) (0.00) 0.49 (0.00) 1.05 (0.00) (0.00) (0.00) (0.00) (0.00) 1.01 3.35 1.03 (0.00) (0.00) (0.00) 0.97 0.41 80 1.4047

(0.00) (0.00) (0.00) (0.00) 0.7 (0.00) 1.3 (0.00) (0.00) 0.29 (0.00) (0.00) 1.56 4.99 0.7 (0.00) (0.00) (0.00) 0.94 1.14 78 0.8653

(0.00) (0.00) (0.00) (0.00) 0.56 (0.00) 1.4 (0.00) (0.00) 0.28 (0.00) (0.00) 3.57 12.77 0.65 (0.00) (0.00) (0.00) 0.99 0.26 91 1.9333

Date

10/01/89

16/01/89

22/01/89

28/01/89

03/02/89

09/02/89

15/02/89

21/02/89

23/03/89

10/04/89

10/05/89

SOIL01 SOIL03 SOIL16 SOIL17 BAMAJC STAGBC MOVES2 WHDIEC MOTIBC SFCRUC CHCRUC SCRRFC AMSUL AMNIT NANO3 MARINE OC LIME R2

(0.00) (0.00) (0.00) (0.00) 1.27 (0.00) 2.01 (0.00) (0.00) 0.44 (0.00) (0.00) 1.43 13.22 (0.00) (0.00) (0.00) (0.00) 0.94 1.12 84 0.8925

(0.00) (0.00) (0.00) 0.47 2.29 (0.00) 2.13 (0.00) (0.00) 0.75 (0.00) (0.00) 1.78 11.53 0.62 (0.00) 1.74 (0.00) 0.98 0.53 93 1.2641

(0.00) (0.00) 0.36 (0.00) 2.36 (0.00) 4.59 (0.00) (0.00) 0 0.31 (0.00) 3.61 31.47 (0.00) 0.27 (0.00) (0.00) 0.94 1.59 90 0.8222

(0.00) (0.00) 0.26 (0.00) 2.78 (0.00) 3.58 (0.00) (0.00) 0.77 0 (0.00) 6.17 39.64 0.78 (0.00) 2.99 (0.00) 0.98 0.58 89 1.1981

(0.00) (0.00) (0.00) (0.00) 0.41 (0.00) (0.00) 0.78 (0.00) (0.00) (0.00) (0.00) 0.54 0.85 0.45 (0.00) (0.00) (0.00) 0.96 0.21 72 2.1715

(0.00) 0.77 (0.00) (0.00) 1.34 (0.00) 1.71 (0.00) (0.00) (0.00) (0.00) (0.00) 3.46 18.95 0.63 (0.00) (0.00) (0.00) 0.99 0.22 91 2.1244

(0.00) (0.00) (0.00) (0.00) 2 (0.00) 1.49 (0.00) (0.00) (0.00) (0.00) (0.00) 2.93 18.18 0.54 (0.00) 1.77 (0.00) 0.99 0.23 93 2.1014

(0.00) (0.00) (0.00) (0.00) 1.42 (0.00) 1.48 (0.00) (0.00) (0.00) (0.00) (0.00) 2.38 12.89 0.51 (0.00) 1.57 (0.00) 0.98 0.44 105 1.4068

(0.00) 0.46 (0.00) (0.00) 0.57 (0.00) 1.73 (0.00) (0.00) (0.00) (0.00) (0.00) 1.34 2.07 0.93 (0.00) (0.00) (0.00) 0.98 0.43 89 1.3967

0.98 (0.00) (0.00) (0.00) 1.02 (0.00) 1.46 (0.00) (0.00) (0.00) (0.00) 3 1.3 (0.00) 1.08 (0.00) (0.00) (0.00) 0.99 0.23 90 2.0790

0.51 (0.00) (0.00) (0.00) 0.41 (0.00) (0.00) (0.00) 0.05 (0.00) (0.00) (0.00) 1.2 (0.00) 0.82 0.22 (0.00) (0.00) 0.98 0.28 70 1.7698

c2red

c2red

c2red %mass Fit Measure a

Source mnemonics are explained in Table 2.

Author's personal copy

478

G. Argyropoulos, C. Samara / Environmental Modelling & Software 26 (2011) 469e481

Table 5 Trial CMBs applied to the ambient PM2.5 sample of 02/07/88, using the US EPA CMB 8.2 model.

(continued on next page)

Author's personal copy

G. Argyropoulos, C. Samara / Environmental Modelling & Software 26 (2011) 469e481 Table 5 (continued)

479

Author's personal copy

480

G. Argyropoulos, C. Samara / Environmental Modelling & Software 26 (2011) 469e481

Table 6 Source contribution estimates (mg/m3) and performance measures values of the first 15 possible fits determined by RCMB for the ambient PM2.5 sample of 02/07/88 Rank SOIL01 SOIL03 SOIL16 SOIL17 BAMAJC STAGBC MOVES2 WHDIEC MOTIBC SFCRUC CHCRUC SCRRFC AMSUL AMNIT NANO3 MARINE OC LIME R2

c2red % mass Fit Measure a b c

1 (0.00) (0.00) 1.44 (0.00) 1.1 (0.00) 1.94 (0.00) (0.00) (0.00) (0.00) 1.23 2.04 (0.00) 0.91 (0.00) 1.94 (0.00) 77 0.9 0.97 0.9478

2 (0.00) (0.00) 1.35 (0.00) (0.00) 0.43 2.03 (0.00) (0.00) (0.00) (0.00) 1.5 2 (0.00) 0.9 (0.00) 2.19 (0.00) 76 0.9 0.96 0.9448

3 1.35 (0.00) (0.00) (0.00) (0.00) 0.6 1.86 (0.00) (0.00) (0.00) (0.00) 1.74 1.98 (0.00) 0.9 (0.00) 2.1 (0.00) 77 1.01 0.96 0.9066

b

4 (0.00) (0.00) (0.00) (0.00) 1.24 (0.00) 1.67 (0.00) (0.00) (0.00) (0.00) 4.38 1.8 (0.00) 0.82 (0.00) 1.67 (0.00) 84 1.15 0.92 0.8797

5 1.47 (0.00) (0.00) (0.00) 1.4 (0.00) 1.76 (0.00) (0.00) (0.00) (0.00) 1.44 2.01 (0.00) 0.9 (0.00) 1.79 (0.00) 79 1.18 0.95 0.8633

6 (0.00) (0.00) (0.00) 1.28 (0.00) 0.43 1.97 (0.00) (0.00) (0.00) (0.00) 1.87 1.97 (0.00) 0.9 (0.00) 2.18 (0.00) 77 1.27 0.95 0.8352

7 (0.00) (0.00) (0.00) (0.00) (0.00) (0.00) 1.86 (0.00) (0.00) (0.00) (0.00) 4.88 1.78 (0.00) 0.81 (0.00) 2.06 (0.00) 83 1.31 0.9 0.8312

8 (0.00) (0.00) (0.00) 1.39 1.16 (0.00) 1.87 (0.00) (0.00) (0.00) (0.00) 1.55 2 (0.00) 0.9 (0.00) 1.92 (0.00) 79 1.32 0.95 0.8301

9 (0.00) (0.00) (0.00) (0.00) 1.46 (0.00) 2.25 (0.00) (0.00) (0.00) (0.00) 4.33 1.77 (0.00) 0.81 (0.00) (0.00) (0.00) 77 1.23 0.9 0.8297

10 2.1 (0.00) (0.00) (0.00) 1.63 (0.00) 1.83 (0.00) (0.00) (0.00) (0.00) (0.00) 2.2 (0.00) 0.93 (0.00) 1.79 (0.00) 76 1.57 0.94 0.7793

11 (0.00) (0.00) (0.00) (0.00) 1.25 (0.00) 1.7 (0.00) (0.00) (0.00) (0.00) 4.37 1.76 0.61 (0.00) (0.00) 1.65 (0.00) 83 1.7 0.88 0.7663

12 (0.00) (0.00) 1.37 (0.00) (0.00) (0.00) 2.08 (0.00) (0.00) (0.00) (0.00) 1.83 1.98 (0.00) 0.9 (0.00) 2.29 (0.00) 76 1.64 0.93 0.7657

13 1.05 (0.00) (0.00) (0.00) (0.00) (0.00) 1.96 (0.00) (0.00) (0.00) (0.00) 2.75 1.91 (0.00) 0.87 (0.00) 2.19 (0.00) 78 1.75 0.9 0.7524

14 (0.00) (0.00) (0.00) 1.23 (0.00) (0.00) 2.03 (0.00) (0.00) (0.00) (0.00) 2.32 1.94 (0.00) 0.89 (0.00) 2.26 (0.00) 78 1.78 0.91 0.7501

a

15c (0.00) (0.00) (0.00) 2.09 1.25 (0.00) 1.99 (0.00) (0.00) (0.00) (0.00) (0.00) 2.19 (0.00) 0.93 (0.00) 1.99 (0.00) 76 1.79 0.93 0.7499

Source mnemonics are explained in Table 2 CMB of Table 5.b CMB of Table 5.a

for the one that maximizes Fit Measure. Moreover, the temporal variation of these source contributions seems to be reasonable, since wood smoke emissions (BAMAJC) occurred more frequently during the cold period, from fireplaces and woodstoves, while agricultural burnings (STAGBC) were common during the warm period, due to prescribed burns, set by farmers (Chow et al., 1990). Fig. 4b shows the average contributions that were determined by the best fits of RCMB for major source types. Another interesting finding of RCMB, in contrast with the original CMB analysis, is the significant contributions estimated for municipal waste incineration (SCRRFC), despite that Chow et al. (1990) reported that contributions from such a source type were not detected in any ambient sample. Table 5a shows the output of the US EPA CMB 8.2 model for a trial CMB that was applied, following standard trial-and-error procedures of conventional CMB modeling, to the ambient PM2.5 sample, which was acquired from Crows Landing, on 02/07/88. All the performance measures, listed in Table 5a, are within target values, so it could have been assumed reasonably that SCRRFC was not needed to explain measured concentrations. In fact, however, as it is shown in Table 5b, a combination of source profiles, which includes SCRRFC instead of SOIL17, explains measured concentrations even better than the one of Table 5a. Therefore it could be assumed, reasonably as well, by another CMB modeler that contribution from SOIL17 was actually below detection limit, after taking into consideration that such a source category normally contains over 90% of its measured mass concentrations in the coarse fraction (Chow et al., 1990). Table 6 shows the first 15 possible fits that were determined by RCMB for the same ambient PM2.5 sample, in a decreasing order of Fit Measure. It is apparent that, by maximizing an overall fitting index, the proposed computational procedure has provided a unique solution to the conventional CMB problem, which cannot be questioned readily, since all the other CMBs, shown in Table 5, have a lower statistical rank. 4. Conclusions From the test cases that were illustrated above, it becomes evident that a typical set of input data, gathered for CMB modeling, can often define a plethora of least squares systems, for which the algorithm of

Britt and Luecke converges successfully, to solutions that meet common statistical criteria. The above test case also confirms the well-established fact (Cheng and Hopke, 1986) that two different solutions, both having acceptable performance measures, can often be found for the same CMB problem by two different people. RCMB minimizes personal judgment, because it is capable of leading straight-forwardly to the best fit that can possibly be obtained by a set of input data, from a statistical point of view. Future work will involve uncertainty quantification of RCMB, by employing either the traditional Monte Carlo method established by Javitz et al. (1988) for conventional CMB modeling, or the computationally costeffective polynomial chaos method that has been recently proposed by Cheng and Sandu (2009). Nevertheless, RCMB, like any other receptor model, is rather explanatory than predictive, thus, it should not be considered as a statistical black box. References Björck, A., 1996. Numerical Methods for Least Squares Problems. SIAM Books, Philadelphia. Britt, H.I., Luecke, R.H., 1973. The estimation of parameters in nonlinear implicit models. Technometrics 15 (2), 233e247. Cheng, M.D., Hopke, P.K., 1986. Investigation on the Use of chemical mass balance receptor model: numerical Computations. Chemometrics and Intelligent Laboratory Systems 1, 33e50. Cheng, H., Sandu, A, 2009. Uncertainty quantification and apportionment in air quality models using the polynomial chaos method. Environmental Modeling & Software (24), 917e925. Chow, J.C., Watson, J.G., Lowenthal, D.H., Pritchett, L.C., Richards, L.W., 1990. San Joaquin Air Quality Study, Phase 2: PM10 Modelling and Analysis. In: Receptor Modeling Source Apportionment, vol. I Final Report, DRI Document No. 8929.1F, JPA Contract #88e1. Coulter, C.T., 2004. EPA-CMB8.2 Users Manual. Report No. EPA-452/R-04e011. U.S. Environmental Protection Agency, Research Triangle Park, N.C. Friedlander, S.K., 1973. Chemical element balances and identification of air pollution sources. Environmental Science and Technology 7 (3), 235e240. Gordon, G.E., 1988. Receptor models. Environmental Science and Technology 22 (10), 1132e1142. Henry, R.C., 1992. Dealing with near collinearity in chemical mass balance receptor models. Atmospheric Environment 26A (5), 933e938. Henry, R.C., Lewis, C.W., Hopke, P.K., Williamson, H.J., 1984. Review of receptor model fundamentals. Atmospheric Environment 18 (8), 1507e1515. Hopke, P.K., 1991. An introduction to receptor modeling. Chemometrics and Intelligent Laboratory Systems 10, 21e43. Javitz, H.S., Watson, J.G., Robinson, N., 1988. Performance of the chemical mass balance model with simulated local-scale aerosols. Atmospheric Environment 22 (10), 2309e2322.

Author's personal copy

G. Argyropoulos, C. Samara / Environmental Modelling & Software 26 (2011) 469e481 Lowenthal, D.H., Chow, J.C., Watson, J.G., Neuroth, G.R., Robbins, R.B., Shafritz, B.P., Countess, R.J., 1992. The effects of collinearity on the ability to determine aerosol contributions from diesel and gasoline-powered vehicles using the chemical mass balance model. Atmospheric Environment 26A (13), 2341e2351. Meloun, M., Militký, J., Hill, M., Brereton, R.G., 2002. Crucial problems in regression modeling and their solutions. Analyst 127, 433e450. Samara, C., 2005. Chemical mass balance source apportionment of TSP in a ligniteburning area of Western Macedonia, Greece. Atmospheric Environment 39, 6430e6443. Samara, C., Kouimtzis, T.h., Tsitouridou, R., Kanias, G., Simeonov, V., 2003. Chemical mass balance source apportionment of PM10 in an industrialized urban area of Northern Greece. Atmospheric Environment 37, 41e54. Thurston, G.D., Lioy, P.J., 1987. Receptor modeling and aerosol transport. Atmospheric Environment 21 (3), 687e698.

481

Viana, M., Kuhlbusch, T.A.J., Querol, X., Alastuey, A., Harrison, R.M., Hopke, P.K., Winiwarter, W., Vallius, M., Szidat, S., Prévôt, A.S.H., Hueglin, C., Bloemen, H., Wåhlin, P., Vecchi, R., Miranta, A.I., Kasper-Giebl, A., Maenhaut, W., Hitzenberger, R., 2008. Source apportionment of particulate matter in Europe: a review of methods and results. Aerosol Science 39, 827e849. Watson, J.G., 2004. Protocol for Applying and Validating the CMB Model for PM2.5 and VOCs. Report No. EPA-451/R-04e001. U.S. Environmental Protection Agency, Research Triangle Park, N.C. Watson, J.G., Robinson, N.F., Lewis, C., Coulter, T., 1997. Chemical Mass Balance Receptor Model-Version 8 User’s Manual. Document No. 1808.1D1. Desert Research Institute, Reno, NV. Watson, J.G., Cooper, J.A., Huntzicker, J.J., 1984. The effective variance weighting for least squares calculations applied to the mass balance receptor model. Atmospheric Environment 18 (7), 1347e1355.