An Investigation into Expanding the CAUS Universe1 Bryan Schar, James Lawrence, Star Ying, Jim Hartman U.S. Census Bureau Washington, DC 20233 Abstract The Community Address Updating System (CAUS) conducts targeted block listings to improve the coverage of the American Community Survey’s sample frame. Blocks with certain address characteristics have traditionally been excluded from listing by the CAUS program because it was believed that data provided by other sources kept them current. However, results from the nationwide 2010 Census Address Canvassing operation suggested that some of these excluded blocks may benefit from being listed by the CAUS program. Therefore, a study was conducted to determine if the universe of CAUS blocks should be expanded. This paper will discuss the methods being used to determine if the CAUS universe should be expanded, some preliminary results of the investigation, and plans for future research. 1. 1.1.
Introduction Master Address File – Description and Updating
The Master Address File (MAF) is a U.S. Census Bureau file that contains an up to date inventory of all known living quarters in the United States. It is the sole source of housing unit records for the American Community Survey’s (ACS) sampling frame, and the main source of group quarters (GQ) information for the ACS’s GQ sampling frame (Bates, 2011). Between Decennial Censuses, the U.S. Postal Service’s (USPS) Delivery Sequence File (DSF) is the primary source of citystyle2 address updates for the MAF. The Demographic Area Address Listing operation, of which the Community Address Updating System (CAUS) targeted census block listing operation is a part, is the primary source of non-city-style3 address updates (GEO, 2011b). 1.2.
Delivery Sequence File
The DSF is the USPS’s list of delivery points in the country, with a delivery point being a single mailbox or other place at which mail is delivered. It is assumed that city-style delivery points on the DSF represent the location of the associated living quarters. The DSF also contains information about non-city-style delivery points; however, this information is not used by the Census Bureau to update the MAF for the following reasons: The non-standard format of certain types of non-city-style addresses on the MAF, such as physical description only addresses, makes it difficult to match them to non-city-style delivery points on the DSF. Thus, updating the MAF with non-city-style delivery points from the DSF increases the risk of adding duplicate living quarters to the MAF. There is a potential that non-city-style delivery points on the DSF may be duplicates of city-style delivery points in areas that have undergone a conversion from non-city-style to city-style. It cannot be assumed that non-city-style delivery points on the DSF represent the location of living quarters. For instance, a PO Box does not represent the location of the living quarters of the person renting the P.O. Box. The information provided by the DSF for many non-city-style delivery points cannot be used to geocode (i.e., assign it to a census block) the address represented by it. Due to these concerns, non-city-style delivery points on the DSF are not used to update the MAF. Instead, CAUS targeted block listing operations are a major source of address updates to the MAF in areas with a high proportion of non-city-style address (GEO, 2011b).
1
This report is released to inform interested parties of ongoing research and to encourage discussion of work in progress. The views expressed are those of the author and not necessarily those of the U.S. Census Bureau. 2 Addresses that contain at least a house number and street name. 3 Examples of non-city-style addresses are PO Boxes, rural routes, and description only addresses.
1
1.3.
CAUS Listings
A census block, or block for short, is the smallest geographic entity for which the Census Bureau collects and tabulates data. When a block is listed, a field representative canvasses that block in order to identify all living quarters present within the block. The CAUS branch is responsible for selecting the blocks that are listed through the CAUS program. When selecting blocks to list, the CAUS branch first considers the Address Characteristic Type (ACT) code assigned yearly by the U.S. Census Bureau’s Geography Division (GEO). This code indicates both the style of MAF addresses in a block and the percent of addresses that can be matched to delivery points on the DSF (GEO, 2011a). Note that a block’s ACT code may change from year to year based on changes in its MAF information. The CAUS branch classifies blocks into one of three groups based on their ACT code. Yes, or “Y”, blocks are considered to be in the CAUS universe and eligible for listing. Maybe, or “M”, blocks are considered to be in the margins of the CAUS universe. No, or “N”, blocks are considered to be outside the CAUS universe. Blocks in the “N” group contain either only city-style addresses, non-residential addresses, or no addresses. Blocks in the “M” group contain a mixture of city-style and non-city-style addresses where some of the addresses match to a DSF delivery point and at least 80 percent of the addresses are city-style. Blocks in the “Y” group contain either only non-city-style addresses, a mixture of city-style and non-city-style addresses where either none or all of the addresses match to a DSF delivery point, or a mixture of city-style and non-city-style addresses where some of the addresses match to a DSF delivery point and less than 80 percent of the addresses are city-style. Table 1 gives the ACT codes that belong to each group, and attachment A provides information about the types of addresses in each ACT code. Table 1: ACT Code Groups Used for CAUS Listings
CAUS Group Y M N 1.4.
ACT Codes M1, ME, MF, MG, M3, N1, N2, N3, P1, P2, P3, R1, R2, R3 MA, MB, MC, MD B1, B2, B3, C1, C2, C3, Z0
Motivation for Research
Traditionally, blocks within the “M” and “N” groups were not considered for listing because it was believed that delivery point data provided by the USPS were sufficient to keep those blocks current. Results from the 2010 Census Address Canvassing operation (AdCan) conducted in 2009 provided listing data for blocks in these groups. These data allowed the CAUS branch to examine if blocks that were traditionally excluded from the CAUS universe need to be listed through the CAUS program to pick up new units. In addition, research conducted by the U.S. Census Bureau’s Demographic Statistical Methods Division suggested there might be benefits to using an expanded definition of the CAUS universe based on the percent of MAF addresses that can be matched to a DSF delivery point (Kennel, 2010). 2. 2.1.
Research Data Used
2.1.1. 2010 Address Canvassing One source of data for this research was the 2010 AdCan results. For this operation, census workers around the nation looked for every place where people live, stay, or could live or stay. They compared what they saw on the ground to what was shown on the MAF. Based on their observations, they verified, updated, or deleted addresses already on the MAF, and added addresses that were missing from it (ALOIT, 2012). Only those 2010 AdCan unit records that met all the criteria below were considered for this study: The unit returned from the 2010 AdCan with an action code that indicated it was a new unit added by the operation. The unit was in the final 2010 Census universe, that is, it was classified as a housing unit in the 2010 Census.
2
The unit was included in the 2012 ACS main4 housing unit frame universe. Information from the 2012 main version of the ACS unit frame universe was used to filter, that is include or exclude, these records since the version of the MAF used in its creation was the first to include information about whether or not a unit was in the final 2010 Census universe. 2.1.2. CAUS Experimental Listings Given the unique nature of the 2010 AdCan, the CAUS branch conducted experimental block listings of its own. One goal of these listings was to obtain a second source of data that could be used to determine if and where the CAUS universe should be expanded. When a block is listed by CAUS, census workers look for every place where people live, stay, or could live or stay in that block. They compare what they see on the ground to what is shown on the MAF. Based on their observations, they verify, update, or delete addresses already on the MAF, and add addresses that are missing from it. Blocks were selected for this experiment in March of 2010, which was approximately a year after the 2010 AdCan was conducted. The first step in selecting blocks for this study was to assign each block to a category. The category a block was assigned to was based on which of the three CAUS ACT groups (“Y”, “M”, “N”) it belonged to and which of five possible housing unit based groups it belonged to. The definitions of the five possible housing unit based groups are detail in Table 2 below. This resulted in each block being assigned to one of 15 categories as shown in Table 3. Table 2: Housing Unit Groups for CAUS Experimental Listings
Housing Unit Group 1 2 3 4 5
Number of Pre-Listing Housing Units 3 or fewer 4-8 9-15 16-29 30 or more
Groupings based on housing units were used because initial attempts to model the expected number of new housing units, or adds, that were found during 2010 AdCan listings showed that the number of pre-existing housing units in a block was a major predictor of the number of adds that would be found when listing. Since the distribution of pre-listing housing units per block has a strong positive skew, including this variable in the models resulted in approximately 90 percent of the blocks (those with less than 30 housing units) being clustered together with very low predicted adds and the remaining 10 percent (those with 30 or more housing units) of blocks having high predicted adds. However, the CAUS branch wanted to be able to better distinguish between blocks with lower predicted adds rather than have them treated as essentially equivalent. Therefore, it was hoped that transforming pre-existing units into a categorical variable would help to provide more gradation to predicted adds. The blocks within each of these groups with 100 or less pre-existing housing units were then sorted based on a selection score assigned to them. This score was derived from a generalized linear model that was developed using 2007 and 2008 block characteristic data to predict 2010 AdCan adds, and it indicated the expected number of new housing units, or adds, that would be obtained by listing the blocks. Only blocks with 100 or less pre-existing housing units were considered for this study due to the cost associated with listing blocks with a large number of pre-existing housing units. At the time of this study, approximately eight million of the 8.2 million tabulation blocks in the country had 100 or less pre-existing housing units. A systematic sample of these sorted blocks was then selected from within each group. This was done to allow for a study of how well the selection score ranked blocks by adds, and how well it estimated the number of adds across a range of scores. The number of blocks selected from each group was driven by the various goals and constraints of the study. First, the CAUS branch estimated that it could afford to list around 7,500 blocks for this study. Based on previous experience, the CAUS branch only expected half as many blocks as it selects and sends out for listing to be completed. Therefore, it wanted to send 4
Two versions of the ACS unit frame universe are created for each ACS survey year, the main and the supplemental.
3
out approximately 15,000 blocks in total for this study. Also, at the time that blocks were selected, the main interest was determining whether the “M” blocks should be added to the CAUS universe. The status of the “N” blocks was of secondary interest. Therefore, approximately four times as many “M” blocks were selected than “N”. In addition, the CAUS branch wanted to make sure that the listings resulted in a good number of new addresses being added to the MAF. Therefore, approximately half of the blocks it selected were “Y” blocks, and approximately three times as many blocks were selected from housing unit groups three, four, and five as compared to housing unit groups one and two. This resulted in 15,241 blocks being selected and sent out for listing during April 2010 through June 2010. Of the blocks sent out, 8,498 were actually listed in the field. Tables 3 and 4 provide more details of this study’s block distribution. Table 3: Design of 2010 CAUS Experimental Listings
Pre-Listing CAUS Group
Block Size Group
Blocks Sent for Listing Percent Count of Group
Blocks in Group Count
Percent of Total
Blocks Listed Count
Percent of Sent
Y
1 2 3 4 5
252,893 158,925 87,656 62,347 46,037
3.2 2.0 1.1 0.8 0.6
892 892 1,863 1,862 1,562
0.4 0.6 2.1 3.0 3.4
439 440 977 951 817
49.2 49.3 52.4 51.1 52.3
M
1 2 3 4 5
9,659 62,596 79,397 88,030 102,720
0.1 0.8 1.0 1.1 1.3
787 792 1,766 1,768 1,465
8.2 1.3 2.2 2.0 1.4
438 437 994 978 783
55.7 55.2 56.3 55.3 53.5
N
1 2 3 4 5
3,648,644 871,375 850,954 934,728 736,974
45.7 10.9 10.7 11.7 9.2
169 169 392 392 470
0.0 0.0 0.1 0.0 0.1
129 128 289 299 399
76.3 75.7 73.7 76.3 84.9
7,992,935
100.0
15,241
0.2
8,498
55.8
Total
Sources: U.S. Census Bureau, CAUS Targeting Database, February 2010; U.S. Census Bureau, CAUS Sample Control Output File, July 2010. Table 4: Distribution of 2010 CAUS Experimental Listings by ACT Code
PreListing CAUS Group Y
Pre-Listing ACT Code M1 ME MF MG M3 N1 N2 P1 R1 R2 R3
Blocks Sent for Listing Percent Count of Group
Blocks in ACT Count 106,398 50,441 22,437 219,141 133 131,929 53 21,224 56,092 8 2
Percent of Total 1.3 0.6 0.3 2.7 0.0 1.7 0.0 0.3 0.7 0.0 0.0
1,629 700 436 3,028 4 935 1 115 223 0 0 4
1.5 1.4 1.9 1.4 3.0 0.7 1.9 0.5 0.4 0.0 0.0
Blocks Listed Count 812 355 237 1,563 3 466 0 56 132 0 0
Percent of Sent 49.9 50.7 54.4 51.6 75.0 49.8 0.0 48.7 59.2
PreListing CAUS Group M
N
Pre-Listing ACT Code MA MB MC MD B1 B2 B3 C1 C2 C3 Z0
Total
Blocks Sent for Listing Percent Count of Group
Blocks in ACT Count 87,325 109,513 74,685 70,879 457 156 235,951 181,195 942,867 3,241,112 2,440,937 7,992,935
Percent of Total 1.1 1.4 0.9 0.9 0.0 0.0 3.0 2.3 11.8 40.6 30.5 100.0
1,587 2,096 1,376 1,519 0 0 12 59 397 984 140 15,241
1.8 1.9 1.8 2.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.2
Blocks Listed Count
Percent of Sent
851 1,180 769 830 0 0 10 41 305 787 101 8,498
53.6 56.3 55.9 54.6
83.3 69.5 76.8 80.0 72.1 55.8
Sources: U.S. Census Bureau, Geography Reference File-Codes, July 2009; U.S. Census Bureau, CAUS Sample Control Output File, July 2010.
Only those unit records from these listings that met the criteria below were considered for the preliminary part of this study: The unit was accepted by GEO for inclusion on the MAF as a new address record. The unit was included in the 2011 ACS supplemental housing unit frame universe. Information from the 2011 supplemental version of the ACS unit frame universe was used for these records since this is the first version of the ACS unit frame to include the addresses added by these listings. Additional data used in this research comes from the MAF extracts and geography files used by ACS to create its unit frame universes each year. These data sources contain such information as a unit’s DSF status, whether or not a unit is residential, and if the unit is considered to be within the ACS universe at the time of frame creation. 2.2.
Decision Criteria
The primary function of CAUS listings is to serve as a complement to the updates provided by the DSF. This is why CAUS has traditionally only listed areas with a high proportion of pre-existing non-city-style addresses. However, if there was evidence that the DSF was not providing sufficient and/or timely updates to other areas, there could be a benefit from including them in the CAUS universe. With this in mind, the CAUS branch decided to base its decision on whether or not to include new areas in the CAUS universe on a measure of how well DSF updating captures the adds from the 2010 AdCan and the 2010 CAUS experimental listings. In particular, if an address that was added to the MAF by one of these operations could not be matched by the Census Bureau to a delivery point on the DSF within a reasonable amount of time, this would be considered a deficiency in the ability of DSF updating to capture that add. There are two aspects to this measure. First is the expected lack of DSF usability for capturing non-city-style adds. Though this deficiency is expected, since the Census Bureau does not use the DSF to add non-city-style addresses to the MAF, it still provides a measure of how much an area would benefit from being listed by the CAUS program. Second, is the unexpected lack of DSF usability for capturing city-style adds. Thus an area can be considered to have a poor DSF capture rate if most or all of it adds were non-city-style adds and/or were city-style adds that could not be matched to a delivery point on the DSF within a reasonable amount of time. Groups of blocks whose DSF match rates of adds are low enough, as measured by the percent of adds that can be matched to a delivery point on the DSF, may be considered for inclusion into the CAUS universe.
5
What constitutes a reasonable amount of time to wait for an added city-style address to match to a DSF delivery point is still an open question. It takes approximately a year, from beginning to end, for a block’s information to be updated on the MAF through the CAUS program. This includes creating the frame from which to select the blocks, selecting blocks from this frame to list, having the selected blocks listed in the field, receiving the block updates from field, processing and editing this information, and finally applying the vetted updates to the MAF. During this same time period, the MAF is updated twice by the DSF (GEO, 2011b). If all the new city-style addresses obtained by listing a block through CAUS were also added by these two DSF updates, then there was no benefit, in terms of adds, gained by listing that block. However, what if it takes one year after the listing updates are applied for a city-style address to be captured using a DSF update? What about two years? One important factor in determining this criterion is cost. The Census Bureau will continue to use DSF updates to keep a block’s city-style address information on the MAF current regardless of whether or not that block is listed by the CAUS program. This means any costs associated with acquiring DSF updates are independent of the decision to list a block. Therefore, it will always be cheaper, in terms of dollars, to wait for city-style addresses to be added by DSF updates (when possible). However, other criteria, such as frame coverage, should be considered when making this assessment. Since the traditional CAUS universe only includes areas with high concentrations of non-city-style addresses, and the DSF is not used to add these types of addresses to the MAF, the cost, in terms of frame coverage, of not listing these areas is clear. The properties of non-city-style addresses are different enough from those of city-style addresses that it would be difficult to justify imputing estimates for the former using information from the latter. However, if the CAUS universe is expanded to include areas that contain high concentrations of city-style addresses that, for reasons yet unknown, are not being added to the MAF by the DSF updates, then it is not as clear-cut that the coverage issues caused by this could not be accounted for through methods other than block listings. This suggests future research to investigate if city-style addresses being added to the MAF by the DSF are representative of those that are not. 2.3.
Limitations
2.3.1. Possible Overestimation of the AdCan Adds When census field representatives were assigned to canvass blocks during AdCan, they were provided with a list of the MAF addresses that had been geocoded to the assigned blocks. They were instructed to add addresses they found in the block that did not appear on that list and to delete any addresses from the list that they could not find in the block. This means that some of the adds in AdCan may have represented an address that existed on the MAF before AdCan but was either not geocoded to a block or was geocoded to the wrong block. Every effort was made during data processing to identify these false adds, but some of them could have been missed (ALOIT 2012, 115). This could result in an overestimate of the adds in the AdCan data, because some of the adds in AdCan were corrections to the geocoding of existing units. In addition, some city-style addresses added by AdCan actually represented housing units that already existed as non-citystyle addresses on the MAF that had been converted to city-style as part of enhancements to an area’s emergency 911 system. Since census workers could have deleted the units’ associated non-city-style address, these would form another type of adddelete pair that could have also slipped through the add-delete matching done during data processing. This could result in an overestimate of the city-style adds in the AdCan data, because some of the adds in AdCan were just conversions of existing units’ addresses from non-city-style to city-style. A complication introduced by the possible existence of false adds in the AdCan data is that they may behave differently, in terms of matching to DSF delivery points, than true adds. If this is the case, then inferences made about the DSF match rate of adds based on AdCan data may not be reflective of what would occur in regular CAUS listings. 2.3.2. ACT Code Representation in 2010 CAUS Experimental Listings Data As mentioned in Section 2.1., blocks were selected for the 2010 CAUS experimental listings based on their ACT group and housing unit group. This was done because it was felt that blocks that belonged to the same ACT and block size groups were similar in terms of adds. In addition, preference was given to selecting blocks in the “Y” and “M” ACT groups and blocks with more pre-existing residential housing units. A consequence of this is that some ACT codes were not represented in this study, while others only had a limited number of blocks selected. These ACT codes tended to occur rarely in the universe of blocks and/or be in the “N” ACT group.
6
2.3.3. Sparsity of Adds in Data Reliably estimating the expected DSF match rate of adds for a group of blocks requires that enough adds were found in enough of its blocks during AdCan and/or the 2010 CAUS experimental listings. This can present a challenge since only 16.0 percent of the blocks listed during AdCan contained adds, and 31.5 percent of the blocks listed during the 2010 CAUS experimental listings contained adds. Also, those blocks with adds in AdCan only contained about 4.2 new units on average, while those blocks with adds in the 2010 CAUS experimental listings contained about 5.4 new units on average. 2.4.
Results to Date
An initial examination of the change over time in the DSF match rate of adds from both AdCan and the 2010 CAUS experimental listings was conducted. Figures 1 and 2 give two measures of the percent of adds matched to the DSF for the AdCan and the 2010 CAUS experimental listings respectively. The square data points in each figure show the percent of citystyle adds that have been matched to a delivery point across different vintages of the DSF. The diamond data points in each figure show the percent of total adds, both city-style and non-city-style, that have been matched to a delivery point across different vintages of the DSF. Note that, since non-city-style addresses are not matched to DSF delivery points (with only a handful of exceptions), the match rate for all addresses will always be less than or equal to the match rate for city-style addresses . The first DSF vintage shown on these figures, Spring 2008 for AdCan and Fall 2009 for the 2010 CAUS listings, is the one used as an input when creating the initial address lists for these operations. As can be seen from Figures 1 and 2, the DSF match rate increases over time for both sets of data. However, the behavior of the percent adds matched to the DSF for AdCan adds suggests the rate of the increase may slow as time goes on. Further examination could be conducted to determine if there is a point where the marginal cost savings of waiting for an address to come in on the DSF is outweighed by the costs of excluding that unit from the sampling frame. It is important to note that the match rates shown in Figures 1 and 2 are not cumulative, but represent how many adds were matched to a delivery point on each vintage of the DSF independent of whether or not they match to a delivery point on a previous vintage. There are cases of delivery points that stop appearing on DSF updates. This occurs when the USPS no longer considers an address to be a valid delivery point for mail. For example, when a previously occupied housing unit has been demolished. This means that the overall DSF match rate of adds could decrease from one vintage of the DSF to the next, though this has not yet been observed in this research.
Figure 1: DSF Match Rate of AdCan Adds Over Time
DSF Match Rate of AdCan Adds 30.0% 25.0%
21.9%
23.3%
24.1%
18.2%
18.8%
19.5% 20.0% 13.6%
15.0%
17.1% 15.2%
7.7%
10.0%
City-Style Total
10.6% 5.0% 0.0%
6.0%
0.0% Spring Fall 2008 Spring Fall 2009 Spring Fall 2010 Spring 2008 DSF DSF 2009 DSF DSF 2010 DSF DSF 2011 DSF Source: U.S. Census Bureau, Master Address File, July 2011.
7
Figure 2: DSF Match Rate of 2010 CAUS Experimental Listing Adds Over Time
DSF Match Rate of 2010 CAUS Listing Adds 1.4%
1.2%
1.2% 1.0% 0.8%
0.8%
City-Style
0.6% 0.4%
Total
0.4% 0.2% 0.0%
0.0%
0.3%
0.0% Fall 2009 DSF Spring 2010 Fall 2010 DSF Spring 2011 DSF DSF Source: U.S. Census Bureau, Master Address File, July 2011.
One thing that stands out from examining the DSF match rate of adds given in Figures 1 and 2, is how much higher the values are for the AdCan adds than the 2010 CAUS experimental listings adds. For example, the adds from AdCan had an overall DSF match rate of 15.2 percent from the Fall 2009 DSF, which represents three post listing DSF updates. The adds from the 2010 CAUS experimental listings have an overall DSF match rate of only 0.8 percent for an equivalent number of DSF updates. This could be due to the limitations of the AdCan data as discussed in Section 2.3. In addition, many of the blocks canvassed in AdCan had not been fully listed since the 2000 Census, and therefore, the adds from AdCan can represent up to ten years worth of changes. This means that, for some blocks, the DSF had up to twelve years to catch up to ground truth given by AdCan. By comparison, the 2010 CAUS experimental listing was conducted only a year after AdCan. This means the DSF had only one and a half years to catch up to the ground truth given by the CAUS experimental listing. This initial analysis suggests that a conservative method of estimating the DSF match rate of adds for the purposes of excluding and including blocks in the CAUS universe is to use the most recent version of the DSF available each time the CAUS universe is created. The underlying assumption of this method is that as long as an add eventually comes in on a DSF it is worth the wait. This is the rule that was used for the preliminary investigation. Since the traditional CAUS universe is defined in terms of the ACT code, this was the first type of block groupings that were examined. Tables 5 and 6 give the DSF match rate of adds grouped by their pre-AdCan and pre-2010 CAUS experimental listing ACT codes respectively. Pre-listing ACT code, rather than post-listing ACT code, of blocks were examined because that is the ACT code that is known at the time of selection. The values for the DSF match rate of adds in these tables are based on the most recent version of the DSF, the Spring 2011 vintage, available to the Census Bureau. Attachments B and C give the DSF match rate of adds by pre-listing ACT code for all post-operation vintages of the DSF.
8
Table 5: DSF Match Rate of AdCan Adds by Pre-AdCan ACT Code
PreAdCan CAUS Group Y
M
N
Total
AdCan Adds PreAdCan ACT M1 ME MF MG M3 N1 N2 N3 P1 R1 R2 R3
Blocks in ACT
Blocks with Adds
CityStyle
NonCityStyle
Total
115,534 50,110 22,281 213,379 159 141,587 61 1 21,993 62,439 16 4
65,248 24,843 14,554 113,539 83 52,647 45 0 5,438 16,095 7 2
252,423 74,718 52,697 447,057 1,104 87,218 121 0 6,092 16,942 19 0
125,559 35,805 26,676 247,330 14 114,810 178 0 5,337 15,670 4 2
377,982 110,523 79,373 694,387 1,118 202,028 299 0 11,429 32,612 23 2
Total MA MB MC MD Total B1 B2 B3 C1 C2 C3 Z0
627,564 106,272 111,730 75,585 70,604 364,191 439 168 238,471 189,406 1,021,573 3,297,091 2,466,679
292,501 67,150 61,792 40,048 34,826 203,816 40 9 4,395 44,415 315,568 371,906 80,816
938,391 359,935 188,519 119,237 93,493 761,184 32 29 84,196 98,449 1,123,949 1,121,747 215,683
571,385 83,086 80,167 57,516 48,014 268,783 62 0 1,872 17,282 146,779 142,310 60,453
1,509,776 443,021 268,686 176,753 141,507 1,029,967 94 29 86,068 115,731 1,270,728 1,264,057 276,136
Total
7,213,827
817,149
2,644,085
368,758
8,205,582
1,313,466
4,343,660
1,208,926
Adds on Spring 2011 DSF Percent City-Style Percent of CityAdds on of Total Style DSF Adds Adds 16,100 6.4 4.3 10,857 14.5 9.8 6,681 12.7 8.4 51,524 11.5 7.4 33 3.0 3.0 12,503 14.3 6.2 35 28.9 11.7 0 452 7.4 4.0 3,736 22.1 11.5 12 63.2 52.2 0 0.0 101,933 99,673 38,385 20,908 15,312 174,278 2 19 48,626 6,295 283,202 379,973 51,277
10.9 27.7 20.4 17.5 16.4 22.9 6.3 65.5 57.8 6.4 25.2 33.9 23.8
6.8 22.5 14.3 11.8 10.8 16.9 2.1 65.5 56.5 5.4 22.3 30.1 18.6
3,012,843
769,394
29.1
25.5
5,552,586
1,045,605
24.1
18.8
Sources: U.S. Census Bureau, Geography Reference File-Codes, July 2008; U.S. Census Bureau, Master Address File, July 2011.
Looking at Table 5, many of the blocks currently outside the CAUS universe have a DSF match rate of total adds that make them promising candidates for inclusion in the CAUS universe. In particular, blocks that had ACT codes of C1, MB, MC, and MD (which are bolded in Table 5) going into AdCan have an above average proportion of blocks with adds and below average DSF match rates for total adds. Other ACT codes had much smaller DSF match rates of adds than would be expected given the DSF coverage of existing units. For example, blocks with an ACT code of C3 contain only city-style addresses that can be matched to delivery points on the DSF. However only 33.9 percent of the city-style AdCan adds and 30.1 percent of the total adds were matched to a delivery point on the Spring 2011 DSF.
9
Table 6: DSF Match Rate of 2010 CAUS Experimental Listings by Pre-Listing ACT Code
Listing Adds PreListing CAUS Group Y
M
N
PreListing ACT
M1 ME MF MG M3 N1 P1 R1 Total MA MB MC MD Total B3 C1 C2 C3 Z0 Total
Total
Blocks Listed
812 355 237 1,563 3 466 56 132 3,624 851 1,180 769 830 3,630 10 41 305 787 101 1,244 8,498
Blocks with Adds
325 148 101 682 0 166 10 24 1,456 293 324 233 210 1,060 2 1 79 74 7 163 2,679
NonCityStyle
CityStyle
740 253 158 1,431 0 108 4 8 2,702 4,102 677 578 304 5,661 171 14 994 448 5 1,632 9,995
608 216 175 1,773 0 501 23 38 3,334 346 284 273 227 1,130 0 0 43 15 22 80 4,544
Total
1,348 469 333 3,204 0 609 27 46 6,036 4,448 961 851 531 6,791 171 14 1,037 463 27 1,712 14,539
Adds on Spring 2011 DSF Spring Spring 2011 Spring 2011 DSF 2011 DSF Percent DSF Percent CS All Adds Adds 2 0.3 0.1 8 3.2 1.7 2 1.3 0.6 10 0.7 0.3 0 1 0.9 0.2 0 0.0 0.0 0 0.0 0.0 23 0.9 0.4 44 1.1 1.0 8 1.2 0.8 6 1.0 0.7 8 2.6 1.5 66 1.2 1.0 0 0.0 0.0 0 0.0 0.0 27 2.7 2.6 5 1.1 1.1 1 20.0 3.7 33 2.0 1.9 122 1.2 0.8
Sources: U.S. Census Bureau, Geography Reference File-Codes, July 2009; U.S. Census Bureau, Master Address File Transaction File, December 2010; U.S. Census Bureau, Master Address File, January 2011 and July 2011.
As mentioned previously, the overall DSF match rate of adds from the 2010 CAUS experimental listings is much lower than the overall DSF match rate of adds from AdCan. Table 6 shows that this is the case for each ACT code group as well as overall. This low level of DSF matching suggests there could be benefit from including all blocks in the CAUS universe. Examining the ACT groups that AdCan data suggested as candidates for inclusion into the CAUS universe, we see that only one of the 41 blocks in the C1 group that were listed had adds. This does not provide enough information to make a conclusion about the post AdCan DSF match rate of adds for this ACT group. All ACT code groups in the “M” CAUS group have a good number of blocks with adds and a low DSF match rates of adds. This, together with the AdCan data, provides evidence that MB, MC, MD, and possibly MA should be considered for inclusion in the CAUS universe. 2.5.
Future Work
Preliminary investigations showed that blocks with certain ACT codes may benefit from being listed through the CAUS program. Future research will look into using additional block characteristics to refine what, if any, blocks are added to the CAUS universe. Some possible characteristics of interest are geographic location, number of pre-existing units, types of preexisting units, percent of pre-existing units that are city-style, and percent of pre-existing units that can be matched to a delivery point on the DSF. When conducting this examination, the sparseness of blocks with adds in the data will be taken into consideration. If these investigations suggest groups of blocks to be added to the CAUS universe, additional experimental listings will be conducted as a confirmation. Similar investigations will be conducted for other geographic types such as block groups and tracts.
10
The CAUS branch will also look into whether or not the criteria currently used to filter adds in this study should be changed. Some possible alternatives to the current criteria used to filter the data for this research are: Using inclusion in or exclusion from the most recent version of the ACS housing unit frame universe to filter the data. Using the same criteria that are used to filter housing unit for inclusion in or exclusion from the list of existing MAF addresses provided to Census field representatives when they conduct listings. Using the same criteria that are used to filter housing units for inclusion in or exclusion from the addresses used when computing a block’s ACT code. The CAUS branch will also work with MAF stakeholders to determine how long it should wait for an address added by a listing to appear on the DSF when determining the DSF match rate of adds, and if non-listing techniques can be used to compensate for the issues caused by groups of blocks with poor DSF match rates for city-style address adds. 3.
References
ALOIT (Address List Operations Implementation Team). (2012). “2010 Census Address Canvassing Operational Assessment Report.” http://2010.census.gov/2010census/pdf/2010%20Census%20Address%20Canvassing%20Operational%20 Assessment.pdf, January 17, 2012. Washington D.C.: U.S. Census Bureau. Bates, Lawrence. (2010). “Editing the MAF Extracts and Creating the Unit Frame Universe for the American Community Survey (2011 Supplemental Phase).” Internally distributed software specification published as DSSD 2012 American Community Survey Memorandum Series #ACS11-UC-4, December 29, 2011. Washington D.C.: U.S. Census Bureau. ———. (2011). “Editing the MAF Extracts and Creating the Unit Frame Universe for the American Community Survey (2012 Main Phase).” Internally distributed software specification published as DSSD 2012 American Community Survey Memorandum Series #ACS12-UC-1, June 2, 2011. Washington D.C.: U.S. Census Bureau. GEO (Geography Division). (2011a). “Address Characteristic Type Software Requirement Specification.” Internally distributed software specification, version 1.5, last modified February 3, 2011. Washington D.C.: U.S. Census Bureau. ———. (2011b). “Delivery Sequence File Refresh Software Requirements Specification.” Internally distributed software specification, version 1.8, last modified July 28, 2011. Washington D.C.: U.S. Census Bureau. Kennel, Timothy and Joel Martin. (2010). “Summary Report for Evaluating MAF Content Quality After the 2010 Decennial Address Canvassing (Doc. #2010-4.0-G-16).” Internally distributed research report, version 1.0, November 15, 2010. Washington D.C.: U.S. Census Bureau.
11
Attachment A Address Characteristic Type (ACT) Code Definitions Two-character variable that describes the type of Master Address File (MAF) addresses in the block, and indicates how many of the block’s MAF addresses can be matched to delivery points on the US Postal Service’s Delivery Sequence File (DSF). Table A-1: ACT Code Definitions with CAUS Groups
CAUS Group
ACT Definition Code
Y
D1
Description only, MAF description only addresses cannot be matched to DSF addresses
M1
City-style and noncity-style, no addresses matched to DSF
ME
City-style and noncity-style, some addresses matched to DSF where the percent City-style is in [75, 80)
MF
City-style and noncity-style, some addresses matched to DSF where the percent City-style is in [70, 75)
MG
City-style and noncity-style, some addresses matched to DSF where the percent City-style is in (0, 70)
M3
City-style and noncity-style, all addresses matched to DSF
N1
Assorted noncity-style, no addresses matched to DSF
N2
Assorted noncity-style, some addresses matched to DSF
N3
Assorted noncity-style, all addresses matched to DSF
P1
PO Box, no addresses matched to DSF
P2
PO Box, some addresses matched to DSF
P3
PO Box, all addresses matched to DSF
R1
Rural Route, no addresses matched to DSF
R2
Rural Route, some addresses matched to DSF
R3
Rural Route, all addresses matched to DSF
MA
City-style and noncity-style, some addresses matched to DSF where the percent City-style is in [95, 100)
MB
City-style and noncity-style, some addresses matched to DSF where the percent City-style is in [90, 95)
MC
City-style and noncity-style, some addresses matched to DSF where the percent City-style is in [85, 90)
MD
City-style and noncity-style, some addresses matched to DSF where the percent City-style is in [80, 85)
B1
Non-residential only, no addresses match to DSF
B2
Non-residential only, some addresses matched to DSF
B3
Non-residential only, all addresses matched to DSF
C1
City-style, no addresses match to DSF
C2
City-style, some addresses match to DSF
C3
City-style, all addresses match to DSF
Z0
No addresses
M
N
Attachment B This attachment provides detailed information about the DSF match rate of 2010 AdCan adds. Note that only AdCan adds which were both in the final 2010 Census universe and residential housing units within the 2012 ACS main frame were included in the counts below. The Spring 2008 DSF was used as an input into creating the AdCan frame, and the Spring 2010 DSF is the first DSF update to the MAF after AdCan adds were added to the MAF. Table B-1: DSF Match Rate History of AdCan Adds by Pre-AdCan ACT Code PreBlocks PreTotal Spring AdCan with AdCan AdCan 2008 Fall 2008 DSF Spring 2009 DSF CAUS AdCan ACT Adds DSF Group Adds AdCan Percent Percent AdCan AdCan Adds of of Adds Adds on AdCan AdCan on DSF on DSF DSF Adds Adds Y
M
N
Fall 2009 DSF
Spring 2010 DSF
Fall 2010 DSF
Spring 2011 DSF
Any DSF
AdCan Adds on DSF
Percent of AdCan Adds
AdCan Adds on DSF
Percent of AdCan Adds
AdCan Adds on DSF
Percent of AdCan Adds
AdCan Adds on DSF
Percent of AdCan Adds
AdCan Adds on DSF
Percent of AdCan Adds
M1
65,248
377,982
0
4,025
1.1
8,036
2.1
11,102
2.9
12,401
3.3
14,216
3.8
16,101
4.3
16,193
4.3
ME
24,843
110,523
0
3,152
2.9
5,871
5.3
8,224
7.4
9,375
8.5
10,270
9.3
10,857
9.8
10,900
9.9
MF
14,554
79,373
0
1,913
2.4
3,481
4.4
4,955
6.2
5,601
7.1
6,143
7.7
6,681
8.4
6,711
8.5
MG
113,539
694,387
0
14,555
2.1
26,179
3.8
37,871
5.5
42,646
6.1
47,881
6.9
51,527
7.4
51,775
7.5
M3
83
1,118
0
4
0.4
33
3.0
33
3.0
33
3.0
33
3.0
33
3.0
33
3.0
N1
52,647
202,028
0
3,285
1.6
6,520
3.2
9,288
4.6
10,128
5.0
11,537
5.7
12,503
6.2
12,535
6.2 11.7
N2
45
299
0
6
2.0
19
6.4
25
8.4
27
9.0
34
11.4
35
11.7
35
P1
5,438
11,429
0
108
0.9
192
1.7
292
2.6
352
3.1
426
3.7
452
4.0
455
4.0
R1
16,095
32,612
0
1,035
3.2
2,006
6.2
2,718
8.3
3,028
9.3
3,380
10.4
3,736
11.5
3,741
11.5
R2
7
23
0
11
47.8
11
47.8
12
52.2
12
52.2
12
52.2
12
52.2
12
52.2
R3
2
2
0
0
0.0
0
0.0
0
0.0
0
0.0
0
0.0
0
0.0
0
0.0
Total
292,501
1,509,776
0
28,094
1.9
52,348
3.5
74,520
4.9
83,603
5.5
93,932
6.2
101,937
6.8
102,390
6.8
MA
67,150
443,021
0
33,323
7.5
59,597
13.5
83,373
18.8
91,715
20.7
96,638
21.8
99,673
22.5
100,079
22.6
MB
61,792
268,686
0
12,698
4.7
22,210
8.3
30,685
11.4
34,048
12.7
36,679
13.7
38,385
14.3
38,505
14.3
MC
40,048
176,753
0
6,365
3.6
11,530
6.5
16,221
9.2
18,144
10.3
19,790
11.2
20,908
11.8
20,984
11.9
MD
34,826
141,507
0
4,631
3.3
8,322
5.9
11,850
8.4
13,290
9.4
14,499
10.2
15,312
10.8
15,361
10.9
Total
17.0
203,816
1,029,967
0
57,017
5.5
101,659
9.9
142,129
13.8
157,197
15.3
167,606
16.3
174,278
16.9
174,929
B1
40
94
0
1
1.1
1
1.1
1
1.1
2
2.1
2
2.1
2
2.1
2
2.1
B2
9
29
0
0
0.0
0
0.0
19
65.5
19
65.5
19
65.5
19
65.5
19
65.5 56.6
B3
4,395
86,068
0
12,159
14.1
27,654
32.1
39,799
46.2
46,228
53.7
48,174
56.0
48,626
56.5
48,709
C1
44,415
115,731
0
1,085
0.9
2,580
2.2
3,983
3.4
4,740
4.1
5,439
4.7
6,295
5.4
6,533
5.6
C2
315,568
1,270,728
0
91,104
7.2
158,096
12.4
225,620
17.8
254,302
20.0
273,247
21.5
283,202
22.3
283,996
22.3
C3
371,906
1,264,057
0
127,460
10.1
220,121
17.4
317,509
25.1
355,199
28.1
372,513
29.5
379,973
30.1
380,791
30.1
Z0
80,816
276,136
0
17,547
6.4
28,172
10.2
41,515
15.0
47,917
17.4
49,950
18.1
51,278
18.6
51,431
18.6
817,149
3,012,843
0
249,356
8.3
436,624
14.5
628,446
20.9
708,407
23.5
749,344
24.9
769,395
25.5
771,481
25.6
Total 1,313,466 5,552,586 0 334,467 6.0 590,631 10.6 845,095 15.2 949,207 Sources: U.S. Census Bureau, Geography Reference File-Codes, July 2008; U.S. Census Bureau, Master Address File, July 2011.
17.1
1,010,882
18.2
1,045,610
18.8
1,048,800
18.9
Total
Attachment C This attachment provides detailed information about the DSF match rate of adds from the 2010 CAUS experimental listings. Note that only adds which were residential housing units within the universe of the 2011 ACS supplemental frame were included in the counts below. The Fall 2009 DSF was used as an input into creating the frame for these listings, and the Spring 2011 DSF is the first DSF update to the MAF after the adds from these listings were added to the MAF. Table C-1: DSF Match Rate History of 2010 CAUS Experimental Listings by Pre-Listing ACT Code
PreListing CAUS Group Y
M
N
Total
PreListing ACT
M1 ME MF MG N1 P1 R1 Total MA MB MC MD Total B3 C1 C2 C3 Z0 Total
Blocks with adds
325 148 101 682 166 10 24 1,456 293 324 233 210 1,060 2 1 79 74 7 163 2,679
Total Adds
1,348 469 333 3,204 609 27 46 6,036 4,448 961 851 531 6,791 171 14 1,037 463 27 1,712 14,539
Fall 2009 DSF Fall 2009 Fall DSF 2009 Percent DSF All Adds 0 0.0 0 0.0 0 0.0 0 0.0 0 0.0 0 0.0 0 0.0 0 0.0 0 0.0 0 0.0 0 0.0 0 0.0 0 0.0 0 0.0 0 0.0 0 0.0 0 0.0 0 0.0 0 0.0 0 0.0
Spring 2010 DSF Spring 2010 Spring DSF 2010 Percent DSF All Adds 0 0.0 0 0.0 0 0.0 0 0.0 0 0.0 0 0.0 0 0.0 0 0.0 0 0.0 0 0.0 0 0.0 0 0.0 0 0.0 0 0.0 0 0.0 0 0.0 0 0.0 0 0.0 0 0.0 0 0.0
Fall 2010 DSF Fall 2010 DSF 1 0 0 4 1 0 0 6 4 2 3 4 13 0 0 18 1 0 19 38
Fall 2010 DSF % All Adds 0.1 0.0 0.0 0.1 0.2 0.0 0.0 0.1 0.1 0.2 0.4 0.8 0.2 0.0 0.0 1.7 0.2 0.0 1.1 0.3
Spring 2011 DSF Spring 2011 Spring DSF 2011 Percent DSF All Adds 2 0.1 8 1.7 2 0.6 10 0.3 1 0.2 0 0.0 0 0.0 23 0.4 44 1.0 8 0.8 6 0.7 8 1.5 66 1.0 0 0.0 0 0.0 27 2.6 5 1.1 1 3.7 33 1.9 122 0.8
Any DSF
Any DSF
2 8 2 10 1 0 0 23 44 8 6 8 66 0 0 27 5 1 33 122
Any DSF Percent All Adds 0.1 1.7 0.6 0.3 0.2 0.0 0.0 0.4 1.0 0.8 0.7 1.5 1.0 0.0 0.0 2.6 1.1 3.7 1.9 0.8
Sources: U.S. Census Bureau, Geography Reference File-Codes, July 2009; U.S. Census Bureau, Master Address File Transaction File, December 2010; U.S. Census Bureau, Master Address File, January 2011 and July 2011.