0 votes
asked ago by (57.7k points)
Aug 25 -- Today, the Census Bureau released new 2010 demonstration data and performance metrics  for the 2020 Census Demographic and Housing Characteristics File (DHC). A 30-day public feedback period begins today and will end on September 26, 2022. This is the final opportunity to provide feedback on the DHC before production settings are finalized.  See below for feedback submission instructions. Note that the release of the final 2020 Census DHC data product is on track and remains slated for May 2023.

As with previous versions, these demonstration data apply the current iteration of the disclosure avoidance system (DAS) to 2010 Census data. This allows a side-by-side comparison of the impact of this version of the DAS against the published 2010 Census tables.

Based on feedback, the Census Bureau added additional tables and geographies to the proposed DHC data product. These updates are reflected in the demonstration data and updated 2020 Census Data Product Planning Crosswalk (v. 2022-08-25). Specifically:

-- We lowered the proposed lowest level of geography from state or county level to census tract level for 16 tables and 63 iterated tables or tables repeated by race and ethnicity.
-- We added 15 iterated tables for sex by single year of age at the census tract level.  

These updates resulted in changes to the table numbers. With these table changes, this updated DHC data product proposal more closely reflects its predecessor, the 2010 Census Summary File 1 (SF1). Note that this release includes both the person and housing tables; these were released separately in the previous round of demonstration data (v. 2022-03-16)

Based on feedback, we’ve added a “topics” column to the “DHC Crosswalk” tab, so you can more easily identify tables of interest. The full topic list is in the “README” tab. Although out of scope for the DHC demonstration data product, the Crosswalk reflects other important changes based on public feedback. For example:

-- We added census tracts to the proposed geography for the Demographic Profile.
-- We split the Detailed DHC data product into three data products.
-- We proposed additional geographies for the Detailed DHC-A and Detailed DHC-B data products.

For a list of all updates, see the “Change Log” tab.

While protecting the confidentiality of respondents’ information, these demonstration data reflect our efforts to meet an array of accuracy targets established by Census Bureau subject matter experts based on internal program requirements and feedback from data users. We work to meet these targets through a combination of adjustments; for example, changing the amount of privacy-loss budget (PLB) applied to sets of tabulations and making other algorithmic improvements to the DAS. We call this “tuning” the DAS.  

It’s also important to note that when we speak about the accuracy of data, we are referring to statistical accuracy. Consistent with the Office of Management and Budget’s Statistical Policy Directive #1 (as codified in Title III of the Foundations for Evidence-based Policymaking Act of 2018), statistical accuracy means that our publicly released data products meet established information quality guidelines while also protecting the confidentiality of respondents’ information and, when appropriate, providing information on limitations of the data that may assist data users in determining the suitability of the data for their purposes.

Recall that we discovered an error in the allocation of the privacy-loss budget after release of the first DHC demonstration data product in March 2022. Today’s data release reflects the correction to that error in addition to other improvements.  

As you analyze the data, please consider the following:

-- Block-level data may be noisy and are fit-for-use when aggregated into geographically contiguous larger entities. They are not intended to be fit-for-use as a unit of analysis.
-- Be mindful with instances of zero counts or geographies with no people of a particular characteristic. It is important to include these cases when calculating metrics for accuracy or bias. For the Detailed Summary Metrics, if the published and differentially private counts for a particular table entry are both zero, we count that as no difference (0% change). If the published count is zero and the differentially private count is not, we count that as a 200% change (following the arc percentage change method used in economic statistics), no matter what the new number is.
-- Keep in mind that comparing the demonstration data to the published 2010 data is imperfect because the 2010 data used the “swapping” method of disclosure avoidance. 

The following charts highlight some of the improvements since the March 2022 demonstration data release.

· Significant improvements in relationship to householder data: There is a significant improvement in accuracy for Relationship to Householder at all levels of geography. The Mean Absolute Error (MAE) for Householder at the county level, for example, has declined from 135.81 to 6.83. MAE is a measure of the “average” absolute value of the count difference for a particular statistic. See Figure 1 for improvements for other relationship categories. . . .
· Improvements in single-year of age for the population 75 and older: Figure 2 compares the differentially private count to the published count for single-year of age 75 - 99, and 5-year of age for 100 - 104, 105 - 109, and 110 and older. The black horizontal line represents zero difference between the differentially private count and the published count – e.g., perfect accuracy. As shown, the first DHC demonstration data product, represented in blue, had a large positive error for centenarians. The second demonstration data product, represented in orange, is much closer to zero, which shows the improvements in accuracy for this population. . . .
In addition to the improvements shown in the graphics above, this second demonstration data product has resolved implausible detailed group quarters types. For example, the second demonstration data product contains zero people living in military ships in Wyoming because there are no group quarters of that type in the state.

We continue to explore the feasibility for improvements in other areas identified, notably:

-- Additional accuracy for the group quarters population, ages 18-25.
-- Additional accuracy for counties and tracts. We’re exploring the impact of shifting varying amounts of privacy-loss budget from the state and national level to counties and tracts.
-- Additional accuracy for school district data, including single year of age and family type, among others.

DHC tuning continues as we work to meet targets based on use cases. Your feedback has been instrumental to this effort. During this final round of evaluation and feedback, we ask that you identify tables and geographies that have differences that would impact your use of the data and provide acceptable targets when possible.

To meet our 2020 Census DHC production release date of May 2023, the comment period must end by September 26. Please submit comments to 2020DAS@census.gov using the subject “2020 Census Data Products.” If you do not want us to publish your feedback or if you want us to remove your identifying information, please indicate that in the email.

Reminder: Webinar on Wednesday, August 31. Join us for a webinar on Wednesday, August 31, at 3:00 p.m. ET to learn more about the demonstration data released today. https://www.census.gov/data/academy/webinars/2021/disclosure-avoidance-series.html


Please log in or register to answer this question.