The Significance of Data Disaggregation in Identifying and Addressing Diabetes Disparities

Introduction: The Persistent Challenge of Diabetes Disparities

Diabetes mellitus remains one of the most pressing public health challenges of the 21st century. According to the Centers for Disease Control and Prevention (CDC), over 37 million Americans have diabetes, and approximately 96 million adults have prediabetes. While the condition affects people from all walks of life, the burden is not evenly distributed. Racial and ethnic minorities, individuals with lower socioeconomic status, and people living in rural or underserved areas consistently experience higher prevalence rates, greater complications, and poorer outcomes. These disparities—measurable differences in health outcomes between population groups—are not accidental; they reflect deep-rooted structural inequities in healthcare access, social determinants of health, and systemic bias.

To effectively address these inequities, healthcare leaders, policymakers, and researchers must move beyond broad averages and examine the nuanced patterns hidden within aggregated statistics. This is where data disaggregation becomes indispensable. Disaggregation is the process of breaking down health data into finer subgroups—by race, ethnicity, age, sex, income, geography, education level, and other variables—to reveal disparities that would otherwise remain invisible. Without this granular lens, interventions risk being blanket solutions that fail to reach the populations most in need.

This article explores the critical role of data disaggregation in identifying and mitigating diabetes disparities. We will examine how disaggregation exposes hidden inequities, discuss methodologies for collecting and analyzing subgroup data, review successful case studies, and outline practical steps for integrating disaggregated data into public health strategies. By the end, it should be clear that data disaggregation is not merely a technical exercise but a moral imperative for achieving health equity.

Why Aggregated Data Masks Critical Disparities

At first glance, using aggregate data—such as the national average prevalence of diabetes—seems efficient. However, averages can be profoundly misleading. When data are pooled across diverse populations, disparities cancel each other out. For example, a city with a 10% diabetes prevalence overall might actually have a 5% rate in affluent white neighborhoods and a 20% rate in low-income Black communities. The aggregate figure obscures this two-fold difference, leading policymakers to believe the problem is uniform when it is anything but.

Consider a real-world illustration: the American Diabetes Association reports that American Indian/Alaska Native adults have the highest age-adjusted prevalence of diabetes (14.7%), followed by non-Hispanic Blacks (12.5%), Hispanics (11.7%), and non-Hispanic whites (7.5%). While these disaggregated rates are alarming, a national average would conceal the severe burden on Indigenous and Black communities. Moreover, even within broad racial categories, there is substantial variation. For instance, among Hispanic subgroups, Puerto Ricans have a diabetes prevalence nearly double that of Cubans. Without disaggregation by specific ethnicity, such differences go unnoticed.

Aggregate data also hide disparities in diabetes-related complications, such as lower-limb amputations, kidney failure, and retinopathy. Research shows that Black patients with diabetes are 3–4 times more likely to undergo lower-limb amputations than white patients, even after controlling for disease severity and insurance status. Yet, a hospital that only tracks overall amputation rates may not identify this profound inequity. Disaggregation by race and ethnicity is the only way to pinpoint where interventions are most urgently needed.

Diabetes disparities are not biologically predetermined; they are heavily shaped by social determinants of health (SDOH): income, education, housing, food security, transportation, and access to healthcare. For example, a person living in a food desert with limited access to fresh produce faces greater challenges in managing their diabetes than someone in a well-resourced neighborhood. Disaggregating data by zip code or census tract often reveals that diabetes prevalence and complication rates cluster in areas with high poverty and low access to primary care. This geographic lens is a powerful tool for resource allocation.

Key Dimensions of Data Disaggregation for Diabetes

Effective disaggregation requires collecting and analyzing data across multiple dimensions. While race and ethnicity are common starting points, they are far from sufficient. The following categories are especially relevant for understanding diabetes disparities:

Race, Ethnicity, and Ancestry

As noted above, disaggregating by racial and ethnic groups—and ideally, by detailed subgroups (e.g., Mexican, Puerto Rican, Chinese, Vietnamese)—reveals differential risk. Genetic factors, such as higher rates of insulin resistance in some populations, interact with social and environmental factors. Data systems must collect granular ethnicity data to enable meaningful analysis.

Geographic Location

Diabetes prevalence varies dramatically by region, state, and even neighborhood. The CDC’s Diabetes Surveillance System provides county-level estimates that show hotspots in the Southeast and Appalachia. Disaggregation by urban, suburban, and rural status also matters: rural residents face barriers like limited specialty care and longer travel distances.

Socioeconomic Status

Income and education level are among the strongest predictors of diabetes outcomes. People in the lowest income brackets are 2–3 times more likely to have diabetes than those at the top. Disaggregating by poverty level or educational attainment helps identify which socioeconomic segments need targeted support, such as subsidized diabetes self-management programs.

Age and Sex

Diabetes prevalence increases with age, but age-related patterns differ by sex. For example, women may experience greater complications from cardiovascular disease associated with diabetes. Disaggregating by age group (e.g., 18–44, 45–64, 65+) and sex enables tailored screening recommendations and treatment protocols.

Healthcare Access and Insurance Status

Uninsured and underinsured individuals are less likely to receive regular screening, consistent medication, and preventive care. Disaggregating diabetes data by insurance type (private, Medicare, Medicaid, uninsured) reveals how health system barriers contribute to disparities. For instance, people on Medicaid often face provider shortages and high out-of-pocket costs even with coverage.

Language and Cultural Factors

Limited English proficiency (LEP) is associated with lower health literacy and poorer diabetes outcomes. Disaggregating by primary language spoken can guide the development of culturally and linguistically appropriate educational materials and interpreter services.

How Disaggregated Data Identifies Hidden Disparities

The power of disaggregation lies in its ability to surface patterns that aggregated data would obscure. Below are several concrete examples of how disaggregation has uncovered critical disparities and led to targeted action.

Example 1: Race Disaggregation in Screening Rates

A community health system in a diverse city aggregated its diabetes screening rates and found they were at 75% overall—seemingly acceptable. However, when data were disaggregated by race, Black and Hispanic patients had screening rates of only 58% and 61%, respectively. Further analysis by neighborhood showed that clinics in predominantly Black and Hispanic communities had fewer screening events and shorter operating hours. This prompted the health system to deploy mobile screening units and extend evening hours in those areas, raising screening rates by 20% within one year.

Example 2: Age and Income intersection

Public health researchers in a western state analyzed diabetes hospitalizations using both age and income data. They found that low-income adults aged 45–64 had a hospitalization rate for diabetic ketoacidosis (DKA) that was three times higher than their higher-income peers in the same age range. This insight led to a targeted program providing free glucometers and telemedicine coaching for low-income middle-aged adults. DKA admissions dropped by 35% in the intervention group over 18 months.

Example 3: Language as a Barrier

A hospital system serving a large Vietnamese-speaking population noticed that diabetes-related emergency department visits were significantly higher among Vietnamese-speaking patients compared to English-speaking patients, even when controlling for medical complexity. Disaggregating by language revealed that translated diabetes self-management materials were rarely used because they were not culturally tailored. In response, the hospital hired bilingual community health workers and developed video-based education in Vietnamese, resulting in a 25% reduction in ED visits among that population.

Practical Steps for Implementing Data Disaggregation

Moving from intention to practice requires systematic changes in data collection, analysis, and use. The following steps provide a roadmap for healthcare organizations, public health departments, and community groups.

1. Standardize Collection of Demographic Data

To perform disaggregation, you first need high-quality, granular demographic data. This means moving beyond the typical “Race/Ethnicity” dropdown that lumps all Hispanics or all Asians into one category. Use detailed categories that reflect the local population—e.g., for Asian American patients, include subgroups like Chinese, Filipino, Indian, Vietnamese, Korean, Japanese. Collect data on additional variables such as income, education, primary language, and zip code. Ensure that data collection is done in a culturally sensitive manner; explain to patients why this information is important for improving care.

2. Build Analytical Capacity

Disaggregation requires statistical methods that can handle small sample sizes without compromising privacy. For very small subgroups, consider aggregating data over time or using Bayesian smoothing to generate stable estimates. Train data analysts in techniques such as stratified analysis, logistic regression, and multilevel modeling. Open-source tools like R and Python, combined with packages for small-area estimation, are invaluable.

3. Create Visualizations That Tell the Story

Data alone is not enough; it must be communicated effectively. Use heat maps (geographic disaggregation), grouped bar charts (race by outcome), and stratified tables to highlight disparities. Dashboards that allow users to filter by demographics can empower decision-makers. The CDC’s PLACES project is an excellent model: it provides county-level estimates of diabetes prevalence and other chronic conditions, disaggregated by race and poverty.

4. Engage Communities in Interpretation

Data disaggregation should not happen in a vacuum. Involve community members and trusted leaders in interpreting findings and co-designing solutions. Their lived experience provides context that raw numbers cannot convey. For example, a community advisory board might explain that low screening rates in a certain Zip code are due to lack of transportation, not lack of awareness.

5. Link Data to Action

Disaggregation is only valuable if it leads to change. Develop a clear process for turning identified disparities into actionable interventions. This may involve allocating resources (funding, staff, equipment) to underserved areas, revising clinical protocols (e.g., offering home-based care for housebound patients), or advocating for policy change (e.g., expanding Medicaid to cover diabetes education programs). Set measurable targets and track progress over time using the same disaggregated data.

Overcoming Barriers to Data Disaggregation

Despite its benefits, data disaggregation faces several challenges. Acknowledging and addressing these obstacles is crucial for sustained implementation.

Data Quality and Completeness

Many health systems have incomplete or inconsistent demographic data. Patients may be recorded as “unknown” or “other” for race/ethnicity. Invest in training for registration staff and use electronic health record prompts to collect data using validated, standardized questions. For income or education, which are more sensitive, use surrogate measures such as census-derived neighborhood characteristics (area-based measures) when individual data is unavailable.

Sample Size and Privacy Concerns

When data are broken down into many small subgroups, numbers for some cells may be too small to report without risking identification of individuals. In such cases, aggregate over time (e.g., combine multiple years of data) or use broader categories (e.g., combine several small Asian subgroups). Ensure compliance with HIPAA and other privacy regulations; do not publish cells with counts < 10.

Resistance to Acknowledging Disparities

Some organizations may be reluctant to highlight disparities because they fear reputational damage or legal exposure. However, transparency is a cornerstone of health equity. Framing disparities as a systemic issue (not a failure of individual providers) and reinforcing the mission of equitable care can help overcome resistance. Many successful initiatives have publicly shared their disaggregated data as a commitment to improvement.

Resource Constraints

Meaningful disaggregation requires time, expertise, and technology. For smaller organizations, partnerships with academic institutions or public health agencies can provide analytical support. Additionally, using state-level data (e.g., state health department surveys) can supplement internal data. The investment is justified by the potential to reduce costly, preventable complications and improve population health.

Case Studies: Success Stories in Diabetes Data Disaggregation

Case Study 1: The NYC Health Department A1c Registry

New York City’s Department of Health and Mental Hygiene created an A1c registry for people with diabetes, linking lab results with neighborhood demographics. By disaggregating by borough and neighborhood poverty level, the department identified that the highest A1c averages were concentrated in the South Bronx and central Brooklyn—areas with high poverty and large Black and Hispanic populations. The department then deployed community health workers to those neighborhoods, offering diabetes self-management classes and coordinated care. After three years, the average A1c in these neighborhoods decreased by 0.5 percentage points, surpassing improvements in wealthier areas.

Case Study 2: Native American Diabetes Prevention in Oklahoma

Oklahoma’s Indian Health Service used disaggregated data by tribe to identify the Cherokee Nation as having a particularly high diabetes rate compared to other tribal groups in the state. Partnering with tribal leaders, they launched a culturally tailored “Diabetes Prevention Program” that incorporated Cherokee language, traditional foods, and community support. A disaggregated evaluation showed that participants lost an average of 5% of body weight, and the program reduced new diabetes cases by 28% over two years. The success hinged on data that pinpointed which tribal community needed intervention most.

Case Study 3: Kaiser Permanente’s Race-Stratified Quality Metrics

Kaiser Permanente, a large integrated healthcare system, began stratifying its quality metrics (e.g., diabetes control, retinal exams, foot exams) by race, ethnicity, and language in the mid-2000s. Initially, the data showed that Black and Hispanic members were significantly less likely to meet diabetes control targets. In response, Kaiser implemented system-wide improvements, including outreach calls in Spanish, culturally adapted nutrition counseling, and ensuring high-performing clinics were evenly distributed across neighborhoods. Over a decade, the racial gap in diabetes control narrowed from 10% to 3%. The key was regular, publicly reported disaggregated data held leaders accountable.

Policy Implications: How Disaggregated Data Can Shape Legislation

Data disaggregation is not only a tool for healthcare providers; it also informs high-impact policy decisions. For example, the Diabetes Prevention and Control Act at the federal level could be better targeted if it required disaggregated reporting by high-risk subgroups. Similarly, state Medicaid programs can use disaggregated data to create value-based payment models that reward success in reducing disparities. The National Diabetes Statistics Report from the CDC already provides some subgroup data, but more granularity (e.g., by Asian subgroups) would allow states to allocate funding more effectively.

Policy can also address data infrastructure. The 21st Century Cures Act and regional health information exchanges (HIEs) can incentivize the collection of standardized social determinants of health data, including race, ethnicity, and language. Some states, like California and Washington, have passed laws requiring disaggregation of health data for Asian American and Native Hawaiian/Pacific Islander populations. These legislative actions create a virtuous cycle: better data leads to better interventions, which in turn generate data that can advocate for continued investment.

Conclusion: From Data to Equity

Diabetes disparities are a stark example of how systemic inequities manifest in health outcomes. Yet, these disparities are not inevitable. Data disaggregation is a powerful lens that reveals the precise contours of inequity, enabling targeted, effective actions. It moves us beyond the fiction of uniformity and toward personalized, community-responsive care.

But data alone is not sufficient. It must be paired with political will, community engagement, and sustained resources. Healthcare organizations that embrace disaggregation—and commit to acting on what they find—will be better positioned to reduce diabetes complications, improve quality of life, and save lives across demographic groups. The path to health equity begins with seeing clearly. Data disaggregation gives us that vision.

If your organization has not yet integrated disaggregated data into diabetes care, start small. Pick one dimension—race, ethnicity, or zip code—and examine one key metric, such as A1c control or emergency department utilization. You may be surprised by what you discover. And once you see the disparity, you can begin to address it. Every step toward granularity is a step toward justice.