Large amounts of data are now routinely collected by health systems in many low- and middle-income countries. Especially relevant are new datasets containing information captured as part of utilization episodes that are often available from settlement of claims or more generally as part of health management information systems. These data are a potentially powerful source of real-time information as to the health of the population but are often not exploited as much as they could for informing policymaking. There are several characteristics of these data that set them apart from more traditional health information sources, e.g., these data are generally not population representative as they capture limited information from only a subset of those that have coverage or are limited to those that utilize services. Also, these are often ‘big data’: large, complex, and relatively sparse, making them more amenable to predictive rather than causal analytics. Participants will learn how to apply core concepts of data analytics to healthcare claims databases for health system policy analysis. This session is designed to provide an understanding of the management, analysis, and interpretation of diverse healthcare data but with a specific focus on analyzing utilization episodes that are routinely collected to settle claims. Participants will be exposed to a broad range of topics, including in-depth exposure to fundamental analytic concepts; a range of different methodologies used to collect data; a variety of techniques to appropriately analyze such data; and guidance on how to present the results of such analysis. Topics covered will include assessing length of stay for inpatient episodes, calculating rates of potentially preventable hospitalizations from ambulatory care sensitive conditions, understanding utilization patterns for high-need high-cost patients, analyzing rates of C-section deliveries, as well as examples of using healthcare claims data for assessing the impact of air pollution and heatwaves on utilization, among others. The target audience is health researchers, policymakers, and healthcare data analysts. An optional second day – targeted towards those interested in learning how to use statistical software such as STATA – will provide a ‘hands-on’ introduction to coding and analysis using a synthetic claims database on the next day at the World Bank’s office in Bangkok. This optional second day participation will be limited to a maximum of 15 individuals, a sign-up sheet will be circulated on the first day.
The overall objective of this session is to introduce participants to examples and techniques to analyze large healthcare claims databases in ways that can help inform policymaking. By the end of the sessions, participants will develop skills to analyze claims data, learn to identify key health system performance metrics, and gain insights into using environmental data to assess their impact on healthcare utilization.