TY - JOUR
T1 - Replicating a COVID-19 study in a national England database to assess the generalisability of research with regional electronic health record data
AU - on behalf of the CVDCOVID-UK/COVID-IMPACT Consortium
AU - Williams, Richard
AU - Jenkins, David
AU - Bolton, Thomas
AU - Heald, Adrian
AU - Mizani, Mehrdad
AU - Sperrin, Matthew
AU - Peek, Niels
AU - Boyle, Jon
AU - Proudfoot, Alastair
AU - Constantine, Andrew
AU - Jones, Dan
AU - Rathod, Krishnaraj
AU - Ahmed, Nida
AU - Fitzgerald, Richard
AU - O’Connell, Dan
AU - Herz, Naomi
AU - Arafin, Rony
AU - Babu-Narayan, Sonya
AU - Karim, Zainab
AU - Shelton, Jon
AU - Slapkova, Martina
AU - Hinchliffe, Rosie
AU - Johnson, Shane
AU - Toms, Renin
AU - Townson, Julia
AU - Birney, Ewan
AU - Gerstung, Moritz
AU - Brown, Katherine
AU - Zuckerman, Benjamin
AU - Wong, Ernest
AU - Braithwaite, Tasanee
AU - Stevenson, Anna
AU - Jackson, Annette
AU - Sudlow, Cathie
AU - Chalmers, Fionna
AU - Lewis, Jadene
AU - Farrell, James
AU - Austin, Jemma
AU - Nolan, John
AU - McAllister, Kate
AU - Murdock, Lars
AU - Morrice, Lynn
AU - Mizani, Mehrdad
AU - Webb, Melissa
AU - Forsyth, Ross
AU - Priedon, Rouven
AU - Khan, Samaira
AU - Petersen, Steffen
AU - Bolton, Thomas
AU - Welshman, Zach
N1 - © Author(s) (or their employer(s)) 2025. Re-use permitted under CC BY. Published by BMJ Group.
PY - 2025/4/23
Y1 - 2025/4/23
N2 - Objectives To assess the degree to which we can replicate a study between a regional and a national database of electronic health record data in the UK. The original study examined the risk factors associated with hospitalisation following COVID-19 infection in people with diabetes. Design A replication of a retrospective cohort study. Setting Observational electronic health record data from primary and secondary care sources in the UK. The original study used data from a large, urbanised region (Greater Manchester Care Record, Greater Manchester, UK—2.8 m patients). This replication study used a national database covering the whole of England, UK (NHS England’s Secure Data Environment service for England, accessed via the BHF Data Science Centre’s CVD-COVID-UK/COVID-IMPACT Consortium—54 m patients). Participants Individuals with a diagnosis of type 1 diabetes or type 2 diabetes prior to a positive COVID-19 test result. The matched controls (3:1) were individuals who had a positive COVID-19 test result, but who did not have a diagnosis of diabetes on the date of their positive COVID-19 test result. Matching was done on age at COVID-19 diagnosis, sex and approximate date of COVID-19 test. Primary and secondary outcome measures Hospitalisation within 28 days of a positive COVID-19 test. Results We found that many of the effect sizes did not show a statistically significant difference, but that some did. Where effect sizes were statistically significant in the regional study, then they remained significant in the national study and the effect size was the same direction and of similar magnitude. Conclusions There is some evidence that the findings from studies in smaller regional datasets can be extrapolated to a larger, national setting. However, there were some differences, and therefore replication studies remain an essential part of healthcare research.
AB - Objectives To assess the degree to which we can replicate a study between a regional and a national database of electronic health record data in the UK. The original study examined the risk factors associated with hospitalisation following COVID-19 infection in people with diabetes. Design A replication of a retrospective cohort study. Setting Observational electronic health record data from primary and secondary care sources in the UK. The original study used data from a large, urbanised region (Greater Manchester Care Record, Greater Manchester, UK—2.8 m patients). This replication study used a national database covering the whole of England, UK (NHS England’s Secure Data Environment service for England, accessed via the BHF Data Science Centre’s CVD-COVID-UK/COVID-IMPACT Consortium—54 m patients). Participants Individuals with a diagnosis of type 1 diabetes or type 2 diabetes prior to a positive COVID-19 test result. The matched controls (3:1) were individuals who had a positive COVID-19 test result, but who did not have a diagnosis of diabetes on the date of their positive COVID-19 test result. Matching was done on age at COVID-19 diagnosis, sex and approximate date of COVID-19 test. Primary and secondary outcome measures Hospitalisation within 28 days of a positive COVID-19 test. Results We found that many of the effect sizes did not show a statistically significant difference, but that some did. Where effect sizes were statistically significant in the regional study, then they remained significant in the national study and the effect size was the same direction and of similar magnitude. Conclusions There is some evidence that the findings from studies in smaller regional datasets can be extrapolated to a larger, national setting. However, there were some differences, and therefore replication studies remain an essential part of healthcare research.
KW - Adult
KW - Aged
KW - COVID-19/epidemiology
KW - Databases, Factual
KW - Diabetes Mellitus, Type 1/epidemiology
KW - Diabetes Mellitus, Type 2/epidemiology
KW - Electronic Health Records/statistics & numerical data
KW - England/epidemiology
KW - Female
KW - Hospitalization/statistics & numerical data
KW - Humans
KW - Male
KW - Middle Aged
KW - Retrospective Studies
KW - Risk Factors
KW - SARS-CoV-2
UR - https://www.scopus.com/pages/publications/105003982049
U2 - 10.1136/bmjopen-2024-093080
DO - 10.1136/bmjopen-2024-093080
M3 - Article
C2 - 40268487
AN - SCOPUS:105003982049
SN - 2044-6055
VL - 15
SP - e093080
JO - BMJ open
JF - BMJ open
IS - 4
M1 - e093080
ER -