BA22 - Predictive Modeling for CVD in a Multi-ethnic Cohort of Women

This page provides study documentation for BA22. For description of the specimen results, see Specimen Results Description (open to public). Data sets of the specimen results are included in the existing WHI datasets located on the WHI Data on this site (sign in and a completed Data Distribution Agreement are required; see details on the Data site).

Investigator Names and Contact Information

Nancy Cook, ScD, Brigham and Women’s Hospital and Harvard Medical School


Risk prediction models for cardiovascular disease have been developed primarily among white men and women, with little validation in multi-ethnic populations. The Women’s Health Initiative (WHI) Observational Study (OS) provides an excellent opportunity to examine the fit of current models as well as to determine whether these can be improved, particularly within under-represented subpopulations. The overall aim of the current proposal is to validate existing, and explore new, predictive models for 10-year risk of cardiovascular disease (CVD), including MI, stroke, and CVD mortality in this multiethnic population-based cohort of women. An efficient case-cohort sample of 4000 women will be used, including 2000 cases, with over-sampling of under-represented minority racial/ethnic groups, and a subcohort of approximately 2000 women frequency matched to the cases by race/ethnicity and 10-year age groups. A strength of the nested case-cohort design is the ability to use the same subcohort sample as a reference for more than one outcome. As such, we will fit models for total cardiovascular disease (CVD), including coronary heart disease (CHD) (myocardial infarction [MI] and CHD death), stroke, and CVD mortality, as well as models for CHD and stroke separately.

Both the Framingham Heart Study risk scores and the Reynolds Risk Score from the Women’s Health Study have been developed on predominantly white cohorts. How well these apply to minority women remains to be determined. This proposal will validate current models in this sample representative of a general diverse population. Risk scores will be assessed both overall and separately in subgroups defined by race/ethnicity, including Caucasians, Black/African-Americans, Hispanics, and Asian/Pacific Islanders. We will also fit new models in subpopulations separately, particularly Caucasian and African-American women.
Current risk prediction models, including the Framingham risk score, contain blood-based biomarkers, but do not include measures of adiposity or physical activity. While these were examined in the development of the Reynolds risk score in the Women’s Health Study, that study included only self-reports of height and weight, and may have thus under-estimated the contribution of these factors. In addition, waist and hip circumference, which may be more predictive, were not consistently measured until later in follow-up in the Framingham Heart Study. Data from the WHI will be used to explore these simplified models without blood-based biomarkers, and compare these to models using such biomarkers. We will also explore the contribution of some new proposed biomarkers for risk prediction beyond those already included in the Framingham and Reynolds models. These include lipoprotein-associated phospholipase A2 (Lp-PLA2), tissue plasminogen activator (tPA) antigen, amino-terminal pro-B-type natriuretic peptide (proBNP), and white blood cell count. The Lp-PLA2 assays will include measures of both mass and activity, and assay kits will be provided by diaDexus at no cost to the project. White blood cell count is already available in the WHI database, and will thus also incur no additional cost.
Specific aims:
1)    To validate the Framingham risk scores for CHD and CVD and the Reynolds risk score for CVD (Models A and B) in a diverse population of American women. The variables to be included are the traditional risk factors of blood pressure, smoking, diabetes, total cholesterol, and HDL cholesterol, as well as newer markers, including high-sensitivity C-reactive protein (hsCRP), apolipoproteins A-I (ApoA1), and B-100 (ApoB), lipoprotein (a) (Lp(a)), and glycated hemoglobin A1C (HbA1c) (among diabetics only).
2)    To develop and compare risk prediction models using lifestyle variables and other noninvasive measures only, using traditional blood-based biomarkers, and using novel biomarkers for CVD. Model-building will take place in a progressive fashion, and models will be assessed and compared using conventional as well as newly-developed measures of model fit. Specifically, components of this aim are:
2a) To develop a predictive model for CVD using lifestyle variables, including adiposity, physical activity, and family history, which are not currently in the Framingham risk score, along with other noninvasive measures including smoking, alcohol use, diabetes, and blood pressure. Available measures of adiposity include weight, height, BMI, waist and hip circumference, and the waist/hip ratio. Physical activity measures include episodes of activity per week, METS score, and resting heart rate. Family history of MI and stroke, including age of occurrence of MI, have been assessed in both parents and siblings.
2b) To determine whether blood-based biomarkers, including total and high-density lipoprotein (HDL) cholesterol, add predictive ability to models with lifestyle variables only. Other markers included in the Reynolds risk algorithms will also be assessed, including hsCRP, ApoA1, ApoB, Lp(a), and HbA1c (among diabetics only).
2c)  To determine the additional clinical utility of promising novel biomarkers for CVD risk prediction, including lipoprotein-associated phospholipase A2 (Lp-PLA2) mass and activity, tissue plasminogen activator (tPA) antigen, and amino-terminal pro-B-type natriuretic peptide (proBNP), and white blood cell count.
3)    To validate the established models as well as to develop new models within race/ethnicity-specific subgroups, particularly among Whites and Black/African-Americans.
4)    To explore models, and differences between models, for the separate endpoints of CHD and stroke in the total case-cohort sample.
5)    To refine methods for assessment and comparison of models for risk prediction, including examining the statistical properties of clinical reclassification and its summary measures. This includes simulations to assess the distribution of the Hosmer-Lemeshow statistic and the net reclassification improvement (NRI) in reclassified categories as well as examining the effect of category definition on the derived statistics. The performance of the integrated discrimination improvement (IDI), as well as the change in the c-statistic, will also be examined.


See Publications:  1272, 1496, 1745.  WHI publications by study lists published WHI papers that have been generated by ancillary studies. A complete list of WHI papers is available in the Bibliography section of this website.