Background: Global trends in cardiovascular disease (CVD) exhibit considerable interregional and interethnic differences, which in turn affect long-term CVD risk across diverse populations. An in-depth understanding of the interplay between ethnicity, socioeconomic status, and CVD risk factors and mortality in a contemporaneous population is crucial to informing health policy and resource allocation aimed at mitigating long-term CVD risk. Generating bespoke large-scale and reliable data with sufficient numbers of events is expensive and time-consuming but can be circumvented through utilization and linkage of data routinely collected in electronic health records (EHR). Objective: We aimed to characterize the burden of CVD risk factors across different ethnicities, age groups, and socioeconomic groups, and study CVD incidence and mortality by EHR linkage in London. Methods: The proposed study will initially be a cross-sectional observational study unfolding into prospective CVD ascertainment through longitudinal follow-up involving linked data. The government-funded National Health System (NHS) Health Check program provides an opportunity for the systematic collation of CVD risk factors on a large scale. NHS Health Check data on approximately 200,000 individuals will be extracted from consenting general practices across London that use the Egton Medical Information Systems (EMIS) EHR software. Data will be analyzed using appropriate statistical techniques to (1) determine the cross-sectional burden of CVD risk factors and their prospective association with CVD outcomes, (2) validate existing prediction tools in diverse populations, and (3) develop bespoke risk prediction tools across diverse ethnic groups. Results: Enrollment began in January 2019 and is ongoing with initial results to be published mid-2021. Conclusions: There is an urgent need for more real-life population health studies based on analyses of routine health data available in EHRs. Findings from our study will help quantify, on a large scale, the contemporaneous burden of CVD risk factors by geography and ethnicity in a large multiethnic urban population. Such detailed understanding (especially interethnic and sociodemographic variations) of the burden of CVD risk and its determinants, including heredity, environment, diet, lifestyle, and socioeconomic factors, in a large population sample, will enable the development of tailored and dynamic (continuously learning from new data) risk prediction tools for diverse ethnic groups, and thereby enable the personalized provision of prevention strategies and care. We anticipate that this systematic approach of linking routinely collected data from EHRs to study CVD can be conducted in other settings as EHRs are being implemented worldwide.