Model Card: Health Check Risk Model

Version: v1.4.2
Release date: 23 Apr 2026

1 Model overview

The Health Check Risk Model is a software-based clinical decision support tool designed to estimate personal health risks. It analyzes a combination of lifestyle, demographic, and medical history data to provide evidence-based insights.

Model name: Health Check Risk Model
Model type: consist multiple logistic regression models
Developer: XUND
Intended users: Laypersons (general public)

2 Intended use

2.1 Purpose and benefits

The model is intended to assess a user’s personal risk related to one or multiple disease groups. By analyzing demographic information, lifestyle factors, medical history, and family history, it performs an evidence-based assessment and provides recommendations on how to address identified risks. This empowers users to identify and initiate relevant preventive actions.

2.2 Out of scope cases

Not a diagnosis: The model does not provide a medical diagnosis; it only assesses the probability of a future event.
Emergency use: Not intended for assessing acute or life-threatening symptoms.
Age restriction: Validated and designed for adults (18+).

3 Model inputs and outputs

3.1 Model inputs

The model processes the user’s responses to a structured health questionnaire, focusing on four key areas:

Demographics: Basic information such as age, biological sex, and ethnicity.
Lifestyle factors: Habits including smoking status, alcohol consumption, physical activity levels, and dietary patterns.
Medical history: Information regarding pre-existing conditions, past diagnoses, and current medications.
Family history: Known medical conditions among immediate family members to identify potential genetic predispositions.
Physical measurements: Data such as Body Mass Index (BMI) or other relevant anthropometric markers.

3.2 Model outputs

Probability score: A raw statistical value representing the likelihood of a specific condition occurring within the defined time horizon.
Risk stratification: A categorized risk level (e.g., Low, Elevated, High) based on how your score compares to a validated reference population.
Evidence-based guidance: Clinical insights and recommendations for preventive actions based on the identified risk profile.

3 Training data & diversity

The model is trained on high-quality, longitudinal health data to ensure clinical validity.

Data sources: Primary evidence is derived from a comprehensive biomedical biobank supplemented by international medical ontologies. The dataset contains the following types of data:
- Biomarker data
- Healthcare records
- Questionnaire data
- Physical measurements
- Demographic and lifestyle data
- Environmental data
Data characteristics: The model is trained on subsets of a comprehensive biomedical biobank, leveraging a pool of over 500,000 anonymized participants providing a robust statistical foundation for multivariate risk modeling.
Bias mitigation: We employ biased cohort sampling to promote diversity across age, sex, and pre-existing conditions during the training phase.

4 Training data & diversity

The model is trained on high-quality, longitudinal health data to ensure clinical validity.

Data sources: Primary evidence is derived from a comprehensive biomedical biobank supplemented by international medical ontologies. The dataset contains the following types of data:
- Biomarker data
- Healthcare records
- Questionnaire data
- Physical measurements
- Demographic and lifestyle data
- Environmental data
Data characteristics: The model is trained on subsets of a comprehensive biomedical biobank, leveraging a pool of over 500,000 anonymized participants providing a robust statistical foundation for multivariate risk modeling.
Bias mitigation: We employ biased cohort sampling to promote diversity across age, sex, and pre-existing conditions during the training phase.

5 Evaluation & performance

5.1 Performance metrics

We use the Concordance Index (C-index) to measure the model’s ability to differentiate between high-risk and low-risk individuals.

Health topic	C-index (v1.4.2)	Clinical interpretation
Cardiovascular health	0.721	Highly reliable
Mental health	0.721	Highly reliable
Tumors	0.642	Reliable
Musculoskeletal	0.667	Reliable
Dermatology	0.602	Emerging / Informative
Women's health	0.550	Emerging / Informative

5.2 Benchmarking

In all validation cycles, the model's performance was required to outperform random chance and meet or exceed the accuracy of medical experts.

6 Limitations & safety considerations

While the model is highly performant, users should be aware of the following limitations inherited from the primary training datasets:

Limited ethnic diversity: The primary training data (UK Biobank) consists predominantly of individuals of European descent. Consequently, the model may have lower predictive accuracy for underrepresented ethnic groups. We are actively working on incorporating more diverse global datasets to mitigate this.
Age range focus: Training data primarily covers the 40–69 age range. While the logic remains sound, the model's absolute risk scores are most representative for this demographic.
Healthy volunteer bias: Participants in the UK Biobank tend to be healthier than the general population, which may influence baseline risk estimates.
Geographic context: The model is calibrated against a UK-based cohort; relative risk relationships are expected to generalize well, but absolute risk may vary in different healthcare systems.

7 Ethical & responsible use

XUND is committed to developing AI that is transparent, safe, and beneficial. The Health Check Risk Model is governed by the following ethical principles:

Human-in-the-loop: The model's feature selection and risk logic are guided by medical experts. AI is used to enhance, not replace, human medical expertise.
Bias monitoring: We recognize the limitations in ethnic and demographic diversity within medical datasets. We perform regular subgroup evaluations to monitor for performance gaps and are committed to transparently communicating these limitations to our users.
Data minimization: The model is designed to provide risk assessments using the minimum necessary data points, ensuring user privacy while maintaining clinical utility.
Proactive safety: The implementation of "Critical Marker" logic (v1.3.2) ensures that users showing high-risk signals are explicitly guided toward professional medical consultation.