Model Card: Health Check Risk Model
Version: v1.4.2
Release date: 23 Apr 2026
1 Model overview
The Health Check Risk Model is a software-based clinical decision support tool designed to estimate personal health risks. It analyzes a combination of lifestyle, demographic, and medical history data to provide evidence-based insights.
- Model name: Health Check Risk Model
- Model type: consist multiple logistic regression models
- Developer: XUND
- Intended users: Laypersons (general public)
2 Intended use
2.1 Purpose and benefits
The model is intended to assess a user’s personal risk related to one or multiple disease groups. By analyzing demographic information, lifestyle factors, medical history, and family history, it performs an evidence-based assessment and provides recommendations on how to address identified risks. This empowers users to identify and initiate relevant preventive actions.
2.2 Out of scope cases
- Not a diagnosis: The model does not provide a medical diagnosis; it only assesses the probability of a future event.
- Emergency use: Not intended for assessing acute or life-threatening symptoms.
- Age restriction: Validated and designed for adults (18+).
3 Model inputs and outputs
3.1 Model inputs
The model processes the user’s responses to a structured health questionnaire, focusing on four key areas:
- Demographics: Basic information such as age, biological sex, and ethnicity.
- Lifestyle factors: Habits including smoking status, alcohol consumption, physical activity levels, and dietary patterns.
- Medical history: Information regarding pre-existing conditions, past diagnoses, and current medications.
- Family history: Known medical conditions among immediate family members to identify potential genetic predispositions.
- Physical measurements: Data such as Body Mass Index (BMI) or other relevant anthropometric markers.
3.2 Model outputs
- Probability score: A raw statistical value representing the likelihood of a specific condition occurring within the defined time horizon.
- Risk stratification: A categorized risk level (e.g., Low, Elevated, High) based on how your score compares to a validated reference population.
- Evidence-based guidance: Clinical insights and recommendations for preventive actions based on the identified risk profile.
3 Training data & diversity
The model is trained on high-quality, longitudinal health data to ensure clinical validity.
- Data sources: Primary evidence is derived from a comprehensive biomedical biobank supplemented by international medical ontologies. The dataset contains the following types of data:
- Biomarker data
- Healthcare records
- Questionnaire data
- Physical measurements
- Demographic and lifestyle data
- Environmental data
- Data characteristics: The model is trained on subsets of a comprehensive biomedical biobank, leveraging a pool of over 500,000 anonymized participants providing a robust statistical foundation for multivariate risk modeling.
- Bias mitigation: We employ biased cohort sampling to promote diversity across age, sex, and pre-existing conditions during the training phase.
4 Training data & diversity
The model is trained on high-quality, longitudinal health data to ensure clinical validity.
- Data sources: Primary evidence is derived from a comprehensive biomedical biobank supplemented by international medical ontologies. The dataset contains the following types of data:
- Biomarker data
- Healthcare records
- Questionnaire data
- Physical measurements
- Demographic and lifestyle data
- Environmental data
- Data characteristics: The model is trained on subsets of a comprehensive biomedical biobank, leveraging a pool of over 500,000 anonymized participants providing a robust statistical foundation for multivariate risk modeling.
- Bias mitigation: We employ biased cohort sampling to promote diversity across age, sex, and pre-existing conditions during the training phase.
5 Evaluation & performance
5.1 Performance metrics
We use the Concordance Index (C-index) to measure the model’s ability to differentiate between high-risk and low-risk individuals.
| Health topic | C-index (v1.4.2) | Clinical interpretation |
| Cardiovascular health | 0.721 | Highly reliable |
| Mental health | 0.721 | Highly reliable |
| Tumors | 0.642 | Reliable |
| Musculoskeletal | 0.667 | Reliable |
| Dermatology | 0.602 | Emerging / Informative |
| Women's health | 0.550 | Emerging / Informative |
5.2 Benchmarking
In all validation cycles, the model's performance was required to outperform random chance and meet or exceed the accuracy of medical experts.
6 Limitations & safety considerations
While the model is highly performant, users should be aware of the following limitations inherited from the primary training datasets:
- Limited ethnic diversity: The primary training data (UK Biobank) consists predominantly of individuals of European descent. Consequently, the model may have lower predictive accuracy for underrepresented ethnic groups. We are actively working on incorporating more diverse global datasets to mitigate this.
- Age range focus: Training data primarily covers the 40–69 age range. While the logic remains sound, the model's absolute risk scores are most representative for this demographic.
- Healthy volunteer bias: Participants in the UK Biobank tend to be healthier than the general population, which may influence baseline risk estimates.
- Geographic context: The model is calibrated against a UK-based cohort; relative risk relationships are expected to generalize well, but absolute risk may vary in different healthcare systems.
7 Ethical & responsible use
XUND is committed to developing AI that is transparent, safe, and beneficial. The Health Check Risk Model is governed by the following ethical principles:
- Human-in-the-loop: The model's feature selection and risk logic are guided by medical experts. AI is used to enhance, not replace, human medical expertise.
- Bias monitoring: We recognize the limitations in ethnic and demographic diversity within medical datasets. We perform regular subgroup evaluations to monitor for performance gaps and are committed to transparently communicating these limitations to our users.
- Data minimization: The model is designed to provide risk assessments using the minimum necessary data points, ensuring user privacy while maintaining clinical utility.
- Proactive safety: The implementation of "Critical Marker" logic (v1.3.2) ensures that users showing high-risk signals are explicitly guided toward professional medical consultation.