Model Card Template

Example: Clinical Note Summarization v2.1

Model Details

Field	Value
Model Name	clinical-summary-v2.1
Base Model	Claude 3.5 Sonnet (claude-3-5-sonnet-20241022)
Fine-tuned	No (prompt-engineered)
Version	2.1.0
Release Date	2025-10-15
Owner	AI Platform Team
Contact	ai-platform@company.com

Intended Use

Primary use case: Summarizing clinical notes for physician review dashboard.

Intended users: Clinical staff reviewing patient histories.

Out of scope: - Direct patient communication - Clinical decision making without physician review - Pediatric notes (not validated) - Non-English notes

Training/Prompt Data

Prompt developed using 500 de-identified clinical notes
Validated on 200 held-out notes with physician review
No fine-tuning performed; uses prompt engineering only

Performance Metrics

Metric	Value	Measurement Date
Physician approval rate	94.2%	2025-10-01
Factual accuracy (spot check)	98.1%	2025-10-01
Hallucination rate	1.2%	2025-10-01
Mean processing time	2.3s	2025-10-01
P95 latency	4.1s	2025-10-01

Limitations

May miss nuanced clinical context requiring specialist knowledge
Abbreviation expansion occasionally incorrect for rare terms
Long notes (>10 pages) may have reduced summary quality
Not validated for: surgical notes, radiology reports, pathology

Ethical Considerations

All outputs must be reviewed by licensed clinician
PHI handling compliant with HIPAA
Bias testing performed across demographic groups (see appendix)
No disparate impact detected in summary quality by patient demographics

Monitoring

Daily evaluation on 100 production samples
Weekly physician review of flagged cases
Monthly bias audit
Alerts: >5% drop in approval rate, >3% hallucination rate

Version History

Version	Date	Changes
2.1.0	2025-10-15	Improved medication list extraction
2.0.0	2025-08-01	Switched to Claude 3.5, new prompt
1.2.0	2025-05-01	Added allergy highlighting

Model Card Template (Blank)

# Model Card: [Model Name] {.unnumbered}

## Model Details
| Field | Value |
|-------|-------|
| Model Name | |
| Base Model | |
| Version | |
| Release Date | |
| Owner | |
| Contact | |

## Intended Use
**Primary use case:** 
**Intended users:** 
**Out of scope:** 

## Training/Prompt Data
[Description of data used for training or prompt development]

## Performance Metrics
| Metric | Value | Measurement Date |
|--------|-------|------------------|
| | | |

## Limitations
[Known limitations and edge cases]

## Ethical Considerations
[Bias testing, fairness considerations, required human oversight]

## Monitoring
[Ongoing evaluation cadence, alert thresholds]

## Version History
| Version | Date | Changes |
|---------|------|---------|
| | | |

Model Card Best Practices

What to include: - Specific, measurable metrics with dates - Explicit scope boundaries (what it’s NOT for) - Known failure modes - Required human oversight - Monitoring and alerting setup

What to avoid: - Vague claims (“high accuracy”) - Missing limitations - No versioning - Static documents that aren’t updated

--- number-sections: false execute: enabled: false --- # Model Card Template {.unnumbered} ## Example: Clinical Note Summarization v2.1 ### Model Details | Field | Value | |-------|-------| | Model Name | clinical-summary-v2.1 | | Base Model | Claude 3.5 Sonnet (claude-3-5-sonnet-20241022) | | Fine-tuned | No (prompt-engineered) | | Version | 2.1.0 | | Release Date | 2025-10-15 | | Owner | AI Platform Team | | Contact | ai-platform@company.com | ### Intended Use **Primary use case:** Summarizing clinical notes for physician review dashboard. **Intended users:** Clinical staff reviewing patient histories. **Out of scope:** - Direct patient communication - Clinical decision making without physician review - Pediatric notes (not validated) - Non-English notes ### Training/Prompt Data - Prompt developed using 500 de-identified clinical notes - Validated on 200 held-out notes with physician review - No fine-tuning performed; uses prompt engineering only ### Performance Metrics | Metric | Value | Measurement Date | |--------|-------|------------------| | Physician approval rate | 94.2% | 2025-10-01 | | Factual accuracy (spot check) | 98.1% | 2025-10-01 | | Hallucination rate | 1.2% | 2025-10-01 | | Mean processing time | 2.3s | 2025-10-01 | | P95 latency | 4.1s | 2025-10-01 | ### Limitations 1. May miss nuanced clinical context requiring specialist knowledge 2. Abbreviation expansion occasionally incorrect for rare terms 3. Long notes (>10 pages) may have reduced summary quality 4. Not validated for: surgical notes, radiology reports, pathology ### Ethical Considerations - All outputs must be reviewed by licensed clinician - PHI handling compliant with HIPAA - Bias testing performed across demographic groups (see appendix) - No disparate impact detected in summary quality by patient demographics ### Monitoring - Daily evaluation on 100 production samples - Weekly physician review of flagged cases - Monthly bias audit - Alerts: >5% drop in approval rate, >3% hallucination rate ### Version History | Version | Date | Changes | |---------|------|---------| | 2.1.0 | 2025-10-15 | Improved medication list extraction | | 2.0.0 | 2025-08-01 | Switched to Claude 3.5, new prompt | | 1.2.0 | 2025-05-01 | Added allergy highlighting | --- # Model Card Template (Blank) {.unnumbered} ```markdown # Model Card: [Model Name] {.unnumbered} ## Model Details | Field | Value | |-------|-------| | Model Name | | | Base Model | | | Version | | | Release Date | | | Owner | | | Contact | | ## Intended Use **Primary use case:** **Intended users:** **Out of scope:** ## Training/Prompt Data [Description of data used for training or prompt development] ## Performance Metrics | Metric | Value | Measurement Date | |--------|-------|------------------| | | | | ## Limitations [Known limitations and edge cases] ## Ethical Considerations [Bias testing, fairness considerations, required human oversight] ## Monitoring [Ongoing evaluation cadence, alert thresholds] ## Version History | Version | Date | Changes | |---------|------|---------| | | | | ``` --- # Model Card Best Practices {.unnumbered} **What to include:** - Specific, measurable metrics with dates - Explicit scope boundaries (what it's NOT for) - Known failure modes - Required human oversight - Monitoring and alerting setup **What to avoid:** - Vague claims ("high accuracy") - Missing limitations - No versioning - Static documents that aren't updated