EXP-001 / question
Technical question
Can an underwriting model know when to defer uncertain decisions to human review?
EXP-001 / method
Method and workflow
- Validate the input dataset before schema inference and model training.
- Train a scikit-learn model that outputs underwriting approval probabilities.
- Evaluate probability quality with corrected ECE, Brier score, ROC-AUC, PR-AUC, and baseline metrics.
- Generate abstention policies that trade off coverage, auto-decision quality, and human review volume.
- Report slice-level review rates, error rates, false approval/rejection rates, and calibration diagnostics.
- Expose results through pipeline artifacts and a Streamlit review dashboard.
EXP-001 / evidence
Evidence of work
Expected Calibration Error compares predicted approval probability against observed approval rate, with reliability diagrams for inspection.
Coverage curves and policy variants show when the system can automate and when it should defer to review.
Slice artifacts summarize review rate, auto-decision behavior, error rates, and calibration by applicant bands.
EXP-001 / stack
Technical stack
EXP-001 / limitations
Limitations and honesty check
- The project is a portfolio demo, not a production underwriting system.
- The included dataset and policies do not certify fairness, compliance, or deployability.
- Real lending use would require governance, monitoring, audit trails, security controls, and legal review.
EXP-001 / next
Next improvements
- Add drift monitoring and model-card metadata.
- Add deeper fairness and subgroup analysis.
- Add a FastAPI scoring endpoint and Docker deployment path.
- Connect reviewer feedback to policy and calibration monitoring.