EXP-002 / Risk Modeling / Published

Financial Fraud Risk Engine

A production-minded fraud-risk workflow with synthetic data generation, validation, model evaluation, cost-sensitive thresholding, policy artifacts, reason codes, SHAP explanations, and dashboard triage.

EXP-002 / question

Technical question

How can fraud-risk scores become analyst-review decisions instead of just model outputs?

EXP-002 / method

Method and workflow

  1. Generate synthetic fraud data with overlap, label noise, and class imbalance.
  2. Prepare train/test transaction datasets through a reusable data-prep workflow.
  3. Train a scikit-learn fraud-risk pipeline and evaluate ranking, calibration, and classification metrics.
  4. Search thresholds with cost, recall, precision, and review-capacity tradeoffs.
  5. Score new CSV files with fraud probability, binary flags, and analyst-friendly reason codes.
  6. Support interactive review through a Streamlit dashboard with SHAP-based explanations.
transactions validation fraud model threshold search policy artifacts reason codes dashboard

EXP-002 / evidence

Evidence of work

Threshold policy

Policy artifacts compare cost-optimized, high-recall, high-precision, balanced, and review-capacity strategies.

Explainability

Reason codes and SHAP views turn model scores into inspectable analyst review signals.

Validation

Training and scoring inputs fail early when required columns or probability contracts are invalid.

EXP-002 / stack

Technical stack

PythonpandasNumPyscikit-learnSciPySHAPStreamlitmatplotlibjoblibunittestGitHub Actions
Open repository ↗

EXP-002 / limitations

Limitations and honesty check

  • The data is synthetic and does not represent real banking or payment-network behavior.
  • Reason codes are analyst-facing summaries, not causal explanations.
  • A real fraud platform would need monitoring, access control, audit logs, feedback loops, and compliance review.

EXP-002 / next

Next improvements

  • Add drift simulation and monitoring examples.
  • Add time-based validation and backtesting.
  • Add FastAPI batch scoring and model metadata endpoints.
  • Add reviewer feedback loops and alerting examples.