Movie Recommendation System | honardoust.codes

EXP-005 / question

Technical question

When does a hybrid recommender actually beat simple recommendation baselines?

EXP-005 / method

Generate deterministic synthetic movies and user ratings.
Create train/test splits and user-item matrices.
Build content-based recommendations with TF-IDF and cosine similarity.
Build collaborative recommendations with corrected TruncatedSVD reconstruction.
Compare hybrid blends across alpha values from 0.0 to 1.0.
Report baselines and structured per-user recommendation outputs with short reasons.

movies ratings split content model SVD model hybrid sweep recommendations

EXP-005 / evidence

Baselines

Bayesian-average, popularity, average-rating, positive-count, and random baselines make model quality honest.

Alpha sweep

Hybrid weights are measured instead of guessed, with the current synthetic dataset favoring content-only blending.

Outputs

Recommendation CSVs include rank, movie ID, title, genres, score, and reason fields.

EXP-005 / stack

PythonpandasNumPySciPyscikit-learnmatplotlibunittestGitHub Actions

Open repository ↗

EXP-005 / limitations

The dataset is synthetic and should not be interpreted as real user preference behavior.
The current best model can be a simple baseline, which is intentionally documented rather than hidden.
Production recommenders would need real interaction logs, ranking experiments, cold-start handling, and online evaluation.

EXP-005 / next