Handwritten Digit Classification: LR, SVM, and SVD
Overview
Course project (MAT167, Applied Linear Algebra). We compare classic classifiers (logistic regression, linear SVM) for handwritten digits and a linear-algebraic approach (SVD). Below are a brief dataset intro and the confusion matrices rendered directly for a quick skim.
Dataset
- USPS digits (10 classes), 16x16 grayscale images.
- Pre-split train/test; flattened to 256-dim vectors; z-score standardized.

SVD (Main Method)
- For each digit class, compute a rank-k SVD basis U_k on that class’s training images.
- For a test image x, compute reconstruction error ||x − U_k U_k^T x|| for each class and predict the class with minimal error.


Confusion Matrices


Results Summary
- Logistic Regression (one-vs-rest) accuracy: ~0.943
- Support Vector Machine (linear kernel) accuracy: ~0.931
- SVD-based classifier (k=17, reconstruction error): ~0.966
Metrics shown are from the current run; small variation is expected across seeds/hyperparameters.
Takeaways
- SVM yields the best accuracy on this USPS split.
- SVD offers compact, interpretable low-rank structure, trading accuracy for speed and dimensionality reduction.
- Regularization, scaling, and rank choice materially affect outcomes.