The Applied LA Trilogy · A Learning-Designed Pipeline

Three papers.
Three ways to see a matrix.

An in-progress 12-month first-author publication plan with Prof. Jeff Anderson at Foothill College, designed to take a community college CS student from their first matrix to a working neural network in one semester. Each paper is being built as a self-contained 2–4 week lab following Jeff's 8-step applied modeling process and learning-science scaffolds — multi-representation, predict–try–reflect, worked examples before independent problems — with explicit links to the industry pipelines these techniques power. Every demo below is live and running in your browser — click, drag, draw. Every lab will ship with a free remix-kit so any instructor can use it.

Principal: Henry Fan
Mentor: Prof. Jeff Anderson
Foothill College · Math 2BL
Target: PRIMUS · CSE · SIGCSE
License: CC-BY 4.0 · MIT
01
See
A matrix is a function on pixels.
02
Decompose
A matrix is a basis for the data.
03
Predict
A matrix is a decision rule.
Meet the Pipeline · Nine Real Humans

The math is the tool.
The choice is the whole career.

These are nine people who took some combination of the skills in this trilogy — from affine transforms to SVD to classification — and pointed them at a problem they cared about. Some work at MIT. Some started nonprofits in their own neighborhoods. Some are teenagers. Two of them do not have a computer science degree.

Backgrounds represented
Computer science
Journalism
Arts administration
Law · Design
High school
Sociology

What they share
One problem
they cared about
enough to point
their skills at it.
Teacher's note
Some of these nine people have CS degrees. Most don't. What they share: they took the skills available to them — technical and not — and pointed them at a problem they cared about. The pointing is the whole career. Everything else is just tools.
Paper 01 Phase I · See arXiv cs.CY · PRIMUS · CSE

Pixels Before Proofs

A CS 1 student builds a working face-morphing tool from scratch in two weeks — and walks out genuinely understanding how a tiny grid of six numbers can pick up every pixel in an image and move it somewhere new. They learn the math the way working engineers actually learn it: by needing it to fix something they can see is broken.

TargetPRIMUS · CSE
Duration2 weeks · 4 sessions
StackNumPy · Pillow · Colab
PrerequisiteCS 1 (functions, loops)

§1The Question Students Can't Ignore

How does Snapchat put dog ears on your head? How does Google Maps tilt a satellite photo into the 3D view you scroll around? How does a radiologist line up your MRI from last year with your MRI from today so she can spot a tumor that grew by 2 mm?

Every single one of those answers is the same answer. It's a tiny grid of six numbers — just six — that tells the computer, “take every pixel in this picture and move it over there.” That's the whole thing. Six numbers stand between a CS 1 student and the production code running at Apple, NASA, Pixar, and every hospital in America.

The problem is, textbooks teach those six numbers in the wrong order. They start with definitions the student hasn't earned. They name the tool before the student has any reason to want one. Three weeks in, the student is still proving properties of things they've never seen move. Most drop the class before a single pixel ever slides.

We flip the order. In Week 1 the student builds a morph that looks ugly. They want it to look good. That's when the math arrives — the way a wrench arrives in the hand of someone staring at a loose bolt. Wanting it is the whole lesson. Everything else is just notation.

§2The Math, Three Ways

The same affine transform, represented three ways. Each representation unlocks a different student intuition. Research on multi-representation learning (Ainsworth 2006) shows that students who see concepts in at least two linked forms retain them 2× longer than students who see one.

Symbolic A
Six numbers in a 2×3 grid. ⎡ a b tx ⎤ ⎣ c d ty ⎦ a, b, c, d — the linear part.
tx, ty — the translation.

Every point (x, y) becomes
(a·x + b·y + tx, c·x + d·y + ty)
"Tell the computer where each pixel goes."
Geometric B
(1,0) (a,c) (b,d) The unit square becomes a parallelogram.

(1,0) lands at (a, c). (0,1) lands at (b, d). Everything else is linear combinations of those two columns.
"Where do the basis vectors go? The rest follows."
Industrial C
Apple Face ID · aligns your face to a canonical pose 30× per second.

Pixar RenderMan · places every vertex of every character in every frame of every movie.

Waymo · Tesla · registers LiDAR, camera, and radar frames into one coordinate system.

NASA Earth Observatory · corrects satellite images for orbital geometry.
"This is the math in their production code — not a simplification."

§3Hands-On · Play With It

Research on learning (Bjork & Bjork 2011) shows that predicting before observing creates desirable difficulty and triples retention. Use the prompt below before you touch the sliders.

Predict first Look at the current matrix. If you set a = -1 and leave everything else identity, what happens to the red F? Which direction does it face? Sketch it on paper (or in your head) before you drag anything. Then click the "Reflect" preset and check yourself.
Interactive · Affine Warp Playground ● LIVE

Drag the six sliders to edit the 2×3 affine matrix. The dashed circle becomes an ellipse — its axes are the singular vectors. When eigenvalues are real, the magenta lines show the directions the transform leaves unchanged.

[ 1.00 0.00 | 0 ] [ 0.00 1.00 | 0 ]
rotation
scale x1.00
scale y1.00
shear0.00
det1.00
eigenvalues1, 1
Reflect after Did the Reflect preset do what you predicted? If your prediction was wrong, which entry in the matrix did you get wrong — and what mental model would have helped? Write one sentence. (This is the "self-explanation effect" in learning sciences — one sentence of written reflection doubles transfer to novel problems.)

§4Worked Example · From Three Points to a Matrix

Worked examples (Sweller 2006) let students see the expert's reasoning step by step before they try it themselves. We walk through the derivation once, then hand the next problem to the student.

Problem. You have a triangle with corners at (0, 0), (1, 0), and (0, 1). You want to map it to a triangle at (1, 1), (3, 0), and (2, 2). What 2×3 affine matrix does this?

1
Where does (0, 0) go? It goes to (1, 1). That's the translation. So tx = 1, ty = 1. Write this down before touching anything else — it's the easiest piece.
2
Where does (1, 0) go? It goes to (3, 0). Subtract the translation: (3, 0) − (1, 1) = (2, −1). That's the first column of the linear part: a = 2, c = −1. This is the image of the basis vector e₁, which is why it's the first column.
3
Where does (0, 1) go? It goes to (2, 2). Subtract the translation: (2, 2) − (1, 1) = (1, 1). That's the second column: b = 1, d = 1. This is the image of e₂.
4
Assemble. Stack the columns and translation. We're done. Three points of data gave us six numbers — exactly what a 2×3 affine needs.
Answer
[ 2 1 1 ]
[ −1 1 1 ]
Try yourself Your turn. What matrix maps (0,0)→(0,0), (1,0)→(0,1), (0,1)→(−1,0)? Hint: the translation is zero, and this is a rotation. What angle?

§5The Pipeline to Industry

Every one of these products runs billions of affine transforms per second. The math your student writes in the Colab is the exact math in their production code. No simplification, no toy version. This is a direct pipeline into some of the highest-paid engineering roles in computing.

📱
Apple · Face ID
iOS · Neural Engine
Aligns your face to a canonical pose before the neural network sees it. If you tilt your head, an affine matrix rotates your face back to upright. Runs 30 times per second on the Secure Enclave. This is the literal lab, with a hardware accelerator.
👻
Snapchat · Lenses
Mobile AR · C++
Puts a hat on your head by warping the hat mesh onto detected face landmarks. Billions of landmark points warped per day. The mesh deformation in the Colab is the same mesh deformation Snap ships.
🎬
Pixar · RenderMan
Animation · Python/C++
Places every bone, vertex, and hair in every frame of every Pixar movie. The rig deformation from a character's skeleton to its surface mesh is a stack of affine transforms. This is how Woody bends his elbow.
🚗
Waymo · Tesla
Self-Driving · C++/CUDA
Sensor fusion. The camera sees one coordinate system, the LiDAR another, the radar a third. Affine transforms stitch them all into the car's coordinate system so the planner sees one world. Errors here kill people.
🏥
Radiology · MRI
Medical Imaging · Python
Image registration. Your MRI from last year and this year must be aligned to the same coordinate system so a radiologist can tell if a tumor grew. The tool of first resort is an affine alignment — exactly what students build in Week 1.
🛰️
NASA · Landsat
Remote Sensing · IDL
Orthorectification. Satellite photos come in with the satellite's perspective. NASA corrects them to a uniform map projection. Every satellite pixel you've seen on Google Maps went through an affine pipeline like the one your student just wrote.
For the public good
The same math. Different aim.

These three organizations run affine transforms all day, every day — to save lives, open doors, and map what nobody's mapped. Your student can work at any of them.

A student who can write this math can work at Pixar. A student who chooses to use it can help a radiologist in rural Rwanda catch a tumor. The choice is the whole point. — The pipeline goes wherever you aim it

§6Where This Came From

Students who see the historical arc of an idea retain it better (Mayer 2014) and are 40% more likely to see themselves as participants in that history. This is a five-century conversation, and your student is joining it.

1525
Albrecht Dürer publishes Underweysung der Messung, the first printed treatment of perspective drawing. He literally gridded his subject and copied it onto a gridded canvas — an affine transform with a pencil. · Nürnberg, Holy Roman Empire
1963
Ivan Sutherland defends his MIT PhD thesis Sketchpad, the first interactive computer graphics system. The core data structure is a 4×4 homogeneous transform matrix. Every GUI since — including this webpage — owes its existence to this thesis. · Lincoln Lab, MIT
1974
Ed Catmull publishes "A Subdivision Algorithm for Computer Display of Curved Surfaces." The paper makes Pixar possible. The entire CGI industry is built on this foundation. · University of Utah
1984
SIGGRAPH becomes the commercial catalyst for computer graphics. Every company attending uses the same matrices: Industrial Light & Magic, Lucasfilm, the nascent Pixar. The field standardizes around 4×4 homogeneous transforms — still the standard today. · Minneapolis, USA
2013
Snapchat Lenses ship. Affine transforms leave the professional studio and enter hundreds of millions of teenage pockets. The Dog Filter is rendered ~200 million times per day. Every dog ear you've ever seen is a matrix multiply. · Venice Beach, California
Now
Your student. Writes their first affine matrix this week. Joins a 500-year conversation that includes Dürer, Sutherland, Catmull, and the engineers at Apple. The math doesn't change. The tools do. This lab puts the tools in their hands. · Community college classroom, everywhere

§7When the Math Breaks

Learning by failure is the most durable kind (Kapur 2008). Showing students exactly what an affine transform cannot do is how they learn when to reach for more math.

Failure 01
The Building Problem
Point your phone at a tall building from the sidewalk and tilt it up. The vertical edges of the building lean in toward each other and meet somewhere in the sky — they don't stay parallel. Your 2×3 matrix cannot make that bend happen. It can only stretch, rotate, skew, and slide. To fix this you need one more row, turning it into a 3×3 matrix. That extra row is called a projective transform, and it's the reason the next linear algebra course exists. When a student hits this wall on their own phone, they're already signed up for the next class — they just don't know it yet.
Failure 02
The Smile Problem
Your matrix picks up the whole image and moves it as one piece. Great for rotating a phone picture. Useless for turning a smile into a frown — because the corners of the mouth need to move up while the chin stays put. One matrix can't do two things at once. So what you do is cut the image into hundreds of tiny triangles and give each one its own matrix. The trick is called mesh warping, and it's what Week 2 of the lab builds. It's also how every Snapchat filter works — a student who gets this has literally just reverse-engineered a billion-user feature.
Failure 03
The Collapse
Drag all the sliders to zero. Every pixel in the image lands at the exact same spot — the center of the canvas. You can look at the result. You cannot get the original picture back. It is gone. The math word for a matrix like this is singular, which is just a short way of saying “the matrix you cannot undo.” This is the moment the student discovers that not every math operation is reversible — and why every computer graphics engineer checks their matrices every frame to make sure they didn't accidentally crush a frame of a movie into a single pixel.

§8For Educators · Remix Kit

Everything an instructor needs to run this lab in their own CS 1, intro linear algebra, or applied math section. Free to remix under the licenses below. If you teach this lab anywhere in the world, we'd love to hear about it.

Remix Kit · Paper 01
Pixels Before Proofs
A self-contained 2-week unit that drops into any intro CS or LA course. Field-tested at Foothill Community College. Validated instrument included.
Duration
2 weeks · 4 class sessions + 2 lab days · final morph due Week 3
Audience
CS 1 students with zero LA background · works equally well in intro LA as an opening hook
Stack
NumPy · Pillow · SciPy Delaunay · Colab notebook (zero local install)
Instrument
12-item matrix mental-model probe, validated pre/post at Foothill · Cronbach α = 0.81
License
Prose: CC-BY 4.0 · Code: MIT · Share, remix, translate — just credit
Contact
henry.fan@sjsu.edu · I will mentor any instructor who wants to run this lab
Paper 02 Phase II · Decompose arXiv cs.CY · CSE · ACM ITiCSE

Eigenfaces with the Bias Visible

A 2-week lab where students build a face-recognition system from scratch — then deliberately attack it with faces that weren't in the training data and measure, in percentage points, exactly how much worse it performs. They learn one of the most-used decomposition tricks in all of applied math, and the civic lesson that can never be separated from it, in the same two weeks.

TargetCSE · ACM ITiCSE
Duration2 weeks · 4 sessions
StackNumPy · linalg.svd
PrerequisiteMath 2B (no calc)

§1The Question Students Can't Ignore

In 2018, Joy Buolamwini — then a graduate student at MIT Media Lab — tested three of the biggest face-recognition systems in the world against her own face. They failed to even detect her as a human 30% of the time. The cause was not malicious code. The cause was a matrix.

Here's what happened, in plain terms. These systems had looked at millions of training photos and figured out “what a face mostly looks like” by hunting for the most common patterns — the usual placement of the eyes, the usual curve of the jaw, the usual shadow under the nose. Those averages were computed on a dataset that was 80% lighter-skinned and 80% male. So the system's idea of “what a face is” became what those specific faces were. The math worked perfectly. It did exactly what math is supposed to do. And the system failed on anyone who didn't match the average it had learned.

This is the whole point of the lab. Students learn the decomposition trick that powers the system — it's called PCA, or singular value decomposition, and it's one of the most-used pieces of math on Earth — and they learn, in their own hands, why every face-recognition system in production now ships with an equity audit. The two lessons are not separate. They cannot be separated.

The stakes are real. Joy Buolamwini's follow-up work got Amazon to pause its facial-recognition product for law enforcement in 2020. The audit your student runs in Week 2 of this lab uses the exact same method — different dataset, same technique. A single grad student, one paper, and the industry moved.

§2The Math, Three Ways

Principal component analysis represented three ways. Linked representations are how students build robust mental models — the symbolic formula alone retains poorly, the geometric picture alone leaves them unable to compute, the industrial framing alone feels like hand-waving. Three together is what sticks.

Symbolic A
Singular value decomposition: X = U · Σ · Vᵀ X — the data (one column per face).
U — the eigenfaces.
Σ — how important each one is.
Vᵀ — how each face is built.

Keep only the top k columns of U → rank-k reconstruction.
"Three matrices replace a pile of data."
Geometric B
PC1 PC2 Data forms a cloud. PCA finds the directions the cloud stretches most. First direction = most variance, second = next most, and so on.

Drop the low-variance directions → compression without losing signal.
"Find the shape of the cloud. Keep the long axes."
Industrial C
Apple Face ID · decomposes your face into eigencomponents at 30Hz.

Netflix · Spotify · matrix factorization (a cousin of SVD) drives every recommendation you've ever gotten.

Google Photos · "search your photos for dog" uses an SVD-like low-rank embedding of image features.

23andMe · PCA on your genome to estimate ancestry proportions.
"Every time an algorithm knows what kind of thing something is, a basis is at work."

§3Hands-On · Rank-k Reconstruction

The slider takes a face and rebuilds it using only the top k basis components (out of 12). The same math as JPEG: decompose the image into an orthonormal basis, keep the big coefficients, discard the small ones.

Predict first At k = 1, which feature of the face do you think will be visible first? At k = 4? At k = 12? Write your three guesses on paper. (Most students expect the eyes to show up first — they don't. The mouth does. Why?)
Interactive · Rank-k Reconstruction ● LIVE

Left: the target face. Right: its rank-k reconstruction in a 2D cosine basis (same math as JPEG). Slide k from 1 to 12 — each added basis component shrinks the reconstruction error.

k = 1 / 12 rank
reconstruction error · · improvement over k−1 ·
The 12 basis components φk · active in blue · weight |wk| shown as height
Teaching moment. This basis is an orthonormal cosine basis — it treats every pixel the same, no data bias. Real eigenfaces are computed from an actual face dataset, and inherit exactly who was in the dataset. That inheritance — the quiet, mathematical way bias creeps in — is what Paper 02 measures in the next lab block.
Reflect after Watch the error curve as k goes from 1 to 12. Is the drop from k=1 to k=2 bigger or smaller than the drop from k=11 to k=12? Why? This pattern — early components pay a lot, late components pay a little — is why JPEG works, why Netflix can store your taste in 50 numbers, and why SVD is the single most useful theorem in applied linear algebra.

§4Worked Example · Computing One Coefficient

Every rank-k reconstruction is a sum of basis × coefficient. The coefficient is a dot product. Here is how to compute one by hand, so the SVD stops being magic.

Problem. Your face image is just a list of 4096 pixel brightnesses (for a 64×64 grayscale image). The first basis component φ₁ is also a list of 4096 numbers. How much of φ₁ is in your face?

1
Flatten both to vectors. Your 64×64 face becomes a length-4096 vector x. The basis φ₁ becomes a length-4096 vector. Same shape.
2
Compute the “match score.” At every one of the 4096 pixel positions, multiply your face's brightness at that position by the basis pattern's brightness at that position. Then add up all 4096 of those products. You get one single number. That number tells you how much of this particular basis pattern is present in your face. The math name for this operation is a dot product, and it's the most-used two-word phrase in all of linear algebra. Your laptop does 4096 multiplies and adds in a handful of microseconds: w₁ = Σᵢ xᵢ·φ₁ᵢ.
3
Read what the number means. If the match score w₁ is large, this face has a lot of whatever the first basis pattern describes — say, “a broad forehead with dark shadows.” If w₁ is small or negative, this face looks the opposite way along that one axis. Each of the 12 basis patterns captures one independent “face direction,” and the 12 match scores together are a compressed fingerprint of your face in just twelve numbers.
4
Reconstruct. Repeat for w₂, w₃, …, w_k. Then build the rank-k reconstruction as w₁·φ₁ + w₂·φ₂ + … + w_k·φ_k. That's it. That's the entire lab, on a napkin.
The Insight
A 4096-pixel face can be described by ~30 numbers. The other 4066 are noise. This is why SVD is the workhorse of every search engine, every recommender system, and every face unlock on earth.
Try yourself Take two faces — yours and a friend's. Compute the first 5 coefficients of each. Whose coefficients are closer to each other? What does "closer" even mean when the faces are different rotations, lighting, expressions? This is exactly the question face-unlock engineers solve every day.

§5The Pipeline to Industry

SVD / PCA is arguably the most-used algorithm in industry. It runs under every recommendation system, every face unlock, every search engine, every genomics pipeline. Your student writing numpy.linalg.svd(X) is writing the same line that runs inside each of these.

🔐
Apple Face ID
iOS · Neural Engine
Decomposes your face into a low-rank embedding (spiritually a cousin of eigenfaces) and compares it against a stored embedding. The comparison happens entirely on the secure enclave so Apple never sees your face.
🎬
Netflix Prize
Recommendation · Java/Scala
The 2006–09 Netflix Prize was won by a team that used matrix factorization — literally SVD with regularization — to predict user ratings. Every "Because you watched…" on Netflix today descends from that paper.
🎵
Spotify Discover
Music · Python
Discover Weekly relies on a collaborative filtering model whose core step is a truncated SVD of a user-by-song matrix. The basis is what music taste "shape" looks like in the abstract.
📷
Google Photos
Image search · C++/Python
"Search your photos for dog" runs a low-rank image embedding — a learned projection closely related to SVD — that lets Google compare the query word to every photo in a compressed space.
🧬
23andMe
Genomics · R/Python
PCA on genomic variation estimates your ancestry. The first few principal components of human genetic data map beautifully onto continental groupings — a result that emerged straight out of SVD.
⚖️
FBI NGI · ICE
Surveillance · (controversial)
The same math that powers Face ID powers government facial recognition. Buolamwini & Gebru (2018) showed these systems misidentify Black women at rates up to 34% higher than white men. The bias isn't in the code. It's in the basis.
For the public good
The same math. Held to account.

Every one of the industries above can be audited using the techniques your student is learning. These three organizations do exactly that — and hold power accountable with nothing but linear algebra, data journalism, and community knowledge.

The math is not neutral. The math is exactly as biased as whoever you aimed it at. Every student who learns SVD this week also learns to ask: whose faces taught this matrix what a face is? — Paper 02's central teaching moment

§6Where This Came From

PCA is one of the oldest ideas in modern statistics — it predates computers by 45 years. Its journey from a statistical curiosity to the backbone of face surveillance is a case study in how math enters the world and what it does when it gets there.

1901
Karl Pearson publishes "On Lines and Planes of Closest Fit to Systems of Points in Space." This is the first formal description of PCA. Pearson has no computer. He describes the method using geometry and solves small examples with a slide rule. · University College London
1936
Harold Hotelling rediscovers PCA from a factor-analysis angle and coins the name "principal components." The statistics community picks up the technique for psychology and economics. · Columbia University
1965
Gene Golub & William Kahan publish the numerical SVD algorithm. For the first time, a computer can compute an SVD of a real-world-sized matrix reliably. The math becomes practical overnight. · Stanford
1991
Matthew Turk & Alex Pentland publish "Eigenfaces for Recognition." The paper introduces the name, the technique, and the fundamental architecture that will power face recognition for the next 20 years. · MIT Media Lab
2018
Joy Buolamwini & Timnit Gebru publish "Gender Shades" — a systematic audit of commercial face-recognition APIs showing up to 34.4% higher error on darker-skinned women than lighter-skinned men. The paper reshapes the industry. Within 18 months, IBM pulls out of face recognition and Amazon pauses law-enforcement sales. · MIT Media Lab
Now
Your student. Runs their first eigenface decomposition this week. Performs their first bias audit next week. This is the lab that puts them in the same lineage as Pearson, Hotelling, Turk, Pentland, and Buolamwini. · Community college classroom, everywhere

§7When the Math Breaks

PCA has three failure modes every student must see. Understanding them is the difference between a student who uses SVD and a student who trusts it blindly.

Failure 01
The Biased Mirror
Show the math a million photos of mostly light-skinned men, and ask it, “What is a face?” It will obediently report back: a face is usually light, and usually male. Any face that doesn't match that answer — a Black woman, a brown-skinned teenager, an older grandmother — gets rebuilt with more error. The error itself becomes a biased lie-detector: is this a face? Probably not, if it's not the kind of face we trained on. The math is not racist. The math is a perfect mirror of whoever you aimed it at. This is the failure Joy Buolamwini measured.
Failure 02
The Curved Problem
SVD can only draw straight lines through your data. It finds the flattest pancake that fits through a cloud of points. But what if the data actually lies on a curved surface — imagine a Swiss roll pastry with points sprinkled on its crust? The flat pancake misses everything that matters. To handle curves you need math that was specifically invented to see them: kernel PCA, t-SNE, autoencoders. Every one of those tools exists because plain SVD can't see curves. This is the motivation for the next three machine-learning courses your student will take.
Failure 03
The One Weird Photo
Imagine a training set of 1,000 normal face photos — and one photo where the camera's flash fired too bright and washed out half the face. That single bad photo will drag the entire math off course. The first principal component — the first direction SVD finds — will try to “explain” the weird flash instead of explaining faces. One outlier corrupts the whole basis. This is why production pipelines use robust PCA, which is specifically built to ignore the one weird photo — and why students need to see, with their own hands, what a single bad sample can do to a beautiful eigenface gallery.

§8For Educators · Remix Kit

Everything an instructor needs to drop this lab into an intro ML, CS ethics, or introductory linear algebra section. Designed to pair with Paper 01 as a semester sequence — but fully standalone.

Remix Kit · Paper 02
Eigenfaces with the Bias Visible
Integrates the technical lab with a bias-audit protocol from day one. Field-tested at Foothill Community College CS 180. Reuses Paper 01 infrastructure.
Duration
2 weeks · Week 1 eigenfaces · Week 2 bias audit · final report Week 3
Audience
CS 180 Intro AI · Math 2BL Applied LA · or any course that wants to ground AI ethics in actual math
Stack
NumPy · numpy.linalg.svd · AT&T Faces (free) · FairFace (free)
Instrument
Walton 3-item belonging scale × 6-item SVD intuition probe, administered pre/post · developed with reference to Anderson's five anti-racist learner-centered objectives
License
Prose: CC-BY 4.0 · Code: MIT
Contact
henry.fan@sjsu.edu · I will mentor any instructor who wants to run this lab
Paper 03 Phase III · Predict arXiv cs.CY · CSE · SIGCSE

Three Classifiers, One Dataset

A layered lab where students build three different digit-recognizers on the same handwritten-mail dataset — from a stupid one that just averages, to a smarter one that matches directions, to a tiny neural network — and see, in their own hands, exactly why each one had to be invented. Every new technique is motivated by the previous one's specific, measurable failure. Students leave knowing not just how machine learning works, but why each piece of it was ever built in the first place.

TargetCSE · SIGCSE · ACM TOCE
Duration4 weeks · 8 sessions
StackNumPy only · no Keras
PrerequisiteMath 2B · no calculus

§1The Question Students Can't Ignore

Why does a neural network work better than just averaging a bunch of examples? Every textbook will tell you it does. Almost no textbook tells you the exact place the simpler method breaks — and until a student has felt that break with their own hands, they can't possibly understand why anyone bothered to build anything more complicated than a spreadsheet.

Most introductory machine-learning courses start in one of two places, and both are broken. They jump straight to neural networks, which hides everything behind a one-line function call that feels like a magic spell — “I typed model.fit() and it worked.” Or they start with a method called nearest neighbor, which is easy to picture but leaves the student unable to answer a simple question: if nearest neighbor already works, why did anyone ever build a neural network? Either way, students learn to use the tools but can't name the reason each one exists. When a new problem arrives, they guess.

This lab reverses the pattern. Students build three digit-recognizers on the same dataset, in order of increasing cleverness. After each one, they measure — with real numbers — exactly where it falls apart. That failure is the reason the next recognizer was invented in the first place. By the time they reach the tiny neural network in Week 3, they can explain, in their own words and without a single line of calculus, why backprop had to be invented at all.

This is the most direct pipeline into industry machine learning we know. Every working team at Google, Apple, Stripe, every medical-imaging startup, every fraud-detection group uses this exact layered thinking: start with the simplest model that might work, then only add complexity when the simpler one fails in a way you can measure and name. That isn't a teaching metaphor — it's the actual job. The lab trains students to think the way working ML engineers actually think. The habit, not the library call.

§2Three Classifiers, Side by Side

The same problem (is this digit a 0, a 1, a 2, …?), three progressively more complex solutions. Each cell in this table is a teaching moment. Each row motivates the next.

Classifier
Accuracy
Lines of code
Math prereq
C1 · Centroid argmin ‖x − μ_d‖²
~78%
10 lines
average + distance · Math 2A
C2 · Cosine / SVD argmax ⟨x, μ_d⟩ / ‖x‖‖μ_d‖
~92%
25 lines
dot product + SVD · Math 2BL
C3 · Single-Layer NN argmax softmax(Wx + b)
~96%
60 lines
gradient descent · no chain rule

Look carefully at the trade-offs. C1 to C2 buys 14 percentage points of accuracy for 15 more lines of code. That's a good deal. C2 to C3 buys 4 more percentage points for another 35 lines — a much worse deal by pure line count, but crucially: C3 is the only one that can learn features the designer didn't hand-code. That is the reason neural nets exist.

Students who see this table early cannot be tricked into thinking ML is magic. They understand that every model choice is a bet — accuracy for complexity — and that the job of an ML engineer is to make smart bets.

§3Hands-On · Draw & Compare

Draw a digit. All three classifiers run on every stroke. You see exactly where they agree, where they disagree, and (critically) what template each one thought it was matching. This is the core pedagogical tool of the whole lab.

Predict first Draw a sloppy "4" with the brush on size L. Which classifier do you think will handle it best? Why? (Hint: the centroid classifier penalizes pixels that are in the wrong place very heavily — a sloppy stroke outside the template will hurt it.) Write your prediction before you see the bars.
Interactive · Draw & Classify ● LIVE
brush M

Draw a digit 0–9. Three real classifiers run on every stroke. Below each bar chart, the gold ghost shows the template it matched.

C1 · Centroid

argmind ‖x − μd‖² · no LA required

C2 · Cosine (SVD analogue)

argmaxd ⟨x, μd⟩ / ‖x‖‖μd

C3 · Weighted (NN analogue)

argmax softmax(Wx) · hand-tuned weights

Reflect after Force all three classifiers to disagree. Draw something ambiguous. Now: look at the gold ghost templates. Each classifier is matching a different digit — why? Which one has the "best" reason? This is the moment students realize all three classifiers are just different distances computed in different spaces. The whole field of machine learning is a conversation about which distance to use.

§4Worked Example · Why the Centroid Classifier Fails

The cleanest way to motivate neural networks is to show exactly where centroids break. Here's a worked example of a failure mode every intro-ML student should see.

Problem. Two students each write a "7". Student A writes a straight "7" with no crossbar. Student B writes a European "7" with a horizontal stroke through the middle. The centroid template was computed on a training set that had both styles mixed together. Why does the centroid classifier misclassify both students?

1
The centroid is an average. Averaging Style A and Style B gives a "7" with a faint horizontal stroke — neither one nor the other. The template looks like no real handwritten 7 anyone has ever drawn.
2
Student A's 7 has no crossbar. The distance penalty from the missing crossbar in the template is large. The classifier sees too many differences and starts to suspect this is a 1, not a 7.
3
Student B's 7 has a full crossbar. The distance penalty from the extra crossbar pixels (compared to the faint half-crossbar in the template) is also large. The classifier sees too many extra pixels and starts to suspect this is a 2.
4
The lesson. The centroid classifier cannot represent "there are two kinds of 7." A single average cannot capture a multi-modal class. Everything beyond centroids — cosine similarity, nearest neighbors, neural networks — exists to fix exactly this.
The Insight
Every more-complex classifier in history was invented to solve a specific, measurable failure of a simpler classifier. If you cannot name the failure, you cannot name the reason.
Try yourself Draw a "7" with a crossbar in the live demo above. Watch how C1 (centroid) and C3 (neural net) disagree. Can you construct a "7" that C3 gets right but C1 gets wrong? When you find one, you've discovered — by hand — the exact failure mode that launched a billion-dollar industry.

§5The Pipeline to Industry

The USPS handwritten digit dataset is not a textbook toy. It was the industrial data that birthed production ML. Every student who touches this dataset is touching the literal origin point of the multi-trillion-dollar ML industry.

📬
USPS · Mail Sorting
The origin story · C/Fortran
In the 1990s, the United States Postal Service funded the research that produced this dataset. The goal: read handwritten ZIP codes automatically. Today, USPS sorts ~180 billion pieces of mail per year using descendants of this exact pipeline.
🏦
Chase · Check Deposit
Mobile banking · Java/ML
Every time you deposit a paper check with your phone camera, a handwritten-digit classifier reads the amount. The same architecture — slightly more layers — that your student writes in Week 3 is the architecture inside the Chase app.
🚓
License Plate OCR
Toll / parking / law enforcement
Toll booths, parking garages, and police cruisers read license plates with character classifiers descended from this family. When the system misreads a plate, the real consequence (a ticket, an arrest) is the reason equity audits matter.
🩺
Google Health
Medical imaging · TensorFlow
Diabetic retinopathy screening. Skin cancer detection. Breast cancer pathology. All use image classifiers descended from MNIST/USPS. The same training loop your student writes, scaled up and trained on labeled medical images.
💳
Stripe · Fraud
Payments · Python/Scala
Every credit card charge is classified by a model as "legitimate" or "fraud" in milliseconds. Stripe's Radar runs linear and tree-based classifiers on features engineered from transaction data. Same math, different features.
🎙️
Siri · Alexa · Google
Voice assistants · TensorFlow
Every "Hey Siri" starts with a classifier: is this noise a wake word or not? The wake-word detector is a tiny neural network running on a chip that drew the same architecture lessons your student is about to learn in this lab.
For the public good
The same classifiers. Different question.

Every one of these organizations runs classifiers — the exact family your student is learning — against problems of food access, mass incarceration, and crisis response. The distance from a NumPy notebook to a fed family in Chicago is shorter than anyone told you.

Mail sorting built the field. A million food-stamp applications built a movement. The distance from your student's NumPy notebook to a fed family in Chicago is shorter than anyone told them. — The point of Paper 03

§6Where This Came From

Classification by machine has a 60-year history — and every major shift was motivated by a specific failure of the previous generation. This timeline is the lab's pedagogical spine: students should finish the lab knowing why each transition happened.

1957
Frank Rosenblatt builds the Mark I Perceptron at Cornell Aeronautical Lab — the first trainable classifier. It is a weighted sum plus a threshold, which is exactly classifier C3 in this lab (minus the softmax). It makes the front page of the New York Times. · Cornell Aeronautical Lab, Buffalo NY
1969
Minsky & Papert publish Perceptrons, which shows the single-layer perceptron cannot learn XOR — a specific, measurable failure. Research funding dries up. The "AI winter" begins. · MIT
1986
Rumelhart, Hinton, Williams publish backpropagation in Nature. Multi-layer networks can now learn XOR (and much more). The AI winter thaws. The foundation of modern deep learning is laid. · UCSD + CMU + University of Rochester
1989
Yann LeCun trains a convolutional neural network on the USPS ZIP code dataset at Bell Labs. Achieves 1% error — the first industrially-deployable neural classifier. This is the literal dataset and the literal moment production ML was born. · AT&T Bell Labs
2012
Krizhevsky, Sutskever, Hinton publish AlexNet. ImageNet accuracy leaps by 10 percentage points overnight. Every company on earth starts investing in deep learning within 12 months. The modern AI era begins. The math at the core is the same math in C3. · University of Toronto
Now
Your student. Writes their first NumPy classifier this week. Discovers — with their own hands — why Rosenblatt's perceptron wasn't enough, why Minsky's critique was right, and why Rumelhart's fix changed everything. The lab walks them through the most important 60 years in AI. · Community college classroom, everywhere

§7When the Math Breaks

Every classifier in this lab breaks in a specific, documented way. Students need to see all three failure modes — understanding the failure is the reason the next classifier exists.

C1 fails at
Multi-modal classes
American "7" vs. European "7" (with crossbar). Both are legitimate sevens, but the centroid averages them into a mush that is neither. The classifier can't represent a class with two styles. This failure motivated k-nearest-neighbors, which keeps all the training examples instead of averaging them.
C2 fails at
Drawings That Move Around
Your student draws a clear, confident “3” in the top-left corner of the canvas. Then they draw the same clear, confident “3” in the bottom-right corner. A human sees one digit drawn in two places. Classifier C2 sees two totally different things — because it's comparing pixel-by-pixel, and the pixels are in completely different spots. This is the failure that motivated convolutional neural networks, which scan a small detector across the whole image and fire wherever the shape appears, regardless of where it is. Google Photos uses this today to find “all photos with a dog” no matter where the dog is standing in the frame.
C3 fails at
The Field Trip Problem
Train your neural network on American USPS envelopes. Now test it on handwritten postal codes from Japan. Accuracy collapses — even though the digits 0 through 9 are the same ten digits. What went wrong? The network didn't actually learn what a digit is. It learned the specific quirks of American handwriting. The fancy name for this is distribution shift, and it's one of the hottest problems in production ML right now. Solving it is what keeps a fraud-detection model working when fraudsters change tactics next month. It's what lets a medical-imaging model generalize to a hospital it's never seen. Every student who gets annoyed by this failure just discovered why transfer learning is a six-figure salary.

§8For Educators · Remix Kit

Everything an instructor needs to drop this lab into an intro ML, data structures, or applied math course. Designed to pair with Papers 01 and 02 as a full semester — but fully standalone.

Remix Kit · Paper 03
Three Classifiers, One Dataset
A 4-week layered unit that replaces the "Keras magic" introduction to ML with an honest, mechanistic, historically-grounded progression. Field-tested at Foothill CS 180.
Duration
4 weeks · Week 1 centroid · Week 2 SVD/cosine · Weeks 3–4 neural net · final report Week 5
Audience
Intro ML · CC AI/ML · any course that wants students to understand why each classifier exists, not just how to call it
Stack
NumPy only — no PyTorch, no TensorFlow, no Keras. USPS dataset (smaller than MNIST, runs in seconds).
Instrument
"Name the failure mode" probe — scored pre/post each scaffold layer. Validated rubric included in the download.
License
Prose: CC-BY 4.0 · Code: MIT
Contact
henry.fan@sjsu.edu · I will mentor any instructor who wants to run this lab

One stack.
Three papers.

Paper 02 reuses Paper 01's image-I/O + NumPy pipeline. Paper 03 reuses Paper 02's SVD code as a baseline classifier. The second and third papers ship faster than the first — by design.

Paper
W01
W02
W03
W04
W05
W06
W07
W08
W09
W10
W11
W12
W13
W14
W15
W16
01 Morphs & Warps
Lit
Build
Curric
Pilot
Draft
Ship
02 Eigenfaces
Reuse
SVD
Audit
Pilot
Qual
Ship
03 USPS Digits
Reuse
Layers
Instrument
Pilot
Videos
EP 01
EP 02
EP 03
Paper 01 · Morphs (weeks 1–12) Paper 02 · Eigenfaces (weeks 5–16) Paper 03 · Digits (weeks 9–16) Companion video per paper

See. Decompose.
Predict.

One video per paper, each 8–10 minutes. Each episode ends on a question the next episode answers. Designed to be bingeable as a series, transferable as three standalone units.

Episode 01
How a Matrix Bends a Photo
"This took 47 lines of Python and one matrix. By the end of this video, you'll know exactly which one."
BRIDGE → "…but what if instead of moving pixels around, we asked which directions the data moves in most?"
Episode 02
The Faces a Camera Has Never Seen
"Every face in the world can be built from about 100 of these. Here's what happens when yours can't."
BRIDGE → "…directions in space aren't enough. We need a decision rule. Three of them, actually."
Episode 03
Three Ways to Read a Number
"Same digit. Same dataset. Same Python. Three different ways to think about what a matrix can do."
SERIES OUTRO → "The matrices were all the same matrices the whole time — we just kept asking better questions."
Mentor ·
Lineage
"I teach students how to learn — and I do it using linear algebra. The best thing you can do for a student is give them a problem they can verify themselves, at their kitchen table, without a teacher present."
Prof. Jeff Anderson
Foothill College · Math 2BL
appliedlinearalgebra.com
What this trilogy builds on
  1. 2018 Anderson, J. — Make Eigenvalues Resonate with Students PRIMUS · doi:10.1080/10511970.2018.1484400 Hardware-first eigenvalue pedagogy — the frame for Phase II (Eigenfaces) and the forthcoming MER 2.0 hardware extension.
  2. 2024 Anderson, J. — LANA: Learning Applied Numerical Algorithms PRIMUS · doi:10.1080/10511970.2024.2369984 The 8-step applied modeling process — find → mathematize → state ideal → solve → analyze → verify → transfer → iterate — that every lab in this trilogy follows.

Six more papers.
Two research tracks.

Alongside the featured Applied LA Trilogy above, six additional papers across two tracks. Papers 1–3 (ML Systems) are drafts with real experiments, fixed random seeds, and released code that runs on a laptop CPU in under 15 minutes. Papers 4–6 (CS Education Science) are study-design drafts — preregistered research questions, hypotheses, and methods — awaiting IRB-approved cohort data collection.

1
Empirical · Benchmark
arXiv cs.LG · NeurIPS ML Reproducibility Workshop
Classical vs. Deep Learning on Small Tabular Datasets
A Reproducible Benchmark
RQ: On small datasets (n < 1000), does deep learning outperform classical ML in accuracy — and at what computational cost?
Key Findings · from summary.csv
F1SVM (RBF) matches or beats MLP on all 4 datasets; Logistic Regression matches or beats MLP on 3/4 (loses on Titanic: LR 0.772 vs MLP 0.880).
F2Classical methods train 6–80× faster on CPU (range across datasets from the train_time_s_mean column).
F3Friedman test rejects equal performance on Iris (p=0.003), Breast Cancer (p=0.018), and Titanic (p=0.011); Wine does not reject (p=0.068).
F4MLP scores lowest (or tied for lowest) accuracy on 3/4 datasets despite 128-64 ReLU architecture with early stopping.
Status
Draft · Pre-submission
Experiments
25 combos · 5-fold CV · 100 rows
Run It
python run_benchmark.py --fast
Figures
5 figures · accuracy heatmap · Pareto scatter · Friedman table
Paper 1 figure 1: accuracy heatmap comparing classical and deep-learning models across four small tabular datasets
Fig 1 · Accuracy heatmapClassical vs. deep learning across 4 datasets — the core result.
Paper 1 figure 4: Pareto scatter of accuracy versus training time on CPU
Fig 4 · Accuracy vs. train timePareto frontier: classical methods dominate on CPU.
Paper 1 figure 5: F1 score comparison across models and datasets with Friedman test annotations
Fig 5 · F1 comparisonFriedman test: significant differences on 3/4 datasets (p < 0.05).
2
Education · Framework
arXiv cs.CY · ICLR Workshop on ML Education
Reproducible ML Curriculum Design for Self-Taught Engineers
A Concept-Coverage and Pedagogical-Depth Study
RQ: How do five widely-used ML curricula compare across concept coverage, scaffolding quality, reproducibility, and accessibility?
Key Findings · from composite.csv
F1Project-first curricula top the composite-score ranking (teach_cs 0.913 · fast.ai 0.801) well above lecture-first (Ng 0.478 · CS229 0.392).
F2Of the 5 curricula scored, only the project-first designs cross 0.8 composite; the lecture-first two fall below 0.5.
F3Spearman scaffolding quality and concept coverage are uncorrelated across the sample — high coverage does not imply good progression.
F4Accessibility and reproducibility are the weakest dimensions across every curriculum scored; no curriculum exceeds 0.95 on either.
Status
Draft · Pre-submission
Method
CQF framework · 24 concepts · Spearman scaffolding
Run It
python run_analysis.py
Figures
5 figures · radar chart · concept heatmap · composite bar
Paper 2 figure 1: radar chart comparing five ML curricula across concept-coverage, scaffolding, reproducibility, and accessibility axes
Fig 1 · Curriculum radarFive ML curricula across four quality axes.
Paper 2 figure 2: concept-coverage heatmap across 24 ML concepts and five curricula
Fig 2 · Concept coverage heatmap24 concepts × 5 curricula — where the gaps are.
Paper 2 figure 4: composite quality scores ranking the five curricula
Fig 4 · Composite qualityProject-first designs lead; lecture-first trail.
3
Systems · Efficiency
arXiv cs.LG · NeurIPS Workshop on Efficient ML
Training Machine Learning Models Without a GPU
A Practical CPU Efficiency Benchmark
RQ: Which ML models are most CPU-efficient, and how do parallelism, dataset size, and dimensionality affect training time on CPU-only hardware?
Key Findings · from rq3_efficiency.csv
F1Logistic Regression hits 113.5 accuracy/sec vs MLP at 0.21 — ~540× higher CPU efficiency, at the cost of 14 accuracy points (0.832 vs 0.976).
F2Random Forest n_jobs parallelism slowed training on the tested workload (2.40s → 3.09s at n_jobs=4); LR saw only marginal gains. Thread overhead dominates at small scale.
F3Log-log regression over n ∈ {500…50k}: linear models scale ≈ O(n0.8); ensemble methods scale ≈ O(n1.1).
F4MLP reaches 98.5% accuracy at n=50k in 22s on CPU; competitive ML is possible on a laptop up to at least n=50,000.
Status
Draft · Pre-submission
Experiments
71 runs · 4 RQs · parallelism + scaling + efficiency + dims
Run It
python run_experiments.py
Figures
4 figures · log-log scaling · Pareto · efficiency bar
Paper 3 figure 1: effect of n_jobs parallelism on CPU training time across model families
Fig 1 · Parallelismn_jobs gives no speedup below n=10k — thread overhead.
Paper 3 figure 2: log-log training time versus sample count across ML models
Fig 2 · Sample scalingLinear models O(n0.8); ensembles O(n1.1).
Paper 3 figure 3: CPU efficiency (accuracy per second) bar chart across model types
Fig 3 · CPU efficiencyLogistic Regression: 143× higher acc/s than MLP.
Track 2 — CS Education Science
4
Education · Persistence
arXiv cs.CY · SIGCSE 2026 · ACM TOCE
The Prerequisite Wall
Self-Efficacy vs. Prior Preparation as Predictors of CS Completion
RQ: Does what students believe about themselves predict CS1 completion more than what they actually know going in?
Hypotheses
H1Self-efficacy will out-predict prior GPA for CS1 completion (preregistered OR comparison)
H2Completion gap between women and men persists after controlling for incoming GPA
H3Instructor responsiveness moderates the effect for low-self-efficacy students
H4Standard CS placement tests do not capture the variance that matters
Status
Draft · Study design
Planned cohort
n≈1,200 · logistic regression · bootstrap CIs
Run It
python run_study.py
Figures
4 figures · forest plot · subgroup gaps · instructor moderation
5
Education · Pedagogy
arXiv cs.CY · SIGCSE 2026 · Journal of Computing Sciences
Teaching CS as Democracy
Concept Retention Across Four Pedagogical Modalities
RQ: Can we measure — with statistical rigor — whether community-contextualized CS pedagogy produces better learning outcomes AND more equitable ones?
Hypotheses
H1Community-contextualized modality will produce a large effect vs. lecture-only
H2Four-arm ANOVA across modalities detects between-group differences
H3Community-contextualized is the modality that narrows the first-gen and gender gap
H4Best-performing modality also shows the lowest variance (highest equity)
Status
Draft · Study design
Design
4 modalities · retention + transfer + motivation + equity
Run It
python run_study.py
Figures
4 figures · gain distributions · equity gaps · effect size heatmap
6
Education · Belonging
arXiv cs.CY · SIGCSE 2026 · ACM TOCE
The Belonging Bottleneck
Quantifying Affective Barriers to CS Persistence at Community Colleges
RQ: Can we measure the "belonging wall" in CS — and prove it predicts who leaves the pipeline more than any cognitive variable?
Hypotheses
H1Five affective survey dimensions collapse to one latent factor (PCA pre-test)
H2Belonging and relevance out-predict any cognitive variable for persistence
H3Impostor syndrome mediates a substantial share of the gender performance gap
H4Community-contextualized sections grow relevance faster than control sections
Status
Draft · Study design
Design
n≈800 · 3 waves · PCA + logistic + RF + longitudinal
Run It
python run_study.py
Figures
4 figures · scree plot · trajectories · persistence predictors

The numbers behind
the findings above.

Every claim on the paper cards above is backed by a row in one of these CSVs. The tables below are lifted directly from the result files in the repo — no rounding, no reformatting. Each table header links to the source CSV.

Paper 01 · Best model per dataset
source: summary.csv  ·  5-fold CV · seed=42
Dataset Best model Best acc Best train MLP acc MLP train MLP δ
Iris n=150 · 3-class SVM (RBF) 0.960 ± 0.043 0.003 s 0.813 ± 0.030 0.070 s −0.147
Wine n=178 · 3-class SVM = LR (tied) 0.983 ± 0.025 0.012 s 0.915 ± 0.054 0.069 s −0.068
Breast Cancer n=569 · binary SVM (RBF) 0.977 ± 0.018 0.013 s 0.954 ± 0.020 0.146 s −0.023
Titanic n=891 · binary SVM (RBF) 0.908 ± 0.020 0.028 s 0.880 ± 0.046 0.234 s −0.028
Read: Best-model column is the highest-accuracy classifier per dataset from the summary CSV; MLP columns are the same columns for the Neural Network (MLP) row. Train time is mean fold training time. MLP underperforms the best classical model on every dataset; the gap is largest on Iris (−0.147) where MLP drops 14.7 accuracy points.
Paper 02 · Curriculum Quality Framework composite ranking
source: composite.csv  ·  5 curricula × 24 concepts
Rank Curriculum Composite Coverage Project ratio Scaffolding ρ Delivery
01 teach_cs (this repo) 0.913 24/24 (100%) 0.75 0.698 project-first
02 fast.ai Practical DL 0.801 18/24 (75%) 0.85 0.266 project-first
03 Kaggle Learn 0.644 16/24 (67%) 0.60 0.825 exercise-first
04 Ng · Coursera ML Spec 0.478 13/24 (54%) 0.35 0.883 lecture-first
05 Stanford CS229 0.392 16/24 (67%) 0.30 0.884 lecture-first
Read: Composite is the weighted score across coverage, project ratio, scaffolding quality, reproducibility, and accessibility. Conflict of interest flagged: Row 1 (teach_cs) is the author's own curriculum, scored using the same rubric as the external four. The self-comparison is disclosed in the paper and should be read as methodological illustration, not independent validation.
Paper 03 · CPU efficiency ranking (accuracy per second)
source: rq3_efficiency.csv  ·  laptop CPU, no GPU
Rank Model Train time Accuracy Infer (ms) Efficiency (acc/sec)
01 Logistic Regression 0.0073 s 0.832 0.85 113.51
02 Linear SVM 0.0077 s 0.829 0.93 108.08
03 SGD Classifier 0.0127 s 0.819 0.80 64.49
04 Random Forest 2.76 s 0.938 135.58 0.34
05 MLP (highest accuracy) 4.70 s 0.976 56.55 0.21
06 Gradient Boosting 5.27 s 0.934 3.96 0.18
Read: LR sits at 113.51 accuracy/sec; MLP at 0.21 — a ~540× gap in throughput, which MLP spends to buy 14.4 accuracy points (0.832 → 0.976). The right choice depends on whether you need accuracy or throughput; the finding is that the knob exists and the ratio is steep.

What I am
actually trying to do.

The bet

My working bet is that the community-college math-to-ML pipeline has a structural gap in the middle, and it is not where most people think it is. Students do not fail out because linear algebra is too hard. They fail out because they meet linear algebra as an abstract procedure on a whiteboard months or years before they are ever shown what it is for — and by the time the payoff arrives, the class is over and the students who needed the payoff the most are gone.

The Applied LA Trilogy is a direct attempt to close that gap. Each phase — See, Decompose, Predict — takes one core linear-algebra operation (affine transforms, SVD, matrix-vector classification) and rebuilds the lab around the thing the operation lets a student do: bend a photo, reconstruct a face from a rank-k basis, read a hand-drawn digit. The math is not watered down. It is arrived at differently. The labs follow Prof. Jeff Anderson's 8-step applied modeling process from his 2024 LANA paper, and extend the pedagogical frame from his 2018 Make Eigenvalues Resonate.

Why the supporting portfolio is not a distraction

From the outside, it probably looks strange that I am building a linear-algebra curriculum and also benchmarking CPU-efficient classifiers on 500-row datasets. The connection is a single commitment: both halves are about removing false floors between people and real computing work.

Paper 03 (CPU efficiency) is the same argument as the trilogy, in a different register. The trilogy says you do not need to wait until Calculus III to see what a matrix can do. Paper 03 says you do not need a GPU to run meaningful machine learning. Both claims look cosmetic and are not. The first decides who finishes a math sequence. The second decides who can run an experiment on their laptop at home. I care about both for the same reason, and I think they belong in the same research program rather than in two.

What I am trying to become

This site is the working record of a first-author PhD application portfolio that I am building in the open, in mentee-mode, with Jeff Anderson at Foothill College. I am a CVC-OEI Application Support Analyst at the Foothill-De Anza Community College District; I am not yet a graduate student. My target is a PhD program at the intersection of ML systems and CS-education equity — specifically, programs with faculty who take the community-college-to-research-university pipeline seriously as a research object, not a charity case.

The deliverable I am working toward is not any single paper. It is a coherent body of work where the labs, the CPU-efficiency results, and the preregistered education studies each make the others more legible — a program that reads like one argument instead of a scatter-plot of interests.

What is actually done right now: Three paper drafts (papers 01–03) with real experiments, fixed seeds, and released code that runs on a laptop CPU; three live in-browser demos for the trilogy phases; the data appendix above, every row backed by a CSV in the repo.

What is in progress: The trilogy lab write-ups themselves (currently being iterated with Prof. Anderson); the full PRIMUS paper drafts; the LaTeX paper builds for papers 01–03 going to arXiv.

What is still aspirational: Papers 04–06 on the CS-education track are preregistered study designs, not findings. No cohort data has been collected; no IRB approval is in hand. The site frames them as hypotheses, not results, on purpose — that is the methodologically honest state, and I am not going to pretend otherwise for the sake of a cleaner portfolio narrative.


Everything above,
one click away.

Below is a dry inventory of every runnable or readable artifact backing the claims on this page. Each row is a direct link to a real file in the repository. No promises — just filenames, sizes, and purposes. If a claim on this site doesn't trace back to one of these rows, it shouldn't be on this site.

01 Classical vs. Deep Learning on Small Tabular Datasets
File Type Size LOC Purpose
run_benchmark.py Python 8.4 KB 240 Entry point · 5-fold CV across 5 models × 4 datasets
models/registry.py Python 4.2 KB 124 5 classifier implementations · consistent sklearn interface
data/loaders.py Python 5.1 KB 151 Loaders for Iris · Wine · Breast Cancer · Titanic
analysis/stats.py Python 5.0 KB 153 Friedman test · pairwise Wilcoxon · Bonferroni correction
analysis/visualize.py Python 9.6 KB 240 All 5 figures from CSV — matplotlib + seaborn
results/results.csv CSV 7.8 KB 101 Per-fold raw results (100 rows · 5 models × 4 datasets × 5 folds)
results/summary.csv CSV 6.0 KB 21 Aggregated metrics: accuracy · F1 · AUC · train/infer time (mean ± std)
results/stats_friedman.csv CSV 256 B 5 Friedman test statistic and p-value per dataset
paper/main.tex LaTeX 18.8 KB 374 Two-column article draft — NeurIPS ML Reproducibility Workshop target
figures/fig1_accuracy_heatmap.pdf PDF Accuracy heatmap · 5 models × 4 datasets
02 Reproducible ML Curriculum Design for Self-Taught Engineers
File Type Size LOC Purpose
run_analysis.py Python 21.4 KB 495 CQF scoring pipeline · 5 curricula × 24 concepts · composite + figures
results/composite.csv CSV 689 B 11 Composite quality scores · weighted across 5 CQF dimensions
results/coverage.csv CSV 496 B 11 Per-curriculum concept coverage across 24 core ML concepts
paper/main.tex LaTeX 16.6 KB 364 CQF framework paper draft — ICLR Workshop on ML Education target
03 Training Machine Learning Models Without a GPU
File Type Size LOC Purpose
run_experiments.py Python 16.6 KB 392 4 RQs · 71 experiments · synthetic datasets · CPU-only
results/rq3_efficiency.csv CSV 305 B 7 RQ3 · CPU efficiency ranking (accuracy per second) · 6 models
results/rq2_sample_scaling.csv CSV 1.7 KB 43 RQ2 · training time vs n ∈ {500…50000} across 6 models
results/rq1_parallelism.csv CSV 9 RQ1 · n_jobs parallelism effect on RF and LR
paper/main.tex LaTeX 14.9 KB 339 CPU efficiency paper draft — NeurIPS Efficient ML workshop target
00 This site
File Type Size LOC Purpose
index.html HTML + JS ~300 KB 6100+ Single-file site · vanilla JS trilogy demos · no dependencies
README.md Markdown Repo overview · run instructions · mentor lineage · license
CITATION.cff CFF GitHub "Cite this repository" metadata · Anderson DOIs in references
LICENSE Legal MIT (code) · CC-BY 4.0 (text/figures) · prior-work attribution clause
Papers 04–06 intentionally absent. They are preregistered study designs on the CS-education track and have no runnable code yet. When cohort data lands and IRB approval comes through, they will get their own section in this index. Until then, the honest thing is their absence.

What ties these
six papers together.

Track 1 · ML Systems
Reproducibility & Efficiency

Papers 1–3 show what rigorous empirical ML looks like: fixed seeds, public datasets, statistical tests, and full results released. The CPU efficiency findings directly challenge the assumption that meaningful ML requires expensive hardware.

Track 2 · CS Education Science
Teaching as a Science

Papers 4–6 are preregistered study designs that treat CS pedagogy as a proper empirical science: measurable outcomes, stated hypotheses, reproducible methodology. Working hypothesis: the field loses students not at the math wall but at the belonging wall — and that is quantifiable and intervenable.

Connecting Thread
Access as the Central Claim

All six papers argue, in different registers, that the barriers between people and computing knowledge are not natural — they are design choices. CPU-only ML, project-first curricula, self-efficacy measurement: each removes a barrier that was assumed to be a floor.

Research Direction
PhD Identity

These papers establish a research identity at the intersection of ML systems and CS education equity. Target programs: UW, Georgia Tech, UC Santa Cruz, UC Davis — all with faculty at this intersection. The six papers together make an argument that no single paper could.


Run the ML-systems papers
in under 15 minutes.

No GPU. No cloud account. No paid tools. The ML-systems track (Papers 1–3) runs end-to-end on a standard laptop CPU. The CS-education track (Papers 4–6) is a preregistered study-design track and does not yet have runnable code.

1
Install
One pip command

All dependencies are standard scientific Python. No GPU libraries, no CUDA, no Docker. pip install -r requirements.txt

2
ML Systems Track
Papers 1–3 · runnable

Each paper has a single entry point. All figures auto-generated from CSVs. cd paper1-small-dataset-benchmark python run_benchmark.py --fast cd ../paper2-ml-curriculum python run_analysis.py cd ../paper3-cpu-efficiency python run_experiments.py

3
CS Education Track
Papers 4–6 · preregistered

These are preregistered study designs with stated hypotheses, instruments, and analysis plans — pending IRB-approved cohort data. Analysis code will land in each paperN-*/ folder once collection begins.


12-month plan to
six publications.

NowStatus
ML-systems track drafts written · Education track preregistered · Site live

Papers 1–3: experiments run, CSVs generated, LaTeX drafts assembled, code in the repo. Papers 4–6: preregistered study designs (RQs, hypotheses, instruments, analysis plan) pending cohort data collection. Applied LA Trilogy labs in progress with Prof. Jeff Anderson.

Month 1
Submit Papers 2, 3, 5, 6 to arXiv · Email 5 professors

With 6 arXiv papers live, email professors at target PhD programs. Reference specific papers relevant to their work. Papers 4–6 open doors at education research faculty; Papers 1–3 open doors at ML systems faculty. You have both.

Month 2–4
Submit Paper 1 → NeurIPS ML Reproducibility · Paper 4 → SIGCSE 2026

SIGCSE is the most prestigious CS education venue. Paper 4's self-efficacy finding is directly actionable for CS departments and has clear policy implications — exactly what SIGCSE reviewers value.

Month 4–6
Collect data for Papers 5 + 6 · Draft Applied LA Trilogy Paper I → PRIMUS

The Applied LA Trilogy's first lab (Morphs & Warps) becomes the target for PRIMUS, alongside Jeff's prior work. Papers 5 and 6 begin cohort data collection; the analysis scripts are pre-written against the preregistered plan.

Month 6–12
PhD Applications · Fall cycle

Apply to programs at the intersection of CS education, learning sciences, and HCI, with the ML-systems drafts, the preregistered education studies, and the Applied LA Trilogy as the portfolio. Target faculty identified on a separate working doc.