Research / Modeling BenchApplied LA · Hands-On Pedagogy · CS Ed × AI
3-Paper Research Program · v0.1

The Modeling
Bench

Three first-author papers at the intersection of applied linear algebra, hands-on lab pedagogy, and CS education in the age of AI.Henry is the principal author and driver of all three papers; Prof. Jeff Anderson (Foothill College) is his research mentor and second author on each. This is a mentee-led research program, not a funded academic lab — every direction below is anchored to a written invitation in Anderson's published work or the Math 2BL Spring 2026 deliverables document.

Principal author
Henry Fan
Mentor
Prof. Jeff Anderson
Started
2026-04-10
Target venues
PRIMUS · College Math Journal · J. Computing in Higher Ed

The lab exists to ship three publishable papers within twelve months while building the technical, narrative, and visual fluency needed to lead a CS-Education / Learning-Sciences PhD. The unit of work is the paper. Everything in this site exists to remove friction between an idea and a submitted manuscript.

Operating Principle

Every week, two things must move forward: (a) a measurable artifact in at least one paper (data, code, hardware photo, draft section), and (b) a five-minute mentor-ready update. If neither moved, the week is a regression — diagnose why before doing anything else.

Constraints we accept

  • · Henry has a full-time CVC-OEI Application Support Analyst role at FHDA. Research is the second shift.
  • · Jeff's time is precious. Mentor touchpoints are async-first, asks-only, and never longer than 10 minutes of his reading time.
  • · No grant funding. Hardware budgets justify against a $200/quarter cap.
§01Mentor & Lineage

Three of these papers are deliberate extensions of work Jeff has already published or has explicitly flagged as open in his Math 2BL deliverables doc. We are not pitching cold — we are stepping into invitations he has already issued in writing.

Anchor

Make Eigenvalues Resonate (MER)

Source

Anderson (2018), PRIMUS, doi.org/10.1080/10511970.2018.1484400. Math 2BL S26 deliverables doc explicitly states: "Project To-Do List: Under development (maybe you can help me develop this in spring 2026)?"

How we extend it

Paper 01 — open-hardware kit + computer-vision tracking + 3-DOF extension.

Anchor

Neural Networks Towards Machine Learning

Source

Math 2BL S26 deliverables doc, p.14: "Underdevelopment. Come back later." Project does not yet exist as either curriculum or paper.

How we extend it

Paper 02 — build the missing project from scratch with an SVD-first lens, validated against USPS digits.

Anchor

LANA + Anti-Racist Learner-Centered Objectives

Source

Anderson (2024), PRIMUS, doi.org/10.1080/10511970.2024.2369984; Anderson (2024) blog post on five anti-racist learner-centered objectives.

How we extend it

Paper 03 — design and evaluate an AI modeling co-tutor scoped to Jeff's 8-step modeling process and audited against his anti-racist objectives.

What Jeff cares about

Read this list before every mentor email.

Hands-on verification

Every claim must be checkable by a student at a kitchen table. If a student cannot verify it without a teacher, it doesn't ship.

Open access

Curriculum, code, hardware specs, and data should be free to remix. No paywalled supplements.

Transferable skills over content coverage

"I teach students how to learn, and I do it using linear algebra." Frame contributions in those terms.

Anti-racist, learner-centered framing

Jeff has five published learner-centered objectives. Every paper should make explicit which objective it serves.

The "$25B eigenvector" tone

Jeff loves papers that make abstract LA concretely consequential — the way Bryan & Leise framed PageRank. Match that storytelling register.

The 8-step modeling process

Find → mathematize → state ideal model → solve → analyze → verify → transfer → iterate. Use these section labels in drafts so Jeff can review fluently.

Three independent contributions; one shared aesthetic. Each paper has a working title, abstract draft, methodology, risks, and a publishable-vs-mediocre line.

Paper 01HardwareCurriculumPRIMUSPRIMUS

MER 2.0 — An Open-Hardware, Computer-Vision Lab Kit for Coupled-Oscillator Modeling in Lower-Division Linear Algebra

Abstract (draft v0)

[DRAFT — pilot data section marked as forthcoming.] Anderson (2018) introduced the Make Eigenvalues Resonate (MER) project as a hands-on bridge from introductory linear algebra to vibrations analysis using a spring-coupled pair of pendula. Six years later, the project still depends on ad-hoc hardware and bespoke video analysis, which limits adoption outside Anderson's own classroom. We present MER 2.0: a fully reproducible, sub-$80 hardware bill of materials, an open-source OpenCV tracking pipeline that runs from any phone-recorded video, and a three-mass extension that lets students see — and verify by measurement — the eigenstructure of a system with multiple natural frequencies. A single-section pilot in Math 2BL at Foothill College is planned for spring 2026; pilot results (kit assembly, eigenvalue-vs-measurement agreement, and self-reported gains against Anderson's five anti-racist learner-centered objectives) will be reported in the full draft. We argue that low-cost reproducible hardware is the binding constraint on the spread of modeling-rich linear algebra labs, and offer MER 2.0 as a template for hardening other projects in the Math 2BL family.

Research questions
  • RQ1Can a sub-$80 BOM reproduce — within tolerable error — the same eigenvalue-vs-measurement comparisons that the original MER apparatus achieved?
  • RQ2Does adding a third mass (and therefore a third visible eigenmode) qualitatively improve students' intuition that eigenvalues are physical things, not just symbols?
  • RQ3What is the smallest hardware + software footprint that lets a community-college student verify their own model without a teacher present?
Methodology

Hybrid (engineering + classroom). (i) Design and bench-test a v2 hardware kit. (ii) Build the OpenCV pipeline as a single Python script + a Colab notebook with zero local install. (iii) Derive the 2-mass and 3-mass equations of motion, diagonalize, and validate predictions against tracked motion. (iv) Run a single-section pilot in Math 2BL Spring 2026 with pre/post survey on Anderson's learner-centered objectives. (v) Publish kit, code, and curriculum under CC-BY + MIT.

Required math & CS

Eigenvalue/eigenvector decomposition; small-angle linearization; coupled ODEs; OpenCV (BGS, optical flow, or color-blob tracking); basic statistics (RMSE, paired t-test); LaTeX.

Step-by-step execution plan
  1. Wk 1–2Reread Anderson (2018) end-to-end. Recreate the original 2-mass derivation by hand. Annotate every figure.
  2. Wk 3–4Bench prototype: dowel + fishing line + neoprene springs + 3D-printed mass holders. Phone-camera recording on a tripod against a known scale. Cost ledger updated.
  3. Wk 5–6OpenCV pipeline. Single Python script that ingests an .mp4 and outputs CSV of (t, x1, x2, x3) plus a fitted-frequency JSON. Colab port.
  4. Wk 7Derive 3-mass case symbolically (SymPy). Diagonalize. Compare to measured frequencies. Iterate hardware until RMSE < 5%.
  5. Wk 8Lock the BOM. Photograph the kit. Write the curriculum sheet (student-facing 8-step modeling worksheet).
  6. Wk 9Pilot in one Math 2BL section. Pre/post survey. Collect at least one student-built kit photo per group.
  7. Wk 10Draft v1 of paper following PRIMUS structure (Motivation → Theory → Activity → Student Work → Reflection).
  8. Wk 11Mentor-review pass with Jeff. Revisions.
  9. Wk 12Submit to PRIMUS.
Risks & failure modes
  • Hardware drift. Cheap springs have nonlinear and temperature-sensitive constants — RMSE may explode. Mitigation: characterize each spring before assembly; report uncertainty bounds.
  • Tracking failures. Phone video at 30 fps may alias the higher modes. Mitigation: 60 fps minimum; verify Nyquist at design time.
  • IRB / consent. Pre/post survey of students requires Foothill IRB exemption. Start the paperwork in Wk 1, not Wk 8.
  • Scope creep. Resist the urge to also rewrite the LANA paper. Three-mass case is the ceiling.
Publishable vs mediocre

Mediocre: Here is a cheaper version of MER, and students liked it.

Publishable: Here is the smallest verifiable footprint at which a student can check eigenvalue theory against the physical world without a teacher; here is the data; here is the BOM you can buy from Amazon today.

The money figure

Three-mass system phase-space animation overlaid with predicted vs. measured trajectories — same plot, different colors, RMSE annotated.

Kanban
Backlog
  • · Hardware BOM v0
  • · OpenCV pipeline v0
  • · Wk 1 reread
  • · IRB exemption form
In Progress
In Review
Done
Paper 02CurriculumEmpiricalPRIMUSPRIMUS

From Matrices to Networks — A Linear-Algebra-First, Hands-On Introduction to Neural Networks for Community College Students

Abstract (draft v0)

[DRAFT — artifact and pilot sections marked as forthcoming.] The Math 2BL curriculum at Foothill College lists a 'Neural Networks Towards Machine Learning' project that has been marked under development for several years. We propose the first complete version of this project. Rather than treating neural networks as a separate paradigm bolted onto a linear algebra course, we frame a single-hidden-layer network as a sequence of matrix operations whose learned weight matrices have geometrically interpretable singular values. Students train a network on the USPS handwritten digit dataset, then perform an SVD on each weight matrix and visualize the top singular vectors as eigen-strokes that the network has learned to detect. We argue this inversion — train first, decompose second — gives community college students an honest, mechanistic account of what a neural network is doing without requiring multivariable calculus. The laboratory worksheet, Colab notebook, and eight-step modeling rubric are under development; a small pilot study of student conceptual gains is planned for a Math 2BL section and will be reported in the full draft once the activity is stable.

Research questions
  • RQ1Can a single-hidden-layer NN trained on USPS digits be unpacked using only Math 2B-level linear algebra (SVD, eigen-decomposition, basis change)?
  • RQ2Do students who follow the train-then-decompose path develop more correct mental models of 'what the network learned' than students who follow a calculus-first backprop path?
  • RQ3What is the minimum-viable hands-on activity that gives a community college student a mechanistically honest first encounter with a neural network?
Methodology

Hybrid. (i) Build the activity end-to-end as a Colab notebook (NumPy only — no PyTorch/TensorFlow for v1, to keep the linear algebra unobstructed). (ii) Train on USPS, achieve a baseline accuracy >90%. (iii) Compute and visualize the SVD of W₁ as 16×16 grayscale eigen-strokes. (iv) Build a paired worksheet that walks a student through the modeling process using Anderson's 8 steps. (v) Pilot with 6–10 Math 2BL students; compare against a control reading group that does the standard backprop tutorial. (vi) Score conceptual gains with a custom 10-item instrument we develop and pilot.

Required math & CS

SVD, eigenvalue/eigenvector, basis change, matrix-vector products as linear maps, gradient descent at the level of 'minimize loss by walking downhill,' NumPy, Colab, basic data viz (matplotlib). Critically, no chain rule prerequisite.

Step-by-step execution plan
  1. Wk 1–2Read Strang Linear Algebra and Learning from Data §I.8 + Nielsen Ch. 1–3. Side-by-side notes on which steps require calculus.
  2. Wk 3Build the NumPy-only training loop. Achieve >90% USPS test accuracy.
  3. Wk 4Compute SVD of trained W₁. Visualize top-16 left singular vectors as 16×16 images. Confirm they look like strokes/curves, not noise.
  4. Wk 5Draft the 8-step modeling worksheet. Each step maps to a Math 2B concept.
  5. Wk 6Build the conceptual instrument (10 items, mix of multiple-choice and short-answer). Pilot it on yourself + 2 friends.
  6. Wk 7–8Run the pilot in Jeff's class. Collect pre/post.
  7. Wk 9Score, analyze, write results section.
  8. Wk 10–11Draft full paper. Mentor pass.
  9. Wk 12Submit to PRIMUS.
Risks & failure modes
  • The eigen-strokes don't look like strokes. If the SVD of W₁ produces noise, the central pedagogical claim collapses. Mitigation: prototype this in Wk 1 before committing. Fallback: use a deeper net and SVD the activations of the first layer instead of the weights.
  • n is too small for statistics. A community-college pilot will give n < 30. Treat this as design-based research, not RCT. Frame as a case study per PRIMUS norms.
  • Reviewers ask 'why no PyTorch'. Have a one-sentence answer: NumPy keeps the linear algebra visible; PyTorch's autograd hides exactly the matrix structure we are trying to teach.
Publishable vs mediocre

Mediocre: Here is a Colab that trains an MLP on MNIST.

Publishable: Here is the first activity that lets a precalculus-only student look inside a trained network and see linear algebra primitives staring back.

The money figure

Top-16 left singular vectors of trained W₁, displayed as a 4×4 grid of 16×16 grayscale images that visibly look like strokes and curves.

Kanban
Backlog
  • · NumPy training loop
  • · SVD viz prototype
  • · 10-item instrument
In Progress
In Review
Done
Paper 03HCICS Ed × AIDesign-Based ResearchJ. Computing in Higher Ed / PRIMUS

Anti-Racist AI Tutoring in the Applied Modeling Classroom — A Design and Evaluation Framework

Abstract (draft v0)

[DRAFT — pilot and findings sections marked as forthcoming.] Large language models have entered college mathematics classrooms faster than instructors have been able to evaluate them. Most off-the-shelf LLM tutors optimize for final-answer correctness, which directly undermines the goals of an authentic mathematical-modeling course where the verification, transfer, and iteration steps are the entire point. We describe a design-based research study in which we constrain a frontier LLM to act as a modeling co-tutor scoped to Anderson's eight-step applied modeling process and audited against his five published anti-racist learner-centered objectives. We present (i) the system architecture — a thin web app with prompt scaffolding and refusal heuristics — and (ii) the audit framework: a rubric mapping each learner-centered objective to LLM behaviors that satisfy or violate it. A two-arm pilot with Math 2BL students on the LANA and MER projects is planned; student artifacts will be coded against the audit rubric and compared to an unconstrained baseline in the full draft. The design question we are pursuing is not 'will it cheat for students,' but 'which student-led practices does the tutor make easier?'

Research questions
  • RQ1Can a frontier LLM be prompted to refuse final-answer requests and instead elicit student work in each of Anderson's eight modeling steps?
  • RQ2Do students using the constrained tutor produce artifacts that better satisfy Anderson's five anti-racist learner-centered objectives than students using an unconstrained baseline?
  • RQ3What categories of LLM failure are most dangerous in an authentic modeling classroom, and how should they be flagged in real time?
Methodology

Design-based research. (i) Build a scoped LLM tutor as a single-page web app calling the Anthropic / OpenAI API with a system prompt that encodes the 8 steps and refusal heuristics. (ii) Develop an audit rubric: a 5×3 matrix where rows are Anderson's learner-centered objectives and columns are supports / neutral / violates. (iii) Two-arm pilot with Math 2BL volunteers — constrained tutor vs. baseline LLM — on a shared LANA or MER task. (iv) Code the resulting student artifacts against the audit rubric. (v) Semi-structured interviews with 4–6 students about what the tutor made easier and what it foreclosed.

Required math & CS

LLM API integration (Anthropic SDK or OpenAI SDK); prompt engineering; design-based research methodology; qualitative coding (deductive, rubric-driven); enough linear algebra to evaluate the LANA/MER artifacts students produce. Web stack: vanilla JS or React + a serverless function for the API key, deployed on GitHub Pages + a tiny Cloudflare Worker.

Step-by-step execution plan
  1. Wk 1–2Operationalize Anderson's five learner-centered objectives into a 5×3 audit rubric. Get Jeff's sign-off in one async pass.
  2. Wk 3Build the system prompt v0. Test against 20 hand-written student questions drawn from MER + LANA.
  3. Wk 4–5Build the SPA. Logging baked in: every turn captured + tagged with which step of the 8-step process the student was in.
  4. Wk 6Pilot dry run with one friendly Math 2BL student. Iterate the prompt.
  5. Wk 7–8Recruit pilot. Run two-arm study. IRB exemption needed early.
  6. Wk 9Code the artifacts against the rubric. Inter-rater reliability with Jeff on a 20% sample.
  7. Wk 10–11Draft. Consider J. Computing in Higher Ed, PRIMUS, or Mathematical Thinking and Learning as targets.
  8. Wk 12Submit.
Risks & failure modes
  • The tutor refuses too much and is unusable. Mitigation: pilot the prompt against student questions in Wk 3, not Wk 7. Set a refusal-rate ceiling.
  • n is small & selection-biased. Frame as design-based research, not causal. Lean on the artifact analysis, not on hypothesis tests.
  • API key leakage. Never put the API key in client-side code. Use a Cloudflare Worker proxy with a per-IP rate limit.
  • Model deprecation mid-study. Pin the model snapshot. Document the exact ID in the methods section.
Publishable vs mediocre

Mediocre: We prompted ChatGPT to be a math tutor and students liked it.

Publishable: We took a published, named, anti-racist learner-centered framework, operationalized it into a behavioral audit rubric, scoped a frontier LLM against an 8-step modeling process, and produced evidence that constraint quality matters more than model quality.

The money figure

Side-by-side conversation transcripts — same student question to baseline LLM vs. constrained tutor — with the 5×3 rubric color-filling cell-by-cell as the conversation progresses.

Kanban
Backlog
  • · Audit rubric v0
  • · System prompt v0
  • · SPA scaffold
  • · IRB exemption
In Progress
In Review
Done
Why these three and not others

A longer list of seven candidates (image morphs/warps, Eigenfaces, P-block 2.0, Computer Graphics & 3D Animations, Cubic Splines, GPS Least-Squares, an Anderson-curriculum effectiveness study) is preserved in long-list-of-candidates.md for future quarters. The three above were chosen because each is anchored to a specific written invitation from Jeff (a published paper to extend, an “under development” project to build, or a published framework to operationalize), and because together they span hardware, curriculum, and HCI — the trifecta a CS Ed / Learning Sciences / HCI PhD application needs.

Eight protected hours per week, distributed so no single paper starves and no single day eats Henry alive. The day-of-week assignments are templates, not laws.

Monday
Deep Work A
06:00–07:30 · 1.5h
  • · Active drafting on the paper at the front of the queue this week
  • · One LaTeX commit minimum
  • · End with a one-sentence 'what's next'
Tuesday
Experiment / Code
06:00–07:30 · 1.5h
  • · Write or run the next experiment
  • · Commit raw output to experiments/<date>/
  • · Update the experiment ledger
Wednesday
Read & Annotate
06:00–07:00 · 1h
  • · One paper, one pass, one page of notes
  • · Save into reading/<first-author-year>.md
Thursday
Deep Work B
06:00–07:30 · 1.5h
  • · Active drafting on the second paper in queue
  • · Same rules as Monday
Friday
Mentor Touchpoint Prep
06:00–06:45 · 0.75h
  • · Write the weekly mentor email (do not send yet)
  • · Sit with it overnight; trim by 30%
Saturday
Long Block
08:00–10:00 · 2h
  • · Biggest single thing of the week (hardware build, multi-step experiment, full draft pass)
  • · Send the trimmed mentor email at end of session
Sunday
Review & Plan
19:00–19:45 · 0.75h
  • · Weekly review (template)
  • · Update kanbans on each paper
  • · Set the queue order for next week
Always-on
Slack Time
~1h floating
  • · Reading on commute
  • · Voice memos for video scripts
  • · Capture into Second Brain

Mentor Protocol

Keep Jeff's load under 10 min/week.

Async first, sync rare

One email per week, max one in-person meeting per month. Mon 8:00–8:40 PM PST is the standing call slot.

Asks-only emails

Each email contains exactly one decision needed from Jeff, a TL;DR, and links — never an open-ended 'thoughts?'

Default to drafts

If you would normally ask 'should I…?', instead send a working draft and ask 'anything I should change before I commit?'

Mentor budget

If you have used >30 minutes of Jeff's time this month, the next ask must be a status update, not a question.

Always include your proposed answer

Lets Jeff reply with a single character if he agrees.

A persistent, paper-tagged journal. Entries live in localStorage on this device — they survive page reloads but should be exported to git monthly via Export JSON.

    §05Video Pipeline

    One companion video per paper, each in the 6–9 minute range. The video is not a marketing afterthought — it is a forcing function for clarity. If you cannot animate the core idea, you do not yet understand it.

    P1
    MER 2.0
    Core visual insight

    An eigenvector is a shape the system likes to vibrate in. The shape is physical, not symbolic.

    Tools
    Manim (math)Blender (hardware shots)DaVinci Resolve (edit)Audacity (VO)Affinity Designer (thumbnails)
    Animation ideas
    1. 1.Phase-space animation of a 2-mass system: random initial condition, then projected onto each eigenmode, side by side.
    2. 2.Slow zoom from real pendulum footage → tracked dots → derived (x1,x2) trajectory → diagonalized (q1,q2) trajectory.
    3. 3.The diagonalization step animated as a literal rotation of the coordinate axes.
    4. 4.Transition: 2-mass → 3-mass, showing a third mode 'appearing' as a new direction in state space.
    5. 5.Final: BOM laid out flat with prices, pan to a student running the Colab.
    P2
    Matrices → Networks
    Core visual insight

    The trained network is just a stack of matrices, and the SVD shows you what each layer 'looked for.'

    Tools
    Manim3Blue1Brown-style typographyNumPy + matplotlib capturesOBS for screen capturesDaVinci Resolve
    Animation ideas
    1. 1.A USPS digit getting flattened into a vector, then matrix-multiplied — show the matrix lighting up cell by cell.
    2. 2.Training animated as the loss surface dimming, with W₁ updating each frame.
    3. 3.The reveal: SVD of W₁, top-16 left singular vectors arranged as 16×16 images, sorted by σ — strokes emerge.
    4. 4.Two networks side by side: one trained on USPS, one on Fashion-MNIST. Eigen-strokes look totally different. Visceral.
    5. 5.Closing: a precalculus student walks through one full SVD slide.
    P3
    AI Modeling Tutor
    Core visual insight

    An AI tutor's value is what it refuses, not what it answers.

    Tools
    OBS (screen capture)After Effects or Manim for the rubric vizDaVinci ResolveAffinity Designer for lower-thirds
    Animation ideas
    1. 1.Split-screen: same student question to baseline ChatGPT vs. the constrained tutor. Show the conversation diverge.
    2. 2.The 5×3 audit rubric drawn live, color-filling cell by cell as the conversation progresses.
    3. 3.'8 steps' as a horizontal track at the bottom, lighting up in real time as the student moves through the modeling process.
    4. 4.A failure case: the tutor giving a too-confident wrong answer, with the audit rubric instantly going red. Honest reporting.
    5. 5.Closing voiceover from a real student about what the tutor made easier.

    Standard 7-min script structure

    0:00–0:20HookA single image or claim that would make a Math 2B student stop scrolling. No throat-clearing.
    0:20–1:00The questionWhose problem? Why unsolved? Why care in 30 seconds?
    1:00–2:30Minimum LASmallest concept set. ONE Manim animation of the core idea.
    2:30–4:30The contributionHardware / code / framework, shown working. Real footage, real numbers.
    4:30–6:00What the data saysPilot results, with honest uncertainty. Show the failure cases.
    6:00–6:30What this changesOne sentence about the bigger frame: classroom practice, equity, AI in education.
    6:30–7:00Call to actionLink to the paper, the GitHub, the Colab, an invitation to remix.

    House rules — translating LA to visuals

    Always show the basis

    Vectors are nothing without an explicit coordinate system on screen. Animate the basis change, do not just write 'let v = Px.'

    Color = direction, brightness = magnitude

    Lock this convention across all three videos so viewers transfer intuition between them.

    Diagonalization is a rotation

    Whenever a matrix is diagonalized, animate the rotation. Never let P⁻¹AP be a static formula.

    Eigenvectors are arrows that survive

    Animate by transforming a circle of vectors, then highlighting only the ones that did not change direction.

    SVD is two rotations and a stretch

    Always animate UΣVᵀ in this geometric sequence, never as three matrices being multiplied.

    One whiteboard, one camera, one voice per video

    Resist the temptation to over-edit. Production value comes from clarity of thought, not transitions.

    “The first commit before sundown. The lab is real the moment it has a git history.”

    Modeling Bench · Operating Rule