Research Agenda
Four hypotheses about the structure of introductory CS.
Seymour and Hunter's Talking About Leaving Revisited (2019) makes a finding that is usually quoted as a slogan and almost never treated as an empirical claim: students who leave STEM are not the ones who fail the coursework. Across 5,100+ interviews at 27 institutions, the measured departure predictors were teaching quality, weed-out culture, help-seeking suppression, and eroded belonging — not GPA or prior preparation. The four research questions below are the ones that finding forces you to ask if you take it seriously at a community college, where 45% of CS undergraduates begin and where the original study was never run. Each question resolves to a hypothesis with an observable quantity, a data source, and a way to be wrong.
Hypotheses, not slogans. Each question below states (a) a construct, (b) an operationalization (how I would measure it), (c) a falsifiable prediction, and (d) the null I would accept as a disconfirming outcome. The projects that address each question are detailed on the projects page; the measurement procedures are worked in the methods appendix.
Target venues: SIGCSE · ICER · EDM · LAK · Learning @ Scale
Question 1 · Project P1
Do structural features of course design predict help-seeking suppression in LMS behavior?
Construct. Help-seeking suppression — the behavioral pattern in which students who are objectively stuck (long edit-compile-fail cycles, rising error rates, declining progress) do not post in the discussion forum, attend office hours, or reach out on Slack. Predicted by Seymour & Hunter to be structurally caused and by the Karabenick / Newman help-seeking literature to track feature-level design choices.
Operationalization. A behavioral index built from LMS and discussion-forum logs: stuck-ratio = (long idle stretches within an edit-fail cycle) / (total idle stretches). Course-design features coded from syllabus, assignment prompts, and office-hours listings using the Tool-2 rubric (validated against P2). DFW outcomes from institutional records.
Hypothesis H1. In a sample of at least 6 CC CS course-sections with ≥1200 enrolled students, course-level pedagogical debt score (composite of the four Tool-2 dimensions) will positively predict course-level help-seeking suppression rate, controlling for enrollment, instructor, term, and student prior-grade covariates, with β ≥ 0.2 and p < 0.05.
Disconfirming null. If the regression returns |β| < 0.1 or p ≥ 0.1 on a pre-registered specification, I accept that the structural-feature hypothesis is not supported at this level of measurement and report the negative result.
Methods: learning analytics · LMS log mining · regression · pre-registration
Question 2 · Project P4
Does MVC-distance in an instructor-annotated curriculum graph predict DFW rates and persistence?
Construct. MVC-distance — the gap between a real curriculum and its greedy Minimum Viable Curriculum on the typed-dependency graph of learning objectives (see Tool 3). The research claim is that curricula farther from the greedy MVC carry "redundant load" that falls disproportionately on students with the least academic slack.
Operationalization. Ten introductory CS course sequences at California community colleges, each annotated by two instructors blind to the other's annotations, against a fixed 60-item objectives vocabulary. MVC-distance = (number of course modules in current sequence) − (greedy-MVC module count on the annotated graph). Outcomes = section-level DFW rate and one-year persistence from institutional data.
Hypothesis H2. Across the 10 sequences, Spearman correlation between MVC-distance and DFW rate ≥ 0.45, and inter-annotator Cohen's κ on objective labels ≥ 0.65.
Disconfirming null. If |ρ| < 0.2, or if κ falls below 0.55 (indicating the typed-dependency grammar is not reliably coded even by experts), I publish the negative result and retire MVC-distance as a candidate metric.
Methods: curriculum graph annotation · inter-rater reliability · institutional data matching
Question 3 · Project P3
Do the Seymour & Hunter departure categories replicate at community colleges, and what is missed?
Construct. Structural departure — the claim that STEM attrition at 2-year institutions is driven, as it is at 4-year institutions, by teaching quality, weed-out culture, eroded belonging, and help-seeking suppression, rather than by aptitude. The CC-specific twist is what Seymour & Hunter could not measure: transfer-path fracture, financial-stop-out, caregiving labor, and the interaction of those with course structure.
Operationalization. Semi-structured interviews with 24–30 students who enrolled in an intro CS course at Foothill or De Anza in the 2024–2025 academic year and did not re-enroll in the subsequent CS course. Transcripts double-coded against the S&H taxonomy plus a pre-registered set of CC-specific candidate categories (transfer-stop, caregiving-load, advising-gap). IRB submission Spring 2026, fieldwork Summer 2026.
Hypothesis H3. ≥ 70% of coded departure episodes map onto at least one S&H category; additionally, ≥ 20% of episodes require at least one CC-specific category to be fully described. Inter-coder κ on the combined codebook ≥ 0.70.
Disconfirming null. If the S&H taxonomy accounts for <50% of episodes, or if the CC-specific extensions are not reliably coded, I revise the theoretical frame before publication rather than squeezing the data to fit.
Methods: semi-structured interviews · grounded theory · double coding · IRB-reviewed study
Question 4 · Project P2
Can rule-based text features reach expert-level reliability for detecting pedagogical debt in syllabi?
Construct. Pedagogical debt — a composite of motivational framing, scaffolding visibility, verification structure, and belonging signals, operationalized in Tool 2. The research claim is that the construct is measurable from syllabus text alone at reliability high enough to be useful to instructors.
Operationalization. A corpus of 120 introductory CS syllabi from California community colleges (IRB-cleared, public-domain where possible). Three trained annotators score each syllabus against a rubric derived from the Tool-2 rules. Rule-based Tool-2 scores compared against the mean human score per dimension.
Hypothesis H4. (a) Inter-annotator Cohen's κ ≥ 0.65 on each of the four dimensions. (b) Spearman correlation between the Tool-2 rule-based score and the mean human score ≥ 0.70 on each dimension. (c) Tool-2 scores, entered as a single composite, predict end-of-term student belonging scores (Walton & Cohen instrument) in a 6-section subsample with β ≥ 0.15, p < 0.05.
Disconfirming null. If any of the three reliability thresholds fail, the tool is not validated; I report which dimensions failed, what their rules looked like, and what the failure tells us about the measurability of the underlying construct.
Methods: annotation study · NLP rule evaluation · belonging-instrument administration
What this program owes, and what it adds
What it owes. Seymour & Hunter (1997, 2019) for the finding this research program rests on. Harel (1998) for the necessity principle and the construct of intellectual need that motivates the pedagogical debt rubric. Walton & Cohen (2007, 2011) and Walton & Brady (2020) for the belonging-uncertainty framework. Margolis & Fisher (2002) and Margolis (2008) for the documentation of how CS culture suppresses belonging in non-default students. Karabenick, Newman, and Ryan for the help-seeking literature. Anderson for the Math 2BL model of necessity-first instructional design that translates Harel from mathematics to computing. This program does not propose a new theory; it proposes a measurement stack that makes the existing theories testable in a community-college setting.
What it adds. Three things. First, the typed-dependency grammar and MVC-distance metric (Q2) are a new operationalization of curriculum structure that is computable, instructor-auditable, and falsifiable. Second, the pedagogical debt rubric (Q4) translates Harel's necessity principle and the belonging literature into rule-level features that can be evaluated against expert annotation — a thing that, to my knowledge, has not been done at the syllabus-corpus level. Third, the research site itself: the Seymour & Hunter replication at California community colleges (Q3) is the first study I'm aware of that attempts to replicate TAL's structural-departure finding in a population where half of future STEM majors actually start.
What would make it wrong. If the instructor-annotated graphs in Q2 do not converge (κ < 0.55), the typed-dependency grammar is too unreliable for the program to stand. If the Tool-2 rules in Q4 do not correlate with human annotation (ρ < 0.5), the pedagogical debt construct is not measurable from text alone and the entire behavioral pipeline has to be re-grounded. If the Q3 interview study finds that S&H categories account for fewer than half of departure episodes, the original theoretical frame is not the right frame for the population and the program's connecting thread has to be rewritten. Each of these is a live possibility and each has a place in the project design to be measured, not avoided.
Roadmap
Now — Spring 2026
IRB Protocol + Corpus Construction
Draft IRB materials for P3 interview study (Foothill / De Anza). Assemble public-syllabus seed corpus for P2 (target n = 120; collection continues through summer). Finalize annotation schema for both Q4 and Q2. Submit PhD program applications.
Summer 2026
P3: Departure Interviews
Conduct 20–30 semi-structured interviews with students who left STEM at Foothill. Code against S&H taxonomy. Begin grounded theory analysis.
Fall 2026
P2: Annotation Study + P1: LMS Analysis
Recruit expert annotators for P2. Begin NLP classifier training. In parallel, analyze LMS data for P1 help-seeking feature extraction.
Spring 2027
First Submissions
Submit P3 (interview study) to ICER 2027. Submit P2 (SyllabusAudit) to Learning @ Scale 2027. Pilot P4 instructor annotation study.
PhD Program
Dissertation Research
Integrate P1–P5 into a coherent dissertation on structural predictors of help-seeking and STEM departure at community colleges.
Research and Teaching as One Practice
These research questions did not arise from a literature review. They arose from watching students leave — from financial aid offices, counseling appointments, tutoring sessions, and learning communities where I saw the same structural patterns from every institutional vantage point. The research formalizes what I observed. The teaching practice attempts to fix it.
Every course I design is a potential research site: the Build a Computer from Scratch project creates a natural laboratory for studying help-seeking behavior (Q1), the effects of physical computing on belonging (Q3), and whether constructionist curriculum design measurably reduces the structural departure patterns that Seymour and Hunter documented (Q3). The build journal entries are qualitative data. The milestone completion patterns are quantitative data. The three-track system is a testable belonging intervention (P5). The research and the teaching are the same activity, observed from different angles.
See the full curriculum site for the enacted version of this research agenda.
Last updated: April 2026