What CS336 Actually Says
Stanford's CS336 course has published a CLAUDE.md file that spells out exactly what AI agents can and cannot do in the context of their assignments. The rules are specific:
AI agents MAY:
- Explain concepts when students are stuck
- Point to relevant lecture materials and documentation
- Review code and suggest improvements via dialog (not a PR)
- Help debug by asking guiding questions
- Suggest sanity checks, profiler investigations, and assertions
AI agents CANNOT:
- Write any Python or pseudocode
- Complete TODO sections in assignment code
- Edit code in the student repo
- Give solutions or direct implementations
- Refactor large portions of code into finished solutions
The guiding philosophy: AI agents should be teaching assistants, not ghost writers. When students ask for fixes, the agent must ask guiding questions and suggest next steps — never implement them.
The "Ask Guiding Questions" Pattern
Here's what a good AI agent interaction looks like under CS336 rules:
Student: "My causal mask seems wrong and training blows up. Please tell me what my mistake is.">
Agent: "My role is to help guide you to understanding, not to give you the answers directly. What have you tried so far?">
Student: "I have tried running a single attention layer, but it still does not work.">
Agent: "Check three things: whether the mask is applied before softmax, whether it broadcasts to the score tensor shape you expect, and whether masked positions become a very negative value rather than zero. A good sanity test is a toy sequence of length 3 where you print the attention scores before and after masking."
The agent guides toward understanding rather than delivering a solution. Every response preserves the learning process.
Why This Matters Beyond Academia
CS336's guidelines aren't just about academic integrity — they describe a real engineering problem that every product team faces: how do you use AI tools without letting them own the solution?
In production environments, the same tension shows up when:
- Junior developers prompt AI to scaffold entire features without understanding the architecture
- Senior engineers use AI as a rubber stamp to ship code they haven't reviewed
- Teams measure AI adoption by lines of AI-generated code rather than learning outcomes
The CS336 model suggests a better framework: AI should augment understanding, not replace it. When a developer asks AI to solve a problem they don't understand, they're not being productive — they're deferring the learning cost to the next bug.
What This Means for Builder Teams
If you're leading a team that uses AI coding tools (Cursor, Copilot, Claude Code, etc.), CS336's framework is worth stealing:
The Practical Takeaway
Stanford CS336 is essentially saying: AI tools are valuable, but only when they make you a better engineer — not when they make you dependent on one.
The same logic applies in production. If you're using AI to ship faster but your team doesn't understand what shipped, you're not moving faster — you're trading short-term velocity for long-term fragility.
The CS336 model of AI-as-teaching-assistant might actually be the right default for most engineering teams too. Use AI to learn, not to replace the learning.