Every operator I talk to has the same complaint about AI: "I keep re-explaining my business every session." They open Claude or ChatGPT, spend the first ten minutes pasting background, correcting assumptions, and re-teaching context they already taught last Tuesday. Then they get 45 minutes of actual work before the context window fills up and the model starts forgetting things. The next day they do it again. Context engineering is the practice that fixes this — and it's the single highest-leverage AI skill most operators have never heard of.
I run four ventures. Ecommerce brands on Amazon, an agency (Velocity Sellers), cohort programs, and advisory. Across all of them, I use Claude Code, Codex, and a handful of automations that run unattended. The difference between my setup in January and my setup now isn't a better model. It's better context. My agents start every session already knowing my brand voice, my naming conventions, my SOPs, my client roster structure, and the specific things I care about. That didn't happen by accident. I engineered it.
This post is the guide I wish someone had handed me before I wasted three months prompt-engineering my way into a ceiling.
What Is Context Engineering?
Context engineering is the practice of designing what information an AI model receives, how that information is structured, and when it enters the context window. It treats the model's input not as a single prompt you type, but as a layered system — instructions, memory, retrieved documents, tool outputs, and conversation history — that you deliberately architect so the model always has what it needs and nothing it doesn't.
If prompt engineering is choosing the right words to say to the model, context engineering is building the room the model works in. The furniture, the reference shelf, the whiteboard with yesterday's decisions still on it. Most operators spend all their time on the words and never build the room.
The distinction matters because the prompt is maybe 5% of what hits the context window in any serious workflow. The other 95% is system instructions, retrieved files, tool results, prior conversation turns, and injected memory. If that 95% is wrong — bloated, stale, irrelevant, poorly structured — no prompt will save you.
Why Prompt Engineering Hit a Ceiling
I spent months building elaborate prompts. Multi-paragraph system instructions. Chain-of-thought scaffolding. Temperature tuning. The results were good — for about one session.
The problem is that prompt engineering is stateless. You craft a beautiful prompt, it works, and then the session ends and everything evaporates. Tomorrow you need the same prompt plus whatever you learned today. The prompt gets longer. The context window gets tighter. The model starts dropping instructions because you're feeding it 8,000 tokens of system prompt before it even sees the task.
Here's what I tracked over two months:
- Average time re-establishing context per session: 8-12 minutes
- Sessions per day: 4-6
- Weekly time lost to re-explanation: 3-5 hours
- Prompt drift (instructions that contradicted each other across versions): happened every 2 weeks
Context engineering fixes all of these because it moves knowledge out of ephemeral prompts and into persistent, structured files that load automatically. The prompt becomes small, focused, and task-specific. The context does the heavy lifting.
The Four Layers of Context Engineering for Operators
I think about context in four layers, from most persistent to most ephemeral. Each layer solves a different problem.
Layer 1: Identity Files (CLAUDE.md, System Instructions)
These are the files that define who the AI is working for and how it should behave. In Claude Code, this is your CLAUDE.md file. In other tools, it's your custom instructions or system prompt.
My root CLAUDE.md is 87 lines. Not 400. Not 1,200. Eighty-seven. It contains:
- Who I am and what I run (2 sentences)
- Voice rules (direct, first-person, no corporate jargon — 4 bullet points)
- Project structure pointers (where to find things, not copies of the things)
- Tool preferences (which MCP servers to use for what)
- Things that always apply (never commit secrets, always run tests, match existing code style)
The most common mistake I see operators make is stuffing their identity file with everything they know. Code style guides, full SOPs, example outputs, client lists. This bloats the context window on every single session, even sessions where none of that information is relevant. The identity file should be a map, not the territory.
Here's a real excerpt from my setup:
# Voice
- First person. Practitioner tone.
- No: leverage, utilize, holistic, game-changer, in today's landscape
- Default to specific numbers over vague claims
- When referencing code, include file_path:line_number
# Project structure
- Blog posts: src/blog/posts/
- Skills: .claude/commands/
- Automations: scripts/
- For brand voice examples, read src/blog/posts/email-to-vault-claude-agent-build-log.md
Notice: I point to a file for brand voice examples instead of pasting examples inline. The AI reads that file only when it needs voice reference, not on every session.
Layer 2: Skill Files (Reusable Workflows)
Skills are the compound interest of context engineering. A skill is a structured set of instructions for a repeatable task — stored as a file, loaded on demand, never occupying context when you don't need it.
In Claude Code, skills live in .claude/commands/ as markdown files. I have 14 of them. Here are the ones I use most:
/blog— the full pipeline for writing and publishing a blog post (topic selection, keyword research, writing, frontmatter, git push)/audit-listing— runs a structured audit of an Amazon listing with scored criteria/meeting-tasks— processes a Fathom transcript into attributed Todoist tasks/weekly-review— generates my weekly review from git logs, Todoist completions, and calendar data
Each skill file is 40-120 lines of markdown instructions. The key insight: skills encode decisions I've already made. I don't re-decide the blog frontmatter format every time I write a post. I don't re-decide how to score a listing audit. Those decisions are baked into the skill, and the AI follows them consistently without me re-explaining.
This is what compounding looks like in practice. Every time I solve a workflow problem, I capture the solution as a skill. Six months from now, I have 30 skills and the AI can do 30 things without me explaining any of them.
Layer 3: Retrieved Context (Files, Search Results, Tool Outputs)
This is the layer most people skip. Your AI needs access to your actual data — not pasted into the prompt, but retrievable on demand.
For me, this means:
- MCP servers connected to Gmail, Todoist, Fathom, GitHub, and my notes vault
- File reads from my project structure (the AI reads what it needs, when it needs it)
- Web search for current information the model doesn't have in training
The operator pattern I've landed on: tell the AI where to look, not what to read. Instead of pasting a client brief into the prompt, I write "read the client brief at clients/acme/brief.md." The AI pulls exactly what it needs and nothing else. The context window stays lean.
This matters more than most operators realize. A Claude Opus context window is 200K tokens. That sounds enormous until you paste in three client briefs, a product catalog, and a competitor analysis. Now you're at 80K tokens of background before the AI has done a single thing, and the actual task instructions are buried so deep the model starts hallucinating because it can't attend to everything equally.
Layer 4: Conversation History (The Session Itself)
This is the most ephemeral layer and the one operators have least control over. But there are two things you can do:
First, plan and execute in separate sessions. I learned this the hard way. If you research, debate, and build all in one conversation, the context window fills with exploratory dead ends the model keeps referencing. Instead: one session to plan (output a structured plan document), then a fresh session to execute (input only the plan). The execution session starts clean with exactly the context it needs.
Second, compress proactively. When a conversation is getting long and I'm about to switch sub-tasks, I ask the model to summarize what we've decided so far into a structured note. I save that note. If the session compresses or I start fresh, I feed back only the summary, not the full conversation.
Building a Context System That Compounds
The four layers above aren't a one-time setup. They're a system that gets better every week if you maintain it. Here's my maintenance rhythm:
Weekly (15 minutes): Review what I re-explained this week. If I explained the same thing to the AI twice, it should be in a file — either the identity layer or a skill. I add it.
After every new workflow: If I built something that worked, I capture the working version as a skill file. Not the prompt I started with — the prompt that actually worked after iteration. This is critical. Most operators save their first draft. I save the version that survived contact with reality.
Monthly (30 minutes): Audit the identity file. Remove anything stale. Check that pointers still point to real files. Trim anything that's grown past 100 lines. The identity file should get shorter over time as knowledge moves into skills and retrieved files.
Never: I never put time-sensitive information in persistent context files. No "this week's priority" in CLAUDE.md. No "current client list" hardcoded. Anything that changes goes in a file the AI reads on demand, not in a file that loads every session.
Context Engineering vs. Prompt Engineering: What Actually Changes
Here's the concrete difference in my workflow:
| Prompt Engineering | Context Engineering | |
|---|---|---|
| Session start time | 8-12 min re-explaining | 0 min — context loads automatically |
| Consistency across sessions | Varies with prompt drift | Identical — same files, same behavior |
| Knowledge retention | Zero — everything ephemeral | Permanent — files persist across sessions |
| Scaling to new tasks | Write a new mega-prompt | Write a 60-line skill file |
| Team handoff | "Here's my prompt doc" (100 pages) | "Clone the repo, run the skill" |
| Monthly time investment | ~15 hrs re-explaining | ~2 hrs maintaining context files |
The time savings alone justified the switch. But the real payoff is consistency. When I run my /blog skill, the output matches my voice, follows my frontmatter format, and hits my keyword structure every time. Not because I wrote a better prompt — because the context is engineered to make the right output the easy output.
The Five Mistakes Operators Make With Context
1. Treating the context window like a suitcase. They cram everything in "just in case." Every past conversation, every reference document, every possible instruction. The model drowns. Context engineering means being ruthless about what does NOT go in.
2. Never externalizing decisions. They make the same decision every session — how to format output, which tone to use, what the deliverable structure looks like — instead of making it once and saving it to a file. If you've decided something, encode it. Don't re-decide it.
3. Copying instead of pointing. They paste code snippets, example outputs, and full documents into their system prompt. These go stale immediately. Point to the source file. Let the AI read the current version.
4. Ignoring retrieval. They don't connect their AI to their actual data. No MCP servers, no file access, no tool integration. The AI works from whatever the operator manually pastes in, which means the AI only knows what the operator remembers to share.
5. Building for today instead of compounding. They solve each task as a one-off prompt instead of asking "will I do this again?" If yes, the solution should become a skill file that's reusable forever. This is the difference between linear and exponential returns from AI.
How to Start: The 90-Minute Context Engineering Sprint
You don't need a weekend to set this up. Here's what I'd do in 90 minutes if I were starting from zero:
Minutes 1-20: Write your identity file. Open your CLAUDE.md (or custom instructions). Write: who you are (2 sentences), what you won't tolerate in output (your "no" list), where your important files live (pointers, not copies), and your default tool preferences. Keep it under 100 lines.
Minutes 21-50: Build your first two skills. Think about the two tasks you do most often with AI. Write each one as a markdown file with clear steps, expected output format, and any decisions you've already made. Save them where your AI tool can find them.
Minutes 51-70: Connect one data source. Pick the one external system your AI would benefit most from reading — your email, your project management tool, your notes, your calendar. Set up the connection (MCP server, API integration, or even a simple file export). The goal is one retrieval path that doesn't require you to copy-paste.
Minutes 71-90: Run a test session. Open a fresh session with your new context. Run one of your skills. Note what the AI got right without you explaining it. Note what it still got wrong — that tells you what's missing from your context files. Add it.
That's your foundation. Everything after this is maintenance and growth — adding a new skill when you find a repeating workflow, trimming your identity file when it gets bloated, connecting another data source when you realize you're still copy-pasting from it.
FAQ
Is context engineering only for Claude Code?
No. The principles apply to any AI tool. ChatGPT has custom instructions and GPTs. Cursor has rules files. Codex has agents.md. The implementations differ but the practice is identical: persistent identity, reusable skills, retrieved context, and managed conversation history. I happen to use Claude Code because the CLAUDE.md + skill file + MCP server stack gives me the most control, but the mental model works everywhere.
How is context engineering different from RAG?
RAG (retrieval-augmented generation) is one technique within context engineering — it's the "retrieval" part of Layer 3. Context engineering is the whole system: identity files, skill files, retrieval, and conversation management. You can do context engineering without RAG (using only file reads and tool outputs), and you can do RAG without context engineering (just bolting a vector database onto a chatbot). The operators who get the most from AI do both.
How much time does this actually save?
In my operation, the shift from prompt engineering to context engineering saved roughly 13 hours per month in re-explanation time alone. That's before counting the consistency gains (fewer rework cycles because the AI follows the same format every time) and the compounding effect of the skill library (new workflows that would have taken 45 minutes to set up now take 5 because the skill already exists). For an operator running 4-6 AI sessions daily, the payback period on a 90-minute setup sprint is about three days.
Do I need to know how to code?
No. My CLAUDE.md is plain text. My skill files are markdown. The only "code" is the file structure — putting files in the right folders with the right names. If you can create a folder and write a bullet-point list, you can do context engineering. The more technical pieces (MCP server connections, automation scripts) are optional power-ups, not prerequisites.
What's the biggest mistake to avoid?
Writing a 500-line CLAUDE.md. I've seen operators with identity files that read like employee handbooks — pages of rules, examples, edge cases, style guides. The model can't attend to 500 instructions equally. It will follow the first 50 and progressively ignore the rest. Keep your identity file short and move specialized knowledge into skill files that load only when relevant.
Three Actions to Take This Week
Context engineering is the skill that turns AI from a tool you use into a system that compounds. Here's where to start:
-
Write your identity file this afternoon. Under 100 lines. Who you are, what you won't tolerate, where your files live. This single file eliminates the "re-explain my business" problem permanently.
-
Capture your most-repeated workflow as a skill file. Whatever you do with AI most often — write it down as step-by-step instructions in a markdown file. You just made that workflow permanently reusable. Context engineering turns one good session into every future session.
-
Adopt the "point, don't paste" rule. Every time you're about to paste a document into a prompt, stop. Save it as a file and tell the AI where to read it instead. Your context window stays clean, your documents stay current, and context engineering starts working for you instead of against you.
The operators who will win the next two years aren't the ones with the best prompts. They're the ones who built systems that make the right context show up at the right time, every time, without thinking about it. That's context engineering. Start building yours.