<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="https://mattstockton.com/feed.xml" rel="self" type="application/atom+xml" /><link href="https://mattstockton.com/" rel="alternate" type="text/html" /><updated>2026-03-02T17:48:30+00:00</updated><id>https://mattstockton.com/feed.xml</id><title type="html">Matt Stockton</title><subtitle>Practical AI implementation and software engineering insights from real projects.</subtitle><author><name>Matt Stockton</name><email>mattstockton@gmail.com</email></author><entry><title type="html">Automating the Path from Frontier Models to Fine-Tuned Models</title><link href="https://mattstockton.com/2026/03/02/automating-the-path-from-frontier-models-to-fine-tuned-models.html" rel="alternate" type="text/html" title="Automating the Path from Frontier Models to Fine-Tuned Models" /><published>2026-03-02T00:00:00+00:00</published><updated>2026-03-02T00:00:00+00:00</updated><id>https://mattstockton.com/2026/03/02/automating-the-path-from-frontier-models-to-fine-tuned-models</id><content type="html" xml:base="https://mattstockton.com/2026/03/02/automating-the-path-from-frontier-models-to-fine-tuned-models.html"><![CDATA[<p>I saw a post from <a href="https://x.com/virattt/status/2027809465789980896">Virat</a> at <a href="https://x.com/findatasets">findatasets</a> recently about using GPT 5.2 to parse 8-K filings. It works well but it’s expensive at scale. His plan was to accumulate examples and fine-tune an open model to bring cost down. <a href="https://x.com/mstockton/status/2027854211992764430">I replied</a> that this loop - frontier model generates outputs, outputs become training data, training data fine-tunes a cheaper model - could be almost fully automated.</p>

<p>I haven’t done fine-tuning in a while. Frontier models just work for most of what I’ve needed, so I haven’t had a reason to. But Virat’s post got me thinking about this pattern more carefully, and it seems like something worth exploring.</p>

<h2 id="structured-extraction-and-evaluation">Structured Extraction and Evaluation</h2>

<p>The reason this pattern seems especially interesting for structured extraction is that it’s easier to evaluate. If you’re pulling known fields out of semi-structured documents - dates, entities, financial figures - there’s a clearer definition of “correct” than with freeform text. Some checks can be automated (does this parse as a valid date? does this value appear in the source document?), and people can review the rest.</p>

<p>If you’re using a frontier model for extraction at any real volume, you should already be evaluating its outputs - having people review results, label quality, build an evaluation set. That evaluation work produces labeled (input, output) pairs. And if you orchestrate the system correctly, those labeled pairs can double as training data for a cheaper model.</p>

<h2 id="what-the-retraining-loop-would-look-like">What the Retraining Loop Would Look Like</h2>

<p>I think the workflow would look something like this:</p>

<ol>
  <li>
    <p><strong>Run the frontier model on your extraction task.</strong></p>
  </li>
  <li>
    <p><strong>Evaluate outputs.</strong> People review and label results. Automated checks can help with the obvious stuff (schema validation, format checks, cross-referencing source documents). This is ongoing work, not a one-time thing.</p>
  </li>
  <li>
    <p><strong>Evaluated outputs become your training dataset.</strong> Every (input, output) pair that’s been reviewed and labeled goes into your training set.</p>
  </li>
  <li>
    <p><strong>Fine-tune a smaller, cheaper model on this data.</strong> Once you have enough high-quality examples, fine-tune an open model like Llama or Mistral.</p>
  </li>
  <li>
    <p><strong>Evaluate the fine-tuned model out of sample.</strong> Check how it performs on examples it wasn’t trained on. This tells you whether it’s actually good enough to deploy.</p>
  </li>
  <li>
    <p><strong>Deploy the fine-tuned model for bulk traffic.</strong> If it meets your quality bar, route extraction volume through the cheaper model. Use the frontier model as a fallback for cases where the fine-tuned model’s confidence is low. You’d also want a way for people to flag bad outputs, so those can feed back into your training data.</p>
  </li>
  <li>
    <p><strong>Continue accumulating data and periodically retrain.</strong> The frontier model fallback path keeps generating new examples for people to evaluate. Each cycle grows the training set.</p>
  </li>
</ol>

<p>If you set up the right hooks - a pipeline for people to label outputs, a process for checking fine-tuned model performance out of sample, off-ramps for routing traffic to cheaper models - the retraining loop falls out of the orchestration. You’re connecting evaluation work you should already be doing to a fine-tuning pipeline.</p>

<h2 id="this-is-just-mlops">This Is Just MLOps</h2>

<p>The thing that struck me about this pattern is that it’s not new. It’s the same retraining loop that classical ML teams have been running for years. Collect labeled data, train a model, deploy it, monitor performance, collect more data, retrain.</p>

<p>In traditional ML, labeling was always the bottleneck. You’d hire annotators to label documents from scratch, and it was tedious, expensive, and slow.</p>

<p>Now frontier models are good enough that they can do the initial extraction at high quality, and you can use techniques like LLM-as-judge to get reasonable confidence that the outputs are correct. You still need humans in the loop reviewing labels, but you’re spot-checking outputs that are already mostly right rather than creating labels from zero. That makes the human review much less costly, which is a big part of why this pattern feels more viable now than it would have a couple years ago.</p>

<p>I’d imagine teams with classical ML experience have an advantage here. They already know how to build data pipelines, version datasets, run A/B tests between model versions, and monitor for drift. If you’ve ever built a retraining pipeline for a classification model or an NER system, you probably already know most of what this requires.</p>

<h2 id="where-i-think-this-does-and-doesnt-apply">Where I Think This Does and Doesn’t Apply</h2>

<p>I haven’t validated all of this myself, but based on what I’ve read, it seems like the pattern fits well when:</p>

<ul>
  <li>There’s a clear definition of “correct” - structured extraction, classification, entity recognition</li>
  <li>Volume is high enough to justify the investment (thousands of examples)</li>
  <li>The domain is relatively stable - document formats don’t change weekly</li>
  <li>Cost matters at scale - you’re spending real money on frontier model API calls</li>
</ul>

<p>And it probably fits less well when:</p>

<ul>
  <li>Outputs are subjective or hard to evaluate - summarization, creative writing, open-ended Q&amp;A</li>
  <li>The domain shifts frequently, invalidating your training data</li>
  <li>You need the frontier model’s breadth and general reasoning, not narrow pattern-following</li>
  <li>Volume is low - if you’re processing a hundred documents a month, just keep using the frontier model</li>
</ul>

<p>From what I’ve read, fine-tuning starts making sense around a few hundred to a few thousand high-quality examples.</p>

<h2 id="fine-tuning-has-gotten-easier">Fine-Tuning Has Gotten Easier</h2>

<p>The last time I experimented with fine-tuning was through the <a href="https://platform.openai.com/docs/guides/fine-tuning">OpenAI fine-tuning API</a>, and it was mostly hello-world level stuff. I didn’t have a real use case to push it further.</p>

<p>The tooling seems like it’s gotten a lot better since then, and there are more options now. <a href="https://github.com/unslothai/unsloth">Unsloth</a> is one I’ve seen mentioned frequently for running fine-tuning on a single GPU. I have some projects on the horizon where this kind of loop might be worth experimenting with, so I want to spend more time here.</p>

<h2 id="final-thoughts">Final Thoughts</h2>

<p>The human evaluation part of this doesn’t go away, and I don’t want to downplay that. Someone needs to define what “correct” means, review outputs, and make judgment calls about when the fine-tuned model is good enough. But the orchestration around it - the pipeline from evaluated outputs to training data to fine-tuned model to deployment with off-ramps - that part seems like it can be set up once and mostly run itself.</p>

<p>I’m curious whether teams are actually doing this today, and what their experience has been. If you’re running frontier models on structured extraction at volume and you’re already evaluating outputs, it seems like you’re close to having what you need. I want to spend more time exploring this and see what it takes to get a working version of this loop running end to end.</p>]]></content><author><name>Matt Stockton</name><email>mattstockton@gmail.com</email></author><category term="Software Engineering" /><category term="ai-tools" /><category term="llm" /><category term="fine-tuning" /><category term="mlops" /><category term="software-engineering" /><summary type="html"><![CDATA[A tweet about fine-tuning on frontier model outputs got me thinking about how much of that loop could be automated.]]></summary></entry><entry><title type="html">What Is An Agent?</title><link href="https://mattstockton.com/2026/02/21/what-is-an-agent.html" rel="alternate" type="text/html" title="What Is An Agent?" /><published>2026-02-21T00:00:00+00:00</published><updated>2026-02-21T00:00:00+00:00</updated><id>https://mattstockton.com/2026/02/21/what-is-an-agent</id><content type="html" xml:base="https://mattstockton.com/2026/02/21/what-is-an-agent.html"><![CDATA[<p>This is my attempt to describe what an agent is and why it’s so incredible, yet simple. I’m not going to edit this and I’m not going to run this through an LLM. I’m also going to try to do this in five minutes or less. Let me know how I did, and what I missed that you think is important.</p>

<ul>
  <li>An agent is simply an LLM that can call tools</li>
  <li>A tool is a computer application or a piece of code. For example, opening a file is a tool. Searching for a word in a file is a combination of tools.</li>
  <li>Command line tools have been in existence on computers for a very long time.</li>
  <li>These tools are composable and can solve almost any problem as it relates to files that are on the computer.</li>
  <li>As an example, this ‘tool’:</li>
</ul>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>cat notes/*.txt | tr '[:upper:]' '[:lower:]' | grep -oE '\b[a-z]{4,}\b' | sort | uniq -c | sort -nr | head
</code></pre></div></div>

<ul>
  <li>Takes all your text notes, normalizes the text, pulls out some words, counts them, and then shows the most common words.</li>
  <li>The tool as read may make no sense to you if you’re not an experienced engineer, but the models know exactly how to ‘do this’ - meaning ‘generate that text and do it’</li>
  <li>With a written instruction like ‘Find the most common topics in my notes.’ - The model has been tuned enough so that it can generate the above tool call.</li>
  <li>Any computer application on your computer is a tool. Excel? It’s a tool. Slack? It’s a tool. Web Browser? Same</li>
  <li>So if you blur your eyes, an agent is something that can simply control your computer. Pretty much any aspect of it.</li>
  <li>So what if a tool doesn’t exist that does what you need the agent to do?</li>
  <li>The agent has a tool that lets it write computer code. This tool is one of the best tools that it has because the foundation labs have spent a lot of time making sure this tool works well.</li>
  <li>And computer code can solve almost any problem.</li>
  <li>So you can have an instruction like: ‘Write me some code that analyzes this image and gives me the exact coordinates of where the red balloon is.’</li>
  <li>Coding agents like Claude Code will see this and then try to write you some code that does that. It will often use other tools in the code itself. So it might download some libraries that it can use or some other techniques that it knows to be able to compose some software to solve the above problem.</li>
  <li>Using a tool like Claude Code, eventually you’ll be able to build some code that solves the problem you’re talking about.</li>
  <li>So what do you have now?</li>
  <li>You have a new tool called the Find the Red Balloon tool.</li>
  <li>And now you can give that tool to the agent so it can just use it next time.</li>
  <li>Basically, you use an agent to build a tool that you can hand to another agent and it can use that tool whenever it needs to.</li>
  <li>So now you can just say, “Find the red balloon in the image.” And the agent will use the tool to do that.</li>
  <li>If the tool works, then it’s going to get it right. You can build deterministic tools that the agent can use. Even though LLMs are at their core, non-deterministic, you can bake in lots of determinism.</li>
  <li>This is the magic, but also the simplicity of agents.</li>
  <li>It’s just an LLM using a computer.</li>
  <li>But it is incredibly flexible, incredibly generalizable, and incredibly composable. So basically you can solve almost any problem.</li>
  <li>The other important thing here is the file system.</li>
  <li>Most systems rely on data to provide any value.</li>
  <li>It turns out if you put data in the right place in a file system, meaning folders on a computer, and you organize that well, the agent can just use tools to find what it needs.</li>
  <li>So basically, agents come down to tools and file systems.</li>
  <li>But those things can be assembled in so many different ways that you can solve incredibly hard problems that truly weren’t solvable before</li>
</ul>

<p>Hope you found this interesting. There are obviously many other details around what agents are. But I really wanted to capture the core essence as I see it in a way that’s accessible to folks. Let me know how I did and how you think about it.</p>]]></content><author><name>Matt Stockton</name><email>mattstockton@gmail.com</email></author><category term="Software Engineering" /><category term="ai-tools" /><category term="agents" /><category term="llm" /><category term="software-engineering" /><summary type="html"><![CDATA[My attempt to describe what an agent is and why it's so incredible, yet simple.]]></summary></entry><entry><title type="html">We Are Here</title><link href="https://mattstockton.com/2026/02/17/we-are-here.html" rel="alternate" type="text/html" title="We Are Here" /><published>2026-02-17T00:00:00+00:00</published><updated>2026-02-17T00:00:00+00:00</updated><id>https://mattstockton.com/2026/02/17/we-are-here</id><content type="html" xml:base="https://mattstockton.com/2026/02/17/we-are-here.html"><![CDATA[<p>This post is different than my other posts. I’ve found myself trying to write down all of the thoughts I’ve had about what AI is doing, particularly for how we build software. So much has changed and things are moving so fast yet it almost feels like there’s no time to even reflect on it all. And there are so many angles to take and perspectives to have. Meanwhile, things are changing so fast that even those perspectives shift rapidly. So this is my attempt to just get some words down. Not in a narrative forum or with a story or really any coherency, but just a list of thoughts I’ve been having and experiences I’ve been having as it relates to building software.</p>

<p>So why am I writing this down? Partly as a means of reflection. Partly as a way for people who are also thinking about this to feel seen. I don’t think there’s any particular reason or need to feel seen, but just acknowledging that I think there are more and more people who are having these types of thoughts about software, some of which are thrilling and some of which are discomforting. And I’m writing mine down. It’s also for folks who might not have had the time to explore these tools or understand just where we are at. If you read the following and you’re writing software and this all feels just very alien to you, I think it’s worth your time to explore - and I think there is an urgency to explore. My candid advice is that you need to do it now actually.</p>

<p>None of this is meant as a brag or anything of that nature. I’m just stating things as they are happening to me and what I’m seeing. Like the title says: We Are Here.</p>

<p>If you distill it all down to the core essence, It’s that things have changed in software so dramatically over the last six months that it’s truly a completely different thing. If you would have showed me this list three years ago, it would have been completely incomprehensible. I would have told you that there’s no way that these things are true.</p>

<p>So here’s the list and I’m not going to use AI to modify this or to make it sound better. Or anything like that. These are the raw notes.</p>

<ul>
  <li>I have not written a single line of code myself for at least the past four months. Zero.</li>
  <li>I am the most productive I have ever been in my career. And I’m astounded by what I can accomplish on an almost daily basis.</li>
  <li>I’ve never had more fun building.</li>
  <li>My productivity in terms of what I can build has likely 10x’ed compared to two years ago.</li>
  <li>Some things I can build in minutes, where it literally would have taken hours or days before.</li>
  <li>I’ve never felt busier with work, but most of the times that aspect is energizing. I actually have a hard time putting it down, which is a feeling I haven’t had in a number of years. The last time I felt like this was probably when iOS came out and you could build iPhone apps in the late 2000s.</li>
  <li>I rarely use my keyboard anymore, particularly at my home office. It’s just me speaking to my computer using Mac Whisper.</li>
  <li>I find myself reading code less and less. Yes, I am still reading it and I’m not just vibe coding, but I’m finding other ways to ensure the system works as expected without reading all the details. There’s never been a better time to have these tools and still know what good looks like from previous experience.</li>
  <li>I am building software that self-updates. Meaning that it emits information as it runs, and then is able to look at that information after it runs to make improvements to itself.</li>
  <li>I often run “pre-determined commands” (e.g. skills or slash commands) that accomplish ‘operational’ work which would have two years ago, taken me several hours.</li>
  <li>I can multi-task on extremely complex projects in disparate areas without feeling overwhelmed. In fact, sometimes this feels almost necessary with how tools like Claude Code and planning mode work. It’s like the code compiling days all over again.</li>
  <li>I can record myself on a run talking through something I want to do, whether it be a document I want to produce or even a large code change that I want to make. When I get back to my desk, I can use this transcript with a predetermined command. And it almost always is able to one-shot the changes correctly or get very close.</li>
  <li>I spend a lot of my time answering questions that an AI asks me. In fact, one of the most valuable ways to make sure these tools produce what I want them to produce is to allow them to ask me questions exhaustively.</li>
  <li>The tools feel like a superpower now. And they continue to get rapidly better. There are truly step changes that have happened in the last couple months, and there is no sign that this is going to stop.</li>
  <li>I am still surprised by software engineers who don’t see it or who don’t get it who are not doing it. I’m obviously deep in the rabbit hole, but it is so incredibly obvious to me that things have changed forever. Classical software engineering is over.</li>
  <li>There is still an immense need for software engineering talent and specifically systems thinking. There’s never been a better time to apply systems thinking than right now. It feels like a cheat code to have been able to build software classically for the last 20 years and then be able to use these tools now.</li>
  <li>There are still lots of quirks with how to use these tools. Many of the people in this space that I have high regard for are using the tools similarly - but there’s still a a lot of differentiation. The only way to figure this out is to get your hands dirty and do the thing.</li>
  <li>I’m finding myself to be more reliant on AI to do specific things, and less reliant on it to do other things over time. As a specific example, planning mode is absolutely critical now for these tools, and I will spend a ton of time thinking through the plans and trying to build the plans without AI to help me as I first cut. Because that first step that the model takes and the direction it starts heading is enormously important. So it’s worth the time to think critically here.</li>
  <li>So many things in this industry have changed so rapidly, particularly over the last six months. I intellectually believe it’s only going to accelerate but I still don’t fully think I’ve internalized what that means.</li>
  <li>I have no idea where this is all going and definitely have my stretches of anxiety about it all. But I am here for it and I’m going to lean into it and I’m going to try to help others that want to do that too.</li>
</ul>

<p>One more that’s useful to add – and this is the thing that I struggle with the most. I actually feel more and more behind every day. I know I am not, but honestly that is how fast things are moving – and it’s only getting faster.</p>]]></content><author><name>Matt Stockton</name><email>mattstockton@gmail.com</email></author><category term="Software Engineering" /><category term="ai-tools" /><category term="software-engineering" /><category term="claude-code" /><category term="reflection" /><summary type="html"><![CDATA[Raw, unedited notes on what it feels like to build software right now.]]></summary></entry><entry><title type="html">If You Want to Play Games, You Have to Make Games</title><link href="https://mattstockton.com/2026/02/16/if-you-want-to-play-games-you-have-to-make-games.html" rel="alternate" type="text/html" title="If You Want to Play Games, You Have to Make Games" /><published>2026-02-16T00:00:00+00:00</published><updated>2026-02-16T00:00:00+00:00</updated><id>https://mattstockton.com/2026/02/16/if-you-want-to-play-games-you-have-to-make-games</id><content type="html" xml:base="https://mattstockton.com/2026/02/16/if-you-want-to-play-games-you-have-to-make-games.html"><![CDATA[<p>This post is going to be different than my normal posts in a couple ways. Let me give you the TLDR first in case you want that:</p>

<ul>
  <li>Me and my kids built some games that you can play on the web. They spoke to my computer to build them, and <a href="https://docs.anthropic.com/en/docs/claude-code/overview">Claude Code</a> built them.</li>
  <li>You can go play them now at <a href="https://confettigalaxy.com/">confettigalaxy.com</a>.</li>
  <li>I was stunned at how well this works. I am stunned almost every day by the things I am able to do with AI.</li>
  <li>It made me think more deeply about AI, education, and the skills that matter to be successful in society - something that, upon reflection, I’ve been avoiding thinking about.</li>
</ul>

<figure class=""><img src="/docs/assets/images/confetti-galaxy/landing.png" alt="The Confetti Galaxy landing page" /></figure>

<h2 id="the-story">The Story</h2>

<p>My oldest daughter loves to play this online game platform. It’s web-based and has all these different options to play. Some of the games are fun and educational, but some of them are just mindless. I get it, it’s a fun way to use your time and it’s interactive. But there’s been a bunch of friction about her using it and wanting to use it all the time, and a lot of arguing about when she can use it.</p>

<p>I’ve been meaning to think more deeply about what my true view is on AI as it relates to kids. There’s no question that the entire way kids learn is going to have to fundamentally change. There’s also no question in my mind that the current educational systems will always be behind in figuring out how to integrate these new capabilities into learning experiences. AI is accelerating so fast at this point that the gap is just going to continue to get wider.</p>

<p>As someone who’s incredibly deep in this rabbit hole, I took a step back and tried to ask myself - why haven’t I thought more about this? I think it’s because I don’t have a good solution, and I’m worried about how this change takes shape, particularly for younger people. So it’s avoidance.</p>

<p>AI has made a step change in capabilities in the last three months. Things that have taken me weeks in the past now take me hours or even minutes. And it keeps compounding as you build systems to better utilize these tools.</p>

<p>So taking all of that into account - yesterday I had an idea. I told my daughter that if she wanted to play games, then she had to make games. As an eight-year-old who has no idea what AI is, that concept didn’t mean anything. So I had to show her what I meant. I opened up Claude Code with her. I know how to orchestrate this tool very well at this point because I’m in it every day, so we weren’t starting from scratch and I knew the scaffolding we had to set up. I worked with her and then with my other daughter to build games. I used the system in a way that had it ask us questions about what we’re trying to achieve.</p>

<p>It was super interesting because kids are so creative. They have ideas and oftentimes they have trouble describing them exactly, but as you keep asking them questions, they can really clarify their thinking. We worked through this for a while. I set up a little Q&amp;A workflow using a Claude Code skill, but I also had my kids speak to the computer using Mac Whisper and tell their own ideas about the games. Then we had it build the games. I helped with some of the polish and I knew how to make them a little bit more interactive, but most of these games were the kids’ ideas with me asking them questions about what they wanted them to do.</p>

<p>Within an hour they were playing their own games. Then I told them that their friends can play their games too. I spent a little bit more time with Claude Code working on a plan to deploy these games so that anyone can play them. And here they are on <a href="https://confettigalaxy.com/">confettigalaxy.com</a>.</p>

<p>This was so fun that I built my own game afterwards. I built a game called Go Out For The Pros, which is a game I used to play with my dad at the playground. I described the experience using the skill workflow - what we used to do, how it worked - and it built the game. It’s honestly exactly how I remembered it. Absolutely magical.</p>

<figure class=""><img src="/docs/assets/images/confetti-galaxy/pros.png" alt="Go Out For The Pros - a football catching game set at a playground" /></figure>

<p>My daughter wanted to build a dolphin swimming game where you swim around and collect candy. She described the whole thing herself and we built it together.</p>

<figure class=""><img src="/docs/assets/images/confetti-galaxy/dolphin.png" alt="An underwater dolphin game" /></figure>

<h2 id="the-jumble-of-thoughts">The Jumble of Thoughts</h2>

<p>After this exercise, a lot of thoughts came to the surface that have been lingering for a while. I haven’t fully clarified them all, but it felt useful to just write them down. Here they are in their raw form.</p>

<h3 id="education-is-going-to-change">Education is going to change</h3>

<p>There’s a lot of uncertainty about how AI changes education. Educational systems will always be behind the leading edge, and the gap is only going to get wider. But I think the failure mode is not leaning into it. If you want your kids to understand AI and how it fits into society, I think you have to experiment and do that yourself right now.</p>

<h3 id="the-skills-that-matter-arent-new">The skills that matter aren’t new</h3>

<p>A lot of the skills you need to use AI well aren’t new skills. They’re skills that need to be amplified to take full advantage of this new capability. The ones that come to mind most for me:</p>

<ul>
  <li><strong>Agency.</strong> Do you believe that you can create something yourself, and will you be assertive in taking that initiative? This skill has always been rare and incredibly useful. People who know what agency can do and are assertive about it are in an incredible position to take advantage of these new tools.</li>
  <li><strong>Curiosity.</strong> Are you willing to think about things and make connections across various topics? Are you learning about things well outside of technology that can impact your viewpoints? Curiosity and making connections has never been more valuable.</li>
  <li><strong>Clear thinking and communication.</strong> Can you think through what you’re trying to do and communicate that with clarity? There is a vast difference in outcomes from using these AI tools based on your specificity and your clarity of thought.</li>
  <li><strong>Willingness to iterate.</strong> Do you treat things as things that can improve over time? Do you try to nudge things to improve in an incremental fashion? Utilizing these tools effectively truly requires a feedback loop and iteration, and that’s where you get the compounding. You have to be able to analyze what you’re getting back as the result and figure out how to improve it.</li>
  <li><strong>Comfort with discomfort.</strong> Can you push through the discomfort of trying something new? Even when it feels odd, can you deal with the change and adapt? Things are moving fast and the people who have the mental models to push through that discomfort instead of retreating from it are going to be in a much better position.</li>
</ul>

<p>There are other skills that are useful for AI, but those are the top ones that come to mind. None of them are technical in nature. People have been trying to develop these types of skills for years, even before AI. If you want your kids to be able to thrive in what’s coming with this wave of AI, you need to figure out how to help them learn these skills.</p>

<h3 id="these-skills-are-learnable">These skills are learnable</h3>

<p>Building the games yesterday showed me these skills are learnable with the right environment and the right person helping. My probing questions forced the kids to think more clearly about what they wanted, and they rose to it. And one of the big advantages of learning these skills with these tools is the feedback loop is so tight. You describe something, it builds it, and you can see the progress immediately. The kids were amazed by how fast we could build these games and it gave them a ton of energy to think about what else they could do.</p>

<h3 id="these-tools-are-absolutely-remarkable">These tools are absolutely remarkable</h3>

<p>I intellectually know at this point that these things are extremely capable. My mental model has always been to try to throw your most ambitious project at these tools because they will continue to surprise you. But even with that knowledge, I am continually surprised.</p>

<p>This is a portal my kids can use instead of the previous games they were playing. We can build new games together in about 10 to 15 minutes. They can create their own adventure and then play it. We can build games they want to play instead of the mindless games they’ve been playing. And it forces them to utilize the skills I talked about above.</p>

<p>I think people still truly underestimate how remarkable these tools are. They can do stunning things. As someone who’s deep in this rabbit hole every day, I am stunned on a daily basis. I really think people need to see this for themselves. Dig in, try these things. It’s worth carving off time to do so because it is absolutely astounding what they can do.</p>

<h3 id="equity">Equity</h3>

<p>Education is going to drastically shift given how these tools can be integrated. Some people are starting to figure this out and take action on it. But the folks who have figured it out already have the means, and the technology and the environment to use it are readily available to them. If you look at something like <a href="https://alpha.school/">Alpha School</a>, the results speak for themselves - the improvements in testing outcomes they’re seeing by integrating AI into how kids learn are significant. I think it’s pointing in the right direction. But it’s not accessible to everyone yet, and won’t be for some time.</p>

<p>I’ll be honest - I haven’t spent enough time thinking through how this all plays out at scale. But the opportunity is real. AI enables individually tuned lesson plans in a way that was never before possible, and research on personalized instruction consistently shows it’s more effective than one-size-fits-all approaches. If we don’t figure this out, the disparity in skills and understanding has the possibility to be orders of magnitude greater than anything we’ve ever seen before. How do we democratize access to what I’m talking about in this post in a way that works? How do we ensure as a society that this is available to everyone? I don’t have the answer, but it’s a question we need to be asking.</p>

<h2 id="where-i-landed">Where I Landed</h2>

<p>I know this is a jumble of thoughts and I haven’t truly clarified them all. But they felt important enough to at least begin trying to clarify my thinking on them. The thread running through all of this for me is just acknowledging how much the world is going to change for our youth because of this technology. There are going to be a lot of decisions that need to be made and things that need to change to make sure we’re taking advantage of these remarkable tools, but doing it in a way that’s fair, equitable, and helps people thrive in society and the economy. I don’t have the answers to all of this. Short term, it’s on me to help my kids adapt, and the earlier I can do that the better off they are. But I’ve committed myself to spend more time thinking about this and figuring out what my role in it is beyond my own family.</p>

<p>On a lighter note, definitely check out the games at <a href="https://confettigalaxy.com/">confettigalaxy.com</a> because they are pretty fun. And if you’re curious about how you can build these games with your kids, I’ll do a follow-up post that is more technical in nature to show exactly what I did.</p>]]></content><author><name>Matt Stockton</name><email>mattstockton@gmail.com</email></author><category term="Personal Reflections" /><category term="AI" /><category term="Claude Code" /><category term="Education" /><category term="Parenting" /><summary type="html"><![CDATA[I had my kids build their own games using Claude Code, and it got me thinking about AI, education, and equity.]]></summary></entry><entry><title type="html">I Published My Portfolio Analysis Workflow - Try It Yourself</title><link href="https://mattstockton.com/2026/02/12/portfolio-analysis-skill-for-claude-code.html" rel="alternate" type="text/html" title="I Published My Portfolio Analysis Workflow - Try It Yourself" /><published>2026-02-12T00:00:00+00:00</published><updated>2026-02-12T00:00:00+00:00</updated><id>https://mattstockton.com/2026/02/12/portfolio-analysis-skill-for-claude-code</id><content type="html" xml:base="https://mattstockton.com/2026/02/12/portfolio-analysis-skill-for-claude-code.html"><![CDATA[<p>I recently wrote about <a href="/2026/02/10/building-a-portfolio-optimization-plan-with-claude-code.html">using Claude Code to build a portfolio optimization plan</a> - I fed it our brokerage CSV exports, described our goals, and iterated over several sessions until it produced a phased action plan with tax impact analysis and fund recommendations. That post ended with a section called “If You Want to Try This” with tips on how to do it yourself from scratch. After it went up, a few people asked if I could just share the workflow so they didn’t have to start from zero.</p>

<p>So I did: <a href="https://github.com/MattStockton/portfolio-analysis">MattStockton/portfolio-analysis</a>.</p>

<p>If you haven’t used <a href="https://docs.anthropic.com/en/docs/claude-code/overview">Claude Code</a> skills before - a skill is basically a set of instructions you give Claude so it knows how to do something specific. It’s not code or a traditional app. It’s markdown files that describe a workflow, and Claude follows them when you ask it to do that thing. Because it’s just text, it’s not locked to Claude Code either. You could drop these files into a Claude project on claude.ai, use them with another AI tool, or just read them and adapt the approach yourself.</p>

<h2 id="what-i-changed-to-make-it-reusable">What I Changed to Make It Reusable</h2>

<p>My original project was built around my specific situation. The parsers only handled the brokerage formats I happened to use. The fund classifications were built up one session at a time as Claude encountered my specific holdings. My tax calculations were hardcoded to my bracket. My allocation targets were baked into the project context. All of that worked great for me but wasn’t useful to anyone else.</p>

<p>To generalize it, I had Claude help me work through each of those pieces. The parsing logic now reads whatever headers are in your file and writes a parser on the fly instead of expecting my specific column layouts. I pulled the fund classifications into a reference database of 190+ funds across eight brokerages, with keyword matching for anything not in the database. Tax calculations got their own reference file covering federal and state brackets. And the allocation targets are now a set of seven templates (or custom) you can pick from or ignore entirely and define your own.</p>

<p>The other big thing was the workflow itself. In my original project, I wrote a long goal prompt and then iterated with Claude over several sessions. That worked because I was willing to put in the time and I knew what I was looking for. For someone picking this up cold, that’s a lot to figure out. So the skill breaks it into six steps with structured questions at each one - you’re picking from options instead of writing freeform prompts.</p>

<p>The goal was to get people 80% of what I got without having to do all the upfront work I did. The other 20% comes from the back-and-forth - clarifying your specific constraints, pushing back on recommendations, iterating until the plan fits your situation.</p>

<h2 id="what-you-get-out-of-it">What You Get Out of It</h2>

<p>You provide CSV exports from your brokerage accounts and the skill handles parsing them - Fidelity, Schwab, Vanguard, E*TRADE, Merrill, or whatever else. It reads the headers and figures out the format.</p>

<p>From there it classifies your holdings, runs a gap analysis against your targets, checks for tax-inefficient placements (like bonds in taxable accounts or cash sitting in Roth space), and looks at whether your recurring contributions are closing gaps or making them worse.</p>

<p>Then it puts together a phased action plan. Free moves in retirement accounts first, then low-tax fixes, then contribution changes, then long-term hold-and-dilute strategies. Each recommendation includes the specific fund, amount, account, tax cost, and rationale.</p>

<h2 id="try-it">Try It</h2>

<p>You need CSV exports from your brokerage accounts - most brokerages have a “Download” or “Export” button on the positions page. When you export, look for options to include extra fields like cost basis, Morningstar category, expense ratio, and fund ratings. The skill can figure a lot of this out from the fund symbol alone, but having it in the export gives it better data to work with. The <a href="https://github.com/MattStockton/portfolio-analysis">repo README</a> has setup instructions. I built it as a Claude Code skill, but you don’t need Claude Code specifically. The skill files are just markdown - if you upload them to claude.ai, ChatGPT, or any other AI tool that lets you provide reference files, it should pick up the workflow and do its best with it.</p>

<p><strong>Privacy:</strong> The skill is just markdown files - it doesn’t collect or send anything anywhere. Your brokerage data goes to your LLM provider as part of the conversation, same as anything else you share in these tools.</p>

<p><strong>Limitations:</strong> US tax system only. Tax brackets are based on 2025 law. Default templates lean toward index funds but it supports active and blended approaches. Not financial advice.</p>

<h2 id="last-thing">Last Thing</h2>

<p>Your situation is different from mine and you’ll probably want to modify things. The skill is just markdown files - you can change the instructions, add your own constraints, or take it somewhere I didn’t. How far you get depends on how much effort you put into the back-and-forth.</p>

<p>Beyond the portfolio stuff, I think the more interesting idea is that this pattern works for any domain. If you’ve built up a good workflow with an AI tool over multiple sessions, you can probably extract it into a skill that other people can pick up and run with.</p>

<p>If you try it, let me know how it goes. Open an issue on the <a href="https://github.com/MattStockton/portfolio-analysis">repo</a> if something doesn’t work or if your brokerage format trips it up.</p>]]></content><author><name>Matt Stockton</name><email>mattstockton@gmail.com</email></author><category term="Finance &amp; Investing" /><category term="personal-finance" /><category term="ai-tools" /><category term="claude-code" /><category term="portfolio-analysis" /><category term="workflow" /><summary type="html"><![CDATA[People asked me to share the portfolio optimization workflow from my last post. I turned it into a Claude Code skill you can install and try with your own brokerage data.]]></summary></entry><entry><title type="html">How I Built a Portfolio Optimization Plan with Claude Code</title><link href="https://mattstockton.com/2026/02/10/building-a-portfolio-optimization-plan-with-claude-code.html" rel="alternate" type="text/html" title="How I Built a Portfolio Optimization Plan with Claude Code" /><published>2026-02-10T00:00:00+00:00</published><updated>2026-02-10T00:00:00+00:00</updated><id>https://mattstockton.com/2026/02/10/building-a-portfolio-optimization-plan-with-claude-code</id><content type="html" xml:base="https://mattstockton.com/2026/02/10/building-a-portfolio-optimization-plan-with-claude-code.html"><![CDATA[<p>I saw the below tweet recently and I did exactly this. Over multiple sessions with Claude Code, I fed it our brokerage CSVs, described our goals and constraints, and worked with it to produce a portfolio optimization plan - phased actions, tax impact analysis, fund recommendations, before-and-after allocation grids. The kind of plan you’d pay a financial advisor to put together, except I built it with an AI.</p>

<p><a href="https://x.com/buccocapital/status/2021290232205676944"><img src="/docs/assets/images/ai-financial-advisor/buccocapital-tweet.png" width="500px" /></a></p>

<p>A year ago, I wrote about <a href="/2025/02/07/buildling-a-personal-finance-tool-with-ai.html">building a personal finance tool in a day with AI</a>. That project required software development - I used AI to help me build an application with a UI, data pipelines, and visualization components. Models have gotten flexible enough that you don’t need to build an app anymore. An agent like Claude Code can read your files, write scripts, run them, and iterate on the output directly.</p>

<figure>
<img src="/docs/assets/images/ai-financial-advisor/workflow-infographic.png" />
<figcaption><em>Overview of the workflow - from raw data and goal prompt through iteration to the final plan</em></figcaption>
</figure>

<h2 id="setting-up-the-workspace">Setting Up the Workspace</h2>

<p>I used <a href="https://docs.anthropic.com/en/docs/claude-code/overview">Claude Code</a>, which runs in a terminal and can read and write files on your machine. I pointed it at a directory with all my financial data. Portfolio analysis is a good fit for this because there’s a lot of messy data in different formats that the model can just dig through directly.</p>

<p>I started with <strong>three inputs</strong>:</p>
<ul>
  <li><strong>CSV exports</strong> from my various financial accounts - holdings, amounts, cost basis, fund fees, fund ratings, and a ton of other data the model could use</li>
  <li><strong>A supplementary text file</strong> - there was other information I knew the model would need but no good way to export it, so I typed up a scratchpad with a bulleted list of things like bank balances, 529s, employer retirement account details, and automated investment settings. Not structured, but it had the raw numbers.</li>
  <li><strong>A goal prompt</strong> describing what I wanted</li>
</ul>

<p>The goal prompt was the most important piece. I spent a lot of time on it and wrote it mostly by hand before having the AI review it - I wanted to force myself to think through all of our constraints and goals. I included:</p>

<ul>
  <li><strong>What data exists and where</strong> - a narrative of what files are in the directory and what the model should be considering</li>
  <li><strong>The desired output format</strong> - I brainstormed the kinds of things I wanted to see: allocation tables, style-box grids, holdings charts, different ways to slice the data. And ultimately, a concrete plan for how to modify the portfolio to get to a target allocation.</li>
  <li><strong>Our preferences</strong> - I want the portfolio to be tax efficient, I’d rather find the lowest fee ETFs that are good than chase performance, and I don’t want to overcomplicate things. I’m well-versed in this stuff but complexity isn’t the goal.</li>
  <li><strong>Known constraints</strong> - everyone has their own nuances. For us, we’d decided not to touch the allocations or ongoing contributions for a specific account, so I called that out as a constraint.</li>
  <li><strong>What we suspected was wrong</strong> - I had a view on things that could be better, so I included it, but I caveated it with “if you have a different view, I want to hear it.” I didn’t want to oversteer the model toward a specific outcome. If you have insight that can nudge it in the right direction, it’s worth providing - just make sure you give it room to look outside your ideas too.</li>
  <li><strong>Strawman proposals</strong> - I included our own rough ideas for target allocations and investment changes. Even if they’re wrong, it gives the model a starting point to react to rather than building from scratch.</li>
  <li><strong>“Ask me questions exhaustively”</strong> - I put this at the end of the prompt. It gets the AI to ask clarifying questions before jumping straight to conclusions, which makes the first plan way better.</li>
</ul>

<p>I also built up a <code class="language-plaintext highlighter-rouge">CLAUDE.md</code> file - project instructions that Claude reads at the start of every session. I had Claude set it up initially after asking me some questions, but after a couple sessions I started having Claude update it with any learnings from each session. Over time it became the project’s memory - data quirks, strategy decisions, target allocations. When I came back days later, Claude picked up right where we left off.</p>

<h2 id="parsing-the-mess">Parsing the Mess</h2>

<p>Once I had the goal prompt and data ready, I handed the wheels over. Claude (Opus 4.6) worked for a while on its own - writing Python scripts to parse all the CSVs, classify every holding into target categories, handle encoding issues, deduplicate overlapping exports, and deal with non-standard line items that would get misclassified. It iterated through problems as it hit them and ended up in a really good place. The baseline it produced - a full gap analysis across every account and category - was impressive enough to start working from immediately.</p>

<h2 id="iterating-as-thought-partners">Iterating as Thought Partners</h2>

<p>The initial plan identified overweight categories, found tax-location violations, recommended specific fund swaps with expense ratio comparisons and Morningstar ratings, and calculated one-time tax costs versus ongoing savings. It even recommended tax-loss harvesting partner funds for every new position.</p>

<p>From there, it was like working with an analyst who has all the data but hasn’t lived with the accounts. I’d clarify something or reframe a constraint, and Claude would update every calculation, table, and recommendation in the plan. A few examples:</p>

<p><strong>I refined our international allocation with Claude.</strong> The initial plan had a single “international” bucket, but I wanted more nuance - specific targets across value vs. growth, small vs. large. I worked back and forth with Claude to figure out what that breakdown should look like and which funds would get us there. That kind of allocation decision is hard to think through alone because every change affects the rest of the portfolio.</p>

<p><strong>I asked Claude to self-critique</strong> - “rate this plan 1-10 and tell me what you’d change.” It gave itself a 7.5 and identified seven improvements. The biggest was hiding in plain sight: we had a large overweight position sitting in tax-free retirement accounts, where selling costs literally zero in taxes. The original plan had left it untouched. It also caught that its own retirement account deployment was putting money into a category that was already overweight. I approved the changes but asked for a phased approach rather than all-at-once.</p>

<p><strong>I reframed which accounts count toward targets.</strong> We decided to treat my wife’s employer retirement account as static for now - we wouldn’t touch it or change its allocations. This is an artificial constraint, but it simplifies the plan, and once we lock in the other changes we can always run the process again with new constraints if we want to modify that account later. The initial analysis left it out entirely. But it still holds real money in specific categories, and that affects what the rest of the portfolio needs to do.</p>

<p>Once I pointed this out, Claude built a combined framework quickly: define targets for the full portfolio, subtract what the static account provides, and optimize the rest to fill the gaps. Some overweight categories got worse, some underweight ones became more urgent, and the recommended purchases changed.</p>

<h2 id="what-it-produced">What It Produced</h2>

<p>I ended up with a ten-section plan document with appendices - and several sections went beyond what I asked for:</p>

<ul>
  <li><strong>Complete portfolio snapshot</strong> - Every account, every holding, with cost basis, embedded gains, expense ratios, and tax treatment. All parsed automatically from the raw CSV exports.</li>
  <li><strong>Gap analysis across ten fund categories</strong> - Current allocation versus targets, with the dollar gap and priority level for each. Included a visual bar chart showing the drift and a nine-box style grid (value/blend/growth vs large/mid/small) showing the portfolio tilt before and after.</li>
  <li><strong>Combined portfolio framework</strong> - A table showing how an account we’d decided not to touch still changes the effective targets for the rest of the portfolio.</li>
  <li><strong>Four-phase sequenced action plan</strong> - Concrete steps organized from “execute this week” through “ongoing quarterly.” Each action specified the fund, the dollar amount, the account, the tax cost, and the rationale. Separate actions for retirement accounts (zero tax) versus taxable (calculated tax impact).</li>
  <li><strong>Tax impact summary with payback periods</strong> - A table for every recommended trade showing the gain, tax cost, annual tax savings, and the breakeven timeline. A separate table for positions explicitly <em>not</em> recommended to sell - calculating the tax cost if you did sell and explaining why the math didn’t work.</li>
  <li><strong>Fund recommendations with deep research</strong> - An active buy list, a stop-buying list, and a hold-and-dilute list. Claude compared candidates on factor loadings, expense ratios, Morningstar ratings, AUM, and whether they screen out low-quality companies. Each recommendation included a tax-loss harvesting partner: a similar-but-not-identical fund from a different provider, so if a position drops you can capture the loss and immediately buy the partner to maintain exposure without triggering wash sale rules.</li>
  <li><strong>Tax location matrix</strong> - Which asset type belongs in which account type (pre-tax, Roth, taxable) and why, with a list of current violations and the action to fix each one.</li>
  <li><strong>Retirement contribution structure optimization</strong> - Claude looked at my wife’s employer retirement contributions unprompted and found that switching one component from after-tax to pre-tax treatment would save thousands annually, with the ability to convert at much lower rates during early retirement.</li>
  <li><strong>Natural dilution timeline</strong> - Projections for how long each overweight category takes to reach target through new money allocation alone, accounting for recurring contributions and dividend flows.</li>
</ul>

<figure>
<img src="/docs/assets/images/ai-financial-advisor/allocation-shift.png" width="450px" />
<figcaption><em>Anonymized excerpt from the plan - before/after allocation shift across categories</em></figcaption>
</figure>

<figure>
<img src="/docs/assets/images/ai-financial-advisor/style-grid.png" width="400px" />
<figcaption><em>Anonymized excerpt from the plan - nine-box style grid showing portfolio tilt</em></figcaption>
</figure>

<figure>
<img src="/docs/assets/images/ai-financial-advisor/tax-impact-analysis.png" width="400px" />
<figcaption><em>Anonymized excerpt from the plan - tax impact and payback period for a recommended trade</em></figcaption>
</figure>

<p>To be clear, this wasn’t a science experiment. I’m actively using this plan and I’m almost done executing it. I had Claude remix the full document into a shorter checklist so I can scan it and check things off as I go.</p>

<h2 id="if-you-want-to-try-this">If You Want to Try This</h2>

<ul>
  <li><strong>Export everything you can into files.</strong> Download CSV holdings from every brokerage account. Create a text file for anything without an export - bank account balances, employer retirement plan details (holdings, contribution amounts, fund options), automated investment settings, any pending changes you’re considering.</li>
  <li><strong>Write a detailed goal prompt.</strong> State your tax situation, risk tolerance, and time horizon. List your constraints - accounts you don’t want to touch, positions you can’t sell, contribution limits. Include what you already suspect is wrong. Add a strawman target allocation if you have one - it gives the AI something concrete to react to. Ask for a sequenced action plan, not general advice. And end with <strong>“ask me questions exhaustively”</strong> - the AI will ask clarifying questions instead of filling in the blanks, so you’re not leaving anything ambiguous about what you actually want.</li>
  <li><strong>Iterate like you would with a human advisor.</strong> The first plan will be solid but will miss things only you know about your accounts. Each clarification is fast; the AI updates every calculation in the plan instantly. Ask it to rate its own plan 1-10 - it’ll find things it missed.</li>
  <li><strong>Build project context that persists.</strong> If you’re using Claude Code, a <code class="language-plaintext highlighter-rouge">CLAUDE.md</code> file in your project directory carries across sessions. Add key decisions, data quirks, and what you’ve figured out as you go. Every future session builds on everything that came before.</li>
</ul>

<h2 id="the-work-compounds">The Work Compounds</h2>

<p>This blog post was written with the help of Claude Code. I pointed it at the same project directory, told it to read the session history and plan documents, and asked it to write a post about the process. The work you do with these tools compounds - you can distill it into other artifacts or remix it with other content. The prompt that kicked off this session:</p>

<p><img src="/docs/assets/images/ai-financial-advisor/blog-post-prompt.png" width="900px" />
<br /></p>

<p>Because the project context was already in the filesystem - the optimization plan, the session summaries, the strategy decisions - Claude had everything it needed to draft the post. I iterated on it from there, but the starting point was strong because all the context was already there from previous sessions.</p>

<p>A year ago this would have been a week of spreadsheet work or a few thousand dollars to a financial advisor. Instead it was a few evenings of back-and-forth with an AI, and the plan it produced is the one we’re executing.</p>]]></content><author><name>Matt Stockton</name><email>mattstockton@gmail.com</email></author><category term="Finance &amp; Investing" /><category term="personal-finance" /><category term="ai-tools" /><category term="claude-code" /><category term="portfolio-analysis" /><category term="workflow" /><summary type="html"><![CDATA[I fed our brokerage data into Claude Code and iterated over multiple sessions to produce a portfolio optimization plan - phased actions, tax impact analysis, fund recommendations. From messy CSVs to an actionable plan we're now executing.]]></summary></entry><entry><title type="html">What I Tell People Getting Started with Claude Code</title><link href="https://mattstockton.com/2026/01/29/what-i-tell-people-getting-started-with-claude-code.html" rel="alternate" type="text/html" title="What I Tell People Getting Started with Claude Code" /><published>2026-01-29T00:00:00+00:00</published><updated>2026-01-29T00:00:00+00:00</updated><id>https://mattstockton.com/2026/01/29/what-i-tell-people-getting-started-with-claude-code</id><content type="html" xml:base="https://mattstockton.com/2026/01/29/what-i-tell-people-getting-started-with-claude-code.html"><![CDATA[<p>I recently spent an hour walking a friend through how I use <a href="https://www.anthropic.com/claude-code">Claude Code</a>. He’s not a developer - he manages a portfolio of 90+ companies and does a lot of knowledge work: newsletters, performance reports, meeting notes. He’d heard me talk about these tools and wanted to see what was possible.</p>

<p>It was a good conversation. Pairing on this together - actually trying things instead of just describing them - worked better than I expected. He could see things in action and ask questions as we went. Afterwards I wanted to write down some of the takeaways, some of which I’ve <a href="/2026/01/07/claude-code-for-non-technical-work.html">written about before</a>.</p>

<p>Most of these are habits rather than features - ways of working with the tool that make everything else easier.</p>

<h2 id="plan-mode">Plan Mode</h2>

<p>Claude Code has a “plan mode” - a thinking mode before execution. When you put Claude in plan mode, it won’t take any actions. It thinks through what you’re asking for and asks clarifying questions. Type <code class="language-plaintext highlighter-rouge">/plan</code> or press shift-tab before describing what you want.</p>

<p>I told my friend: always be in plan mode for anything non-trivial. Plan mode lets you explore without committing to anything. You can describe a vague idea, let Claude ask questions, refine your thinking, and only then decide whether to proceed. If you jump straight to execution, Claude might start creating files or making changes before you’ve fully figured out what you want. When planning is complete, Claude will ask if you want to clear the context and execute. Clear the context - it prevents the model from getting confused by all the back-and-forth exploration that happened during planning. You keep the plan, lose the noise.</p>

<p>You’re never committing to anything until you’ve seen Claude’s proposed approach and explicitly approved it.</p>

<h2 id="the-interrogation-pattern">The Interrogation Pattern</h2>

<p>You don’t need 100% clarity on what you want before you start. You can use Claude to pull it out of you.</p>

<p>Tell Claude to “ask me questions exhaustively” or “use AskUserQuestion exhaustively to understand what I want.” Claude will keep asking clarifying questions until you tell it to stop. It pulls information out of you that you didn’t know you needed to provide. It asks about edge cases you hadn’t considered. It catches assumptions you didn’t realize you were making.</p>

<p>You don’t need to be precise upfront. You don’t need to know the right terminology or anticipate what Claude needs to know. Just describe your goal and let Claude figure out what questions to ask. It helps to know how to instruct the model, but you don’t need to be perfect at it - you can iterate, and answering questions is easier than crafting the perfect prompt.</p>

<h2 id="work-logging-and-compounding">Work Logging and Compounding</h2>

<p>Claude Code loves files. It can read them, search them, reference them later. So store information about your work in files - work logs, commit messages, meeting notes, project summaries. Be verbose. You can always trim it down later, but you can’t recover context you didn’t capture.</p>

<p>When Claude can read what you’ve done before, it produces better outputs. You’re not starting from zero every session. I can ask Claude to look at my git commits and summarize what I worked on last week. I can point it at meeting notes and have it draft a follow-up. None of this happens if you’re using Claude as a chat interface that forgets everything between sessions.</p>

<p>Automate the capture where you can. You can get Claude Code to log its own work by setting up good CLAUDE.md files and skills - and you can use plan mode and the interrogation pattern to build those. Tell Claude what you want to track, let it ask questions, and have it create the instructions for itself. You’re not going to get it right the first time. But as you figure out patterns for storing what you’ve learned, things just keep getting better. That’s how I put it to my friend: “things just magically get better” as context accumulates.</p>

<h2 id="git-repository-backing">Git Repository Backing</h2>

<p>This is probably the biggest technical hurdle for folks who aren’t technical. But I do think it’s necessary, and it’s achievable - Claude Code can help you get it set up.</p>

<p>A git repository is just a folder where changes are tracked over time. Every time you save a checkpoint (called a “commit”), git remembers what changed and lets you add a note about why.</p>

<p>Files need history - not just for version control, but so the system itself can reference what changed and when. I have a skill that looks at uncommitted changes, figures out what I did based on the changes and session context, adds an entry to my work log, and creates a detailed git commit. When I need to know what I worked on, I ask Claude to look at the git commits since a certain date and give me a summary. It looks at the changes in each commit and tells me what I did.</p>

<p>History gives Claude something to work with beyond just the current state of your files.</p>

<h2 id="the-slot-machine-mentality">The Slot Machine Mentality</h2>

<p>Bias toward action and iteration rather than reviewing everything upfront. Work happens fast enough now that you can throw things away - if something isn’t working or heads in the wrong direction, scrap it and start over.</p>

<p>I’ve gotten more comfortable not reading every detail of what Claude plans to do. I just let it execute. Because first, it’s probably right. And second, if it’s not right - pull the slot machine again. It’s lower cost to iterate than to review everything upfront.</p>

<p>You don’t need to understand every line of what Claude produces. You need to understand the output and whether it matches what you wanted. If it doesn’t, try again. Describe what’s wrong and let Claude fix it. If a session goes off the rails, type <code class="language-plaintext highlighter-rouge">/rewind</code> to backtrack to an earlier point in the conversation.</p>

<p>Plan mode, git, and <code class="language-plaintext highlighter-rouge">/rewind</code> all let you back out of mistakes. Don’t let the desire for certainty slow you down.</p>

<h2 id="building-skills-through-doing">Building Skills Through Doing</h2>

<p>Skills are saved instructions that tell Claude how to do a specific task. They’re markdown files that describe what Claude should do, what questions to ask, and how to format the output. I have skills for summarizing meetings, committing changes with work log updates, turning transcripts into blog posts. Once you have a skill, you type a command and Claude follows those instructions.</p>

<p>I wouldn’t try to create skills before you’ve done the task manually at least once. Work through a real task with Claude - post a transcript, describe a document you need, iterate until you get something you like. Then tell Claude to turn that conversation into a skill you can run next time.</p>

<p>You don’t even need to read the skill Claude creates. Just trust that it captured what worked. Next time you have a similar task, invoke the skill. If the situation is slightly different, just tell Claude - it adapts. You figure out what works through doing, then codify it.</p>

<h2 id="getting-started">Getting Started</h2>

<p>People are often hesitant to start because they’re not sure how. But there’s not much downside here. You can explore without committing, revert when things go wrong, start without knowing exactly what you want. If something breaks, you try again.</p>

<p>The skills that matter most for using Claude Code are non-technical: curiosity, persistence, clear thinking, confidence. It helps to understand concepts like git and file organization, but Claude Code can help you learn those too - if you keep trying things. After our session, my friend said what helped most was actually seeing what this looks like in practice. That’s hard to get from reading. You have to try it.</p>

<p>If you’re already using Claude Code and have patterns that work for you, I’d like to hear about them.</p>]]></content><author><name>Matt Stockton</name><email>mattstockton@gmail.com</email></author><category term="Getting Started with AI" /><category term="claude-code" /><category term="workflow" /><category term="productivity" /><category term="ai-tools" /><category term="knowledge-work" /><summary type="html"><![CDATA[Plan mode, work logging, the interrogation pattern - habits that make Claude Code useful whether you write code or not.]]></summary></entry><entry><title type="html">The Biggest AI Opportunity Isn’t Better Models</title><link href="https://mattstockton.com/2026/01/10/the-biggest-ai-opportunity-isnt-better-models.html" rel="alternate" type="text/html" title="The Biggest AI Opportunity Isn’t Better Models" /><published>2026-01-10T00:00:00+00:00</published><updated>2026-01-10T00:00:00+00:00</updated><id>https://mattstockton.com/2026/01/10/the-biggest-ai-opportunity-isnt-better-models</id><content type="html" xml:base="https://mattstockton.com/2026/01/10/the-biggest-ai-opportunity-isnt-better-models.html"><![CDATA[<p>Six months ago, an old colleague, Martin, reached out to me after reading some of my writing about AI. He’s an expert in zeolites - microporous crystalline minerals used in catalysis and filtration. Only about 250 zeolite structures have been synthesized, but millions are theoretically possible. We sat down and he showed me his website, <a href="http://www.hypotheticalzeolites.net/">Hypothetical Zeolites</a>, and software he’d built to computationally generate and test potential new structures. People from industry were using it. He’d been at this for years. I was fascinated by this domain I had no idea existed - the next day I spent an hour on ChatGPT voice going back and forth learning more about zeolites. It made me realize just how many different things you can be an expert on in this world.</p>

<p>He had ideas for improving his platform, so I nudged him to start experimenting with AI tools. <a href="/2025/06/14/ai-coding-tools-journey.html">Claude Code</a> existed but was still nascent - not yet widely adopted. He started using various AI tools and had some success with them. Now, six months later, thinking about what he wanted to build - ways to make his research faster - I keep imagining what would happen if he got proficient with today’s tools. Really proficient, not just dabbling. He might be able to move 10x faster on problems he’s been chipping away at for years.</p>

<p>That got me thinking. How many other people like Martin are out there? By “like Martin” I mean: deep expertise in a narrow field, probably building tools or workflows to support their work, but not deep in the rabbit hole of understanding the best ways to use leading edge AI tools.</p>

<p>Here’s what happens when someone like that figures it out. Andrew Hall, a political economist at Stanford, recently <a href="https://x.com/ahall_research/status/2007603340939800664">shared on Twitter</a> how he used Claude Code to replicate and extend one of his old papers on vote-by-mail and election turnout. It downloaded his original repo, translated Stata code to Python, crawled the web for updated election and census data, ran new analyses through 2024, created tables and figures, performed a lit review, wrote a new paper, and pushed everything to GitHub. The whole thing took about an hour. His take: “This is an insane paradigm shift in how empirical work is done.” What he did is impressive - and once you know what these tools can do, it’s not surprising at all.</p>

<h2 id="my-hypothesis">My Hypothesis</h2>

<p>Martin’s work is unique, but this pattern isn’t. There are specialists in every field - rare diseases, educational methods, obscure industrial processes. They’ve built tools and workflows for years. Many are using AI tools, but there’s a gap between that and knowing what the leading edge can actually do.</p>

<p>AI coding tools can now write working software from a description of what you want. That’s what Andrew Hall did - and it’s what Martin could be doing too.</p>

<p>The tools exist. The experts exist. But the experts don’t know what the tools can do, and the people who know the tools don’t understand what the experts are working on.</p>

<h2 id="if-this-is-true-what-do-you-do-about-it">If This Is True, What Do You Do About It?</h2>

<p>So how do you fix this? My first thought was scale - build a platform, write guides, create content that reaches lots of people.</p>

<p>But I don’t think that’s how this works. You need to show people the capabilities in the context of what they’re already doing - that’s what creates the “aha” moment. A generic guide or tutorial just can’t do that.</p>

<p>I keep hearing this narrative: if AI progress stopped today, we’d have 10 years of adoption work ahead of us just to integrate what already exists. I actually think that’s true based on what I’ve seen. It’s frustrating - we have these superpowers available and it’s going to take a while for most people to use them. That’s probably always going to be true. But is there a way to shortcut this for people like Martin?</p>

<p>I think there is. Find people like Martin - people doing work where I look at it and think “I have no idea what this is, but it seems important.” People where I can see a pattern for how they could use these tools more effectively. Sit down with them, understand what they’re trying to do, and figure out what’s possible.</p>

<h2 id="two-asks">Two Asks</h2>

<p><strong>If you’re a specialist doing deep work in a narrow field:</strong> I want to hear what you’re working on. What’s tedious? What would you build if you could? I’m not selling anything - I just want to understand where you’re stuck and show you what might be possible.</p>

<p><strong>If you’re already deep in AI like me:</strong> Think about finding a Martin in your network. The most useful thing you can do right now probably isn’t building another demo. It’s sitting down with someone who knows their field cold and showing them what these tools can do for their specific work.</p>

<p>If this sounds like you, you have a view on this hypothesis, you know someone like Martin, or you know someone who’s already in the weeds using these tools to accelerate their work - send me a message on <a href="https://www.linkedin.com/in/mattstockton/">LinkedIn</a> or shoot me an email. I’d love to hear more examples.</p>]]></content><author><name>Matt Stockton</name><email>mattstockton@gmail.com</email></author><category term="AI Strategy &amp; Leadership" /><category term="ai-adoption" /><category term="domain-experts" /><category term="claude-code" /><category term="ai-tools" /><category term="strategy" /><summary type="html"><![CDATA[There are specialists in every field who could massively accelerate their work with AI tools - but they don't know what's possible. I want to find them.]]></summary></entry><entry><title type="html">A Data Engineering Lesson for AI Agent Builders</title><link href="https://mattstockton.com/2026/01/09/a-data-engineering-lesson-for-ai-agent-builders.html" rel="alternate" type="text/html" title="A Data Engineering Lesson for AI Agent Builders" /><published>2026-01-09T00:00:00+00:00</published><updated>2026-01-09T00:00:00+00:00</updated><id>https://mattstockton.com/2026/01/09/a-data-engineering-lesson-for-ai-agent-builders</id><content type="html" xml:base="https://mattstockton.com/2026/01/09/a-data-engineering-lesson-for-ai-agent-builders.html"><![CDATA[<p>Nobody can agree on what an agent is, but most definitions share a common thread: it’s software that uses an LLM to take actions, not just generate text. It can read files, call APIs, run commands, and make decisions based on what it finds. I wrote more about this in my post on <a href="/2025/12/29/why-tool-calling-and-file-system-access-matter.html">tool calling and file system access</a>.</p>

<p>Foundation Capital’s article on <a href="https://foundationcapital.com/context-graphs-ais-trillion-dollar-opportunity/">context graphs</a> has been circulating widely, and <a href="https://podcasts.apple.com/us/podcast/context-graphs-ais-next-big-idea/id1680633614?i=1000743886766">NLW covered it</a> on his podcast. The core idea: as agents orchestrate decisions across companies, they can capture the reasoning and context that led to specific outcomes. This information currently lives in Slack threads, people’s heads, or nowhere. Agents are in the execution path. They can record it.</p>

<p>One piece of their argument that I think deserves more attention: don’t over-constrain the format of what you capture. Let the agent figure out what belongs in the trace. This connected to something I learned years ago in data engineering.</p>

<h2 id="the-web-scraping-lesson">The Web Scraping Lesson</h2>

<p>At <a href="https://ryancaldbeck.medium.com/announcing-the-launch-of-helio-b06458a27af">CircleUp</a>, we were building an authoritative data set for understanding consumer packaged goods - what products existed, what brands were out there, where they were sold, how they changed over time. A core part of this was pulling in unstructured data from various sources. One example: scrapers that collected product information from brand websites and retailers - prices, descriptions, SKU numbers, ingredients. We cared about this data over time, not just a snapshot. We wanted to track how products changed month to month.</p>

<p>The MVP version of these scrapers transformed data inline. We’d find the HTML elements for the attributes we wanted, extract those values, and store them in a database.</p>

<p>Then someone asked about ingredient changes. We could add ingredient tracking going forward, but what about the historical data? Products that had reformulated - we couldn’t see what their ingredients used to be. We’d thrown away everything except the fields we thought we needed.</p>

<p>We caught this early and fixed it. The fix: store the entire HTML page, even if you only need three fields right now. Storage is cheap. You can’t recreate the ability to extract new information from historical data. The lesson stuck.</p>

<p>Once you learn this, your brain gets tuned to ask: what should I be storing that I’m not? API responses where you only need one field? Store the full payload. Event streams you’re filtering? Keep the raw stream somewhere. You can always add structure later. You can’t go back and capture what you didn’t store.</p>

<h2 id="decision-traces-are-the-same-problem">Decision Traces Are the Same Problem</h2>

<p>Agent reasoning works the same way. Consider an agent handling subscription renewals. A customer asks for a discount. The agent checks their support ticket history - three escalations in the past quarter, two unresolved for over a week. It looks at usage patterns - engagement dropped 40% after the last outage. It reviews similar customers who churned and spots warning signs. Based on all this, it recommends a 15% discount to retain the account.</p>

<p>Most systems would only capture the outcome - a field in the CRM that says “discount: 15%”. The reasoning - the support history it reviewed, the usage patterns it analyzed, the churn signals it identified, the similar cases it compared against - gets thrown away.</p>

<p>Systems of record capture state. They don’t capture decision lineage. The Foundation Capital article calls the accumulated decision lineage a “context graph” - traces stitched together over time so precedent becomes searchable.</p>

<p><img src="/docs/assets/images/decision-traces/store_raw_outputs.png" alt="Store the Raw Trace, Structure It Later" /></p>

<p>The connection to my <a href="/2026/01/03/four-building-blocks-for-document-generation-agents.html">document generation agents post</a> is direct. I called it “context tracing” - asking the model to generate a decision log alongside its output. What was considered? What was rejected? Why did it rate this evidence as strong vs. weak?</p>

<p>But I prescribed a specific format for those traces because I was targeting specific document types. Reading about context graphs made me reconsider: should I loosen those constraints?</p>

<h2 id="why-unstructured-traces-are-better">Why Unstructured Traces Are Better</h2>

<p>The natural approach is to define a schema for decision traces upfront. But if you lock in a structure too early, you limit what you can extract later.</p>

<p>Models are good at turning unstructured data into structured data - assuming the information exists in the unstructured source. If it’s not there, no post-processing will create it.</p>

<p>Verbose, unstructured traces let you change your mind. You can extract different structures later for different purposes. You can ask the model to create different views on the same data.</p>

<p>You can also discover patterns you didn’t anticipate. If you’re capturing verbose reasoning and later notice your agents consistently make similar exceptions for a certain type of customer, that’s a pattern you can codify - or at least understand. Structured traces with predefined fields would never surface it.</p>

<p>This is the data engineering pattern: store raw, transform on read. Store the full HTML. Later run Spark or DuckDB to extract whatever structure you need. If your transformation logic has bugs, fix it and rerun. The raw data is your source of truth.</p>

<p>With decision traces, the transformation uses an LLM instead of traditional ETL. More expensive per run, but negligible compared to not having the data at all.</p>

<h2 id="context-engineering-for-tracing">Context Engineering for Tracing</h2>

<p>Unstructured doesn’t mean effortless. You still have to invest in the prompts and context that tell the agent how to trace its reasoning.</p>

<p>You can’t just say “capture a decision trace.” You have to define what that means - but in terms that don’t constrain what the model can include. The instructions should describe what kinds of reasoning to document, not a rigid schema.</p>

<p>Something like: “For each decision, document what information you considered, what alternatives you evaluated, what made you choose this option over others, and any uncertainty you have about your choice.” Specific enough to be useful. Doesn’t prescribe fields or formats.</p>

<p>The decision trace is almost as important as the decision itself. Once you’ve collected enough traces, you can use an LLM to distill learnings from them. Ask it what patterns it notices. Take a contrarian approach: “What should we be doing differently based on these traces?” Having that information gives you more to work with as you iterate. The traces feed back into improving the agent.</p>

<h2 id="the-one-way-door">The One-Way Door</h2>

<p>Some decisions are hard or impossible to reverse. Choosing not to capture decision traces is one of them. Each agent execution generates reasoning that, if not recorded, is sealed away.</p>

<p>Most people building agent systems focus on getting to the right decision. What’s the correct discount rate? Did the agent extract the right claims? That matters. But if you’re only optimizing for the immediate output, you’re walking through one-way doors with every execution.</p>

<p>The clearest path to keep this door two-way: capture unconstrained, verbose decision traces. Give real thought to how you instruct the agent to do this. It’s a core part of your system design.</p>

<p>Store the raw trace. Structure it later.</p>]]></content><author><name>Matt Stockton</name><email>mattstockton@gmail.com</email></author><category term="Building with LLMs" /><category term="agentic-workflows" /><category term="observability" /><category term="data-engineering" /><category term="decision-traces" /><category term="ai-agents" /><summary type="html"><![CDATA[Agent decisions are one-way doors. A lesson I learned building data infrastructure applies directly: store the raw reasoning, add structure later.]]></summary></entry><entry><title type="html">How I Use Claude Code for Non-Technical Work</title><link href="https://mattstockton.com/2026/01/07/claude-code-for-non-technical-work.html" rel="alternate" type="text/html" title="How I Use Claude Code for Non-Technical Work" /><published>2026-01-07T00:00:00+00:00</published><updated>2026-01-07T00:00:00+00:00</updated><id>https://mattstockton.com/2026/01/07/claude-code-for-non-technical-work</id><content type="html" xml:base="https://mattstockton.com/2026/01/07/claude-code-for-non-technical-work.html"><![CDATA[<p><a href="https://www.anthropic.com/claude-code">Claude Code</a> isn’t just for writing software. I’ve been using it to run my consulting business since mid-2025 - client documentation, meeting notes, project tracking, email drafts, data analysis. It’s become the primary tool I use for knowledge work. I wrote about this in September when I first set it up as a <a href="/2025/09/19/how-claude-code-became-my-knowledge-management-system.html">knowledge management system</a>, and I’ve continued to refine how I use it since then.</p>

<p>People keep asking how to use Claude Code for non-technical work. Below is a practical reference you can scan and pick from - patterns I’ve found useful. Everyone uses it differently, and you’ll find your own approach once you start.</p>

<p><img src="/docs/assets/images/claude-code-non-technical/claude_code.png" alt="Claude Code for Non-Technical Work" /></p>

<h2 id="setting-up-your-foundation">Setting Up Your Foundation</h2>

<p><strong>Organize your work into folders.</strong> For me, folders represent different clients. I also have a folder for my own consulting business and a personal folder for things that don’t fit elsewhere. Underneath each client folder, I have subfolders for meeting agendas, meeting summaries, and individual projects. Use whatever structure matches how you think.</p>

<p><strong>Use git to track changes.</strong> You can throw things away when Claude goes off track - it has a /rewind feature, but commit history gives you more flexibility. Claude is also good at reading git history and using <code class="language-plaintext highlighter-rouge">git diff</code> to understand changes. When I’m working on something, I’ll often ask it to look at the history of a specific file, and it uses that to understand the current problem. Commit messages can describe completed units of work, and Claude uses those messages when making decisions later. If you’re non-technical, don’t let git intimidate you - it’s not that complicated, and you can ask Claude Code to help you set up a repository. You don’t need to know git commands to benefit from version tracking.</p>

<p><strong>Bootstrap your CLAUDE.md by talking.</strong> When you’re starting fresh, record yourself describing how you intend to use the repository. Use Apple Voice Memos or whatever you have. Be verbose - talk about project tracking, to-do lists, document templates, whatever you’re trying to accomplish. Don’t worry about structure.</p>

<p>Once you have a transcript, put Claude Code in plan mode and ask it to review the transcript, then use its AskUserQuestion tool exhaustively to clarify how you want the system to work. Let it ask questions until you’re satisfied it understands your intentions - you’ll probably have to tell it to stop, because it will keep asking if you told it to be exhaustive. Then have it generate a CLAUDE.md file from that conversation. You’ll have something workable to start with.</p>

<p>This voice-to-transcript pattern is one of the most useful techniques I’ve found for working with Claude Code. I use it constantly - not just for bootstrapping CLAUDE.md files, but for meeting summaries, project kickoffs, brain dumps, even drafting emails. Talking is faster than typing, and you capture nuance and context you’d otherwise leave out. Record yourself, transcribe it, then let Claude turn that raw input into structured output. It’s become a core part of how I work.</p>

<p><strong>Create nested CLAUDE.md files for subfolders.</strong> If your folder structure has meaning - like folders representing different clients - you want specific instructions for each one. Client-specific stakeholders, communication preferences, key projects. You can use the same voice-transcribe-plan process to create these.</p>

<h2 id="session-management">Session Management</h2>

<p><strong>Start new sessions for different units of work.</strong> If you’re starting a new project and want to create a requirements document, use a fresh Claude session. Don’t let unrelated work bleed together unless you want Claude to have that shared context.</p>

<p><strong>Never clear sessions - always exit.</strong> You can resume sessions later by typing <code class="language-plaintext highlighter-rouge">claude --resume</code>, which shows your conversation history. I regularly return to previous conversations to continue work, summarize what I did, or pull out information that only exists in that context window. If you clear a session, that context is gone.</p>

<p><strong>Name sessions you’ll return to.</strong> Use <code class="language-plaintext highlighter-rouge">/rename</code> to give sessions meaningful names. When you resume, you can use the name directly instead of scrolling through a list.</p>

<p><strong>Use /rewind when things go off track.</strong> If Claude starts deviating from what you want, type <code class="language-plaintext highlighter-rouge">/rewind</code> and choose which message to restart from. It rewinds both the conversation and any file changes. This means you don’t have to overthink every message - you can always back up.</p>

<p><strong>Always use plan mode first.</strong> Before having Claude execute anything, put it in plan mode and let it ask questions about what you’re trying to accomplish. It’s faster than going back and forth later.</p>

<p><strong>Avoid long sessions that hit context limits.</strong> If you compartmentalize your work into focused sessions, you won’t have to worry about automatic compaction. I rarely self-compact - I just exit and start fresh when the work changes.</p>

<h2 id="building-systems-that-compound">Building Systems That Compound</h2>

<p><strong>Put a todo.md in each project folder.</strong> I have instructions in my CLAUDE.md describing how I want the to-do system structured - in progress, backlog, and completed with dates. The completed section includes sub-bullets describing what was actually done. Claude can read these files and use them later. I can ask “give me a summary of everything I did last month” and it pulls from my to-dos and git history.</p>

<p><strong>Create a /command for committing changes.</strong> I have a <code class="language-plaintext highlighter-rouge">/commit-changes</code> command that looks at uncommitted files, infers from the changes and session context what happened, and generates a commit message in my preferred format. After working on something, I just type <code class="language-plaintext highlighter-rouge">/commit-changes</code> and it handles the rest.</p>

<p><strong>Maintain a work log.</strong> This is one of the most useful things I do. I keep a worklog.md at the top level of each client folder - not per-project, but one log for everything related to that client. It captures meta information about work I’ve done: dates, category, subject, summary, related files, and key points. My <code class="language-plaintext highlighter-rouge">/commit-changes</code> command appends to this file automatically.</p>

<p>If you capture what you did in a structured way, Claude can use that information later. Need to write a status update? Claude reads the work log. Want to remember why you made a decision three months ago? It’s in the log. The work log becomes a running record that Claude can query, summarize, and reference.</p>

<p><strong>Update your CLAUDE.md from sessions.</strong> At the end of a session where I’ve done something I might want to repeat, I ask Claude to review the conversation and suggest updates to my system instructions. I have a <code class="language-plaintext highlighter-rouge">/update-from-session</code> command that uses AskUserQuestion to clarify what specifically should be captured. Be selective - you don’t want to pollute your CLAUDE.md with noise.</p>

<p><strong>Turn useful sessions into /commands.</strong> Sometimes the work is significant enough that it deserves its own command rather than just updating CLAUDE.md. For example, I have a <code class="language-plaintext highlighter-rouge">/meeting-summary</code> command - I record myself describing what happened in a meeting, and it turns that into a structured summary. I’ve even created a <code class="language-plaintext highlighter-rouge">/command-from-conversation</code> command that helps turn an interactive session into a reusable command.</p>

<p><strong>Put templates in separate files.</strong> If you have specific formats for things like project descriptions or meeting notes, create separate markdown files for those templates. Reference them from your CLAUDE.md or /commands. This keeps your instruction files from getting too large and lets Claude load templates on demand.</p>

<h2 id="working-with-files">Working with Files</h2>

<p><strong>Use @mention when you need specific files.</strong> Claude is generally good at finding relevant files on its own, but if you want to guarantee it reads certain files, type @ and it will autocomplete. I use this frequently when referencing project descriptions or previous meeting notes.</p>

<p><strong>Let Claude write code, even for non-technical work.</strong> If you’re analyzing a CSV or processing text, Claude will often write Python to solve the problem. You don’t need to fully understand the code - just approve it and check the results. If something works well, tell Claude to save it as a reusable script.</p>

<p><strong>Consider a mono repo.</strong> I put all my non-technical work into one repository. There’s no great pattern that I like yet for sharing /commands across repos, and having everything in one place makes cross-project work easier. You can mount other folders if needed, but I haven’t found a strong need to.</p>

<h2 id="the-compounding-effect">The Compounding Effect</h2>

<p>The underlying idea behind all of this: build systems where information compounds. Every document you create, every commit message you write, every /command you build becomes context for future work. Work logs, git history, CLAUDE.md files - all of it is accessible to Claude later.</p>

<p>When Claude can read your previous meeting summaries, your project history, and your documented preferences, it produces outputs that actually match what you want. Chat interfaces start fresh every time - this doesn’t.</p>

<p>One thing I’ve learned: whenever you discover something through a Claude Code session about how you want the system to work, store it somewhere - your CLAUDE.md, a /command, the work log, a commit message. If it stays only in the conversation, it’s lost when the session ends. If you capture it, Claude can use it forever.</p>

<h2 id="things-worth-exploring">Things Worth Exploring</h2>

<p>I’m not using these much yet, but they’re capabilities I’m aware of and have experimented with. I just haven’t needed to pull them into my workflow:</p>

<p><strong>MCP integrations.</strong> Model Context Protocol lets Claude interface with external systems - your Postgres database, Notion, Gmail. The tradeoff is that MCP connections often use a lot of tokens and can pollute your context window. An alternative is having Claude write code to interface with these systems directly, which saves on context - I prefer this route when needed.</p>

<p><strong>Prompt completion.</strong> Claude Code now suggests what you might want to do next. Press enter to run the suggestion or tab to edit it. I’ve been surprised by how relevant they sometimes are, but generally like to describe what I want myself.</p>

<p><strong>Skills.</strong> These are packaged, portable instructions for specific workflows. Anthropic has a skills directory, and people are building and sharing their own. I haven’t built custom skills yet since /commands work well for me, but I might convert some of my more elaborate workflows to skills eventually.</p>

<p><strong>Desktop and mobile apps.</strong> You can move sessions between CLI and mobile. People are doing real work from their phones now. I think this will be a great UX for more async work.</p>

<p><strong>Permission auto-accept.</strong> You can configure Claude Code to automatically approve certain commands instead of asking each time. Useful if you’re running multiple sessions and the permission prompts slow you down.</p>

<p><strong>Hooks.</strong> You can attach instructions to specific events - for example, triggering actions after certain files are written.</p>

<p><strong>Subagents.</strong> Claude can spin up subagents with their own context windows that run in parallel and return results without polluting your main context. I’ve seen Claude do this on its own.</p>

<p><strong>Plugins.</strong> These bundle commands, skills, hooks, and MCP servers into installable packages. People have built specialized plugins for different workflows.</p>

<h2 id="resources">Resources</h2>

<p>There’s a lot of content out there about Claude Code, but these are links I think are worth your time if you want to dig in further:</p>

<ul>
  <li><a href="https://adocomplete.com/advent-of-claude-2025/">Advent of Claude 2025</a> - 30 tips from Anthropic’s developer relations</li>
  <li><a href="https://www.lennysnewsletter.com/p/everyone-should-be-using-claude-code">Lenny’s article on Claude Code for non-programmers</a> - Covers installation and use cases sourced from <a href="https://x.com/lennysan/status/1960417604948123663">this Twitter thread</a></li>
  <li><a href="https://anthropic.skilljar.com/claude-code-in-action">Anthropic’s Claude Code course</a> - 15 lectures in about an hour, covers 80% of what you need</li>
  <li><a href="https://x.com/bcherny/status/2007179832300581177">Boris Cherny’s thread</a> - The creator of Claude Code on how he uses it (more technical)</li>
  <li><a href="https://www.claude.com/product/claude-code">Claude Code documentation</a> - The official docs with getting started guides and examples</li>
</ul>

<h2 id="getting-started">Getting Started</h2>

<p>The patterns I’ve described here took months to develop, but you don’t need all of them to get value. Start with one folder, one CLAUDE.md file, and one problem you want to solve. Record yourself describing what you’re trying to do, let Claude ask you questions, and see what happens.</p>

<p>What I’ve found is that these tools reward ambition. Every time I think “Claude probably can’t handle this,” I’m wrong. The system I’ve built now would have seemed implausible when I started - but it emerged naturally from trying things, capturing what worked, and letting it compound.</p>

<p>I’m curious what’s working for you, or what problems you’re trying to solve. The patterns that work for consulting might look different for other kinds of knowledge work. But the underlying approach - build systems that compound, capture what you learn, let Claude read your history - that transfers everywhere.</p>

<p>If you’ve been waiting to try this, stop waiting. Pick a real problem and point Claude Code at it. You’ll figure out your own patterns faster than you expect.</p>]]></content><author><name>Matt Stockton</name><email>mattstockton@gmail.com</email></author><category term="Getting Started with AI" /><category term="claude-code" /><category term="workflow" /><category term="knowledge-management" /><category term="productivity" /><category term="ai-tools" /><summary type="html"><![CDATA[I've been using Claude Code to run my consulting business since mid-2025. This post covers what's working - from voice-to-transcript workflows to building systems where information compounds over time.]]></summary></entry></feed>