Karpathy’s Practical LLM Insights
I distilled two hours of practical wisdom from Andrej Karpathy on using large language models effectively — real tools, clear explanations, and valuable tips you can apply immediately.
I distilled two hours of practical wisdom from Andrej Karpathy on using large language models effectively — real tools, clear explanations, and valuable tips you can apply immediately.
Success with LLMs isn’t about picking the right API it’s about structuring problems in a way that works. In this podcast recap, I break down John Berryman’s insights on better prompts, structured workflows, and human-AI collaboration that lead to real results.
If you’re technical and trying to get your head around AI, this guide cuts through the noise. I’ve put together a mix of foundational concepts, hands-on resources, and my own takeaways to help you get started quickly. The best way to learn AI is by diving in — this post gives you a roadmap.
OpenAI’s Deep Research generates structured reports with citations in minutes, making research faster and more efficient. I tested it for portfolio analysis, using meta-prompting to refine my queries. Here’s how it streamlined my decision-making and how you can use it too.
Most AI projects fail because they start with the tool instead of the problem. Report generation flips that — integrating AI into real workflows where domain experts already know what good looks like. This post breaks down why it’s the best first step for AI adoption.
Many AI teams skip the most critical step — actually reviewing their model’s outputs. If you don’t analyze errors and define correctness, you risk building something that looks right but isn’t. This post breaks down why structured evaluation is the difference between AI experiments and real-world impact.
I use AI tools daily to streamline thinking, organize projects, and refine ideas. This post breaks down my workflow - voice input, meta-prompting, structured summaries — and how AI helps turn rough thoughts into something useful. If you’re looking to make AI more practical, here’s how I do it.
I built a personal finance tool in a day using AI to automate portfolio analysis — extracting data, checking allocations, and identifying better fund options. This post (and video) walks through how I used AI tools to speed up development and even add features with voice commands.
DeepSeek R1 is a transparent reasoning model that exposes its thought process through thinking tokens. I tested it locally and recorded a demo showing how it works. This post breaks down why its approach to chain-of-thought reasoning is so interesting and how it might change how we use AI.
I recorded a video showing how I use LLMs to summarize podcast transcripts into concise, structured insights. This post covers the workflow, key takeaways, and how tools like the LLMOps database can help builders working with AI. If you’re summarizing long-form content, here’s how to do it efficiently.
Repo Prompt automates injecting local files into model contexts, making it useful for coding and structured workflows. In this video, I demo how it helps with prompting for o1 pro, using meta-prompting and structured inputs to improve responses. If you’re working with o1 pro, this tool is worth exploring.
I used OpenAI’s o1 pro to build a self-guided learning app — generating a PRD, writing code, and creating a working prototype in no time. This post walks through the process, showing how AI can streamline development and make building more accessible than ever.
I use LLMs to summarize podcasts into structured takeaways, making it easier to revisit key insights. This post walks through my workflow — meta-prompting, refining outputs, and generating summaries. If you want to extract useful details from long-form content, here’s how I do it.
I used o1 pro to challenge widely accepted maxims and generate new ones with fresh perspectives. This experiment highlights how AI can help unpack assumptions, rethink conventional wisdom, and surface insights that might otherwise go unnoticed. Here’s what I found.
I tested ChatGPT’s memory with prompts designed to uncover hidden narratives, sparking questions about how it generalizes insights. This post explores the experiment, potential uses for memory embeddings, and how o1 pro helped brainstorm new ways to analyze patterns in stored interactions.
I took OpenAI’s Reasoning with o1 course, a short but detailed look at how o1 handles chain-of-thought reasoning, tool calling, and complex workflows. This post breaks down key takeaways, from coding improvements to policy refinement, and why this model stands out for structured problem-solving.
I experimented with OpenAI’s O1 model to generate portfolio strategies inspired by Verdad’s Persistent Alpha concept. Using a structured prompt, O1 produced eight systematic strategies worth exploring. This experiment showed me how powerful O1 can be for tackling complex investment ideas.
Andrej Karpathy used O1 Pro to create a full book from scratch, shaping the content himself through structured prompts. This process isn’t just about getting answers — it’s a powerful way to clarify thinking, refine ideas, and explore topics in ways that weren’t possible before.
Ethan Mollick’s post cuts through the noise on AI and prompt engineering: just start using it. Treat AI like a forgetful but skilled coworker, be clear in your requests, and iterate. The best way to learn is through hands-on practice—don’t overthink it, just dive in.
Steven Johnson turned his 400-page book into an interactive detective game using a 400-word prompt and an LLM. His essay highlights how long-context models are transforming how we analyze, retrieve, and interact with information—moving beyond search to reasoning and creative problem-solving.
I tested Gemini Flash 2.0’s screen-sharing feature by analyzing a portfolio report in real time. It felt like having someone guide me through the data, reducing friction and making learning more natural. The insights weren’t perfect, but this kind of interaction is a big step forward.
Gemini 2.0 Flash introduces real-time screen and video interaction, making AI feel more like a true assistant. A radiologist’s demo shows its potential for detailed analysis, but the real takeaway is how these tools are evolving to fit into real workflows.
Large language models predict the next word based on context, trained on massive text datasets. 3Blue1Brown’s video explains this in under 10 minutes with great visuals. Understanding these basics helps you use LLMs more effectively — watching the video is a great place to start.
I used an LLM to refine financial strategies, iterating through simple prompts to generate clear, actionable insights. This experiment reinforced how easily these tools can help process complex idea — no overthinking required, just a conversational approach to learning.
I used AI to analyze my portfolio for tax efficiency, starting with a casual voice prompt and refining my strategy with Claude. In 30 minutes, I identified simple, actionable changes that could save money long-term — an easy, effective way to make better financial decisions.
Stanley Druckenmiller’s British Pound trade is a masterclass in high-conviction decision-making. He saw a clear mismatch, acted decisively, and concentrated his bet. The lesson applies beyond investing — when the odds are in your favor, focus and bold action can drive the biggest outcomes.
A recent podcast with Max Buckley highlighted key ways to improve RAG systems — adding context to chunks, using multiple embeddings, and implementing pre-submission checks for better input quality. A holistic approach across the pipeline ensures more precise and reliable AI-generated responses.
Gwern Branwen’s take on writing, curiosity, and burnout left me thinking about how we shape AI — and how it shapes us. From following rabbit holes to recognizing when it’s time to switch gears, this episode was a reminder to stay curious and intentional about what I create.
Latent Space’s conversation with Erik Schluntz from Anthropic covered practical lessons on building LLM agents — from structuring prompts with XML to designing tools like a UX problem. The key takeaway: effective agents aren’t just about capabilities, but about guiding LLMs to reason and act well.
I used Claude AI to refine a list of life skills that would be valuable to learn at 21. Voice input made the process faster, and experimenting with writing styles helped shape clearer, more actionable insights. The result was a solid list — simple, direct, and worth revisiting.
I used Claude to break down complex ideas from a podcast, refining explanations and generating useful examples. The process took 10 minutes and turned passive listening into active learning. AI makes it easy to iterate — just ask, refine, and keep going until it clicks.
I used Gemini to turn rough, disorganized notes from a podcast into a structured, useful summary. By guiding the model with key points I cared about, I got a refined list of prompting techniques with minimal effort—another example of how LLMs can adapt to different learning styles.
Anthropic’s classification demo showed me a different way to approach text classification — structured XML prompts, retrieval-augmented examples, and clear evaluation with PromptFoo. It’s practical, well-documented, and has me rethinking when to reach for an LLM over traditional ML tools.
Anthropic’s Cookbook is one of the best hands-on resources for learning LLM techniques. The examples — like structuring citations, prompt formatting, and integrating external tools — are practical and widely applicable. If you’re serious about improving your LLM skills, it’s worth your time.
AI is evolving fast, and these are the ideas shaping how I think about it — from self-guided learning to the importance of UX in adoption. There’s still so much to build, and the best way to form a perspective is to dive in and experiment. What are your takes?
Hamel Husain’s post on LLM evaluations is the most practical guide I’ve seen. The key takeaway? Work with a domain expert — they define success, refine inputs, and keep your system grounded in reality. Everything else, from synthetic inputs to structured feedback, is built on that foundation.
I used an LLM to distill insights from a Twitter thread on AI workflow hacks — turning a noisy flood of replies into a clear, structured top 10 list. The process itself became a meta-experiment in using AI to cut through information overload efficiently.
Anthropic’s updated prompt workbench makes testing, refining, and deploying prompts much smoother. It blends automation with flexibility — letting you generate test cases, tweak prompts, and evaluate results all in one place. A well-designed tool for serious prompt engineering.
Google’s prompt tuning playbook is packed with insights on writing better prompts. Key takeaways: guide the model to the right ‘universe’ of information, favor clear zero-shot instructions, and remember that LLMs excel where answers are hard to make but easy to check.
Jason Liu’s TwiML podcast covered high-signal lessons on RAG pipelines — why structured evaluations beat stochastic testing, how chain-of-thought prompting outperforms multi-step prompts, and why structured reports often deliver more value than chatbots. A must-listen for LLM builders.
Clear communication is critical for AI teams. Simple, precise language helps bridge gaps between technical and non-technical members, improving collaboration. Learning to write clearly —whether for teammates or LLMs — is just as valuable as learning new AI techniques.
Matt Cynamon’s discussion on building AI tools at USV reinforced a key idea — just start. Experimentation, small specialized tools, and human-AI collaboration are shaping how we build. The best insights come from diving in, iterating, and learning along the way.
I built a Halloween Candy Calculator using an LLM in minutes, not hours — proving that AI enables rapid prototyping, playful creativity, and even a little family-friendly learning. The best way to understand LLMs? Just start building, even if it’s something as silly as tracking candy raids.
If you’re serious about building with LLMs, read ‘What We’ve Learned From A Year of Building with LLMs.’ It’s packed with lessons on prompting, retrieval, evaluations, and real-world deployment. My notes summarize key takeaways, but the full paper is worth your time.
If you’re using LLMs in production, you need systematic evaluations. Anthropic’s notebooks and tools like PromptFoo make it easy to test prompts, compare models, and iterate quickly. LLMs are powerful, but without structured testing, you won’t know if they’re working as expected.
A rapid-fire list of AI insights, tools, and concepts that have caught my attention lately — from practical prompting tips to the latest breakthroughs in multimodal models. If you’re looking for a nudge to dive deeper into this space, there’s something here for you.
LLMs have drastically lowered the cost of prototyping custom software. Using Gumloop, I built an AI-powered podcast summarizer in under an hour — automating transcription, summarization, and PDF generation. It’s never been easier to experiment and build useful tools.
NotebookLM lets you turn documents into custom podcasts in minutes. I tested it by generating an audio overview of nuclear energy — and it worked great. Learning on your own terms has never been easier, and tools like this are making it even more accessible.
RAG sounds complex, but it’s really just using a custom search engine to feed better context to an LLM. A great place to start is Santiago’s step-by-step tutorial — under 100 lines of code. If you’re curious, just dive in. The hardest part is getting started.
An 8-year-old built a working chatbot app in 45 minutes. The barriers to building with AI are gone — if you can write clearly, you can create software. The only real challenge left is getting started. Watch the video, try it yourself, and see what’s possible.
I used LLMs to simulate a conversation between Warren Buffett and Joel Greenblatt analyzing a company’s annual report. The results were fascinating — and just a glimpse of what’s possible with creative prompting. Full inputs and outputs included.
Verdad used LLMs to analyze thousands of Japanese company reports, categorizing their valuation plans and linking them to stock performance. This kind of structured analysis — translation, summarization, and categorization — is exactly where LLMs shine. We’re just getting started.
Alex Bartling’s team is using LLMs not just for AI-driven note-taking, but to refine their product based on user feedback — turning unstructured input into structured UI improvements. This is the future of building: constantly iterating with AI as a collaborator.
LLMs can now process video, unlocking entirely new categories of automation. I tested this by turning an iPhone walkthrough of an EV charger install into a detailed project quote in under 10 minutes. The potential for real-world applications is massive.
I used Gemini’s long context window to analyze company annual reports — processing hundreds of pages in minutes to rank small-cap stocks. This kind of deep document analysis was nearly impossible before, but LLMs are changing what’s feasible at scale.
I tested Flux, an AI image model, and documented my workflow for experimenting with new AI tools. The takeaway? The best way to learn AI is to start playing with it. Small experiments compound, and those who explore now will have an edge later.
I modified a demo that narrates your webcam in real time — using my own cloned voice and Mister T. It’s a fun example of how little effort it takes to build with AI now. The real question: what non-toy applications can we create with the same underlying tech?
Custom GPTs are a fast, low-effort way to embed tailored LLM functionality for your team. I built one for leadership decision-making in minutes — adding custom prompts, conversation starters, and book references. It’s a great middle-ground between generic AI chat and full custom software.
LLMs can now analyze images and generate detailed narratives with minimal input. I tested this by having Claude Sonnet 3.5 interpret everyday scenes and historical documents — producing rich, context-aware stories. The potential for real-world applications is enormous.
I used Claude to extract and visualize my blood test history from unstructured PDFs — transforming raw medical data into an interactive tool in minutes. This experiment highlights how LLMs can bridge the gap between personal data and useful insights.
I built a playable word game in 5 minutes using Claude’s Artifacts feature — no coding required. LLMs are making software development more accessible, reducing the time to prototype, and enabling new types of small, custom applications that wouldn’t have been built before.
AI tools like Udio can generate full songs with lyrics and music in minutes. I tested it by making a country song about my book character, Warren the Worm. The results were surprisingly good — and a reminder that AI creativity goes far beyond text generation.
I used Claude 3.5 to build an interactive dashboard that maps stock returns to economic indicators like GDP and inflation. Claude handled API calls, data processing, and visualization, letting me iterate quickly. In under an hour, I had a working prototype with real insights.
I built a working image classifier in 20 minutes using Claude 3.5 and Artifact — handling everything from Flask API setup to model integration and Docker deployment. This experiment showed just how much LLMs can accelerate development and simplify complex workflows.
I previously wrote about the traits of a good leader in a startup. I spent some time reflecting on the key traits I value when hiring new teammates. This isn’t a comprehensive list, just some of the key traits that I care about.
I like to read books. I’m always looking for ways to discover new titles. I mainly use Goodreads and Twitter as a discovery mechanism, and I use Goodreads to track my reading queue. I love when I find lists of books that others are reading, particularly when I discover a book I hadn’t come across yet. It’s been a while, but I shared my top books...
I want to build my Prompt Engineering skills. As I highlighted in a previous post, I’m convinced that LLMs will be a huge part of the future of work. Understanding how to use them most effectively will be an incredibly valuable skill. There are many published resources on Prompt Engineering out there. It’s difficult to filter the signal from the...
I finished reading Shane Parrish’s new book Clear Thinking. It has quite a few nuggets of wisdom that I’d like to reference in the future. I’m trying to get better at applying information I learn in books. Perhaps taking reflective notes is a good way to do this, so I am trying it out. Here’s to hoping I can apply a few of the following ideas in...
What is this? I used ChatGPT to generate a children’s book about a worm learning a valuable life lesson. This was just a fun way to learn more about the technology. If you want to learn how I made it, check out this post.
I love Ethan Mollick’s ‘Jagged Frontier’ concept with LLMs. We don’t exactly know what LLMs are great at, and what they are bad at — yet. To figure this out, we should experiment with LLMs as much as we can — and ‘bring’ them to tasks we do on a regular basis. With that as inspiration, I wanted to see if I could author a children’s book using Ch...
I recently finished reading Co-Intelligence by Ethan Mollick. I was very eager to read this book, as Ethan continually posts fantastic LLM content on Twitter. The content is a really good mix — links to technical papers, links to new tools, insights he’s had through his work, etc. — overall just a fantastic way to stay informed in this incredibl...
I’ve spent most of my career in small-ish VC-backed startup companies, with between 10 and 150 people. Things tend to move and change fast in small companies. Your success is far from guaranteed, and the ‘rules of the game’ can change frequently – A year can happen in a week. It’s important to have leaders who can steer the ship through the fog,...
I’ve recently built a desire to write more, and wanted to dig into why I am feeling this way. What are my reasons and goals? A big part of this desire is rooted in discovering content from people like David Perell, Dan Shipper, Paul Millerd, and Nat Eliason. Following their journeys — and the journeys of people they talk to, write about, and int...
There is an endless amount of stuff in the world to learn, and not enough time to learn it. The demands of modern work nudge us to build skills in a very specific domain - typically the domain someone else is giving us money to be proficient in. Success at work is important, but it’s not sufficient to be happy and successful in life. There are m...
Explaining how things work to different audiences is a topic that interests me. Can you put yourself in the mind-space of the person you’re speaking with — and explain a concept — so that they truly receive what you’re saying? I’m also curious about the view that true value in LLMs is through getting people to use them in the problems they encou...
Last month, during a visit to the Apple Store, I experienced an unexpected nudge towards being present — a concept I encounter frequently in books I’ve read. The idea of living the present moment conceptually resonates with me. I make attempts to integrate it into my daily life — but applying consistently often proves challenging. It’s easy to d...
I want to be a great dad. It’s not easy. Being a dad is the biggest privilege and opportunity in my life. As a parent, you have huge responsibility in the actions you take — and do not take — to help your kids thrive. It’s easy for me to feel like I am failing. Being a dad is the hardest thing I’ve ever done. It’s filled with a ton of joy, awe,...
Everything competes for your attention The most recent thing seems like the most important Without a framework for what you pay attention to, what happens? Things happen to you Flip it. Build a framework. Make you happen to things
At the beginning of 2013, I made a resolution to learn more things outside of Technology. In the past, I’ve been a semi-active reader, finishing about 15 books a year. Most of these books were non-Fiction, Business or Psychology-oriented, mainstream (Gladwell, Friedman, etc.). I wanted to branch out. I opened a Goodreads account and started addi...
Why default-resiliency is not the best option
‘What should we tell them?’ How about the truth.
Thoughts on ‘Product strategy means saying NO’
Thoughts on being malleable instead of magnetic
Thoughts on doing ‘things about the thing’ instead of the thing itself
… can be a dangerous attitude to have when trying to solve a problem.
Yesterday, I called Fidelity to get help with my account. Before I was connected to a human, I was asked to enter my username, and then my password using the phone keypad. I did a double-take when I started entering my password. Clearly, my password isn’t all numbers — Are they really just storing my password encrypted rather than storing it has...
I often cringe when I hear people say they are ‘getting out of the building’ to test their product idea. At the core, ‘getting out of the building’ is a proxy for finding or validating tangible problems that people are trying to solve. It’s about finding and validating real pain points that people are trying to alleviate. It is not, however, abo...
When you work at a big company, your role is specialized. On a day-to-day basis, you don’t have to venture far from your ‘comfort zone’ of core skills to accomplish your tasks (I know there are exceptions — I’m making a generalization). When you work at a small company, your role can still be specialized, but you have to cover a much broader are...
There is not a linear relationship between the complexity of a product’s features and the product’s value to an end user. The graph of complexity versus value often looks like this:
I am a new student of the Business Model Generation. I’m working to understand and apply the tools and techniques outlined by key influencers such as Steve Blank and Alexander Osterwalder. The Business Model Generation book outlines the difference between a ‘Design Attitude’ and a ‘Decision Attitude’, and how crucial the difference is to the suc...
I’ve always appreciated a good non-fiction book. My preferred reading ‘categories’ are behavioral economics (e.g. Ariely, Thaler, Dubner, Gladwell), analyses of critical thinking, particularly in medicine (Gawande, Groopman), and timeless investment strategy (Lynch, Graham, Buffett, Fisher). I read somewhat passively (maybe 1-3 books per month),...
People have told me that it isn’t easy to keep up a blog. Now I know what they mean! Somehow, I’ve gone an entire month without posting anything — time seems to have flown by! I will try to be better at this, and post here at least 1 time a week (2 may have been a bit of an ambitious start)
Ok, I admit it…this may qualify as the nerdiest / lamest name for a blog post…ever…in the history of the blogosphere…but hear me out on this one, because I’m going to make a point that might help you focus your energies in work and life to achieve more with the same amount of effort.
Over the last 5 years since I began working full-time, I have developed a strong interest in investing in the stock market. I started investing in mutual funds, and gradually started investing in individual stocks. I even helped to form a (now defunct) Investment Club in order to learn more about investing in stocks with other young professional...
The bloggers over at Newly Corporate are asking “What 1 or 2 CORE traits get you noticed at work or help you succeed in your day-to-day operations” Here is the blog post
Every day, a software project dies. Some die a slow, painful, expensive, death. Others die a quick, not painless, and relatively embarrassing death. As Software Engineers, we never want our own projects to die. As individual contributors the livelihood of our projects are often times outside the realm of our control. At the same time, there ar...