Why GenAI Pilots Actually Fail
The '95% failure' headline is misleading, but the organizational barriers behind pilot failures are real and worth understanding.
Building Production AI Evaluations: A Systematic, Domain-Driven Approach
Teresa Torres went from zero AI experience to shipping production evaluations by combining domain expertise with systematic thinking - her journey shows how to build reliable AI systems without extensive...
GPT-5 Prompting is Different - Here's What Works
GPT-5 requires different prompting techniques than GPT-4o and o3, but once you understand what it expects, the results are worth the adjustment.
Six Insights from Andrew Ng on Building Faster with AI
Key takeaways from Andrew Ng's Y Combinator talk on adapting to the rapidly evolving AI development landscape.
Why Successful LLM Products Start with Spreadsheets
Philip Carter's process on Vanishing Gradients shows the value-added work most people skip - actually looking at your data and fixing what's broken.
AI Coding Tools Amplify What You Already Know
Alex MacCaw nails why AI tools are rocket fuel for experienced engineers - plus the six prompting principles that work beyond coding.
Why AI Adoption is Really About Organizational Learning
After updating AI workshop materials that became obsolete in 3 months, I realized the real challenge isn't keeping up with tools - it's building organizations that can learn and adapt...
AI Coding Tools: Why I Use Claude Code
After trying RepoPrompt, Cline, and others, I found the AI coding tool that finally clicked. Here's what I learned about choosing tools, avoiding the shiny object trap, and why the...
This Was Written With ChatGPT
Some people dismiss writing if it was touched by AI. But that reaction often misses the point. This post explores why the tool itself isn't the problem—and why thoughtful use...
Transactions as Language
Stripe's approach to fraud detection made me rethink how we use transformers. What looks like a row in a table might actually be a sequence. And once you model it...
Building Better from Day One: Notes on Click by Jake Knapp and John Zeratsky
Click offers a simple, practical framework for starting projects the right way. In this post, I share my notes on why the Foundation Sprint works, how it helps teams focus...
Incentives and the AI Divide
The only way to get good at using AI is by doing the work. But unless teams are set up with the right expectations and incentives, that learning won't happen....
The Courage to Hit Publish
Everyone has a strange, specific mix of experiences that's completely their own. Writing is how I've learned to make sense of mine—and how you might start uncovering yours.
Better Communication Makes You Better at AI
Lessons on communication from Wes Kao that apply just as much to working with people as they do to prompting LLMs. Clear, tactical, and instantly useful — these are the...
What Actually Matters When You're Building AI Products
A few takeaways from Hamel Husain's guide to improving AI products — from avoiding the tools-first trap to empowering domain experts and focusing on fast, iterative learning. One of the...
Karpathy's Practical LLM Insights
I distilled two hours of practical wisdom from Andrej Karpathy on using large language models effectively — real tools, clear explanations, and valuable tips you can apply immediately.
Practical Takeaways on Building with LLMs from John Berryman
Success with LLMs isn't about picking the right API it's about structuring problems in a way that works. In this podcast recap, I break down John Berryman's insights on better...
Getting Started with AI & LLMs (For Technical Folks)
If you're technical and trying to get your head around AI, this guide cuts through the noise. I've put together a mix of foundational concepts, hands-on resources, and my own...
Using OpenAI's Deep Research
OpenAI's Deep Research generates structured reports with citations in minutes, making research faster and more efficient. I tested it for portfolio analysis, using meta-prompting to refine my queries. Here's how...
Report Generation - A Smart Way To Start With AI
Most AI projects fail because they start with the tool instead of the problem. Report generation flips that — integrating AI into real workflows where domain experts already know what...
The Most Overlooked Step in Building Reliable AI Systems
Many AI teams skip the most critical step — actually reviewing their model's outputs. If you don't analyze errors and define correctness, you risk building something that looks right but...
A Brain Dump on How I Use AI Tools
I use AI tools daily to streamline thinking, organize projects, and refine ideas. This post breaks down my workflow - voice input, meta-prompting, structured summaries — and how AI helps...
Building a Personal Finance Tool in a Day with AI
I built a personal finance tool in a day using AI to automate portfolio analysis — extracting data, checking allocations, and identifying better fund options. This post (and video) walks...
DeepSeek R1: A Reasoning Model with Thinking Tokens
DeepSeek R1 is a transparent reasoning model that exposes its thought process through thinking tokens. I tested it locally and recorded a demo showing how it works. This post breaks...
Summarizing with LLMs: Video Walkthrough
I recorded a video showing how I use LLMs to summarize podcast transcripts into concise, structured insights. This post covers the workflow, key takeaways, and how tools like the LLMOps...
Repo Prompt and prompting for o1 pro
Repo Prompt automates injecting local files into model contexts, making it useful for coding and structured workflows. In this video, I demo how it helps with prompting for o1 pro,...
Building a Self-Guided Learning App with O1 Pro
I used OpenAI's o1 pro to build a self-guided learning app — generating a PRD, writing code, and creating a working prototype in no time. This post walks through the...
Summarizing Podcasts with LLMs (Using o1 Pro)
I use LLMs to summarize podcasts into structured takeaways, making it easier to revisit key insights. This post walks through my workflow — meta-prompting, refining outputs, and generating summaries. If...
Common Maxims - LLM Remix
I used o1 pro to challenge widely accepted maxims and generate new ones with fresh perspectives. This experiment highlights how AI can help unpack assumptions, rethink conventional wisdom, and surface...
A Thought Experiment with ChatGPT's Memory
I tested ChatGPT's memory with prompts designed to uncover hidden narratives, sparking questions about how it generalizes insights. This post explores the experiment, potential uses for memory embeddings, and how...
Reasoning with o1 - Course Breakdown
I took OpenAI's *Reasoning with o1* course, a short but detailed look at how o1 handles chain-of-thought reasoning, tool calling, and complex workflows. This post breaks down key takeaways, from...
Trying Out O1 for Portfolio Construction
I experimented with OpenAI's O1 model to generate portfolio strategies inspired by Verdad's *Persistent Alpha* concept. Using a structured prompt, O1 produced eight systematic strategies worth exploring. This experiment showed...
Building a Book That Didn't Exist
Andrej Karpathy used O1 Pro to create a full book from scratch, shaping the content himself through structured prompts. This process isn't just about getting answers — it's a powerful...
Using AI Without Overthinking It – From Ethan Mollick
Ethan Mollick's post cuts through the noise on AI and prompt engineering: just start using it. Treat AI like a forgetful but skilled coworker, be clear in your requests, and...
Building a Game From a 400-Page Book in 400 Words
Steven Johnson turned his 400-page book into an interactive detective game using a 400-word prompt and an LLM. His essay highlights how long-context models are transforming how we analyze, retrieve,...
Trying Gemini Flash 2.0's Screen Interaction
I tested Gemini Flash 2.0's screen-sharing feature by analyzing a portfolio report in real time. It felt like having someone guide me through the data, reducing friction and making learning...
Gemini 2.0 Flash: A Shift in How We Interact with AI
Gemini 2.0 Flash introduces real-time screen and video interaction, making AI feel more like a true assistant. A radiologist's demo shows its potential for detailed analysis, but the real takeaway...
How Large Language Models Work at a High Level
Large language models predict the next word based on context, trained on massive text datasets. 3Blue1Brown's video explains this in under 10 minutes with great visuals. Understanding these basics helps...
Distilling Financial Advice With LLMs
I used an LLM to refine financial strategies, iterating through simple prompts to generate clear, actionable insights. This experiment reinforced how easily these tools can help process complex idea —...
Using AI to Make My Portfolio More Tax-Efficient
I used AI to analyze my portfolio for tax efficiency, starting with a casual voice prompt and refining my strategy with Claude. In 30 minutes, I identified simple, actionable changes...
High-Conviction Bets
Stanley Druckenmiller's British Pound trade is a masterclass in high-conviction decision-making. He saw a clear mismatch, acted decisively, and concentrated his bet. The lesson applies beyond investing — when the...
Improving Retrieval Augmented Generation Systems
A recent podcast with Max Buckley highlighted key ways to improve RAG systems — adding context to chunks, using multiple embeddings, and implementing pre-submission checks for better input quality. A...
Shaping Tools and Ourselves: Reflections from Gwern Branwen
Gwern Branwen's take on writing, curiosity, and burnout left me thinking about how we shape AI — and how it shapes us. From following rabbit holes to recognizing when it's...
Building LLM Agents: Insights from Anthropic on Latent Space
Latent Space's conversation with Erik Schluntz from Anthropic covered practical lessons on building LLM agents — from structuring prompts with XML to designing tools like a UX problem. The key...
Cheat Codes for Life: Experimenting with LLMs
I used Claude AI to refine a list of life skills that would be valuable to learn at 21. Voice input made the process faster, and experimenting with writing styles...
How I Use LLMs for Self-Guided Learning
I used Claude to break down complex ideas from a podcast, refining explanations and generating useful examples. The process took 10 minutes and turned passive listening into active learning. AI...
Learning with LLMs: Using Quick Notes to Guide Better Summaries
I used Gemini to turn rough, disorganized notes from a podcast into a structured, useful summary. By guiding the model with key points I cared about, I got a refined...
Anthropic Demo for LLM Text Classification
Anthropic's classification demo showed me a different way to approach text classification — structured XML prompts, retrieval-augmented examples, and clear evaluation with PromptFoo. It's practical, well-documented, and has me rethinking...
Why You Should Spend a Day with Anthropic's Cookbook
Anthropic's Cookbook is one of the best hands-on resources for learning LLM techniques. The examples — like structuring citations, prompt formatting, and integrating external tools — are practical and widely...
My Takes On AI
AI is evolving fast, and these are the ideas shaping how I think about it — from self-guided learning to the importance of UX in adoption. There's still so much...
Evaluating LLM Applications the Right Way
Hamel Husain's post on LLM evaluations is the most practical guide I've seen. The key takeaway? Work with a domain expert — they define success, refine inputs, and keep your...
How I Used LLMs to Summarize Workflow Hacks from a Twitter Thread
I used an LLM to distill insights from a Twitter thread on AI workflow hacks — turning a noisy flood of replies into a clear, structured top 10 list. The...
Thoughts on Anthropic's Updated Prompt Engineering Workbench
Anthropic's updated prompt workbench makes testing, refining, and deploying prompts much smoother. It blends automation with flexibility — letting you generate test cases, tweak prompts, and evaluate results all in...
Notes From Google's Prompt Tuning Playbook
Google's prompt tuning playbook is packed with insights on writing better prompts. Key takeaways: guide the model to the right 'universe' of information, favor clear zero-shot instructions, and remember that...
Key Takeaways from Jason Liu's Podcast on RAG Pipelines
Jason Liu's TwiML podcast covered high-signal lessons on RAG pipelines — why structured evaluations beat stochastic testing, how chain-of-thought prompting outperforms multi-step prompts, and why structured reports often deliver more...
Bringing Clear Communication to AI-Driven Teams
Clear communication is critical for AI teams. Simple, precise language helps bridge gaps between technical and non-technical members, improving collaboration. Learning to write clearly —whether for teammates or LLMs —...
Building with AI: Insights from Matt Cynamon and USV
Matt Cynamon's discussion on building AI tools at USV reinforced a key idea — just start. Experimentation, small specialized tools, and human-AI collaboration are shaping how we build. The best...
LLMs - Halloween Candy Edition
I built a Halloween Candy Calculator using an LLM in minutes, not hours — proving that AI enables rapid prototyping, playful creativity, and even a little family-friendly learning. The best...
A Year Of Building With LLMs - The Paper You Need To Read
If you're serious about building with LLMs, read 'What We've Learned From A Year of Building with LLMs.' It's packed with lessons on prompting, retrieval, evaluations, and real-world deployment. My...
Testing Your Prompts - Writing LLM Evals
If you're using LLMs in production, you need systematic evaluations. Anthropic's notebooks and tools like PromptFoo make it easy to test prompts, compare models, and iterate quickly. LLMs are powerful,...
A Firehose of LLM / AI Tidbits
A rapid-fire list of AI insights, tools, and concepts that have caught my attention lately — from practical prompting tips to the latest breakthroughs in multimodal models. If you're looking...
Building Small-Scale Custom Software with LLMs and Gumloop
LLMs have drastically lowered the cost of prototyping custom software. Using Gumloop, I built an AI-powered podcast summarizer in under an hour — automating transcription, summarization, and PDF generation. It's...
Custom Podcasts with NotebookLM
NotebookLM lets you turn documents into custom podcasts in minutes. I tested it by generating an audio overview of nuclear energy — and it worked great. Learning on your own...
You Can Build a RAG System Too
RAG sounds complex, but it's really just using a custom search engine to feed better context to an LLM. A great place to start is Santiago's step-by-step tutorial — under...
Hello - You Can Build Now
An 8-year-old built a working chatbot app in 45 minutes. The barriers to building with AI are gone — if you can write clearly, you can create software. The only...
Analyzing Companies with LLMs - A Simulated Investor Conversation
I used LLMs to simulate a conversation between Warren Buffett and Joel Greenblatt analyzing a company's annual report. The results were fascinating — and just a glimpse of what's possible...
Investing With Language Models
Verdad used LLMs to analyze thousands of Japanese company reports, categorizing their valuation plans and linking them to stock performance. This kind of structured analysis — translation, summarization, and categorization...
Inviting AI to the Table - Building Software Edition
Alex Bartling's team is using LLMs not just for AI-driven note-taking, but to refine their product based on user feedback — turning unstructured input into structured UI improvements. This is...
Videos Within Multi-Modal Models - A Whole New World of Opportunity
LLMs can now process video, unlocking entirely new categories of automation. I tested this by turning an iPhone walkthrough of an EV charger install into a detailed project quote in...
Using Google Gemini's Long Context Window to Analyze Small Cap Stock Annual Reports
I used Gemini's long context window to analyze company annual reports — processing hundreds of pages in minutes to rank small-cap stocks. This kind of deep document analysis was nearly...
Learning About AI Is Easier Than You Think. Just Get Started
I tested Flux, an AI image model, and documented my workflow for experimenting with new AI tools. The takeaway? The best way to learn AI is to start playing with...
Narrating Your Webcam with Your Own Voice and Mister T
I modified a demo that narrates your webcam in real time — using my own cloned voice and Mister T. It's a fun example of how little effort it takes...
Building Custom GPTs
Custom GPTs are a fast, low-effort way to embed tailored LLM functionality for your team. I built one for leadership decision-making in minutes — adding custom prompts, conversation starters, and...
Multi-Modal LLM Capabilities
LLMs can now analyze images and generate detailed narratives with minimal input. I tested this by having Claude Sonnet 3.5 interpret everyday scenes and historical documents — producing rich, context-aware...
Exploring My Unstructured Medical Data from PDFs with Claude
I used Claude to extract and visualize my blood test history from unstructured PDFs — transforming raw medical data into an interactive tool in minutes. This experiment highlights how LLMs...
Using LLMs to Build Software - A Working Game in 5 Minutes
I built a playable word game in 5 minutes using Claude's Artifacts feature — no coding required. LLMs are making software development more accessible, reducing the time to prototype, and...
AI Beyond Text Generation: A Song About a Worm
AI tools like Udio can generate full songs with lyrics and music in minutes. I tested it by making a country song about my book character, Warren the Worm. The...
Building an Economic Indicators Dashboard with Claude 3.5
I used Claude 3.5 to build an interactive dashboard that maps stock returns to economic indicators like GDP and inflation. Claude handled API calls, data processing, and visualization, letting me...
Creating an Image Classifier with Claude 3.5 and Artifacts
I built a working image classifier in 20 minutes using Claude 3.5 and Artifact — handling everything from Flask API setup to model integration and Docker deployment. This experiment showed...
What I look for when hiring teammates
I previously wrote about the traits of a good leader in a startup. I spent some time reflecting on the key traits I value when hiring new teammates. This isn’t...
Summarizing and creating a reading list with ChatGPT
I like to read books. I’m always looking for ways to discover new titles. I mainly use Goodreads and Twitter as a discovery mechanism, and I use Goodreads to track...
Prompt Engineering Quickstart
I want to build my Prompt Engineering skills. As I highlighted in a previous post, I’m convinced that LLMs will be a huge part of the future of work. Understanding...
Clear Thinking : Take-Aways
I finished reading Shane Parrish’s new book Clear Thinking. It has quite a few nuggets of wisdom that I’d like to reference in the future. I’m trying to get better...
Warren The Worm: A Children's Book by ChatGPT and me
What is this? I used ChatGPT to generate a children’s book about a worm learning a valuable life lesson. This was just a fun way to learn more about the...
Making 'Warren The Worm': A Children's Book by ChatGPT and me
I love Ethan Mollick’s ‘Jagged Frontier’ concept with LLMs. We don’t exactly know what LLMs are great at, and what they are bad at — yet. To figure this out,...
Co-Intelligence : Take-Aways
I recently finished reading Co-Intelligence by Ethan Mollick. I was very eager to read this book, as Ethan continually posts fantastic LLM content on Twitter. The content is a really...
Leadership in startup companies
I’ve spent most of my career in small-ish VC-backed startup companies, with between 10 and 150 people. Things tend to move and change fast in small companies. Your success is...
Why I want to write
I’ve recently built a desire to write more, and wanted to dig into why I am feeling this way. What are my reasons and goals? A big part of this...
Minimum Viable Competence
There is an endless amount of stuff in the world to learn, and not enough time to learn it. The demands of modern work nudge us to build skills in...
Explaining LLMs to non-technical people
Explaining how things work to different audiences is a topic that interests me. Can you put yourself in the mind-space of the person you’re speaking with — and explain a...
Being present
Last month, during a visit to the Apple Store, I experienced an unexpected nudge towards being present — a concept I encounter frequently in books I’ve read. The idea of...
Being a good dad
I want to be a great dad. It’s not easy. Being a dad is the biggest privilege and opportunity in my life. As a parent, you have huge responsibility in...
A small thought on focus
Everything competes for your attention The most recent thing seems like the most important Without a framework for what you pay attention to, what happens? Things happen to you Flip...
Top Books From 2013
At the beginning of 2013, I made a resolution to learn more things outside of Technology. In the past, I’ve been a semi-active reader, finishing about 15 books a year....
Stubbornness and Resiliency
Why default-resiliency is not the best option
The Truth and Shackleton
‘What should we tell them?’ How about the truth.
Product smells and sniff tests
Thoughts on ‘Product strategy means saying NO’
How to hack taking advice
Thoughts on being malleable instead of magnetic
Artificial Progress
Thoughts on doing ‘things about the thing’ instead of the thing itself
I've Done This Before
… can be a dangerous attitude to have when trying to solve a problem.
My bank password is 'sort-of' hashed
Yesterday, I called Fidelity to get help with my account. Before I was connected to a human, I was asked to enter my username, and then my password using the...
Seeking affirmation of your product idea can be dangerous
I often cringe when I hear people say they are ‘getting out of the building’ to test their product idea. At the core, ‘getting out of the building’ is a...
Confusing what you're good at with what needs to be done
When you work at a big company, your role is specialized. On a day-to-day basis, you don’t have to venture far from your ‘comfort zone’ of core skills to accomplish...
The Complexity Versus Value Trap
There is not a linear relationship between the complexity of a product’s features and the product’s value to an end user. The graph of complexity versus value often looks like...
A Student of the Business Model Generation
I am a new student of the Business Model Generation. I’m working to understand and apply the tools and techniques outlined by key influencers such as Steve Blank and Alexander...
Reflections on reading 'business books' as an Engineer
I’ve always appreciated a good non-fiction book. My preferred reading ‘categories’ are behavioral economics (e.g. Ariely, Thaler, Dubner, Gladwell), analyses of critical thinking, particularly in medicine (Gawande, Groopman), and timeless...
Where did this month go? And don't optimize!
People have told me that it isn’t easy to keep up a blog. Now I know what they mean! Somehow, I’ve gone an entire month without posting anything — time...
Applying Amdahl's Law to Your Life
Ok, I admit it…this may qualify as the nerdiest / lamest name for a blog post…ever…in the history of the blogosphere…but hear me out on this one, because I’m going...
Learning from the Financial Crisis
Over the last 5 years since I began working full-time, I have developed a strong interest in investing in the stock market. I started investing in mutual funds, and gradually...
What's at the CORE
The bloggers over at Newly Corporate are asking “What 1 or 2 CORE traits get you noticed at work or help you succeed in your day-to-day operations” Here is the...
Habits of a Rockstar Software Engineer
Every day, a software project dies. Some die a slow, painful, expensive, death. Others die a quick, not painless, and relatively embarrassing death. As Software Engineers, we never want our...