Evaluating New Technology
Evaluating new technology is about separating signal from hype. The best product leaders develop a personal relationship with emerging tools—using them directly rather than reading about them—while maintaining healthy skepticism about vendor promises. The goal is to understand what's genuinely possible today versus what's marketing.
The Guide
5 key steps synthesized from 22 experts.
Use the technology yourself before deciding
You can't evaluate what you haven't used. Create personal projects that force you to use new tools beyond demos and tutorials. Solve a real problem. Hit the edges. The only way to develop intuition is through direct experience, not through reading think pieces or watching demos.
Featured guest perspectives
"Really try and use these tools yourself... we learn a lot about how our own workflow can change, and that's going to tell you so much more about how you're going to change your organization's workflow than if you're reading a bunch of think pieces on LinkedIn."— Dhanji R. Prasanna
"I try to use as many different AI products, including not Airtable, as I can... I try to invent little, almost like side projects of my own, to have a real reason to use these products."— Howie Liu
Update your priors constantly
What couldn't work last year might work today. The hardest part of evaluating fast-moving tech is letting go of conclusions from previous experiments. Actively fight the 'scar tissue' from past failures. The technology you dismissed six months ago may have fundamentally changed.
Featured guest perspectives
"My impression of it from trying it a few months ago, that prior needs to be updated. And it's hard to do that, right? You have to do something almost counterintuitive and against the grain to say, 'No, no, ignore what you learned about what this can or cannot do.'"— Aparna Chennapragada
"The truth is the value is changing every day. And so you need to be adaptable and look at what the value is today and plan for what the value will be tomorrow."— Dhanji R. Prasanna
Think 'build AND buy' not 'build vs. buy'
The choice isn't binary. Buy tools for the 90% of functionality that's standard, then build the 10% that makes you unique. Building everything from scratch wastes engineering time. Buying everything locks you into someone else's vision. The sweet spot is strategic combination.
Featured guest perspectives
"Build and buy as opposed to build versus buy... If it's only build versus buy, then you've already made the decision that you can only do one or the other... Buy tools to handle 90% of the standard functionality and build the 'cool' 10% that is unique to your business."— Austin Hay
"I think the key decision is whether you want to build or buy... It's usually not a zero one, it's usually both. You build some and you buy some."— Ronny Kohavi
Be skeptical of 'one-click' solutions
When vendors promise turnkey AI that just works, be suspicious. Real enterprise deployment is messy—data is scattered, systems don't talk to each other, edge cases abound. Prefer partners who talk about building pipelines that improve over time rather than magic that works out of the box.
Featured guest perspectives
"When someone comes up to me and says, 'We have this one click agent, it's going to be deployed in your system.' I would almost be skeptical because it's just not possible... I would rather go with a company that says, 'We're going to build this pipeline for you,' and that will learn over time."— Aishwarya Naresh Reganti + Kiriti Badam
"AI guardrails do not work... When these guardrail providers say, 'We catch everything,' that's a complete lie."— Sander Schulhoff
Design for swappability
The AI stack is evolving rapidly. Today's best model is tomorrow's commodity. Build abstraction layers that let you swap components without rebuilding everything. Bet on platforms that support model diversity. Avoid lock-in that will hurt you when something better arrives.
Featured guest perspectives
"You really need to bet on a platform or some app server type layer that allows you to swap things in and out and not really be beholden to any one technology or any one tool because the reality is the whole thing is going to change."— Asha Sharma
"Before React wins, there's a competing technology called Web Component from Google... we're betting on that technology. And then we realize because it's so new, it's just so unstable... Then we have to restart the company, rebuild the whole thing."— Ivan Zhao
Common Mistakes
- Evaluating technology based on demos and marketing rather than hands-on usage
- Letting old conclusions about what doesn't work persist past their expiration date
- Treating build vs. buy as binary instead of combining approaches strategically
- Believing vendor promises about turnkey solutions that will 'just work'
- Locking into specific tools without planning for the ecosystem to evolve
Signs You're Doing It Well
- You can articulate from personal experience what a technology does well and where it breaks
- Your team regularly re-evaluates previous technology decisions as the landscape changes
- You have a clear framework for deciding what to build internally vs. buy externally
- Vendors compete for your business based on substance rather than hype
- Your architecture can accommodate new tools without major rewrites
All Guest Perspectives
Deep dive into what all 22 guests shared about evaluating new technology.
Aishwarya Naresh Reganti + Kiriti Badam
"When someone comes up to me and says, 'We have this one click agent, it's going to be deployed in your system.' ... I would almost be skeptical because it's just not possible. And that's not because the models aren't there, but because enterprise data and infrastructure is very messy... I would rather go with a company that says, 'We're going to build this pipeline for you,' and that will learn over time."
- Evaluate AI vendors based on their ability to build a learning pipeline rather than a static 'one-click' agent.
- Assess the readiness of your internal data layer before attempting to deploy autonomous agents.
Aparna Chennapragada
"The models couldn't do some things one year ago. I mean, image generation was full of spellings or reasoning. You just couldn't have deeper and smarter answers. You couldn't do data analysis. So my impression of it from change, trying it a few months ago, that prior needs to be updated. And it's hard to do that, right? You have to do something almost counterintuitive and against the grain to say, 'No, no, ignore what you learned about what this can or cannot do.' The baby just grew up to be a 15-year-old in a month."
- Actively work to ignore 'scar tissue' from previous technical limitations
- Regularly re-test assumptions about what AI can and cannot do every few months
- Demand more from current technology rather than relying on old benchmarks
Asha Sharma
"you really need to bet on a platform or some app server type layer that allows you to swap things in and out and not really be beholden to anything, any one technology or any one tool because the reality is the whole thing is going to change."
- Bet on platforms that support model diversity and easy swapping of components
- Prioritize tools that offer high observability and evaluation capabilities
Austin Hay
"I have this adage I always say, which is tools are just meant to solve problems. And the problem set for marketing technologists and business technologists is you focus on the tools."
- Always define the problem and the people involved before selecting a system or tool.
- Avoid 'tool bias'—picking a tool just because you've used it before.
"It's B and B as opposed to BVB. So, build and buy as opposed to build versus buy. People all the time just think the second that you're talking about implementing a tool or procuring a solution, it's, Hey, I want to build this thing or I want to buy this really expensive thing. Build versus buy is a very narrowly constricting decision tree. If it's only build versus buy, then you've already made the decision that you can only do one or the other... Build and buy means that both of you can win"
- Buy tools to handle 90% of the standard functionality and build the 'cool' 10% that is unique to your business.
- Use a financial model to show that building on top of a third-party tool often yields higher ROI than building from scratch.
Brandon Chu
"if you're not technical, really lean into it and build something simple, learn how to build something simple for yourself, demystify the technology. That experience will take you far. I love telling people that literally don't even know what HTML is... you could build a clone of Twitter using a tutorial on Rails"
- Build a simple clone of a popular app using a tutorial to understand basic architecture
- Focus on demystifying how data flows rather than mastering the code
Camille Fournier
"GraphQL... is one of the things that is both popular and thought relatively poorly of by most of the senior people that I know... if you're seriously thinking about it and you're not Facebook, you may really want to make sure you know what problem you're trying to solve because the impression that I have... is that GraphQL is kind of trying to promise front-end engineers that they don't really have to collaborate with backend engineers."
- Identify the specific problem a new framework is intended to solve before adopting it
- Be wary of frameworks backed by VC-funded startups whose goal is adoption over utility
Bret Taylor
"I do still think studying computer science is a different answer than learning to code... computer science is a wonderful major to learn systems thinking."
- Prioritize understanding 'systems thinking' and complexity over learning specific syntax
Christine Itwaru
"We had someone at one point handle whatever tools connected to Pendo and make sure that those systems are set up for maximum outcomes for the product manager. So, Pendo's connected to Salesforce. We're connected to Looker. We're connected to all these different."
- Audit the product manager's tool stack to ensure maximum data integration
- Connect product analytics tools (like Pendo) to CRM systems (like Salesforce) to provide a complete customer picture
Dan Shipper
"My first thing that I open is o3. I'm a ChatGPT boy... it has memory. And I just love that... I think Claude Opus is... Claude Code, everyone inside Every, that's basically what we use... Gemini... It's incredibly powerful and it's incredibly cheap, which is great."
- Use ChatGPT (o3) for tasks requiring long-term memory and personal style consistency
- Use Claude Code for autonomous engineering tasks and Gemini for high-volume, low-cost API applications
Dhanji R. Prasanna
"The savings and costs that there might be in replacing a vendor tool by something you build in-house is probably not worth it in the mental bandwidth that you've lost and the amount of the team's technical focus that's being taken away. ... I would always come back to what is the reason we're doing this? Why does it matter to us and to our customers?"
- Evaluate build vs. buy decisions based on 'mental bandwidth' lost rather than just direct dollar costs
- Question if a process is even necessary before deciding to automate it or buy a tool for it
"I would say really try and use these tools yourself. ... we learn a lot about how our own workflow can change, and that's going to tell you so much more about how are you going to change your organization's workflow than if you're reading a bunch of think pieces on LinkedIn."
- Solve a specific, personal, real-world problem with a new tool to understand its true strengths and weaknesses
- Encourage the executive team to use the product daily to drive authentic adoption
Howie Liu
"I try to use as many different AI products, including not Airtable, as I can... I try to invent little, almost like side projects of my own, to have a real reason to use these products."
- Invent personal side projects to force deep usage of new AI tools
- Study the 'prior art' of AI products to understand emerging UX patterns and form factors
Ivan Zhao
"Before React wins, there's a competing technology called Web Component from Google... we're betting on that technology. And then we realize because it's so new, it's just so unstable. It don't know where the bug come from. It's from your source code or from the underlying libraries? Then we have to restart the company, rebuild the whole thing."
- Distinguish between 'orthodox' technology foundations and experimental ones.
- Be willing to 'throw away the code' and rebuild on a more stable foundation if the current stack is hindering progress.
Manik Gupta
"One of the things that we used to talk about all the time at Google on Maps was how would we design a navigation product when people are in self-driving cars?... It's going to take years, but it is just such a different paradigm. It's like computers talking to computers, algorithms talking to other algorithms. Then there's a human in the mix in terms of serving the human at the end, but it's like the human is not initiating that much."
- Analyze how a new technology shifts the user from an 'initiator' to a 'passenger' or 'consumer'.
- Consider the 'computer-to-computer' interaction layer when designing for future tech stacks.
Naomi Ionita
"The modern growth stack... is the evolution of what you do with the data. So these are the workflows that the data enables to drive the business forward for product growth and revenue teams like I used to run. It's the modern replacement for infrastructure that teams like mine built or bought."
- Evaluate tools like Hightouch or Census (Reverse ETL) to break down data silos
- Look for tools that offer 'hard ROI' through either cost reduction or revenue generation
"Eppo, which offers experimentation for the modern data stack. So unlike Optimizely, which focused on more kind of click through metrics, Eppo ties directly to the metrics in your data warehouse. So tying an experiment result to things like subscriptions or revenue or margins, really like board level metrics that you're trying to move."
- Prioritize experimentation tools that integrate with the data warehouse (e.g., Snowflake, Redshift)
Ronny Kohavi
"I think the key decision is whether you want to build or buy. ... It's usually not a zero one, it's usually both. You build some and you buy some, and it's a question of do you build 10% or do you build in 90%? I think for people starting, the third party products that are available today are pretty good."
- Start with third-party experimentation vendors to avoid the high initial cost of building a platform
- Evaluate the build vs. buy decision based on the maturity of the organization's experimentation culture
Ryan J. Salva
"This product in particular, I probably spent more time with legal than any other products that I've ever kind of been responsible for. ... It is also privacy and security champions. It is, frankly, developers, like the people who are using it, listening to them. ... We're actually at a place now where we're able to partner with the Azure Department of a Responsible AI, and they've created some really extraordinary models that help detect I'll call it sentiment for lack of a better word, but basically when there is something that is patently offensive."
- Engage legal and privacy teams early when dealing with novel technology like LLMs
- Use secondary AI models to monitor and filter the outputs of primary generative models for safety
Sander Schulhoff
"AI guardrails do not work. I'm going to say that one more time. Guardrails do not work. If someone is determined enough to trick GPT-5, they're going to deal with that guardrail. No problem. When these guardrail providers say, 'We catch everything,' that's a complete lie."
- Do not rely on third-party guardrails as a primary security layer
- Be skeptical of '100% catch rate' claims from AI security vendors
"AI red teaming works too well. It's very easy to build these systems and they always work against all platforms... these automated red teaming systems are not showing anything novel. It's plainly obvious to anyone that knows what they're talking about that these models can be tricked into saying whatever very easily."
- Recognize that automated red teaming results against off-the-shelf models are expected and not necessarily a unique flaw in your implementation
Yuriy Timen
"Media Mix Modeling is now making a comeback... the company that's leading the charge of bringing the Media Mix Modeling methodology of the traditional advertising era and ushering it into the digital world is a company called the Recast."
- Consider MMM tools like Recast if spending over $100k/month across 3+ channels.
- Use incrementality testing (randomized control experiments) to determine true causality in ad spend.
Andrew Wilkinson
"I finally got one that actually works. It's called the Matic Vacuum... basically, I think it's like former Google Engineers basically built like a mini Waymo car. So it has machine vision and it will avoid absolutely everything."
- Prioritize hardware built by engineers with backgrounds in autonomous vehicles or high-end computer vision.
- Look for 'machine vision' as a key differentiator in robotics to ensure reliability.
Hila Qu
"From infrastructure perspective, on data tool, my first tool usually, one is some sort of data hub segment, right? This next one is some sort of a product analytics tool. Think about Amplitude. I know PostHog is actually a pretty popular one... The third piece I think that's pretty essential, I counted in the infra, is some sort of a lifecycle marketing tool."
- Implement a data hub like Segment to allow for flexible tool integration
- Use lifecycle marketing tools that trigger based on in-product behavior rather than just email opens
"The success of this is you identify the gaps and eventually you want to establish something called the data dictionary... The data dictionary will include, here are all the key actions, what's the event name for each of those, and what are the property and things like that."
- Perform a data instrumentation audit to identify gaps
- Create a centralized data dictionary for the entire team to reference
Jeanne Grosser
"I think the calculus on build versus buy is changing... because this whole space is so nascent, often your own esoteric context, your content, your workflow is really key to unlocking the power of the agent. And so I think there's value in experimenting with your own internal agent development."
- Experiment with building internal agents for specific workflows before procuring fragmented AI tools.
Tobi Lutke
"Monorepo, now for companies, it's a very much one of those door A, door B kind of things. It's a very consequential choice that is incorrect to go say yes to at a certain size, and then it becomes very correct in my mind to say yes to, but at that point it's an enormous amount of effort. So, it's a kind of thing that actually is something I'm uniquely positioned to be involved with because it's actually a business strategy thing as well."
- Evaluate technical infrastructure choices (like monorepos) as business strategy investments
- Recognize that technical choices that were 'incorrect' at one size may become 'correct' at another
- Use leadership influence to compress the time spent on contentious technical change management
Install This Skill
Add this skill to Claude Code, Cursor, or any AI coding assistant that supports Agent Skills.
Download the skill
Download SKILL.mdAdd to your project
Create a folder in your project root and add the skill file:
.claude/skills/evaluating-new-technology/SKILL.md Start using it
Claude will automatically detect and use the skill when relevant. You can also invoke it directly:
Help me with evaluating new technology Related Skills
Other AI & Technology skills you might find useful.
AI Product Strategy
AI strategy should focus on using algorithms to scale human expertise and judgment rather than just...
View Skill → →Building with LLMs
Using LLMs for text-to-SQL can democratize data access and reduce the burden on data analysts for ad...
View Skill → →Platform Strategy
Platform and ecosystem success comes from identifying 'gardening' opportunities—projects with inhere...
View Skill → →Vibe Coding
The guest repeatedly uses this term to describe a new mode of development where non-engineers (desig...
View Skill → →