AI Drug Discovery in 2026: What’s Actually Working (And What Isn’t)

Why Drug Discovery Needed AI in the First Place

Let me start with the brutal calculus of modern drug discovery. It costs $2.6 billion to bring a single drug to market. The timeline is 10 to 15 years. The odds of any given compound, even after Phase I, ever making it to patients is roughly 10%. Those numbers haven’t budged much in three decades, despite exponential improvements in our ability to measure, sequence, and model biology.

The bottleneck was always the same: you can design millions of molecules in silico, but you can only synthesize and test thousands in the lab. Computational approaches existed long before “AI” became a marketing term—molecular docking, QSAR models, physics-based simulations. They were useful, but they plateaued. The predictions were better than guessing, but not better than medicinal chemists with 20 years of experience and a lucky streak.

What changed post-AlphaFold was fundamental. For the first time, we could predict the 3D structure of proteins—the actual physical shape that determines whether a molecule can bind to it—without needing to solve it experimentally. That sounds incremental. It isn’t. If you don’t know the shape, all your predictions about whether a molecule will work are literally shooting in the dark.

The second shift came from generative models. Instead of predicting whether a molecule someone already proposed would work, you could now generate molecules de novo—from scratch—with specific properties in mind. Design the outcome, then design the molecule to hit it. That inverts the entire discovery process.

These aren’t hype cycles. These are legitimate phase transitions in how chemistry and biology can be modeled. And the companies that understood that early are now seeing real results in the clinic.

The Drug Discovery Pipeline: Where AI Actually Fits

If you want to understand why AI works for some parts of drug discovery and not others, you have to understand the actual pipeline. There are really four phases where AI can intervene, and they matter differently.

Target Identification: This is deciding what protein or biological pathway you want to modulate. Historically, this took years—literature review, target validation assays, sometimes animal models. AI can compress this by sifting through tens of thousands of genes and correlating them with disease biology, genetic data, and tissue expression. Companies like Berg Health and Insitro have shown that ML-driven target discovery can reveal overlooked pathways in everything from fibrosis to aging. But here’s the catch: you still have to validate that the target is druggable, that modulating it won’t kill you, and that there’s actually a patient population that benefits. AI doesn’t replace that judgment.

Lead Discovery: This is where most of the hype lives, and it’s actually where things are working best. Once you have a target, you need to find molecules that will interact with it. Historically, this meant screening thousands of existing compounds or synthesizing and testing new ones. AI can generate and prioritize millions of candidates virtually before a single molecule is made. Virtual screening—using physics-based models to predict binding—has been around for decades. Generative chemistry—building new molecules atom by atom with desired properties—is the new lever. I’ll dive deeper into this below.

ADMET Prediction and Optimization: ADMET stands for Absorption, Distribution, Metabolism, Excretion, and Toxicity. These are the properties that determine whether a molecule you found in the lab will actually work in the body. A drug could bind perfectly to its target but be toxic at the dose needed, or be metabolized too quickly to reach the site of action. Predicting these properties used to rely on lookup tables, physicochemical rules, and educated guesses. Graph neural networks and attention-based models have gotten genuinely good at this. Companies like Recursion are building their entire platform on the idea that you can predict these properties early and cheaply, killing bad compounds before expensive synthesis.

Clinical Trial Design and Patient Stratification: This is where I think the next wave of value emerges. Not in the molecule design, but in finding the patients who will actually benefit. AI can analyze genetic data, biomarkers, clinical endpoints from previous trials, and patient populations to identify which subsets are most likely to respond. This is critical for rare diseases and precision medicine, where your patient population might be tiny. Shorter trials mean faster data, which means faster iteration, which means real acceleration.

The error most investors make is treating these phases as equal. They’re not. Lead discovery is maybe 30% of the problem. It’s the visible part. The hard parts are target validation, manufacturing, clinical design, and market finding.

AlphaFold 3 and Protein Structure: What Changed

Let me be direct: AlphaFold 2, released by DeepMind in 2020, solved a decades-old problem in biology. It predicted the 3D structure of individual proteins with unprecedented accuracy. That alone justified the hype. But there were limitations. It predicted static structures. It didn’t handle protein-protein interactions well. It struggled with small molecules binding to proteins—the exact problem you need solved for drug discovery.

AlphaFold 3, released in October 2024, changed that. It now models how proteins interact with ligands—your small molecule drugs. It models protein-protein interactions. It handles DNA and RNA. The accuracy is striking, and it’s available free to academics through a web server.

What does this actually mean for drug discovery? Two things. First, you no longer have to guess at the binding pocket. You can see it. You can model your molecule in that pocket and predict whether it will fit and bind. This is huge for structure-based drug design. Second, you can predict off-target binding—the interactions you don’t want your drug to have. That’s often where toxicity comes from.

The limiting factor now isn’t structure prediction. It’s time and compute. For a typical lead optimization campaign, you might need to run thousands of AlphaFold predictions. That’s possible, but it requires infrastructure and money. Smaller companies and academic labs are still at a disadvantage here, which is why we’re seeing consolidation around companies that have invested in protein folding infrastructure.

What AlphaFold 3 doesn’t solve: whether binding to that pocket will actually modulate the disease biology in a human being. That still requires validation. Structure is necessary, not sufficient.

Generative Chemistry: Designing Molecules From Scratch

This is where the most interesting work is happening, and I want to be careful to separate the real progress from the marketing.

Generative models for molecular design use transformers and reinforcement learning to build molecules atom by atom with specific constraints. You give the model properties you want—”I want something that binds to this protein, has a molecular weight under 400, is orally bioavailable, and isn’t toxic.” The model then generates candidates that satisfy those criteria. You run those through AlphaFold to check binding. You rank by predicted ADMET properties. You synthesize the top candidates.

This is not hypothetical. Exscientia, Recursion, and Inscore have all published data showing that compounds designed this way work in vitro and in vivo. In 2023, Inscore published a Phase IIa trial for a generatively designed compound for pulmonary hypertension. The molecule was designed from scratch using their platform. It worked. That’s a watershed moment.

The catch is this: generative models are often trained on existing chemical space—what’s already been made and tested. They tend to generate variations on known chemotypes, which is useful for lead optimization but less revolutionary than it sounds. The companies doing truly novel chemistry are combining generative models with reinforcement learning to explore chemotypes that haven’t been synthesized before. That’s riskier—you’re betting on new properties you haven’t validated—but it’s also where you get genuine novelty.

I’ve looked at dozens of pitches built on generative chemistry. The pattern I see is this: the models are good at finding the local optima—making a good molecule better. They’re less good at finding true novelty or at predicting properties that are genuinely under-represented in training data. And they absolutely don’t replace medicinal chemistry judgment. What they do is compress the time from “I have an idea” to “I have a testable candidate” from six months to six weeks.

AI for Clinical Trial Design and Patient Stratification

Here’s where most people miss the real opportunity. The molecule design part is maybe 20% of the value chain. The real compression comes from knowing who your patients are before you run the trial.

Most Phase II and Phase III trials are massively under-powered for subgroup analysis. You recruit 300 patients who are somewhat sick with your disease and give them your drug. Some get better, some don’t, most don’t know why. You publish the overall effect size and move on. But what if 60% of those patients had a genetic variant, a biomarker, or a clinical feature that perfectly predicted response? Then your actual responder population is 60 of 300, and you’ve failed the trial because you’re averaging the non-responders into the noise.

AI can predict these subgroups before the trial even starts. You run a retrospective analysis on patient data—genotypes, clinical phenotypes, electronic health records—and identify which patient clusters are most likely to respond. Then you design the trial to enrich for those patients. Shorter trials. Fewer patients. Better phase outcomes.

I’ve seen this work in oncology (where biomarker-driven trials are standard) and we’re starting to see it in neurology and immunology. The limiting factor isn’t the ML anymore. It’s the quality and accessibility of patient data, and the regulatory clarity around running enriched trials.

This is also where there’s real risk. If you over-fit your enrichment criteria to historical data, you’ll enroll the perfect patients in your trial and then fail in the broader population. That’s a multi-hundred-million-dollar mistake. So the companies doing this right are validation-heavy. They validate their patient stratification models on held-out data before committing to trial design.

Real Companies, Real Results

Let me give you concrete examples, because concrete is better than abstract.

Inscore and Pulmonary Hypertension: In January 2023, Inscore announced Phase IIa results for ISC-12, a compound designed using generative chemistry for pulmonary hypertension. The compound hit its primary endpoint. That’s huge because it’s the first time a fully generatively designed molecule has made it through Phase II. The timeline from target to IND to Phase IIa was roughly four years. For context, the median timeline for traditional drug discovery is seven to nine years at that stage. They compressed it. Did AI do all the work? No. But it was essential.

Recursion: Recursion is building an AI-first drug discovery engine at scale. They’re using generative models for lead discovery, graph neural networks for ADMET prediction, and they’re systematizing the entire workflow. In 2024, they announced their first in-human trial results for REC-6287 in neurodevelopmental disease. The company exists because AI made their cost structure feasible—otherwise, doing this many campaigns would be ruinously expensive.

Isomorphic Labs: DeepMind’s drug discovery spinout. They’ve been public about using AlphaFold for structure-based design and working with Eli Lilly on multiple programs. The interesting thing about Isomorphic is they’re not claiming they can replace medicinal chemistry. They’re claiming they can make medicinal chemists radically more productive. That’s realistic. I see more of this than I see the “AI replaces chemists” narrative.

Exscientia: They’ve been doing generative chemistry since before it was fashionable. They’ve got multiple compounds in IND-enabling studies, and they’re partnering with major pharma (Roche, Sanofi, Exelixis) to run campaigns. The partnerships matter—large pharma is credible, and they wouldn’t bet money on this if there wasn’t real signal.

These are not failures. These are proof that the concept works. They’re also capital-intensive, which means they’ll likely consolidate around three to five major players and a handful of specialized boutiques.

What AI Still Can’t Do in Drug Discovery

I want to be equally clear about the frontier of what doesn’t work.

Predicting efficacy in disease context: AI is good at predicting whether a molecule will bind to a protein. It’s getting good at predicting whether it will be toxic at a given dose. It’s genuinely bad at predicting whether modulating that protein will actually treat the disease in humans. That’s because disease biology is complex—there are feedback loops, compensatory mechanisms, and variables that only emerge when you test in a living system. You can’t model away that requirement.

Manufacturing and formulation: AI can suggest a synthetic route for a molecule. But taking a route from paper to production scale, avoiding side reactions, optimizing yield, and troubleshooting real-world issues—that’s still chemists and engineers. I know companies that designed a beautiful molecule computationally and then spent three years trying to make it in volume. The molecule was the easy part.

Regulatory prediction: Can AI predict whether the FDA will approve a drug? The honest answer is no, not really. You can predict whether your clinical data is clean and your trial was well-run. But regulatory decisions involve scientific judgment, precedent, and institutional factors that are genuinely hard to model. The companies claiming they can predict drug approval are overselling what their models do.

De novo innovation in biology: AI can optimize within known chemical space. It can find better versions of known strategies. But it doesn’t have intuitions about biology. It doesn’t have the kind of embodied understanding that comes from working with cells and organisms for years. Some of the most interesting recent drug targets—like tau protein in Alzheimer’s, or LRRK2 in Parkinson’s—came from human insight and luck, not from AI screening databases.

This isn’t a knock on AI. It’s a recognition that drug discovery is still fundamentally empirical. You need wet lab validation. You need to test in organisms. You need careful observation of how real patients respond. AI compresses the parts of the pipeline that are amenable to computation. The wet lab parts remain wet.

The Investment Landscape: Where I’m Looking

When I look at drug discovery startups now, I’m looking for three things.

First: is the founder a scientist or engineer who has actually worked in drug discovery? Not someone who read about it. Someone who has felt the pain of the process. The best companies are founded by people who spent years in a medicinal chemistry lab or running clinical trials, got frustrated by the inefficiency, and built a tool to fix it.

Second: do they have real partnerships with pharma or evidence that biopharma companies want their tool? The graveyard of biotech is full of companies with beautiful platforms that nobody actually used. If you’ve got a partnership with a major pharma company or a contract research organization, that’s evidence the tool actually solves someone’s problem.

Third: are they solving a compressed problem—can they show that their approach saves time, money, or increases probability of success? Not just “here’s a neat ML model.” Show me the unit economics. Show me that using your platform instead of traditional methods gets you to a decision point faster and cheaper.

I’m also watching closely for companies focused on patient stratification and clinical trial design. That’s where I think the next wave of value is. The lead discovery problem is getting solved. The clinical problem is not.

What’s Coming in 2026 and 2027

A few things I’m confident about in the near term.

More integration of AI into large pharma workflows: The big pharmaceutical companies are not replacing their discovery capabilities. But they are building AI-native pathways for specific therapeutic areas. Pfizer, Merck, GSK—they all have significant AI groups now. This means the question isn’t “will AI discover drugs?” but “which companies will be early enough and committed enough to build real competitive advantage?”

Consolidation in the pure-play AI drug discovery space: There’s not room for 20 AI drug discovery platforms. I’d expect to see five to seven players remain relevant in five years. Some will be acquired by pharma companies. Some will go public. Some will fold.

Better foundation models for chemistry: The current generative models are good, but they’re not yet at the level of transfer learning you see in NLP or vision. We’ll see better pre-training on chemical data, better fine-tuning, and better ways to incorporate domain knowledge. This will make the models more accessible to smaller companies.

Regulatory clarity on AI-designed drugs: We’ve had one or two AI-designed compounds in human trials. By 2027, we’ll have more. The FDA will have seen real data and will have developed clear expectations around what validation is needed for computationally designed drugs. This clarity will unlock capital and accelerate more companies into the clinic.

Real progress on protein engineering with AI: Not drug discovery, but related. Companies are using AI to design proteins from scratch—for diagnostics, for therapeutics, for manufacturing. This is earlier stage than small molecules, but the progress is fast. We could see real therapeutic proteins designed by AI in clinical trials within five years.

Conclusion: The Realistic Picture

If you’re looking at AI drug discovery, here’s what I actually believe. AI has solved the protein structure problem. It’s genuinely good at optimizing molecules once you have a promising starting point. It’s useful for ADMET prediction and early toxicology. It will be transformative for patient stratification and trial design. It has not solved the problem of predicting efficacy, manufacturing, or regulatory outcomes.

The companies winning right now are those that understand these boundaries and treat AI as one tool among many—not as a silver bullet. They’re also moving fast and iterating on real feedback from pharma partners, not just iterating on benchmark papers.

The opportunity is real. The timeline is compressed. But it’s still drug discovery. It still takes capital, discipline, and a willingness to fail.

If you want to stay ahead of where AI and longevity are actually going, subscribe to Accelerated — my weekly newsletter on the frontier of biotech and AI. Subscribe here

Leave a Reply

Scroll to Top

Discover more from Grey Area Labs

Subscribe now to keep reading and get access to the full archive.

Continue reading