McMaster AI model searches 46 billion compounds for antibiotic, expanding drug discovery beyond lab limits

The computer expands the search, but the lab narrows the claim.

AI can explore billions of possibilities, but experimental biology remains the final authority on whether a drug actually works.

At McMaster University, a researcher has turned the logic of drug discovery inside out — rather than asking which known compounds might fight bacteria, an AI system called SyntheMol-RL asks which compounds could be built, and which of those are worth building first. By navigating a space of 46 billion synthesizable molecules, it compresses decades of chemical intuition into a computational search that no physical laboratory could replicate. The result is not a medicine, but something rarer in science: a genuinely wider horizon, arrived at before a single flask has been filled.

Antibiotic resistance is outpacing discovery — bacteria evolve faster than researchers can find new drugs, and familiar chemical libraries keep yielding the same dead ends.
SyntheMol-RL breaks the physical ceiling by searching 46 billion synthesizable compounds, a scale millions of times beyond what robotic lab screening can achieve.
The system doesn't optimize for a single goal — it uses reinforcement learning to balance antibacterial potency, synthesizability, low toxicity, and novelty simultaneously, discarding the vast majority of candidates rapidly.
A promising molecule has emerged from the search, but it now faces the full gauntlet: synthesis, bacterial testing, toxicity studies, animal trials, and eventual clinical evaluation.
The bottleneck hasn't vanished — it has shifted, moving the hardest unknowns from the computational front end back to the laboratory bench where biology has final say.

A researcher at McMaster University has built an AI system capable of searching 46 billion possible chemical compounds — a scale no physical laboratory could approach. SyntheMol-RL was designed to find a new antibiotic candidate not by testing what already exists on shelves, but by asking which molecules could plausibly be made, and which of those deserve to be made first.

Traditional drug discovery is bounded by the physical world. Even with robotics and automation, a laboratory might screen a few million compounds — a vanishingly small fraction of what chemistry permits. Every change to a molecular structure shifts whether a compound kills bacteria, harms patients, resists degradation, or invites resistance. That combinatorial vastness has long remained invisible to researchers, who were limited to whatever happened to be within reach.

SyntheMol-RL reframes the question entirely. Constrained to molecules buildable from known reactions and available building blocks, it explores its enormous search space by scoring candidates across competing criteria at once: predicted antibacterial activity, synthesizability, toxicity, and novelty. Reinforcement learning allows the system to navigate toward candidates that balance all four — discarding the overwhelming majority quickly, leaving a shortlist worth the expense of physical testing.

The stakes are real. Antibiotic discovery has become a grinding problem. Familiar chemical collections have been searched so thoroughly that researchers keep rediscovering weak or toxic structures. Bacteria evolve resistance, and new drugs must clear an extraordinary bar — killing pathogens without harming patients, reaching infection sites, and avoiding resemblance to compounds bacteria have already learned to defeat.

What SyntheMol-RL has produced is a candidate, not a cure. That molecule must still be synthesized, tested against real bacteria and human cells, studied for resistance patterns, and carried through animal trials and clinical evaluation. The word 'candidate' marks a beginning. What the system genuinely offers is a less blind beginning — a first step drawn from a far larger and more chemically diverse space than any physical library could contain. The laboratory retains final authority. But the search that leads to it may no longer be limited to whatever happens to be in reach.

A researcher at McMaster University has built an artificial intelligence system that can search through 46 billion possible chemical compounds—a scale that no physical laboratory could ever hope to match. The system, called SyntheMol-RL, was designed to find a new antibiotic candidate by exploring a vast landscape of molecules that could theoretically be synthesized, then narrowing the possibilities down to a handful worth actually making and testing in the lab.

This matters because traditional drug discovery has always been constrained by the physical world. A chemist working in a laboratory can test hundreds of thousands of compounds, maybe a few million if robotics and automation are involved. But that is still a tiny fraction of the molecules that could exist. Each small change to a molecular structure—a different atom here, a different bond there—can alter whether the compound kills bacteria, whether it poisons the patient, whether it dissolves in water, whether bacteria can evolve resistance to it. Multiply those variables by all the available building blocks and reaction pathways, and the theoretical space of possible molecules becomes almost incomprehensibly large. For decades, researchers have been limited to testing whatever compounds happened to sit on their shelves or could be quickly synthesized. The rest remained invisible.

SyntheMol-RL changes that equation by asking a different question. Instead of "Which of the compounds we already have might work?" it asks "Which compounds could we make, and which of those are worth making first?" The system does not imagine molecules at random. It is constrained to search only through chemical structures that could plausibly be built using known reactions and available building blocks—the kind of practical limitation that keeps a computational model tethered to reality. Within that constraint, it explores a space 46 billion molecules large, scoring each one not on a single criterion but on multiple competing factors: predicted antibacterial activity, likelihood of being synthesizable, predicted toxicity, novelty. This is where reinforcement learning comes in. Rather than simply ranking molecules, the system learns to navigate toward candidates that balance all these criteria at once. A molecule that kills bacteria in a dish but also damages human cells is not useful. A structure that looks novel but cannot be reliably synthesized is a dead end. The model's job is to discard the overwhelming majority of possibilities quickly, leaving researchers with a shortlist of candidates that actually deserve the time and expense of physical testing.

Why this matters for antibiotics specifically is worth understanding. Antibiotic discovery has become a grinding, expensive problem. Bacteria evolve resistance to drugs. Many familiar chemical collections have been searched so many times that researchers keep finding the same kinds of molecules—some weak, some toxic, some merely rediscovering structures that are already known. A new antibiotic has to kill or suppress bacteria without harming the patient, reach the site of infection, survive long enough to work, and avoid resembling drugs that bacteria have already learned to resist. Even a successful antibiotic may be deliberately held in reserve, used sparingly so it remains effective against resistant strains. The World Health Organization has called antimicrobial resistance a major global public-health threat. New discovery methods do not solve that problem on their own, but they can widen the pool of candidates from which future drugs might emerge.

It is important to be precise about what SyntheMol-RL actually does and does not do. It is not an approved antibiotic. It is not even proof that a medicine will emerge. The system has identified a candidate—a molecule that looks promising in computational models. That candidate still has to be synthesized in a real laboratory, purified, tested against actual bacteria, tested against human cells to ensure it does not cause harm, examined for resistance patterns, checked for stability, and studied in increasingly complex biological systems. If it survives those steps, it still faces animal studies, formulation work, dosing studies, manufacturing challenges, and clinical trials in human patients. The word "candidate" does a lot of work in that sentence. It marks a beginning, not an ending.

What makes this approach genuinely useful is the tension it creates between computation and experiment. The computer expands the search space, asking what could be made and what might work. The laboratory then asks humbler, more grounded questions: can this molecule actually be synthesized, does it dissolve, does it kill the bacteria we care about, does it spare human cells, does it retain activity in realistic biological conditions? That is not a weakness of the system. It is the point. The best use of AI in drug discovery is not to replace experimental biology but to make the first experimental steps less blind. If researchers can begin with candidates drawn from a much larger and more diverse chemical space than any physical library could contain, the odds of finding unusual molecular scaffolds—structures that bacteria have not yet learned to resist—may improve. The bottleneck in antibiotic discovery has not disappeared. It has simply moved. The hard work still returns to the bench, where real molecules meet real bacteria. But the search that leads to those molecules may no longer be limited to whatever happens to be in reach.

Notable Quotes

A candidate is a beginning, not an ending

— Source material on the status of AI-identified molecules in drug development

The best use of AI in this setting is not to remove experimental biology, but to make the first experimental steps less blind

— Source material on the relationship between computational and laboratory drug discovery

The Hearth Conversation Another angle on the story

Inventor

Why does 46 billion matter? Why not just say "a lot of compounds"?

Model

Because the number tells you something about the relationship between what a computer can explore and what a lab can test. A physical screen might handle a few million. Forty-six billion is not just bigger—it is a different category of search. It means the model can look in places no wet lab would ever bother to go.

Inventor

But the candidate still has to be made in the lab, right? So doesn't the computer just move the work around?

Model

It does move the work, but in a useful direction. Instead of testing millions of compounds to find a few that might work, you test a few compounds that a model has already filtered from billions. The computer does the cheap elimination. The lab does the expensive validation.

Inventor

What stops the AI from just proposing molecules that look good on paper but are impossible to actually synthesize?

Model

That is the constraint built into SyntheMol-RL. It does not search through all imaginable structures. It searches only through molecules that could be built using known reactions and available building blocks. It is still a huge space, but it is a space tethered to chemistry that actually works.

Inventor

So this solves the antibiotic resistance problem?

Model

No. It helps. Resistance is a biological problem—bacteria evolve, they share genes, they adapt. A new discovery method can widen the pool of candidates, which might mean finding scaffolds bacteria have not yet learned to resist. But the method itself does not stop resistance from happening.

Inventor

When will this antibiotic candidate actually be available as a medicine?

Model

That is years away, if it happens at all. The candidate has to be synthesized, tested against bacteria, tested for toxicity, studied in animals, and then studied in human trials. Many candidates fail at each stage. The computer found a promising starting point. The lab still has to do the work.

Inventor

What is the real innovation here—the AI, or the idea of constraining the search to synthesizable molecules?

Model

Both matter, but the constraint is what keeps the innovation grounded. Any system can imagine molecules. The trick is imagining molecules that chemists can actually make. That is what makes this different from earlier generative models that produced beautiful structures no one could synthesize.

Want the full story? Read the original at Space Daily ↗

McMaster AI model searches 46 billion compounds for antibiotic, expanding drug discovery beyond lab limits

Notable Quotes

Related Coverage

Get The Register in your inbox