Back
Science

Synthegy: LLM-Based Framework for Retrosynthesis and Reaction Mechanism Elucidation

View source

A study led by Philippe Schwaller at EPFL developed Synthegy, a framework that uses large language models (LLMs) as reasoning engines for chemistry. Instead of generating chemical structures directly, the models evaluate and guide traditional computational tools. The framework combines established search algorithms with AI capable of interpreting chemical strategies expressed in natural language.

How Synthegy Works

Synthegy begins with a target molecule and a plain-language instruction from the user (e.g., early formation of a specific ring, avoiding unnecessary protecting groups). Traditional retrosynthesis software generates numerous potential synthetic routes. Each route is translated into text and analyzed by an LLM.

Synthegy evaluates how well each pathway aligns with the user's goals, assigns scores, and explains its reasoning. This allows researchers to rank and filter candidate routes. For reaction mechanisms, Synthegy breaks reactions into elementary electron movements and explores multiple possibilities. The LLM assesses each step, guiding the search toward chemically plausible mechanisms.

Validation and Performance

In a double-blind expert study, 36 chemists provided 368 valid evaluations. Their judgments aligned with Synthegy's assessments 71.2% of the time on average. The framework can detect unnecessary protecting steps, assess reaction feasibility, and prioritize efficient pathways.

Significance for Chemistry

Synthegy shows that LLMs can analyze chemistry across multiple levels: interpreting functional groups, assessing individual reactions, and evaluating complete synthetic pathways. Larger and more advanced models demonstrate the strongest performance; smaller models show limited capability.

The work redefines how AI can support chemistry by positioning LLMs as evaluators rather than generators, allowing chemists to express goals in plain language and receive strategically relevant solutions. The technology could accelerate drug discovery, improve reaction design, and make advanced computational tools more accessible.

Publication Details

The study was published in Matter on April 24, 2026 (DOI: 10.1016/j.matt.2026.102812). Other contributors include the National Centre of Competence in Research Catalysis (NCCR Catalysis) and b12 Labs.