Updated on June 4, 2026 · Published by PatSnap Open Intelligence Team
You’ve sketched a scaffold in ChemDraw and now need to know if synthesizing analogs will step on active patents. The typical path—toggling between PubChem for structure confirmation, SureChEMBL for prior art, then asking legal to check status—burns an afternoon and still leaves gaps. A Model Context Protocol (MCP) connector collapses that into one query inside your AI.
You’re refining a lead compound and need to understand the patent landscape before committing synthesis time. The molecule looks clean in PubChem, but you need to know: are similar structures already claimed? Which jurisdictions matter? What’s actually still active versus expired?Your current workflow spreads across disconnected tools. PubChem confirms the structure exists but doesn’t link to patents. SureChEMBL surfaces patent mentions but doesn’t integrate legal status or similarity thresholds. By the time you’ve copied SMILES strings between three browser tabs, cross-referenced publication numbers, and emailed a screenshot to IP counsel, you’ve lost half a day—and you still don’t know if that 0.75 Tanimoto match in an EP application is something your team should care about.
This guide uses PatSnap Chemical Molecular MCP—built by
PatSnap, indexing patents across 174 jurisdictions and 282M+ compounds. The MCP lets you run structure searches, extract all compounds from a patent, and check similarity scores in plain language, live, inside Claude or any compatible AI. Results appear in your conversation grounded in patent numbers, legal status, and assignee names—no tab-switching.
This guide covers: - How to run exact, substructure, and similarity searches using SMILES or InChIKey in a single AI query
- How to extract and compare all structures from a patent to map competitor chemical space
- How to interpret Tanimoto scores and legal status to prioritize which analogs deserve deeper review
- Why the connector routes your question to the right search type without forcing you to learn query syntax
How to Run Structure Searches Directly in Your AI
Paste a SMILES string into your AI and ask for similar patents. The MCP interprets your query, runs the search against the patent database, and returns matches with Tanimoto similarity scores, patent numbers, assignees, and legal status. You specify the similarity threshold (typically 0.7+ for close analogs, 0.4+ for scaffold exploration), and the system ranks results by structural overlap.
Exact match finds the identical molecule—useful for checking if a competitor disclosed your exact structure.
Substructure search finds all molecules containing your fragment—critical when you’re working with a privileged scaffold and need to see every published variant.
Similarity search uses Tanimoto coefficients to rank analogs—this is how you discover that a competitor’s 2019 application covers a methyl-shifted version at 0.82 similarity.When you ask about a known drug, the output shows how crowded that chemical space is. A proton pump inhibitor search illustrates this: omeprazole returns over 40,000 similar structures at modest thresholds, but the top-ranked exact matches reveal something more useful for freedom-to-operate work.
Example output (omeprazole similarity search at 0.4+ Tanimoto): Esomeprazole 1.00 similarity in CA2535983A1 (AstraZeneca, INACTIVE); omeprazole benzimidazolide anion 1.00 similarity in US7576219B2 (Zydus Lifesciences, ACTIVE); lansoprazole 0.72 similarity in EP0174726B1 (Takeda, INACTIVE); pantoprazole and rabeprazole ~0.68 similarity, all expired 1989–1999.
This tells you the entire first-generation PPI landscape is open. The only active patent in the top matches is a process claim, not composition of matter. If you’re designing in this space, you’re navigating expired art—but note that vonoprazan, a potassium-competitive acid blocker, uses a different scaffold entirely and wouldn’t appear in this similarity set.For your own compounds, look at the top 10–20 matches. If active patents cluster above 0.75 similarity with your target scaffold, flag those for IP review. If the nearest active claims sit at 0.5 similarity or lower, you’re likely in open space for composition-of-matter purposes—though you’d still verify that synthesis routes and formulations don’t infringe process or method claims.
How to Extract All Chemical Structures from a Competitor Patent
When you find a relevant patent, you need to see every disclosed structure—not just the ones highlighted in the abstract. A Markush claim might cover hundreds of theoretical analogs, but the examples show which compounds the inventors actually made and tested. The MCP pulls all chemical records from a patent by document number: claimed structures, example compounds, intermediates, and comparators.Ask your AI:
“Extract all structures from US9856232” (or any patent number). The tool returns SMILES, InChIKeys, and notes which appear in claims versus examples. This maps the competitor’s actual chemical space in one query instead of manually parsing tables and image-based structures from a PDF.
Example output (US9856232B1, King Saud University dihydropyrimidinone derivatives): 136 total chemical records, 23 novel compounds disclosed, 18 claims with 2 independent. Claim 1 is a Markush genus; Claim 14 covers synthesis method. Representative novel compound: 4-phenyl-5-[4-(piperidin-1-yl)benzoyl]-3,4-dihydropyrimidin-2-one appears in Claims 1, 2, 3.
The count tells you scope: 136 structures means this patent explores broad SAR, not just a single lead. The 23 “novel” compounds are the ones the inventors synthesized and characterized—these are your comparators. The split between genus claims and specific examples shows how much room the patent leaves open. If your analog sits inside the Markush definition of Claim 1 but wasn’t explicitly disclosed in the examples, you’re in a gray zone that needs legal interpretation.For your workflow, extract structures from the 3–5 most relevant competitor patents, then run similarity searches on each disclosed compound against your internal library. This shows overlap before you invest in a design cycle. If a competitor’s Example 7 sits at 0.88 similarity to your lead and their patent is active in your target jurisdiction, you pivot or file continuation applications early.
How to Interpret Similarity Scores and Legal Status Together
A
Tanimoto score quantifies structural overlap: 1.0 is identical, 0.7+ indicates close analogs (typically same scaffold with substitution changes), and 0.4–0.6 suggests related but distinct chemotypes. For patent purposes, anything above 0.7 deserves scrutiny, but the score alone doesn’t determine infringement—you need legal status, claim language, and jurisdiction.The MCP returns both: each hit shows similarity, patent number, assignee, and status (ACTIVE, INACTIVE, PENDING).
ACTIVE means the patent is in force and could block commercialization in that jurisdiction.
INACTIVE means expired, lapsed, or abandoned—no longer enforceable.
PENDING means the application hasn’t issued; the claims might still narrow during prosecution.When omeprazole’s search returns esomeprazole at 1.00 similarity but INACTIVE status, that’s useful: the molecules are identical in structure, but the patent can’t block you. When a 0.72-similar lansoprazole patent also shows INACTIVE with a 1989 priority date, you know the scaffold has been public domain for decades. The single ACTIVE result—a process claim at 1.00 similarity—matters only if you’re using that specific synthesis route. For composition-of-matter freedom to operate, the landscape is clear.For your own searches, flag any ACTIVE patents above 0.7 similarity for IP counsel review. Document INACTIVE high-similarity hits as prior art—they strengthen your freedom-to-operate position by showing the structure was disclosed and abandoned. Ignore low-similarity matches (<0.5) unless you’re mapping a broad therapeutic class; these rarely indicate infringement risk for small molecules.Cross-reference jurisdictions: a patent active in the US but expired in the EU changes your commercialization strategy. The connector shows legal status per document across 174 jurisdictions, so you see regional gaps in competitor coverage. If your target market is outside the patent’s geographic scope, high structural similarity becomes less urgent.
Why This MCP for Structure-Based Patent Search
You need to answer “is this molecule clear to synthesize?” without leaving your AI workspace, and you need the search to interpret your intent—exact match versus scaffold exploration—without forcing you to learn query syntax. The Chemical Molecular MCP routes your natural language question to the right search type (exact, substructure, or similarity), runs it against the patent database, and returns results with legal status in the same conversation.
You ask in plain language; the tool decides search type. “Is this SMILES patented?” triggers exact match. “Show me patents with similar structures” triggers similarity search with a default threshold, which you can adjust (“…at 0.6 or higher”). “Find all patents containing this benzimidazole core” triggers substructure search. You don’t need to know which search mode to invoke—describe what you want to know, and the connector handles routing.
Legal status appears with every result. You see ACTIVE/INACTIVE/PENDING next to each patent number, so you immediately know which hits require legal review versus which are prior art only. This removes the second lookup step and keeps the entire analysis in one thread.
Bulk structure extraction from patents. When you find a relevant competitor filing, you extract all disclosed compounds in one query instead of manually copying SMILES from tables or image-based structures. This maps their chemical space in minutes and feeds directly into your next similarity search to check overlap with your own library.
Tanimoto scoring integrated into results. You don’t calculate similarity separately—every match includes the score, ranked from most to least similar. This prioritizes review: you look at 0.9+ matches first, then decide how far down the similarity ladder you need to go based on your risk tolerance and patent strategy.For medicinal chemists who live in ChemDraw and need IP context during design reviews, this collapses the “is this patented?” question into a 30-second query instead of a cross-tool research session. You ask, the database answers, you move to the next analog decision with patent numbers and status documented in the same conversation your team can reference later.
Try Structure-Based Patent Search Yourself
You can explore the database in a browser or connect it to your AI—both access the same patent and compound data.
Path A — Try in browser, no setup
PatSnap Eureka runs the same structure searches in a browser. Paste a SMILES string, choose similarity threshold, and see patent results with legal status. No install, no API key.
Best if you’re still evaluating or need to share results with team members who don’t use AI assistants.For deeper chemical search capabilities across patents and literature, explore
PatSnap’s chemical intelligence platform.
Path B — Add the MCP to your AI
- Get your API key at open.patsnap.com (10,000 free credits, no credit card).
- Find PatSnap Chemical Molecular MCP in the marketplace and copy the connection URL.
- Ask your AI: “Help me add this MCP to my config: [paste URL with API key]”—your AI handles the file path, JSON, and restart.
- Run a query to see it work: “Search patents for structures similar to omeprazole at 0.7 Tanimoto”. If you see results with patent numbers and similarity scores, you’re set.
Conclusion
Structure-based patent searching removes the friction of switching between chemical databases, patent platforms, and legal status lookups. When you can paste a SMILES string and immediately see active patents, expired prior art, and Tanimoto-ranked analogs in one response, the question “is this scaffold clear?” becomes answerable in the moment—during a design review, not days later after a literature deep-dive. The difference is workflow velocity: you iterate on molecular design with patent context already integrated, so freedom-to-operate concerns surface early when pivoting is still inexpensive.
Note: Information based on publicly available sources as of 2026. Product features may change. Contact PatSnap for current capabilities.
Ready to Search Your First Compound?
Start free—10,000 credits, no credit card, no subscription.→
Get Your API Key — sign up at open.patsnap.com→
Add the MCP to Your AI — find it in the marketplace→
Try in Browser — PatSnap Eureka, no install
FAQ
What is SMILES patent search?
SMILES patent search uses Simplified Molecular Input Line Entry System notation—a text string representing molecular structure—to find patents disclosing the same or similar compounds. You input a SMILES string (e.g.,
CC(=O)OC1=CC=CC=C1C(=O)O for aspirin), and the search returns patents containing that exact molecule, substructures that include it, or similar analogs ranked by Tanimoto coefficient. This lets you check patent coverage for a compound without needing the patent number or chemical name. Learn more about molecular representations in the
ChEMBL database documentation.
How do I search patents by chemical structure in Claude or other AI assistants?
Connect the MCP to your AI environment using the setup URL and your API key. Once linked, ask in plain language:
“Search patents for this SMILES” or
“Find similar structures to [paste SMILES] at 0.7 similarity”. The MCP interprets your request, queries the database, and returns patent numbers, assignees, legal status, and Tanimoto scores directly in the conversation. Your AI handles the connection—you don’t write code or navigate separate interfaces.
Can molecular similarity search replace a freedom-to-operate analysis?
Similarity search identifies structurally related patents and ranks them by overlap, which surfaces potential prior art and active claims quickly. It doesn’t replace a full freedom-to-operate (FTO) analysis because FTO requires claim construction, infringement assessment, and jurisdictional strategy—work for IP counsel. Use similarity search to triage: flag high-similarity active patents for legal review, document expired hits as prior art, and rule out low-similarity results early. This focuses legal spend on the patents that actually matter. For comprehensive IP guidance, consult
WIPO Patent Information Services.
How do I start with structure-based patent search if I’ve never used an MCP?
Sign up for a free API key at
open.patsnap.com—10,000 credits, no credit card. If you use Claude, Cursor, or another MCP-compatible AI, find the Chemical Molecular MCP in the marketplace and paste the connection URL into your AI with the key; your assistant will configure it. To explore searches first without setup, try
PatSnap Eureka in a browser—same database, no installation required. For more patent search strategies, visit the
PatSnap resources blog.