PatSnap platform upgrade: semantic search

 Semantic Search Blog Post ImageBuilding the perfect patent search query is like learning a foreign language. The new vocabulary is replaced with codes and classifications, acronyms and abbreviations.

Instead of new verb conjugations, you must instead master the art of syntax. A misplaced comma, colon, or bracket can have an impact on the results of your search.

Why does building a patent search query have to be so hard?

The answer is that it doesn’t have to be. We’re pleased to tell you we have just launched a major update to our Semantic Search functionality which makes it easier than ever before to start finding relevant patents.

What makes Semantic patent search so powerful?

When trying to understand if an idea or invention is truly novel, keyword-based patent searching means that the researcher must use his or her experience to identify the most appropriate search terms to use. The database or databases being searched will usually only return documents which match the keywords or keyword combinations which have been selected. Even simple concepts such as a sweatshirt could be described in different ways—“jersey”, “pullover” or even “sweater” might describe what is essentially the same thing. However, two different researchers are likely to choose different search terms to describe the same concept or invention, meaning that their searches would return different results.

Often the most useful search criterion is not a collection of carefully chosen keywords, but a complete patent document or invention disclosure, which gives a full and detailed description of an idea and its novelty using natural language. When the complete text is analysed using the appropriate semantic analysis algorithms, the key concepts can emerge and be used to find other patent documents which are most similar to the original document.

How does the new Semantic Search functionality work?

Instead of building a query in the usual way, the researcher can describe their concept or invention with a few paragraphs—or simply copy and paste a patent number or invention disclosure into the search box.

When deciding which patents to return to the user, the algorithm will take into account the entire body of text, ignoring transitional phrases and indefinite articles, and stemming phrases which include prefixes or suffixes. The algorithm will build a relationship between the words to understand the essential meaning and linkage between different words. Then it can complete a comparison with the hundreds of millions of documents in our database, so that only the most similar results are returned.

So when a user requests a set of similar documents through semantic search, the algorithm is able to perform very quick calculations to provide a similarity score for every document in the dataset and then provide the most relevant to the user (the top 1,000 and only ones that have the minimum similarity score).

The goal is to find the most similar documents much in the same way that a reasonable human being would find them if they were to read every single document in the dataset and rank them according to how similar each appears, overall. 

To try it out, simply log in and choose the “Semantic” option—then start searching in plain English.

Our dedicated R&D team is working hard to make it easy to find answers to your innovation questions. If you’d like to learn more about our new Semantic Search feature, please contact your account manager, or request a demo. 

Book a Demo