Similarity Search

Introduction 

Seed Tag

Irrelevant Tag(s)

Using Similarity Search

Apply a Similarity Search in Custom Analysis

Analyze and Evaluate Results of Similarity Search

 

Introduction

PatentSight’s Similarity Search Workflow 

 

Based on a homogenous set of patents (seed patents or seed tag) defined by the user, the Similarity Search finds patents which belong to the same technology field as the seed patents.

The Similarity Search is a useful tool for patent researcher as well for users who have at least some expertise in a technology field for the following tasks:

  • Definition of a technology field (Prior art search, FTO search, patent monitoring etc.)
  • Verification of a defined technology field (Cross-check, sensitivity analysis etc.)

Please be aware of the quality of the results strongly depends on the homogeneity of the seed patents, the accuracy of the classification systems and the coverage with citations in the respective technology fields. The quality of the results may differ depending on the technology field.

The Similarity Search is not designed to search for several technology fields at once. E.g., it is not designed to find enterprises active in the same set of technology fields. Therefore, it is not recommended to use the whole portfolio of a company as seed tag – in most cases this will not lead to satisfying results.

To define several technology fields, the Similarity Search needs to be run several times: individually for each technology field.

Important: Make sure to always review the results you obtain from a Similarity Search !

 

The Similarity Search supports you in defining a technology field by searching for similar patents. The search builds on technology classes (IPC, CPC and F-Terms) and patent citations.

Depending on the use case, the defined technology field(s) can be used as final result set or as starting point for further analyses using the Custom Analysis.

 

In order to find similar patents, the Similarity Search first calculates a theoretical “ideal” patent based on the seed tag. This “ideal” patent serves as center to find patents that surround it at various distances. The distance from this center is expressed by the Similarity Score, with 1 being most similar (close to the center) and 0 not being similar at all.

If the seed tag is not homogenous and includes, e.g., patents belonging to technology A and technology B, the Similarity Search will calculate this central “ideal” patent as a technology between these technologies. As a result, the search may miss out both technology A and technology B and instead find patents belonging to a field it interprets as in between.

 

Seed Tag

You need to create a seed tag of relevant patents, i.e. patents that you know are relevant in a technology field in which you want to search for similar patents. The tag should contain a minimum of 10 patents. The patents should be homogenous in technology. We do not recommend tagging the entire portfolio of an owner, as this may lead to a tag containing patents of various technologies and therefore to poor results of the Similarity Search.

Irrelevant Tag(s)

Optionally, you can create one or more tags of irrelevant patents, i.e. patents that you know are irrelevant in a technology field you want to perform the Similarity Search in. This may be, e.g., patents that belong to the technology field of your Similarity Search (e.g., “wind turbine blades”) but cover a specific detail you are not interested in (e.g., “wind turbine blade transportation equipment”). The tag should contain a minimum of 10 patents. If you want to tag several irrelevant technology fields, we recommend you make a separate tag for each technology field.

 


Using Similarity Search

 

The Similarity Search Assistant, which guides you step-by-step through the tool can be activated and deactivated

 

Similarity Search Start Menu

 

First Stage: Choose seed patents

 

Second Stage (optional): Choose irrelvant patents

 

 

Third stage (optional): Review sample patents

 

 

Fourth stage: Determine technology field scope and finalize Similarity Search

 

 


Apply a Similarity Search in Custom Analysis

After a successful Similarity Search, your search area will contain the tag field(s) generated by the Similarity Search

 

By default, the filter includes inactive patents and other IP rights

Similarity Search_IP rights_1

 

Analysis Example 1: Seed Tag Review

 

Analysis Example 2: Macro Level

 

Analysis Example 3: Micro Level

 


Analyze and Evaluate Results of Similarity Search

Technology field size determination is the fourth stage of Similarity Search and is presented to you in this overview

 

The Recall-Precision-Graph 

 

 

PatentSight suggests a division into three fields:

 

Narrow” has, in this case, a default cutoff value of 0,80 and strongly focusses on the relevant technology field including very few irrelevant patents but also a smaller share of relevant patents than the other fields.

Medium” has, in this case, a default cutoff value of 0,70 and includes a large share of relevant patents but also more irrelevant patents.

Broad” has, in this case, a default cutoff value of 0,40 and includes even more irrelevant patents. It may also include patents from other, though related, technology fields.

However, you can adjust these thresholds in regards to your needs.

 

 

 

By default, PatentSight suggest a division into three fields: Narrow (0,8), Medium (0,7) and Broad (0,4).

In general, the selection of the cutoff value totally depends on the analyzed technology field. As a rough threshold, precision and recall should be above 80% for the narrow field. However, this might change within a different technology field or set of seed patents.