Full Text Search

Last updated: April 2021

 

Introduction

Full Text Search in Visual Search Mode

Full Text Search in Syntax Mode

How to Enter Keywords Into the Full Text Filter Element

Search Syntax of the Full Text Search

Boolean Operators

Proximity Operators

Wildcards

Escape Characters

Subqueries

Other

Stop Words

Punctuation Characters

Upper Case and Lower Case Letters

Full Text Search Examples

Example 1: Search Within an Example Document

Example 2: Combining Several Full Text Filter Elements

 


 

 

Introduction

PatentSight's full text search runs on the English-language versions of all documents in our database regardless of their original language. Where original English texts are not available, we use machine translated versions thereof. This allows to search in patent documents from all over the world while using English keywords. 

Note: The PatentSight full text search is designed for searching for English keywords. Searching for keywords in other languages is currently not supported.

Full Text Search in Visual Search Mode

In Visual Search Mode, you can add the full text search filter element to the search filter by clicking on "Full Text" displayed on top of the search filter.

You can select which segment(s) of the patent documents your keywords should be searched in by ticking one or a combination of the four options "Title", "Abstract", "Claims", and "Description".

 

Full Text Search in Syntax Mode

In Syntax Mode, instead of clicking on the available filter elements, you can type your entire search query directly into the search filter.

To define which segment(s) of the patent documents your keywords should be searched in, you can type in any of the following search queries:

Long Version Short Version
Title=( ) T=( )
Abstract=( ) A=( )
Claims=( ) C=( )
Description=( ) D=( )
TitleAbstracts=( ) TA=( )
TitleClaims=( ) TC=( )
TitleDescription=( ) TD=( )
TitleAbstractClaims=( ) TAC=( )
TitleAbstractDescription=( ) TAD=( )
TitleClaimsDescription=( ) TCD=( )
TitleAbstractClaimsDescription=( ) TACD=( )
AbstractClaims=( ) AC=( )
AbstractDescription=( ) AD=( )
AbstractClaimsDescription=( ) ACD=( )
ClaimsDescription=( ) CD=( )

 

How to Enter Keywords Into the Full Text Filter Element

When in Visual Search Mode, the keywords you want to search for should be entered into the white textbox of the Full Text filter element. 

When in Syntax Mode, the keywords you want to search for should be entered into between the parentheses that appear after the equal sign of the Full Text filter element.

PatentSight's full text search offers a range of operators which you can use to search for combinations of keywords or for alternative spellings of the same keyword.

In the PatentSight search filter, these operators will always appear in green font. Details and examples regarding each set of operators are outlined below.

 


 

Search Syntax of the Full Text Search

 

Boolean Operators


Syntax Description Example Alternative Syntax
AND

Logical AND operator (conjunction).

Finds documents containing all search terms separated by the operator.

steel AND alloy

Finds all documents that contain both the words "steel" and "alloy".

&
&&
OR

Logical OR operator (disjunction).

Finds documents containing any of the search terms separated by the operator.

steel OR alloy

Finds all documents that contain either the word "steel" or the word "alloy".

|

||
,

AND
NOT

 

NOT

Logical NOT operator (negation).

Excludes documents containing the negated search term from the result set.

steel AND NOT alloy

Finds all documents that contain the word "steel" but do not contain the word "alloy".

NOT alloy

Finds all documents that do not contain the word "alloy".
Note: Only used at the beginning of query. 

!

Important:
When selecting more than one segment of the patent documents (e.g., title and claims), if using Boolean operators, these segments will be treated as one text body
E.g., the search TC=(steel AND alloy) returns documents that contain:

  • both "steel" and "alloy" in the title
  • both "steel" and "alloy" in the claims
  • only "steel" in the title and only "alloy" in the claims
  • only "steel" in the claims and only "alloy" in the title
  • both "steel" and "alloy" in both title and claims

     

    Proximity Operators

    Proximity operators allow you to define the maximum or exact number of word jumps between the words you are searching for, e.g., "steel" in the vicinity of "alloy".

    If the number of word jumps (n) is not defined in the syntax, per default, there will be 5 word jumps. Depending on the operator used, this means either a maximum of or exactly 4 words ( = 5 - 1 ) between the keywords you are searching for.

    Syntax Description Example Alternative Syntax
    NEAR[n] Unordered proximity operator. Returns all documents that contain the searched terms within up to n word jumps of each other, regardless of order. If n is not
    specified, the distance is set to 5. E.g. NEAR1 searches for directly adjacent words.

    steel NEAR alloy

    Finds all documents that contain both the words "steel" and "alloy" within up to 5 word jumps of each other, regardless of their order.


    (steel, iron) NEAR[n] (alloy, blend)

    Finds all documents that contain either of the words "steel" or "iron" within up to 5 word jumps of either of the words "alloy" or "blend".

    W[n]
    ~[n]
    [n]D
    SEQ[n] Ordered proximity operator. Returns all documents that contain the searched terms within up to n word jumps of each other, in the specified order. If n is not
    specified, the distance is set to 5.

    steel SEQ3 alloy

    Finds all the documents that contain both the words "steel" and "alloy" within up to 3 word jumps of each other, in the specified order ("steel" has to appear before "alloy").

    Also supported:
    (word 1, word 2) SEQ[n] (word 3, word 4).

    WF[n]
    [n]W
    PRE[n]
    WD[n]
    =NEAR[n] Unordered proximity operator with exact word distance. Returns all documents that contain the searched terms in exactly n word jumps of each other, regardless of
    order. If n is not specified, the distance is set to 5.

    steel =NEAR10 alloy

    Finds all the documents that contain both the words "steel" and "alloy" in exactly 10 word jumps of each other, regardless of their order.

    Also supported:
    (word 1, word 2) =NEAR[n] (word 3, word 4).

    =[n]D
    =SEQ[n] Ordered proximity operator with exact word distance. Returns all documents that contain the searched terms in exactly n word jumps of each other, in the specified order. If n is not specified, the distance is set to 5.

    steel =SEQ3 alloy

    Finds all the documents that contain both the words "steel" and "alloy" in exactly 3 word jumps of each other, in the specified order (→ "steel" has to appear before "alloy").

    Also supported:
    (word 1, word 2) =SEQ[n] (word 3, word 4)

    =[n]W
    Space Space-separated search terms are evaluated as SEQ1 chains.

    steel alloy

    Finds all the documents that contain both the words "steel" and "alloy" directly next to each other, in the specified order.

    ADJ

    Important:
    When selecting more than one segment of the patent documents (e.g., title and claims), if using proximity operators, each of these segments will be searched individually.

    E.g., the search TC=(steel =SEQ2 alloy) returns documents that contain:

    • "steel" and "alloy" in this order in exactly 2 word jumps of each other in the title
    • "steel" and "alloy" in this order in exactly 2 word jumps of each other in the claims
    • "steel" and "alloy" in this order in exactly 2 word jumps of each other in both title and claims

    Example hit:
    " [...] golf club head of claim 1 , wherein the metal face insert is composed of a material selected from the group consisting of steel, titanium alloy, and aluminum alloy."

     

    Wildcards

    Syntax Description Example
    * The *-wildcard replaces between zero and up to an unlimited amount of characters. Using it at the beginning of a search term is allowed (unlimited left-hand truncation).

    comput*
    Finds all documents containing e.g. computing, computation and computer.

    *oxide
    Finds all documents containing e.g. monoxide, dioxide, and peroxide.

    ? The ?-wildcard replaces exactly one character. Using it at the beginning of a search term is allowed (left-hand truncation). analy?e
    Finds all documents containing both analyse and analyze.
    % The %-wildcard replaces either zero or one character. Using it at the beginning of a search term is allowed (left-hand truncation). alumin%um
    Finds all documents containing both aluminium and aluminum.
    _ The _-wildcard replaces either a space or no character. air_bag
    Finds all the documents containing air bag and airbag.

     

    Escape Characters

    Syntax Description Example
    " " Escape character. Using the quotation marks enables searching for search terms that would otherwise be interpreted as operators.

    "electromagnetic near field"

    Finds all documents that contain the actual term "near" within the above stated sentence instead of electromagnetic NEAR field.
    Please note that special characters ( + = & | > < ! ( ) { }[ ] ^ " ~ * ? % : / \ ) are not indexed and thus not searchable.

     

    Subqueries

    Syntax Description Example
    (  )

    The order of evaluation can be changed with the use of parentheses.

    Order of evaluation:
    1. Subqueries
    2. Proximity operators
    3. NOT
    4. AND
    5. OR

    For same priority level: "left to right" (left-associativeness).

    (copper OR nickel) SEQ1 alloy
    is equivalent to
    copper alloy OR nickel alloy

     

    Other

    Stop Words

    Stop words (e.g., "the" or "and") are treated as regular keywords and not filtered out. 

    • Searching for "steel =SEQ2 alloy" returns "steel and alloy" but not "steel alloy"
    • Searching for "coating =SEQ2 layer" returns "coating a layer" but not "coating layer"

     

    Punctuation Characters

    Punctuation characters and special characters, such as such as periods, commas or hyphens, are ignored in both the search filter and in the documents searched in. 

    • Periods (".") and commas (",")
      • Searching for "coating layer" returns both "coating layer" and "coating, layer"
    • Hyphens ("-") 
      • Searching for "water based" returns "water based" and "water-based"
      • Searching for "water-based" returns "water-based", "water based" and "waterbased"

    However, note that punctuation characters that define numeric values are treated as exceptions:

      • Searching for "12.5" returns "12.5" but not "125".
      • Searching for "12,5" returns "12,5" but not "125".
      • Searching for "125" returns "125" but not "12,5" or "1,25" or "12.5" or "1.25".

     

    Upper Case and Lower Case Letters

    The PatentSight full text search is case insensitive and does not differentiate between upper and lower case letters.

    • Searching for "LED" returns both "LED" and "led"
    • Searching for "composition" returns both "Composition" and "composition"

     


     

     

    Full Text Search Examples

    Example 1:
    Search Within a Sample Document

    Operator Query Matching Explanation
    AND antibody AND polypeptide Match Both operands left and right of the AND operator occur in the example document.
    polypeptide AND DNA No
    Match
    Only one condition left and right of the AND operator is met: the word "DNA" does not occur in the example document.
    polypeptide AND NOT DNA Match Both conditions left and right of the AND operator are met. One operand does occur and the negated operand does not occur.
    polypeptide AND NOT antibody No
    Match
    Only one condition left and right of the AND operator is met: "antibody" was negated with a NOT operator, but does occur in the example document.
    OR polypeptide OR antibody Match Any operand left and right of the OR operator occurs in the example document.
    polypeptide OR DNA Match At least one operand left and right of the OR operator occurs in the example
    document.
    RNA OR DNA No Match None of the conditions left and right of the OR operator are met: neither the word
    "RNA", nor the word "DNA" do occur in the example document.
    NEAR isolated NEAR position Match Both operands left and right of the NEAR operator occur within up to 5 word jumps next to each other in the example document. The order of occurrence does
    not matter for NEAR.
    isolated NEAR2 position No
    Match
    Both operands left and right of the NEAR2 operator do not occur within up to 2 word jumps next to each other in the example document.
    isolated =NEAR4 position Match Both operands left and right of the =NEAR4 operator occur in exactly 4 word jumps next to each other in the example document.
    isolated =NEAR6 position No
    Match
    Both operands left and right of the =NEAR6 operator do not occur in exactly 6 word jumps next to each other in the example document.
    SEQ position polypeptide Match Both space-separated search tokens occur adjacent to each other. Space-separated search terms are evaluated as SEQ1.
    isolated SEQ position Match Both operands left and right of the SEQ operator occur within up to 5 word jumps next to each other. The order of occurrence does matter for SEQ.
    position SEQ isolated No
    Match
    While both operands left and right of the SEQ operator do occur within up to 5 word jumps next to each other, they do not occur in the specified order.
    isolated =SEQ6 position No
    Match
    Both operands left and right of the =SEQ6 operator do not occur in exactly 6 word jumps next to each other in the specified order in the example document.
    Wildcards *peptide Match The word "polypeptide" occurs in the example document, "poly" being replaced by the *-wilcard.
    HVR-?? Match The words "HVR-H1", "HVR-H2", etc. occur in the example document. The double ?-wildcards being a placeholder for "H1", "H2", etc.
    X3? No
    Match

    There is no word in the example document that begins with "X3" and ends with exactly one additional character.

    Note: X3% would match.

    chain% Match
    This query does match because of the occurrence of the word "chain" in the example document.
    Note: "chains" would also match.

     

    Example 2:
    Combining Several Full Text Filter Elements

    The full text search browses through patent documents but returns patent families. How this affects the search results, becomes evident when comparing the following two searches.

    Search A

    TC=(steel AND alloy)

    Search B

    TC=(steel)
    AND
    TC=(
    alloy)

    Search A finds patent families that have at least one family member (document) which contains both "steel" and  "alloy" in its title or claims. This means that the same family member has to meet both conditions (=contain both "steel" and "alloy")!

    Hit? Patent Families
    Yes Patent family 1, where the US member has "steel" and "alloy" in its title
    Yes Patent family 2, where the JP member has "steel" in its title and "alloy" in its claims
    No Patent family 3, where the KR member has "steel" in its claims and the DE member has the "alloy" in its claims
    No Patent family 4, where the FR member has "steel" in its title and the CN member has the "alloy" in its claims

     

    Search B finds patent families that have at least one member (document) which contains "steel" in its title or claims and at least one member (document) that contains "alloy" in its title or claims. This means that different members within the same family can meet both conditions individually!

    Hit? Patent Families
    Yes Patent family 1, where the US member has "steel" and "alloy" in its title
    Yes Patent family 2, where the JP member has "steel" in its title and "alloy" in its claims
    Yes Patent family 3, where the KR member has "steel" in its claims and the DE member has the "alloy" in its claims
    Yes Patent family 4, where the FR member has "steel" in its title and the CN member has the "alloy" in its claims


    Consequently, Search B can result in more hits than Search A.