Legacy Product

Fusion 5.10
    Fusion 5.10

    Regex Field Extraction Index Stage

    The Regex Field Extraction stage (called the Regular Expression Extractor stage in versions earlier than 3.0) is used to extract entities from documents based on matching regular expressions. The resulting regex matches over the contents of the source field are copied to the target field. The regular expression, source, and target fields are defined properties of this stage.

    If using the REST API, this stage type is named "regex-extractor".

    For examples of how to use this stage in the Fusion UI, see Part 2 of the Getting Started tutorial.

    Example Stage Specification

    Define a regex-field-extraction stage to apply a regular expression that looks for storage capabilities of products when it appears in the product 'name' field, and store it in a special field:

    {
      "type" : "regex-field-extraction",
      "id" : "storagesize-regex-extraction",
      "rules" : [ {
        "source" : [ "name" ],
        "target" : "storage_size_ss",
        "pattern" : "(\\d{1,20}\\s{0,3}(GB|MB|TB|KB|mb|gb|tb|kb))",
        "annotateAs" : "storage_size"
      } ],
      "skip" : false
    }

    Configuration

    When entering configuration values in the UI, use unescaped characters, such as \t for the tab character. When entering configuration values in the API, use escaped characters, such as \\t for the tab character.

    This stage allows you to extract text using regular expressions

    skip - boolean

    Set to true to skip this stage.

    Default: false

    label - string

    A unique label for this stage.

    <= 255 characters

    condition - string

    Define a conditional script that must result in true or false. This can be used to determine if the stage should process or not.

    rules - array[object]

    object attributes:{source required : {
     display name: Source Fields
     type: array
    }
    target required : {
     display name: Target Field
     type: string
    }
    writeMode : {
     display name: Write Mode
     type: string
    }
    pattern required : {
     display name: Regex Pattern
     type: string
    }
    returnIfNoMatch : {
     display name: Return if no Match
     type: string
    }
    noMatchValue : {
     display name: No Match Literal Value
     type: string
    }
    group : {
     display name: Regex Capture Group
     type: integer
    }
    annotateAs : {
     display name: Annotation Name
     type: string
    }
    }