Legacy Product

Fusion 5.10
    Fusion 5.10

    Include Documents Index Stage

    Table of Contents

    This stage passes documents to the next stage in the pipeline if they match one or more of the specified rules (Boolean OR). If some field has multiple values then at least one value must match against specified pattern. All non-matching documents are dropped. Rules are defined using regular expression field matching.

    Examples

    Give the "simple-include" pipeline a stage that includes only certain document types:

    curl -u USERNAME:PASSWORD -X POST -H "Content-type: application/json" 'http://localhost:8764/api/index-pipelines' -d '
    {
      "id" : "simple-include",
      "stages" : [ {
        "type" : "include-doc",
        "matchRules" : [ {
            "field" : "document_type",
            "pattern" : "(xls|xlsx|xlst|doc|docx)"
        }]
      }]
    }'

    Response:

    {
      "id" : "simple-include",
      "stages" : [ {
        "type" : "include-doc",
        "id" : "f701f96b-780e-4355-9dd3-6e53a89afe3e",
        "matchRules" : [ {
          "field" : "document_type",
          "pattern" : "(xls|xlsx|xlst|doc|docx)"
        } ],
        "type" : "include-doc",
        "skip" : false,
        "label" : "include-doc"
      } ],
      "properties" : { }
    }

    Send a text document through the "simple-include" pipeline:

    curl -u USERNAME:PASSWORD 'http://localhost:8764/api/index-pipelines/simple-include/collections/logs/index?simulate=true&echo=true' -H 'Content-type: application/json' -d '
    {
      "document_type": "txt"
    }'

    The empty response indicates the document was dropped:

    [ ]

    Send an XLS document through the pipeline:

    curl -u USERNAME:PASSWORD 'http://localhost:8764/api/index-pipelines/simple-include/collections/logs/index?simulate=true&echo=true' -H 'Content-type: application/json' -d '
    {
      "document_type": "xls"
    }'

    The response is document metadata, indicating the document passed the stage:

     {
      "id" : "9e7d1c2e-343a-49de-bc6a-1d1fc25fa93f",
      "fields" : [ {
        "name" : "document_type",
        "value" : "xls",
        "metadata" : { },
        "annotations" : [ ]
      } ]
    } ]

    Configuration

    When entering configuration values in the UI, use unescaped characters, such as \t for the tab character. When entering configuration values in the API, use escaped characters, such as \\t for the tab character.

    This stage passes a document through if any of the specified rules match; drops the document otherwise

    skip - boolean

    Set to true to skip this stage.

    Default: false

    label - string

    A unique label for this stage.

    <= 255 characters

    condition - string

    Define a conditional script that must result in true or false. This can be used to determine if the stage should process or not.

    matchRules - array[object]required

    object attributes:{field required : {
     display name: Field
     type: string
    }
    pattern required : {
     display name: Regex Pattern
     type: string
    }
    }