Legacy Product

Fusion 5.10
    Fusion 5.10

    Exclude Documents Index Stage

    Table of Contents

    The Exclude Documents stage drops all documents that match all of the specified rules (Boolean AND). If some field has multiple values then at least one value must match against the specified pattern. No further processing is done on any matching documents, thus they will not be indexed into a Fusion collection. All non-matching documents are passed to the next stage in the pipeline. Rules are defined using regular expression field matching.

    Examples

    Give the "simple-exclude" pipeline a stage that excludes certain document types:

    curl -u USERNAME:PASSWORD -X POST -H "Content-type: application/json" 'http://localhost:8764/api/index-pipelines' -d '
    {
      "id" : "simple-exclude",
      "stages" : [ {
        "type" : "exclude-doc",
        "matchRules" : [ {
            "field" : "document_type",
            "pattern" : "(xls|xlsx|xlst|doc|docx)"
        }]
      }]
    }'

    Send a text document through the "simple-exclude" pipeline:

    curl -u USERNAME:PASSWORD 'http://localhost:8764/api/index-pipelines/simple-exclude/collections/logs/index?simulate=true&echo=true' -H 'Content-type: application/json' -d '
    {
      "document_type": "txt"
    }'

    The response is document metadata, indicating the document passed the stage:

    [ {
      "id" : "93da43ff-4218-4f24-a690-23b530926104",
      "fields" : [ {
        "name" : "document_type",
        "value" : "txt",
        "metadata" : { },
        "annotations" : [ ]
      } ]
    } ]

    Send an XLS document through the "simple-exclude" pipeline:

    curl -u USERNAME:PASSWORD 'http://localhost:8764/api/index-pipelines/simple-exclude/collections/logs/index?simulate=true&echo=true' -H 'Content-type: application/json' -d '
    {
      "document_type": "xls"
    }'

    The empty response indicates the document was dropped:

    [ ]

    Configuration

    When entering configuration values in the UI, use unescaped characters, such as \t for the tab character. When entering configuration values in the API, use escaped characters, such as \\t for the tab character.

    This stage drops a document if all of the specified rules match; otherwise, passes the document through unchanged

    skip - boolean

    Set to true to skip this stage.

    Default: false

    label - string

    A unique label for this stage.

    <= 255 characters

    condition - string

    Define a conditional script that must result in true or false. This can be used to determine if the stage should process or not.

    matchRules - array[object]required

    object attributes:{field required : {
     display name: Field
     type: string
    }
    pattern required : {
     display name: Regex Pattern
     type: string
    }
    }