Legacy Product

Fusion 5.10
    Fusion 5.10

    SolrXML V1 Connector Configuration Reference

    Table of Contents

    The SolrXML connector indexes XML files formatted according to Solr’s XML structure. It is not a generic XML file crawler; it can only index SolrXML-formatted documents.

    Deprecation and removal notice

    This connector is deprecated as of Fusion 4.2 and is removed or expected to be removed as of Fusion 5.0. Use the Solr V1 connector instead.

    For more information about deprecations and removals, including possible alternatives, see Deprecations and Removals.

    Configuration

    When entering configuration values in the UI, use unescaped characters, such as \t for the tab character. When entering configuration values in the API, use escaped characters, such as \\t for the tab character.

    Connector to index XML files formatted according to Solr's XML structure. This connector can only index SolrXML formatted documents and is not a generic XML file crawler. Per the Solr standard, all XML files must include the <add> tag in order for the documents to be added to the index.

    description - string

    Optional description for this datasource.

    id - stringrequired

    Unique name for this datasource.

    >= 1 characters

    Match pattern: ^[a-zA-Z0-9_-]+$

    pipeline - stringrequired

    Name of an existing index pipeline for processing documents.

    >= 1 characters

    properties - Properties

    Datasource configuration properties

    commit_on_finish - boolean

    Set to true for a request to be sent to Solr after the last batch has been fetched to commit the documents to the index.

    Default: true

    db - Connector DB

    Type and properties for a ConnectorDB implementation to use with this datasource.

    aliases - boolean

    Keep track of original URI-s that resolved to the current URI. This negatively impacts performance and size of DB.

    Default: false

    inlinks - boolean

    Keep track of incoming links. This negatively impacts performance and size of DB.

    Default: false

    inv_aliases - boolean

    Keep track of target URI-s that the current URI resolves to. This negatively impacts performance and size of DB.

    Default: false

    type - string

    Fully qualified class name of ConnectorDb implementation.

    >= 1 characters

    Default: com.lucidworks.connectors.db.impl.MapDbConnectorDb

    exclude_paths - array[string]

    An array of regular expression patterns that indicate documents to be excluded from the index. Multiple expressions can be separated by commas.

    Default:

    generate_unique_key - boolean

    If true, a unique identifier will be added to each document. In most cases, this is an 'id' field, unless it was changed in your implementation. If your documents already include an ID field, you can set this to false.

    Default: true

    include_datasource_metadata - boolean

    Set True to add '_lw_data_source_s' and '_lw_data_source_type_s' fields to each document in addition to fields found in the file. These fields will ensure these documents are associated with this datasource for faceting, information shown in the UI, or later document removal.

    Default: true

    include_paths - array[string]

    An array of regular expression patterns that indicate documents to be included in the index. Multiple expressions can be separated by commas.

    Default: ".*\\.xml"

    initial_mapping - Initial field mapping

    Provides mapping of fields before documents are sent to an index pipeline.

    condition - string

    Define a conditional script that must result in true or false. This can be used to determine if the stage should process or not.

    label - string

    A unique label for this stage.

    <= 255 characters

    mappings - array[object]

    List of mapping rules

    object attributes:{operation : {
     display name: Operation
     type: string
    }
    source required : {
     display name: Source Field
     type: string
    }
    target : {
     display name: Target Field
     type: string
    }
    }

    reservedFieldsMappingAllowed - boolean

    Default: false

    skip - boolean

    Set to true to skip this stage.

    Default: false

    unmapped - Unmapped Fields

    If fields do not match any of the field mapping rules, these rules will apply.

    operation - string

    The type of mapping to perform: move, copy, delete, add, set, or keep.

    Default: copy

    Allowed values: copymovedeletesetaddkeep

    source - string

    The name of the field to be mapped.

    target - string

    The name of the field to be mapped to.

    max_docs - integer

    The maximum number of documents to crawl. Use -1 to index all documents found.

    Default: -1

    path - string

    Name of the file to read, or directory containing files to read.

    >= 1 characters

    verify_access - boolean

    Set to true to require successful connection to the filesystem before saving this datasource.

    Default: true