Local Filesystem V2 Connector Configuration Reference

Table of Contents

Remote connectors
Configuration

This connector traverses a network file system (NFS), where a shared drive is mounted to the same location on all hosts in the cluster that are running this connector.

Remote connectors

V2 connectors support running remotely in Fusion versions 5.7.1 and later. Refer to Configure Remote V2 Connectors.

When entering configuration values in the UI, use unescaped characters, such as \t for the tab character. When entering configuration values in the API, use escaped characters, such as \\t for the tab character.

Connector for file systems

description - string

Optional description

<= 125 characters

pipeline - stringrequired

Name of the IndexPipeline used for processing output.

>= 1 characters

Match pattern: ^[a-zA-Z0-9_-]+$

diagnosticLogging - boolean

Enable diagnostic logging; disabled by default

Default: false

parserId - stringrequired

The Parser to use in the associated IndexPipeline.

coreProperties - Core Properties

Common behavior and performance settings.

fetchSettings - Fetch Settings

System level settings for controlling fetch behavior and performance.

indexingThreads - number

Maximum number of indexing threads; defaults to 4.This setting controls the number of threads in the indexing service used for processing content documents emitted by this datasource.Higher values can sometimes help with overall fetch performance.

>= 1

<= 10

exclusiveMinimum: false

exclusiveMaximum: false

Default: 4

Multiple of: 1

pluginInstances - number

Maximum number of plugin instances for distributed fetching. Only specified number of plugin instanceswill do fetching. This is useful for distributing load between different instances.

<= 500

exclusiveMinimum: false

exclusiveMaximum: false

Default: 0

Multiple of: 1

indexMetadata - boolean

When enabled the metadata of skipped items will be indexed to the content collection.

Default: false

fetchResponseScheduledTimeout - number

The maximum amount of time for a response to be scheduled. The task will be canceled if this setting is exceeded.

>= 1000

<= 500000

exclusiveMinimum: false

exclusiveMaximum: false

Default: 300000

Multiple of: 1

indexingInactivityTimeout - number

The maximum amount of time to wait for indexing results (in seconds). If exceeded, the job will fail with an indexing inactivity timeout.

>= 60

<= 691200

exclusiveMinimum: false

exclusiveMaximum: false

Default: 86400

Multiple of: 1

pluginInactivityTimeout - number

The maximum amount of time to wait for plugin activity (in seconds). If exceeded, the job will fail with a plugin inactivity timeout.

>= 60

<= 691200

exclusiveMinimum: false

exclusiveMaximum: false

Default: 600

Multiple of: 1

indexContentFields - boolean

When enabled, content fields will be indexed to the crawl-db collection.

Default: false

numFetchThreads - number

Maximum number of fetch threads; defaults to 20.This setting controls the number of threads that call the Connectors fetch method.Higher values can, but not always, help with overall fetch performance.

>= 1

<= 500

exclusiveMinimum: false

exclusiveMaximum: false

Default: 20

Multiple of: 1

asyncParsing - boolean

When enabled, content will be indexed asynchronously.

Default: false

id - stringrequired

A unique identifier for this Configuration.

>= 1 characters

Match pattern: ^[a-zA-Z0-9_-]+$

properties - Plugin Configuration

Plugin specific properties.

initialFilePaths - array[string]

Set of initial paths to crawl.

addFileMetadata - boolean

Add information about documents found in the file system to the index, such as document owner, ACLs, etc.

Default: true

includeDirectories - boolean

When true, directory items are indexed as documents.

Default: false

sizeLimitProperties - Item Size Limits

Options for including or excluding items based on size, in bytes.

maxSizeBytes - number

Used for excluding items when the item size is larger than the configured value.

>= -2147483648

<= 2147483647

exclusiveMinimum: false

exclusiveMaximum: false

Default: -1

Multiple of: 1

minSizeBytes - number

Used for excluding items when the item size is smaller than the configured value.

>= -2147483648

<= 2147483647

exclusiveMinimum: false

exclusiveMaximum: false

Default: 1

Multiple of: 1

namePatternConfig - Name Pattern Rules

inclusiveRegexes - array[string]

Regular expressions for URI patterns to include. This will limit this datasource to only URIs that match the regular expression.

Default:

exclusiveRegexes - array[string]

Regular expressions for URI patterns to exclude. This will limit this datasource to only URIs that do not match the regular expression.

Default:

regexCacheSize - number

The number of regex matches to cache when evaluating regular expressions. For example if you exclude files by filename, each filename's regex result will be cached so that if this same filename came up again, the regex matches would be remembered.

>= -2147483648

<= 2147483647

exclusiveMinimum: false

exclusiveMaximum: false

Default: 10000

Multiple of: 1

includedFileExtensions - array[string]

Set of file extensions to be fetched. If specified, all non-matching files will be skipped.

Default:

excludedFileExtensions - array[string]

A set of all file extensions to be skipped from the fetch.

Default:

depthLimitConfig - Item Depth Limits

maxDepth - number

Maximum depth level for fetch items. If an item has a depth greater than the configured value, it will not be fetched. The default is "no limit" (-1).

>= -2147483648

<= 2147483647

exclusiveMinimum: false

exclusiveMaximum: false

Default: -1

Multiple of: 1

maximumItemLimitConfig - Item Count Limits

maxItems - number

Limits the number of items emitted to the configured IndexPipeline. The default is no limit (-1).

>= -2147483648

<= 2147483647

exclusiveMinimum: false

exclusiveMaximum: false

Default: -1

Multiple of: 1