OneDrive Datasource V2 Connector Configuration Reference
OneDrive is a file hosting service that is part of the Microsoft Office Online services.
The Fusion OneDrive connector crawls a OneDrive for Business instance and retrieves data from it for indexing within Fusion.
To set up the OneDrive connector, first authenticate it with a new or existing Microsoft application. Then proceed to configuring the crawl.
Connector for OneDrive file systems
coreProperties - Core Properties
Common behavior and performance settings.
fetchSettings - Fetch Settings
System level settings for controlling fetch behavior and performance.
fetchItemQueueSize - number
Size of the fetch item queue.Larger values result in increased memory usage, but potentially higher performance.Default is 10k.
>= 1
<= 500000
exclusiveMinimum: false
exclusiveMaximum: false
Default: 10000
Multiple of: 1
fetchRequestCheckInterval - number
The amount of time to wait before check if a request is done
>= 1000
<= 500000
exclusiveMinimum: false
exclusiveMaximum: false
Default: 15000
Multiple of: 1
fetchResponseCompletedTimeout - number
The maximum amount of time for a response to be completed. If exceeded, the task will be retried if the job is still running
>= 1
<= 600000
exclusiveMinimum: false
exclusiveMaximum: false
Default: 300000
Multiple of: 1
fetchResponseScheduledTimeout - number
The maximum amount of time for a response to be scheduled. The task will be canceled if this setting is exceeded.
>= 1000
<= 500000
exclusiveMinimum: false
exclusiveMaximum: false
Default: 300000
Multiple of: 1
indexContentFields - boolean
When enabled, content fields will be indexed to the crawl-db collection
Default: false
indexMetadata - boolean
When enabled the metadata of skipped items will be indexed to the content collection
Default: false
numFetchThreads - number
Maximum number of fetch threads; defaults to 20.This setting controls the number of threads that call the Connectors fetch method.Higher values can, but not always, help with overall fetch performance.
>= 1
<= 500
exclusiveMinimum: false
exclusiveMaximum: false
Default: 20
Multiple of: 1
description - string
Optional description
<= 125 characters
diagnosticLogging - boolean
Enable diagnostic logging; disabled by default
Default: false
id - stringrequired
A unique identifier for this Configuration.
>= 1 characters
Match pattern: ^[a-zA-Z0-9_-]+$
parserId - stringrequired
The Parser to use in the associated IndexPipeline.
pipeline - stringrequired
Name of the IndexPipeline used for processing output.
>= 1 characters
Match pattern: ^[a-zA-Z0-9_-]+$
properties - OneDrive properties
Plugin specific properties.
clientId - string
Client Id
clientSecret - string
Client secret
fetchRetryProperties - Retry Options
A set of options for configuring retry behavior.
delayFactor - number
The retryer will retry failed operations in the case that they might succeed if attempted again. The retryer will sleep an exponential amount of time after the first failed attempt and retry in exponentially incrementing amounts after each failed attempt up to the maximumTime. nextWaitTime = exponentialIncrement * multiplier.
>= 1
<= 9999
exclusiveMinimum: false
exclusiveMaximum: false
Default: 2
Multiple of: 1
delayMs - number
Sets the delay between retries, exponentially backing off to the maxDelayTimeMs and multiplying successive delays by the delayFactor
>= 1
<= 9223372036854776000
exclusiveMinimum: false
exclusiveMaximum: false
Default: 1000
Multiple of: 1
errorExclusions - array[string]
Optional regex list that will be matched against failed attempts exception class and message. If any regex matches, do not retry this request. This is needed to prevent the retryer from retrying non-recoverable errors that were not already ignored by the connector implementation.
maxDelayTimeMs - number
The maximum time wait time between successive retries.
>= 1
<= 600000
exclusiveMinimum: false
exclusiveMaximum: false
Default: 300000
Multiple of: 1
maxRetries - number
The retryer will retry failed operations in the case that they might succeed if attempted again. This parameter states the number of attempts to retry until giving up. This parameter, if specified, will override the "Stop retrying after time (milliseconds)" parameter.
<= 100
exclusiveMinimum: false
exclusiveMaximum: false
Default: 3
Multiple of: 1
maxTimeLimitMs - number
This setting is used to limit the maximum amount of time spent on retries. Note: this will be ignored if "Maximum Retries" is specified.
>= 1
<= 28800000
exclusiveMinimum: false
exclusiveMaximum: false
Default: 600000
Multiple of: 1
maximumItemLimitConfig - Item Count Limits
maxItems - number
Limits the number of items emitted to the configured IndexPipeline. The default is no limit (-1).
>= -2147483648
<= 2147483647
exclusiveMinimum: false
exclusiveMaximum: false
Default: -1
Multiple of: 1
namePatternConfig - Name Pattern Rules
excludedFileExtensions - array[string]
A set of all file extensions to be skipped from the fetch.
Default:
exclusiveRegexes - array[string]
Regular expressions for URI patterns to exclude. This will limit this datasource to only URIs that do not match the regular expression.
Default:
includedFileExtensions - array[string]
Set of file extensions to be fetched. If specified, all non-matching files will be skipped.
Default:
inclusiveRegexes - array[string]
Regular expressions for URI patterns to include. This will limit this datasource to only URIs that match the regular expression.
Default:
regexCacheSize - number
The number of regex matches to cache when evaluating regular expressions. For example if you exclude files by filename, each filename's regex result will be cached so that if this same filename came up again, the regex matches would be remembered.
>= -2147483648
<= 2147483647
exclusiveMinimum: false
exclusiveMaximum: false
Default: 10000
Multiple of: 1
securityTrimmingProperties - Security trimming configuration
aclCollectionName - string
Name of Solr collection to be used for storing fetched ACL records. If not specified, ACL collection name will be generated automatically using pattern '<datasource_id>_access_control_hierarchy'.
enableSecurityTrimming - boolean
Enable indexing and query-time security-trimming
Default: true
sizeLimitProperties - Item Size Limits
Options for including or excluding items based on size, in bytes.
maxSizeBytes - number
Used for excluding items when the item size is larger than the configured value.
>= -2147483648
<= 2147483647
exclusiveMinimum: false
exclusiveMaximum: false
Default: -1
Multiple of: 1
minSizeBytes - number
Used for excluding items when the item size is smaller than the configured value.
>= -2147483648
<= 2147483647
exclusiveMinimum: false
exclusiveMaximum: false
Default: 1
Multiple of: 1
tenantIdentifier - string
Allowed values are common, organizations, consumers or identifiers (i.e. 8eaef023-2b34-4da1-9baa-8bc8c9d6a490 or example.onmicrosoft.com). For more details see: https://docs.microsoft.com/en-us/azure/active-directory/develop/active-directory-v2-protocols#endpoints
usersFilter - array[string]
When this property is set, just the files and folders from the users listed here will be retrieved. This property accepts the user principal name(UPN) only.