MongoDB V1 Connector Configuration Reference
Retrieve data from a MongoDB instance.
|
V1 deprecation and removal notice
Starting in Fusion 5.12.0, all V1 connectors are deprecated. This means they are no longer being actively developed and will be removed in Fusion 5.13.0.
The replacement for this connector is in active development at this time and will be released at a future date.
If you are using this connector, you must migrate to the replacement connector or a supported alternative before upgrading to Fusion 5.13.0. We recommend migrating to the replacement connector as soon as possible to avoid any disruption to your workflows.
|
At the first connection, the Fusion MongoDB connector crawls the entire MongoDB and saves the checkpoint.
If Process oplog is not selected, when you restart the data source, the connector recrawls the entire MongoDB.
In this mode the connector does not support incremental recrawling, nor does it delete entries that are deleted from MongoDB.
|
When entering configuration values in the UI, use unescaped characters, such as \t for the tab character. When entering configuration values in the API, use escaped characters, such as \\t for the tab character.
|
Crawl a MongoDB database. For recrawls, the crawler can use the oplog in MongoDB to discover new content and updates to existing content (updated or removed documents). If a full re-synchronization is required, it can be done by de-selecting the oplog option and starting the crawl again.
id - stringrequired
Unique name for this datasource.
>= 1 characters
Match pattern: ^[a-zA-Z0-9_-]+$
pipeline - stringrequired
Name of an existing index pipeline for processing documents.
>= 1 characters
description - string
Optional description for this datasource.
properties - Properties
Datasource configuration properties
db - Connector DB
Type and properties for a ConnectorDB implementation to use with this datasource.
type - string
Fully qualified class name of ConnectorDb implementation.
>= 1 characters
Default: com.lucidworks.connectors.db.impl.MapDbConnectorDb
inlinks - boolean
Keep track of incoming links. This negatively impacts performance and size of DB.
Default: false
aliases - boolean
Keep track of original URI-s that resolved to the current URI. This negatively impacts performance and size of DB.
Default: false
inv_aliases - boolean
Keep track of target URI-s that the current URI resolves to. This negatively impacts performance and size of DB.
Default: false
list_hosts - array[object]
Host and ports of Mongo nodes
Default: {"host":"localhost","port":27017}
object attributes:{host
: {
display name: Host
type: string
}port
: {
display name: Port
type: integer
}}
list_credentials - array[object]
Credentials for Mongo databases
object attributes:{database
: {
display name: Database
type: string
}username
: {
display name: Username
type: string
}password
: {
display name: Password
type: string
}id
: {
display name: Auth Config id
type: string
}}
collections - string
The MongoDB collections to index, in the format 'databaseName.collection'. Multiple collections can be separated by commas. The default '*.*' option crawls all databases (limited by user access) and their related collections.
>= 1 characters
Default: *.*
process_oplog - boolean
Process updates from the oplog. Disable this option to perform a full synchronization of content in MongoDB collections with the index.
Default: true
diagnosticMode - boolean
Diagnostic mode enables more logging, including logging the ID of every document inserted, updated or deleted in the oplog.
Default: false
batch_size_solr_commit - integer
The number of documents every time solr_commit will be made.
Default: 1000
enable_ssl - boolean
When enabled, SSL connections will be used to communicate with the MongoDB server
Default: false
customized_timestamp - integer
Customized timestamp in epoch format (e.g. 1557881001), it is used to overwrite the existing checkpoint in zookeeper, use it carefully. The checkpoint is overwritten as long as the oplog is enabled. This property is transient, it means: if you set a value and add/update the datasoure, after the checkpoint is replaced, this property will be removed; you must refresh the UI manually
exclusiveMinimum: false
oplog_listener_period_time - integer
Period time in seconds when the checkpoint is updated in zookeeper. This option will work if oplog is enabled
Default: 60
read_preferences - string
Read preference describes how MongoDB clients route read operations to the members of a replica set.
Default: primary
Allowed values: primaryprimary preferredsecondarysecondary preferrednearest
tag_set_list - array[object]
A list of Tag Sets used for non-primary read modes
Default:
object attributes:{tag_set
: {
display name: Tag Set
type: array
}}
commit_on_finish - boolean
Set to true for a request to be sent to Solr after the last batch has been fetched to commit the documents to the index.
Default: true
verify_access - boolean
Set to true to require successful connection to the filesystem before saving this datasource.
Default: true
initial_mapping - Initial field mapping
Provides mapping of fields before documents are sent to an index pipeline.
skip - boolean
Set to true to skip this stage.
Default: false
label - string
A unique label for this stage.
<= 255 characters
condition - string
Define a conditional script that must result in true or false. This can be used to determine if the stage should process or not.
reservedFieldsMappingAllowed - boolean
Default: false
mappings - array[object]
List of mapping rules
object attributes:{source
required : {
display name: Source Field
type: string
}target
: {
display name: Target Field
type: string
}operation
: {
display name: Operation
type: string
}}
unmapped - Unmapped Fields
If fields do not match any of the field mapping rules, these rules will apply.
source - string
The name of the field to be mapped.
target - string
The name of the field to be mapped to.
operation - string
The type of mapping to perform: move, copy, delete, add, set, or keep.
Default: copy
Allowed values: copymovedeletesetaddkeep