Fusion 5.3.0

Release date: November 18, 2020

Component versions:

Component	Version
Solr	8.6.3
ZooKeeper	3.5.7
Spark	2.4.5
Kubernetes	GKE, AKS, EKS 1.19 Rancher (RKE) and OpenShift 4 compatible with Kubernetes 1.19 OpenStack and customized Kubernetes installs not supported. See Kubernetes support for end of support dates.
Ingress Controllers	Nginx, Ambassador (Envoy), GKE Ingress Controller Istio not supported.

Component

Version

Solr

8.6.3

ZooKeeper

3.5.7

Spark

2.4.5

Kubernetes

GKE, AKS, EKS 1.19

Rancher (RKE) and OpenShift 4 compatible with Kubernetes 1.19

OpenStack and customized Kubernetes installs not supported.

See Kubernetes support for end of support dates.

Ingress Controllers

Nginx, Ambassador (Envoy), GKE Ingress Controller

Istio not supported.

More information about support dates can be found at Lucidworks Fusion Product Lifecycle.

Solr is updated from version 8.4.1 to 8.6.3.

Looking to upgrade?

Check out the Fusion 5 Upgrades topic for details.

Data models simplify the process of getting started with Fusion by providing pre-configured objects to reduce the effort spent on basic starting tasks. This helps keep documents consistent between datasources and intuitive to the object’s type.

Access data models in the Fusion UI by navigating to Indexing > Data Models.

Data Models in Fusion UI

Some connectors include built-in data models as a standard component. Others require you to manually create data models.

Audit Logs

Audit Logs are added to the DevOps Center’s Log Viewer. Audit logs provide you with a resource for tracking actions within Fusion, including the date, time, user responsible, and more.

Audit logs

Subscriptions

Subscriptions in Fusion allow you to create and configure subscriptions using Apache Pulsar.

In Fusion 5.3.0, Subscriptions are included in the Fusion UI under Indexing > Subscriptions.

Subscriptions Panel

See Subscriptions UI for more information.

Fusion

New features for Smart Answers

Milvus integration

Fusion 5.3 extends support for semantic search using vectors and embeddings by integrating with Milvus, a highly scalable embeddings engine that allows Fusion to streamline the methodologies that use deep learning for question/answer solutions like Smart Answers, recommendations based on similarity, and regular search. Milvus MySQL is included for metadata management.

A number of new components are introduced to manage and utilize Milvus:

New jobs

New pipelines

Smart Answers Index Pipeline
Smart Answers Query Pipeline

For the latest configuration instructions, see Configure The Smart Answers Pipelines.

New index pipeline stage

Encode into Milvus

New query pipeline stages

New deep learning models

In Fusion 5.3, we are refreshing our deep learning models methodologies to be used in training and inference for semantic search-based Smart Answers. The following models are new in this release:

bpe_en_300d_10K
bpe_en_300d_200K
bpe_ja_300d_100K
bpe_ko_300d_100K
bpe_zh_300d_50K
pe_multi_300d_320K

The bpe_{language}_{dim_size}_{vocab_size} models are general pre-trained BPEmb embeddings that are available for different languages, including Chinese/Japanese/Korean (CJK) languages and multilingual. These are also useful in scenarios when vocabulary is very big or when the data might contain a lot of misspellings.

distilbert_en
distilbert_multi

These are distilled, performance-optimized versions of BERT models designed to be used on scale. Available for English language and multilingual applications.

biobert_v1.1

This is a BERT model that was pre-trained on large-scale biomedical corpora which makes it more suitable for biomedical domain applications.

For configuration details, see Train A Smart Answers Supervised Model, Train A Smart Answers Cold Start Model, and Advanced Model Training Configuration for Smart Answers.

Answer Extraction

To enhance how our Smart Answers customers interact with results sets that are composed of large documents, Fusion 5.3 adds Answer Extraction, allowing you to extract a paragraph, sentence, or phrase to answer questions.

When a large document is presented as a result to a query, Answer Extraction extracts the sentences out of the document that are most similar to the query content. To configure this feature, you train a model that gets deployed at the end of the Smart Answers query pipeline stage, after the resulting set of large documents is returned from Solr for final ranking. The model outputs the sentences from each document that are the most similar to the query.

Answer Extraction workflow

The Answer Extraction model is now available from the Lucidworks official Docker to be deployed as a Seldon model. See Extract Short Answers from Longer Documents for detailed configuration steps.

New Seldon model: spaCy

The spaCy NER and POS model that formerly shipped with Fusion is now available only from the Lucidworks official Docker to be deployed as a Seldon model.

The new Seldon model is compatible with Fusion 5.1+ and existing NLP Annotator stages.

The new Trending Recommender job analyzes signals to measure customer engagement over time. Use this job to identify spikes in popularity for specific documents or products, then display those items to your users or analyze the trends for business purposes. You can configure any time window, such as daily, weekly, or monthly.

For detailed steps to configure this job, see Identify Trending Documents or Products.

Build Training Data job

The new Build Training Data job constructs the training data required for query-time classification, that is, predicting the categories most likely to satisfy a query.

Query-time classification workflow

For detailed configuration steps, see Classify New Queries.

Connectors

Connectors SDK 3.0

Version 3.0 of the Java Connector Development is now available. See the Javadocs for complete details.

Custom connectors created with previous versions of the SDK must be recompiled with version 3.0 for compatibility with Fusion 5.3.

Remote connector support

Remote connector support returns in Fusion 5.3.0. See Use a Remote Connector with Pulsar Proxy for more information.

The Windows Share connector can access content in a Windows Share or Server Message Block (SMB 2 and 3 protocols)/Common Internet File System (CIFS) filesystem.

For more information, see the Windows Share SMB 2/3 V2 connector reference documentation.

Google Drive (V2)

Unresolved directive in <stdin> - include::/fusion-connectors/reference/googledrivev2.asciidoc[tag=intro]

For more information, see the Google Drive reference documentation.

JDBC (V2)

Connect to any JDBC database.

For more information, see the JDBC V2 connector reference documentation.

AWS S3 (V2)

The AWS S3 V2 connector crawls items in a single bucket. You must specify the bucket name and AWS region in which that bucket is located.

You may crawl specific items in a bucket. If no items are specified, the entire bucket will be crawled.

For more information, see the AWS S3 V2 connector reference documentation.

Predictive Merchandiser

JSON Blob rule type

A new rule type, JSON Blob, is added. This rule type allows you to pass arbitrary JSON blobs to your frontend when a rule fires:

JSON Blob Rule Type

Detail Page template

A new template, Detail Page, is added. This template allows you to configure what details and zones are displayed when a user views a product’s details. To configure this template, navigate to Templates and edit Detail Page. You can also configure this template visually in the Merchandiser screen by hovering over a product, clicking the Detail Page button, clicking the Start Task button, and clicking the Edit Template button.