Legacy Product

Fusion 5.10
    Fusion 5.10

    Trending Recommender Jobs

    Table of Contents

    The Trending Recommender job analyzes signals to measure customer engagement over time. Use this job to identify spikes in popularity for specific items or queries, then display those items to your users or analyze the trends for business purposes. You can configure any time window, such as daily, weekly, or monthly.

    Input

    signals (the COLLECTION_NAME_signals collection by default)

    Output

    Trending items or queries

    query
    count_i
    type
    timstamp_tdt
    user_id
    doc_id
    session_id
    fusion_query_id

    Required signals fields:

    [5]

    required

    required

    required

    required

    5 Required when identifying trending queries instead of trending items.

    For detailed steps to configure this job, see Identify Trending Documents or Products.

    Trending Recommender

    id - stringrequired

    The ID for this Spark job. Used in the API to reference this job. Allowed characters: a-z, A-Z, dash (-) and underscore (_). Maximum length: 63 characters.

    <= 63 characters

    Match pattern: [a-zA-Z][_\-a-zA-Z0-9]*[a-zA-Z0-9]?

    sparkConfig - array[object]

    Spark configuration settings.

    object attributes:{key required : {
     display name: Parameter Name
     type: string
    }
    value : {
     display name: Parameter Value
     type: string
    }
    }

    trainingCollection - stringrequired

    Solr Collection containing labeled training data

    >= 1 characters

    fieldToVectorize - string

    Fields to extract from Solr (not used for other formats)

    >= 1 characters

    dataFormat - stringrequired

    Spark-compatible format that contains training data (like 'solr', 'parquet', 'orc' etc)

    >= 1 characters

    Default: solr

    trainingDataFrameConfigOptions - object

    Additional spark dataframe loading configuration options

    trainingDataFilterQuery - string

    Solr query to use when loading training data if using Solr

    Default: *:*

    sparkSQL - string

    Use this field to create a Spark SQL query for filtering your input data. The input data will be registered as spark_input

    Default: SELECT * from spark_input

    trainingDataSamplingFraction - number

    Fraction of the training data to use

    <= 1

    exclusiveMaximum: false

    Default: 1

    randomSeed - integer

    For any deterministic pseudorandom number generation

    Default: 1234

    outputCollection - string

    Solr Collection to store model-labeled data to

    dataOutputFormat - string

    Spark-compatible output format (like 'solr', 'parquet', etc)

    >= 1 characters

    Default: solr

    sourceFields - string

    Solr fields to load (comma-delimited). Leave empty to allow the job to select the required fields to load at runtime.

    partitionCols - string

    If writing to non-Solr sources, this field will accept a comma-delimited list of column names for partitioning the dataframe before writing to the external output

    writeOptions - array[object]

    Options used when writing output to Solr or other sources

    object attributes:{key required : {
     display name: Parameter Name
     type: string
    }
    value : {
     display name: Parameter Value
     type: string
    }
    }

    readOptions - array[object]

    Options used when reading input from Solr or other sources.

    object attributes:{key required : {
     display name: Parameter Name
     type: string
    }
    value : {
     display name: Parameter Value
     type: string
    }
    }

    refTimeRange - integerrequired

    Number of reference days: number of days to use as baseline to find trends (calculated from today)

    targetTimeRange - integerrequired

    Number of target days: number of days to use as target to find trends (calculated from today)

    numWeeksRef - number

    If using filter queries for reference and target time ranges, enter the value of (reference days / target days) here (if not using filter queries, this will be calculated automatically)

    sparkPartitions - integer

    Spark will re-partition the input to have this number of partitions. Increase for greater parallelism

    Default: 200

    countField - stringrequired

    Field containing the number of times an event (e.g. click) occurs for a particular query; count_i in the raw signal collection or aggr_count_i in the aggregated signal collection.

    >= 1 characters

    Default: aggr_count_i

    referenceTimeFilterQuery - string

    Add a Spark SQL filter query here for greater control of time filtering

    targetFilterTimeQuery - string

    Add a Spark SQL filter query here for greater control of time filtering

    typeField - stringrequired

    Enter type field (default is type)

    Default: aggr_type_s

    timeField - stringrequired

    Enter time field (default is timestamp_tdt)

    Default: timestamp_tdt

    docIdField - stringrequired

    Enter document id field (default is doc_id)

    Default: doc_id_s

    types - stringrequired

    Enter a comma-separated list of event types to filter on

    Default: click,add

    recsCount - integerrequired

    Maximum number of recs to generate (or -1 for no limit)

    Default: 500

    type - stringrequired

    Default: trending-recommender

    Allowed values: trending-recommender