Matrix Decomposition-Based Query-Query Similarity Jobs
Train a collaborative filtering matrix decomposition recommender using SparkML’s Alternating Least Squares (ALS) to batch-compute query-query similarities.
Legacy Product
Train a collaborative filtering matrix decomposition recommender using SparkML’s Alternating Least Squares (ALS) to batch-compute query-query similarities.
Train a collaborative filtering matrix decomposition recommender using SparkML's Alternating Least Squares (ALS) to batch compute query-query similarities
The ID for this Spark job. Used in the API to reference this job
<= 128 characters
Match pattern: ^[A-Za-z0-9_\-]+$
Identifier for the recommender model. Will be used as the unique key when storing the model in Solr.
Collection to load and store the computed model (if absent, it won't be loaded or saved)
Whether we should save the computed ALS model in Solr
Default: false
Item/Query preference collection (often a signals collection or signals aggregation collection)
Solr query to filter training data (e.g. downsampling or selecting based on min. pref values)
Default: *:*
Items must have at least this # of unique users interacting with it to go into the sample
Default: 2
Downsample preferences for items (bounded to at least 2) by this fraction
<= 1
exclusiveMaximum: false
Default: 1
Collection to store batch-computed query/query similarities (if absent, none computed)
Collection to store batch-computed items-for-queries recommendations (if absent, none computed)
Solr field name containing stored queries
Default: query
Solr field name containing stored item ids
Default: item_id_s
Solr field name containing stored weights (i.e. time decayed / position weighted counts) the item has for that query
Default: weight_d
Batch compute and store this many query similarities per query
Default: 10
Batch compute and store this many item recommendations per query
Default: 10
Number of user/item factors in the recommender decomposition (or starting guess for it, if doing parameter grid search)
Default: 100
Maximum number of iterations to use when learning the matrix decomposition
Default: 10
Confidence weight (between 0 and 1) to give the implicit preferences (or starting guess, if doing parameter grid search)
Default: 0.5
Smoothing parameter to avoid overfitting (or starting guess, if doing parameter grid search). Slightly larger value needed for small data sets
Default: 0.01
Parameter grid search to be done centered around initial parameter guesses, exponential step size, this number of steps (if <= 0, no grid search)
Default: 1
Pseudorandom determinism fixed by keeping this seed constant
Default: 13
Treat training preferences as implicit signals of interest (i.e. clicks or other actions) as opposed to explicit query ratings
Default: true
Even if a model with this modelId exists, re-train if set true
Default: true
Additional spark dataframe loading configuration options
Default: query_similarity
Allowed values: query_similarity