Evaluate QnA Pipeline

Table of Contents

Configuration properties

Evaluate the performance of a Smart Answers pipeline.

See Evaluate a Smart Answers Query Pipeline for configuration instructions.

Configuration properties

Evaluates performance of a QnA pipeline

additionalParams - string

Additional query parameters to pass to return resultsfrom Fusion. Please specify in dictionary format: e.g. { "rowsFromSolrToRerank": 20,"fq": "type:answer" }"

appName - stringrequired

Fusion app where indexed documents or QA pairs live.

collectionName - stringrequired

Fusion collection where indexed documents or QA pairs live

doWeightsSelection - boolean

Whether to perform grid search to find the best weights combination for ranking scores for query pipeline's Compute Mathematical Expression stage"

Default: true

id - stringrequired

The ID for this job. Used in the API to reference this job. Allowed characters: a-z, A-Z, dash (-) and underscore (_)

<= 63 characters

Match pattern: [a-zA-Z][_\-a-zA-Z0-9]*[a-zA-Z0-9]?

inputEvaluationCollection - stringrequired

Collection to pull labeled data for use in evaluation

>= 1 characters

kList - string

The k retrieval position that will be used to compute for each metric

Default: [1,3,5]

matchFieldInFile - string

Field which contains id or text of the ground truth answer in the evaluation collection

Default: answer_id

matchFieldInFusion - string

Field name in Fusion which contains answer id or text for matching ground truth answer id or text in the evaluation collection

Default: doc_id

metricsList - string

List of metrics that should be computed during evaluation. e.g.["recall","precision","map","mrr"]

Default: ["recall"]

outputEvaluationCollection - stringrequired

Collection to store evaluation results (recommended collection is job_reports)

>= 1 characters

queryPipelineName - stringrequired

QnA query pipeline name

rankingScoreField - string

Score to be used for ranking and evaluation

Default: ensemble_score

returnFields - stringrequired

Fields (comma-separated) that should be returned from the main collection (e.g. question, answer). The job will add them to the output evaluation

samplingProportion - number

The proportion of data to be sampled from the full dataset. Use a value between 0 and 1 for a proportion (e.g. 0.5 for 50%), or for a specific number of examples, use an integer larger than 1. Leave blank for no sampling

scoreListForWeights - string

Ranking scores (comma-separated) used for ensemble in the query pipeline's Compute Mathematical Expression stage. The job will perform weights selection for the listed scores

Default: score,vectors_distance

seed - integer

Random seed for sampling

Default: 12345

solrScaleFunc - string

Function used in the pipeline to scale Solr scores. E.g., scale by max Solr score retrieved (max), scale by log with base 10 (log10) or take squre root of score (pow0.5)

Default: max

sparkConfig - array[object]

Provide additional key/value pairs to be injected into the training JSON map at runtime. Values will be inserted as-is, so use " to surround string values

object attributes:{key required : {
display name: Parameter Name
type: string
}value : {
display name: Parameter Value
type: string
}}

targetRankingMetric - string

Target ranking metric to optimize during weights selection

Default: recall@3

testQuestionFieldInFile - string

Defines the field in the collection containing the test question

Default: question

type - stringrequired

Default: argo-qna-evaluate

Allowed values: argo-qna-evaluate