Legacy Product

Fusion 5.10
    Fusion 5.10

    Query Rewrite Jobs Post-processing Cleanup

    The Synonym Detection job uses the output of the Misspelling Detection job and Phrase Extraction job.

    Therefore, post processing must occur in the order specified in this topic for the Synonym detection job cleanup, Phrase extraction job cleanup, and Misspelling detection job cleanup procedures.

    The Head-Tail Analysis job cleanup can occur in any order.

    Synonym detection job cleanup

    Use this job to remove low confidence synonyms.

    Prerequisites

    Complete this:

    • AFTER the Misspelling Detection and Phrase Extraction jobs have successfully completed.

    • BEFORE removing low confidence synonym suggestions generated in the post processing phrase extraction cleanup and misspelling detection cleanup procedures detailed later in this topic.

    Remove low confidence synonym suggestions

    Use either an Synonym cleanup method 1 - API call or the Synonym cleanup method 2 - Fusion Admin UI to remove low confidence synonym suggestions.

    Synonym cleanup method 1 - API call

    1. Open the delete_lowConf_synonyms.json file.

    {
      "type" : "rest-call",
      "id" : "DC_Large_QR_DELETE_LOW_CONFIDENCE_SYNONYMS",
      "callParams" : {
        "uri" : "solr://DC_Large_query_rewrite_staging/update",
        "method" : "post",
        "queryParams" : {
          "wt" : "json"
        },
        "headers" : { },
        "entity" : "<root><delete><query>type:synonym AND confidence:[0 TO 0.0005]</query></delete><commit/></root>"
      },
      "type" : "rest-call",
      "type" : "rest-call"
    }

    + NOTE: REQUEST ENTITY specifies the threshold for low confidence synonyms. Edit the upper range from 0.0005 to increase or decrease the threshold based on your data.

    1. Enter <your query_rewrite_staging collection name/update> in the uri field. An example URI value for an app called DC_Large would be DC_Large_query_rewrite_staging/update.

    2. Change the id field if applicable.

    3. Specify the upper confidence level in the entity field.

      The entity field specifies the threshold for low confidence synonyms. Edit the upper range to increase or decrease the threshold based on your data.

    Synonym cleanup method 2 - Fusion Admin UI

    1. Log in to Fusion and select Collections > Jobs.

    2. Select Add+ > Custom and Other Jobs > REST Call.

    3. Enter delete-low-confidence-synonyms in the ID field.

    4. Enter <your query_rewrite_staging collection name/update> in the ENDPOINT URI field. An example URI value for an app called DC_Large would be DC_Large_query_rewrite_staging/update.

    5. Enter POST in the CALL METHOD field.

    6. In the QUERY PARAMETERS section, select + to add a property.

    7. Enter wt in the Property Name field.

    8. Enter json in the Property Value field.

    9. In the REQUEST PROTOCOL HEADERS section, select + to add a property.

    10. Enter the following as a REQUEST ENTITY (AS STRING)

      <root><delete><query>type:synonym AND confidence: [0 TO 0.0005]</query></delete><commit/></root>

      REQUEST ENTITY specifies the threshold for low confidence synonyms. Edit the upper range from 0.0005 to increase or decrease the threshold based on your data.

    Delete all synonym suggestions

    To delete all of the synonym suggestions, enter the following in the REQUEST ENTITY section:

    <root><delete><query>type:synonym</query></delete><commit/></root>

    This entry may be helpful when tuning the synonym detection job and testing different configuration parameters.

    Phrase extraction job cleanup

    Use this job to remove low confidence phrase suggestions.

    Prerequisites

    Complete this:

    Remove low confidence phrase suggestions

    Use either an Phrase cleanup method 1 - API call or the Phrase cleanup method 2 - Fusion Admin UI to remove low confidence phrase suggestions.

    Phrase cleanup method 1 - API call

    1. Open the delete_lowConf_phrases.json file.

    {
      "type" : "rest-call",
      "id" : "DC_Large_QR_DELETE_LOW_CONFIDENCE_PHRASES",
      "callParams" : {
        "uri" : "solr://DC_Large_query_rewrite_staging/update",
        "method" : "post",
        "queryParams" : {
          "wt" : "json"
        },
        "headers" : { },
        "entity" : " <root><delete><query>type:phrase AND confidence:[0 TO <INSERT VALUE HERE>]</query></delete><commit/></root>"
      },
      "type" : "rest-call",
      "type" : "rest-call"
    }
    1. Enter <your query_rewrite_staging collection name/update> in the uri field. An example URI value for an app called DC_Large would be DC_Large_query_rewrite_staging/update.

    2. Change the id field if applicable.

    3. Specify the upper confidence level in the entity field.

      The entity field specifies the threshold for low confidence phrases. Edit the upper range to increase or decrease the threshold based on your data.

    Phrase cleanup method 2 - Fusion Admin UI

    1. Log in to Fusion and select Collections > Jobs.

    2. Select Add+ > Custom and Other Jobs > REST Call.

    3. Enter remove-low-confidence-phrases in the ID field.

    4. Enter <your query_rewrite_staging collection name/update> in the ENDPOINT URI field. An example URI value for an app called DC_Large would be DC_Large_query_rewrite_staging/update.

    5. Enter POST in the CALL METHOD field.

    6. In the QUERY PARAMETERS section, select + to add a property.

    7. Enter wt in the Property Name field.

    8. Enter json in the Property Value field.

    9. In the REQUEST PROTOCOL HEADERS section, select + to add a property.

    10. Enter the following as a REQUEST ENTITY (AS STRING)

      <root><delete><query>type:phrase AND confidence: [0 TO <insert value>]</query></delete><commit/></root>

      REQUEST ENTITY specifies the threshold for low confidence phrases. Edit the upper range to increase or decrease the threshold based on your data.

    Delete all phrase suggestions

    To delete all of the phrase suggestions, enter the following in the REQUEST ENTITY section:

    <root><delete><query>type:phrase</query></delete><commit/></root>

    This entry may be helpful when tuning the phrase extraction job and testing different configuration parameters.

    Misspelling detection job cleanup

    Use this job to remove low confidence spellings (also referred to as misspellings).

    Remove misspelling suggestions

    Use either an Misspelling cleanup method 1 - API call or the Misspelling cleanup method 2 - Fusion Admin UI to remove misspelling suggestions.

    Misspelling cleanup method 1 - API call

    1. Open the delete_lowConf_misspellings.json file.

    {
      "type" : "rest-call",
      "id" : "DC_Large_QR_DELETE_LOW_CONFIDENCE_MISSPELLINGS",
      "callParams" : {
        "uri" : "solr://DC_Large_query_rewrite_staging",
        "method" : "post",
        "queryParams" : {
          "wt" : "json"
        },
        "headers" : { },
        "entity" : "<root><delete><query>type:spell AND confidence:[0 TO 0.5]</query></delete><commit/></root>"
      },
      "type" : "rest-call",
      "type" : "rest-call"
    }
    1. Enter <your query_rewrite_staging collection name/update> in the uri field. An example URI value for an app called DC_Large would be DC_Large_query_rewrite_staging/update.

    2. Change the id field if applicable.

    3. Specify the upper confidence level in the entity field.

      The entity field specifies the threshold for low confidence spellings. Edit the upper range to increase or decrease the threshold based on your data.

    Misspelling cleanup method 2 - Fusion Admin UI

    1. Log in to Fusion and select Collections > Jobs.

    2. Select Add+ > Custom and Other Jobs > REST Call.

    3. Enter remove-low-confidence-spellings in the ID field.

    4. Enter <your query_rewrite_staging collection name/update> in the ENDPOINT URI field. An example URI value for an app called DC_Large would be DC_Large_query_rewrite_staging/update.

    5. Enter POST in the CALL METHOD field.

    6. In the QUERY PARAMETERS section, select + to add a property.

    7. Enter wt in the Property Name field.

    8. Enter json in the Property Value field.

    9. In the REQUEST PROTOCOL HEADERS section, select + to add a property.

    10. Enter the following as a REQUEST ENTITY (AS STRING)

      <root><delete><query>type:spell AND confidence: [0 TO 0.5]</query></delete><commit/></root>

      REQUEST ENTITY specifies the threshold for low confidence spellings. Edit the upper range from 0.5 to increase or decrease the threshold based on your data.

    Delete all misspelling suggestions

    To delete all of the misspelling suggestions, enter the following in the REQUEST ENTITY section:

    <root><delete><query>type:spell</query></delete><commit/></root>

    This entry may be helpful when tuning the misspelling detection job and testing different configuration parameters.

    Head-tail analysis job cleanup

    The head-tail analysis job puts tail queries into one of multiple reason categories. For example, a tail query that includes a number might be assigned to the 'numbers' reason category. If the output in a particular category is not useful, you can remove it from the results. The examples in this section remove the numbers category.

    Prerequisites

    The head-tail analysis job cleanup does not have to occur in a specific order.

    Remove head-tail analysis query suggestions

    Head-tail analysis cleanup method 1 - API call

    1. Open the delete_lowConf_headTail.json file.

    {
      "type" : "rest-call",
      "id" : "DC_Large_QR_HEAD_TAIL_CLEANUP",
      "callParams" : {
        "uri" : "solr://DC_Large_query_rewrite_staging/update",
        "method" : "post",
        "queryParams" : {
          "wt" : "json"
        },
        "headers" : { },
        "entity" : "<root><delete><query>reason_code_s:(\"number\" \"number spelling\" \"number rare-term\" \"question number other-specific\" \"number others\" \"number other-specific\" \"number other-extra\" \"product number other-specific\" \"product number other-extra\" \"product number spelling\" \"product number others\" \"product number rare-term\" \"product question number\" \"product number re-wording\" \"question number other-extra\" \"number re-wording\")</query></delete><commit/></root>"
      },
      "type" : "rest-call",
      "type" : "rest-call"
    }
    1. Enter <your query_rewrite_staging collection name/update> in the uri field. An example URI value for an app called DC_Large would be DC_Large_query_rewrite_staging/update.

    2. Change the id field if applicable.

    Head-tail analysis cleanup method 2 - Fusion Admin UI

    1. Log in to Fusion and select Collections > Jobs.

    2. Select Add+ > Custom and Other Jobs > REST Call.

    3. Enter remove-low-confidence-head-tail in the ID field.

    4. Enter <your query_rewrite_staging collection name/update> in the ENDPOINT URI field. An example URI value for an app called DC_Large would be DC_Large_query_rewrite_staging/update.

    5. Enter POST in the CALL METHOD field.

    6. In the QUERY PARAMETERS section, select + to add a property.

    7. Enter wt in the Property Name field.

    8. Enter json in the Property Value field.

    9. In the REQUEST PROTOCOL HEADERS section, select + to add a property.

    10. Enter the following as a REQUEST ENTITY (AS STRING)

      <root><delete><query>reason_code_s:("number" "number spelling" "number rare-term" "question number other-specific" "number others" "number other-specific" "number other-extra" "product number other-specific" "product number other-extra" "product number spelling" "product number others" "product number rare-term" "product question number" "product number re-wording" "question number other-extra" "number re-wording")</query></delete><commit/></root>

    Delete all head-tail suggestions

    To delete all of the head-tail suggestions, enter the following in the REQUEST ENTITY section:

    <root><delete><query>type:tail</query></delete><commit/></root>

    This entry may be helpful when tuning the head-tail job and testing different configuration parameters.