/QPP-Retrieval-Coherency

Query Performance Prediction through Retrieval Coherency

Primary LanguagePython

Query Performance Prediction through Retrieval Coherency

Query Performance Prediction through Retrieval Coherency

The problem of Query Performance Prediction (QPP) addresses evaluating the quality of retrieved information in satisfying the information need behind the query. While most of the QPP methods focus on the query and retrieved documents’ similarity and score, in this work, we propose a host of post-retrieval QPP metrics based on documents’ associations. We empirically study the potential of commonly used QPP baselines to be improved using document association.

  • Post-retrieval QPP baselines include:
    1. WIG
    2. Clarity
    3. QF
    4. NQC
    5. σk (known as SD)
    6. n(σx%) (known as ISD)
    7. SMV
  • Document association metrics includes calculating the following features in Document Association Network:
    1. ACC : Average Clustering Coefficient
    2. ADC : Average Degree Connectivity
    3. AND : Average Neighbour Degree
    4. D : Density
    5. WACC : Weighted Average Clustering Coefficient
    6. WADC : Weighted Average Degree Connectivity
    7. WAND : Weighted Average Neighbour Degree
    8. WD : Weighted Density

Results can be found in results directory and can be reproduced as follows:

evaluate.py [-h] [-c  corpus] [-b baseline] [-d density_metric]
  • choose corpus from ['rb04', 'gov2', 'cw09']
  • choose density_metric from ['ACC', 'WACC', 'ADC', 'WADC', 'AND', 'WAND', 'D', 'WD']:
  • choose QPP baseline from ['WIG', 'Clarity', 'NQC', 'QF', 'ISD', 'SD', 'SMV']

for example:

python evaluate.py -c rb04 -b WIG -d WD

will have the following output:

Corpus=rb04
Baseline= WIG
Density=WD
Pearson rho= 0.568209855531196   Kendall tau 0.4222981366459627

The Table below shows the Pearson Rho and kendall Tau correlation of baselines standalone and baselines linearlyinterpolated with our host of document association metrics, respectively. It is shown that considering the relationship between the retrieved documents can boost the QPP methods performance.

Pearson Rho Kendall Tau
QPP baselines Document Association Robust 04 ClueWeb09 GOV2 Robust 04 ClueWeb09 GOV2









WIG
ACC 0.54 0.31 0.51 0.38 0.23 0.36
WACC 0.52 0.36 0.46 0.36 0.24 0.31
ADC 0.49 0.29 0.58 0.34 0.21 0.39
WADC 0.52 0.33 0.54 0.39 0.22 0.46
AND 0.52 0.24 0.59 0.38 0.16 0.46
WAND 0.55 0.37 0.55 0.37 0.25 0.42
D 0.50 0.29 0.55 0.35 0.19 0.41
WD 0.57 0.41 0.55 0.42 0.24 0.42









Clarity
ACC 0.54 0.30 0.45 0.39 0.22 0.32
WACC 0.53 0.38 0.46 0.38 0.19 0.33
ADC 0.51 0.39 0.44 0.39 0.27 0.32
WADC 0.52 0.32 0.46 0.38 0.21 0.30
AND 0.54 0.35 0.46 0.37 0.20 0.29
WAND 0.54 0.33 0.45 0.36 0.17 0.28
D 0.52 0.32 0.45 0.39 0.25 0.32
WD 0.51 0.35 0.49 0.39 0.21 0.34









QF
ACC 0.39 0.23 0.43 0.32 0.14 0.30
WACC 0.43 0.34 0.34 0.34 0.21 0.27
ADC 0.43 0.19 0.55 0.35 0.17 0.38
WADC 0.42 0.20 0.47 0.35 0.14 0.36
AND 0.42 0.25 0.51 0.32 0.17 0.37
WAND 0.41 0.28 0.55 0.33 0.17 0.42
D 0.40 0.28 0.50 0.31 0.15 0.36
WD 0.42 0.39 0.47 0.29 0.24 0.31








NQC
ACC 0.48 0.21 0.35 0.35 0.17 0.27
WACC 0.53 0.32 0.39 0.37 0.25 0.30
ADC 0.52 0.19 0.41 0.37 0.18 0.29
WADC 0.47 0.24 0.38 0.37 0.19 0.24
AND 0.49 0.19 0.51 0.35 0.15 0.36
WAND 0.47 0.21 0.43 0.34 0.15 0.30
D 0.49 0.24 0.43 0.36 0.15 0.27
WD 0.55 0.36 0.46 0.38 0.22 0.39








SD
ACC 0.49 0.18 0.39 0.34 0.12 0.28
WACC 0.48 0.34 0.38 0.36 0.16 0.27
ADC 0.50 0.25 0.44 0.35 0.15 0.34
WADC 0.55 0.21 0.50 0.38 0.18 0.33
AND 0.50 0.31 0.48 0.33 0.23 0.35
WAND 0.47 0.26 0.43 0.39 0.18 0.30
D 0.52 0.30 0.47 0.37 0.19 0.34
WD 0.53 0.33 0.50 0.41 0.22 0.36








SMV
ACC 0.50 0.29 0.31 0.29 0.22 0.21
WACC 0.49 0.30 0.41 0.34 0.17 0.29
ADC 0.52 0.18 0.49 0.38 0.16 0.34
WADC 0.53 0.30 0.45 0.36 0.22 0.37
AND 0.47 0.26 0.50 0.31 0.13 0.35
WAND 0.50 0.22 0.49 0.38 0.13 0.39
D 0.50 0.18 0.49 0.37 0.13 0.34
WD 0.54 0.33 0.55 0.39 0.29 0.37









ISD
ACC 0.56 0.25 0.47 0.41 0.21 0.33
WACC 0.55 0.40 0.43 0.34 0.25 0.33
ADC 0.56 0.35 0.57 0.39 0.21 0.38
WADC 0.54 0.35 0.55 0.36 0.21 0.43
AND 0.57 0.26 0.54 0.38 0.19 0.42
WAND 0.60 0.32 0.62 0.41 0.19 0.47
D 0.54 0.35 0.59 0.38 0.19 0.42
WD 0.57 0.43 0.58 0.38 0.27 0.45