nci/scores

Quantile Interval and Interval Scores Tutorial

Closed this issue · 6 comments

@reza-armuei, @nicholasloveday, @tennlee

I was taking a look at the "Quantile Interval and Interval Scores" tutorial in readthedocs (https://scores.readthedocs.io/en/latest/tutorials/Quantile_Interval_And_Interval_Scores.html). I noticed a few minor formatting and grammar issues.

Rather than commenting on the closed pull request, I thought I would open an issue. (Apologies I didn't manage to have a look at the tutorial while the pull request was still open).

I haven't looked at the tutorial in JupyterLab, so can't comment on which (if any) of these formatting issues are present in JupyterLab (some things render differently in JupyterLab compared to readthedocs). This also means I didn't uncomment any of the code cells. I also didn't review the code cells (as I don't have the domain expertise to do that).

Minor formatting issues/questions I had:

  1. [will review post-merge (Tennessee)] In the body of the text, "quantile interval score" and "interval score" are sometimes italicised and sometimes not italicised. I was a bit confused as to when they were meant to be italicised and when they weren't meant to be, but that might just be my confusion (in which case, please ignore).

  2. [done] Re. the sentence: "The quantile interval score function is defined as sum of three penalties, as follows:". Should that be "the sum of" (adding a "the" before "sum")?

  3. [done] This really doesn't matter, but if it is easy to do, would it be helpful if "So, the total score can be written as:" was in bold text? Aside from the fact this really doesn't matter, I also realise formatting in readthedocs and JupyterLab can be tricky, so if this is hard to do or you would simply prefer not to, I wouldn't worry about it.

  4. [done but need to check under-prediction formulae ] Re. the following formulae:
    imageAre these the right way around? Or is "U" meant to be under-prediction penalty, and "O" meant to be over-prediction penalty?

  5. Questions 5a-5c relate to the following sentence:
    image
    PLEASE NOTE: for 5a-5c, I didn't know how to format subscript text in a GitHub issue comment - so please ignore my lack of subscript formatting. If you make any of the following changes, please do use subscript text as needed!

5a. [went with 5c] Would the following be clearer: "ql and qu are lower and upper quantile forecasts respectively" (adding "respectively")?

5b. [went with 5c] Would the following be clearer: "αl and αu are lower and upper quantile levels respectively" (adding "respectively")?

5c. Alternatively, would it be clearer to change this sentence to dot point format, e.g. along the lines of:
"where:

  • [done] S is the scoring function (here quantile interval score),
  • [done] ql is the lower quantile forecast
  • [done] qu is the upper quantile forecast
  • [done"] y is observation
  • [done] αl is the lower quantile level
  • [done] αu is the upper quantile level."
  1. Questions 6a-6b relate to the sentence: "As you can see, this score penalizes the width of interval as well as the extent to which observation falls outside of the interval forecast (i.e., under-prediction penalty when observation is larger than upper-quantile forecast, and over-prediction penalty when observation is smaller than lower-quantile forecast)."

6a. [done] For the most part, scores documentation has been using Australian/UK English spelling, rather than US English spelling (although I think there are a few US English spellings here and there in the docs). AUS/UK English spelling would be "penalise".

6b. Should any of the following changes be made? (If any/all of the below suggestions would be inaccurate or inappropriate, then please ignore accordingly).

  • [done] "width of the interval" (adding "the)
  • [done] "to which the observation falls" (adding "the") or "to which the observations fall"
  • [done] "under-prediction penalty when the observation is" (adding "the") or "under-prediction penalty when the observations are"
  • [done] "larger than the upper-quantile forecast" (adding "the")
  • [done] "over-prediction penalty when the observation is" (adding "the) or "over-prediction penalty when the observations are"
  • [done] "smaller than the lower-quantile forecast" (adding "the").
  1. [done] Re. the following formulae:
    image
    It is not clear to me which one is Over-Prediction Penalty and which is Under-Prediction Penalty.

  2. Questions 8a-8c relate to the following paragraph:
    image

8a. [done] Would it be easier to read if Scenario 1 and Scenario 2 were each on new lines?

8b. [done] Do you need the "1." before "Scenario 1"? Do you need "2." before "Scenario 2"?

8c. [maybe] I am a bit confused by the references "(indicating X% cooler)" or "(indicating Y% warmer)". For instance, in Scenario 1, does "indicating 10% cooler" mean there is a 10% chance of the temperature being cooler than 15 degrees celcius, or does it mean something else? Please note, I don't have a scientific or meteorology background, so my confusion may stem simply from my lack of domain expertise. If the existing wording will be clear to users of scores then no worries.

  1. [done] The following paragraph has a few formatting issues (regarding italicisation, a quotation mark, an asterisk, a lack of a space between two words, and a full stop about half-way through the final sentence).
    image

  2. [done] Re. the sentence: "Later we will see how to calculate score for specific dimension(s) (e.g., for each time forecast step) by using preserve_dims or reduce_dims arguments." "calculate score" doesn't seem quite right grammatically. Perhaps "calculate scores"?

  3. [done] Re. the sentence: "As mentioned, we can use both the quantile interval score and the interval score for Scenario 2 due the symmetric quantile range of forecast intervals in this scenario."
    11a. I would suggest: "due to the".
    11b. Would it be clearer to say "we can use either the quantile interval score or the interval score"?

  4. [done] Re. the sentence: "An example scenario could involve assigning weights to regions based on their population when computing the these scores." I am guessing this is meant to be "computing these scores."?

[done] As I mentioned in my first comment, I didn't review the code cells. (I don't have a coding background - I only started learning how to use a command line, Git and GitHub in May).

However, I did notice in the code cells you use "overal_qis_sc1", "overal_qis_sc2" and "overal_is_sc2". Would these be better as "overall_qis_sc1" etc. (i.e. "overall" rather than "overal")? As I don't understand the code, I may be incorrect about this. If "overal" is better, then no worries.

[not needed] In the tutorial gallery page (https://scores.readthedocs.io/en/latest/tutorials/Tutorial_Gallery.html#Continuous), the "Quantile Interval Score and Interval Score" thumbnail is located at the end of the continuous section.

Is that the best location for it in the tutorial gallery? If not, should it be between "Quantile Loss" and "Murphy Diagrams", or elsewhere?

As I don't have a background in this area, I don't have an opinion on this. I just wanted to raise the question.

[done] An additional thought: The first level header at the start of the tutorial currently says "Quantile Interval and Interval Scores". I was just reflecting on that, and wonder if "Quantile Interval Score and Interval Score" would be clearer?

Hi @Steph-Chong , Thanks so much again for your review. I applied all your comments in #736. Following are my answers to two questions:

Re your question about location of thumbnail for the tutorial of this score, I think the order does not matter. So I did not change it.

Re 8c, you are right, It means that there is a 10% chance that the temperature will be cooler than 15° C in that particular example. There is no change to the text.

Hi @Steph-Chong , Thanks so much again for your review. I applied all your comments in #736. Following are my answers to two questions:

Re your question about location of thumbnail for the tutorial of this score, I think the order does not matter. So I did not change it.

Re 8c, you are right, It means that there is a 10% chance that the temperature will be cooler than 15° C in that particular example. There is no change to the text.

Re. 8c - I was reflecting further on this. Even though I did correctly figure out the meaning, I am concerned the wording is not as clear as it could be, and that readers may potentially get confused. As such, I made some suggestions for alternative wording on PR #736. @tennlee has accepted the alternative wording I proposed. @reza-armuei you may want to review what I proposed on PR #736, so that you can make any further adjustments or corrections as you wish. UPDATE: @tennlee has made some further changes again post merge. So if you want to see what it currently says, it is probably better to check what is on the develop branch (https://scores.readthedocs.io/en/latest/tutorials/Quantile_Interval_And_Interval_Score.html).

Everything (bar one item I will raise in a new issue) is resolved by #736