Talk:f-score

	Linguistics portal This article is within the scope of WikiProject Linguistics, a collaborative effort to improve the coverage of linguistics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.LinguisticsWikipedia:WikiProject LinguisticsTemplate:WikiProject LinguisticsLinguistics articles
???	This article has not yet received a rating on the project's importance scale.
	This article is supported by Applied Linguistics Task Force.

Statistics Low‑importance

	This article is within the scope of WikiProject Statistics, a collaborative effort to improve the coverage of statistics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.StatisticsWikipedia:WikiProject StatisticsTemplate:WikiProject StatisticsStatistics articles
Low	This article has been rated as Low-importance on the importance scale.

This stub almost completely duplicates Information_retrieval#F-measure. Arguably, the two should be merged.

I think that recall should not be described joinly with the F1 Score. The redirect link from Recall to F1 Score should be supressed.

In this way, I have corrected the redirect link from recall; instead of being linked to "F1 score", it is now linked to "Information Retrieval", which has a section with a Recall Description.

—

The sentence "Two other commonly used F measures are the $F_{2}$ measure, which weights recall twice as much as precision, and the $F_{0.5}$ measure, which weights precision twice as much as recall." is wrong. It's easy to see that $\beta =2$ weights recall four times as much as precision (q.v. german article. 141.89.52.220 ( talk) 12:29, 19 August 2009 (UTC) reply

I do not agree (anymore). Neither with the fact that

\beta >1

puts more weight on precision, nor with the factor (twice, four times, ...) Perhaps it would be clearer if the formula were written

{\frac {1}{F_{\beta }}}={\frac {1}{1+\beta ^{2}}}\left(\beta ^{2}\cdot {\frac {1}{recall}}+1\cdot {\frac {1}{precision}}\right)

. Here, we can see that the reciprocal of recall is weighted

\beta ^{2}

times the reciprocal of precision. Have a look at a visual illustration of the situation. It shows that for

\beta >1

, the gradient is more vertical for large parts of the plot, which means that recall is more important there.-- Jonas Wagner ( talk) 18:23, 3 March 2011 (UTC) reply

I think you meant to say the lines for F2 are more horizontal, not vertical. Being horizontal means the value of precision matters less, which agrees with your core claim. — Preceding unsigned comment added by 78.22.80.252 ( talk) 15:03, 23 August 2012 (UTC) reply

This statement, " $F_{\beta }$ measures the effectiveness of retrieval with respect to a user who attaches β times as much importance to recall as precision", as quoted directly from van Rijsbergen's book (linked from this article), appears to be in error. As per analysis above and also Chapter 8 of Manning et al.'s IR book (see here), I believe the "β times as much importance" part should read "β² times as much importance" instead. -- unkx80 ( talk) 15:19, 18 August 2013 (UTC) reply

—

What is the point of showing the Diagnostic Testing Diagram? F-score does not even appear in it. And it is presented with no discussion or a link back to Confusion_matrix. 70.166.151.52 ( talk) 16:52, 5 April 2017 (UTC) reply

On the contrary, I clicked on the Talk tab to come here and comment on how I found so exceptionally clear, the Diagnostic Testing Diagram's definition of F1 Score. In the lower right corner you'll see the top level definition -- that of F1 score. Definitions upon which it depends are laid out in a tabular form of cells, but each of the cells includes not only the term being defined, but also its various synonyms that one might think mean something else if not warned of their redundancy. The only critique I can come up with is that the table should perhaps have been flipped to exchange the top right and bottom left corners. But that is so minor a critique compared to a rare example of how technical Wikipedia articles should be written in general that I hesitated to bring it up at all. Jim Bowery ( talk) 22:52, 19 December 2019 (UTC) reply

Requested move 2 October 2020

The following is a closed discussion of a requested move. Please do not modify it. Subsequent comments should be made in a new section on the talk page. Editors desiring to contest the closing decision should consider a move review after discussing it on the closer's talk page. No further edits should be made to this discussion.

The result of the move request was: moved. ( closed by non-admin page mover) — Nnadigoodluck ███ 15:59, 19 October 2020 (UTC) reply

F1 score → F-score – The article covers both the F₁ score and the more general F-score or F_β score. The F₁ score is a special case of an F_β score where β=1. As far as I can tell the most common spelling is "F-score" with a dash rather than "F score". Marko knoebl ( talk) 17:07, 2 October 2020 (UTC) —Relisting. BegbertBiggs ( talk) 18:54, 10 October 2020 (UTC) reply

The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

"Equivalent" formulations for F-1 don't account for division by zero

The three formulations:

F_{1}={\frac {2}{\mathrm {recall} ^{-1}+\mathrm {precision} ^{-1}}}=2{\frac {\mathrm {precision} \cdot \mathrm {recall} }{\mathrm {precision} +\mathrm {recall} }}={\frac {2\mathrm {tp} }{2\mathrm {tp} +\mathrm {fp} +\mathrm {fn} }}

are undefined at different values:

The first is undefined when either precision or recall (or both) is zero, or when there are only true negatives (i.e. a classifier perfectly predicts only negative examples on a completely negative dataset)
The second is undefined when precision and recall are undefined, or when there are only true negatives
The third is undefined only when there are only true negatives.

so they are not completely equivalent Connorboyle ( talk) 23:41, 24 December 2023 (UTC) reply