Talk:Sorensen–Dice coefficient: Difference between revisions
ChatGPT is wobbly on the issue
(SDI = 2 × (A ∩ B) / (A ⊎ B)) |
(ChatGPT is wobbly on the issue) |
||
(3 intermediate revisions by one other user not shown) | |||
Line 10:
::That's what comes up when you click the 'download them to a file' link on the main page so that's what I've used though I made a local copy first. --[[User:PureFox|PureFox]] ([[User talk:PureFox|talk]]) 10:00, 25 September 2022 (UTC)
== SDI = 2 × (A ∩ B) / (A ⊎ B) ==
Where has this definition of Sorensen–Dice coefficient come from?--[[User:Nigel Galloway|Nigel Galloway]] ([[User talk:Nigel Galloway|talk]]) 13:
== ChatGPT is wobbly on the issue ==
<pre>
ChatGPT: To calculate the Dice-Sørensen coefficient of similarity between the strings "gg" and "gggg", we first need to find the bigrams present in each string.
For "gg", there is only one bigram: "gg".
For "gggg", there are three bigrams: "gg", "gg", "gg".
Now, we calculate the intersection and union of the sets of bigrams:
Intersection: There is one common bigram, which is "gg".
Union: There are three distinct bigrams in total.
Using the formula for the Dice-Sørensen coefficient:
Dice("gg","gggg") = 2 × ∣ {"gg"} ∣ / (|{"gg"}∣ + ∣ {"gg", "gg", "gg"}∣
= 2 / 4
= 0.5
So, the Dice-Sørensen coefficient of similarity between the strings "gg" and "gggg" using bigrams is 0.5.
User
But {gg,gg,gg} as a set only has one element, so the formula yields 2 * 1 / (1+1), that is, 1, no?
ChatGPT: You are correct, I apologize for the oversight. Let's correct the calculation.
For the given strings "gg" and "gggg":
Intersection: There is one common bigram, "gg".
Union: There are two distinct bigrams, "gg" and "gg".
Using the formula for the Dice-Sørensen coefficient:
[...]
Dice("gg","gggg")= 1
So, the Dice-Sørensen coefficient of similarity between the strings "gg" and "gggg" using bigrams is indeed 1. Thank you for catching that mistake.
<pre>
|