Talk:Sorensen–Dice coefficient: Difference between revisions

ChatGPT is wobbly on the issue
No edit summary
(ChatGPT is wobbly on the issue)
 
Line 13:
 
Where has this definition of Sorensen–Dice coefficient come from?--[[User:Nigel Galloway|Nigel Galloway]] ([[User talk:Nigel Galloway|talk]]) 13:04, 18 October 2023 (UTC)
 
== ChatGPT is wobbly on the issue ==
<pre>
ChatGPT: To calculate the Dice-Sørensen coefficient of similarity between the strings "gg" and "gggg", we first need to find the bigrams present in each string.
 
For "gg", there is only one bigram: "gg".
For "gggg", there are three bigrams: "gg", "gg", "gg".
 
Now, we calculate the intersection and union of the sets of bigrams:
 
Intersection: There is one common bigram, which is "gg".
Union: There are three distinct bigrams in total.
Using the formula for the Dice-Sørensen coefficient:
 
Dice("gg","gggg") = 2 × ∣ {"gg"} ∣ / (|{"gg"}∣ + ∣ {"gg", "gg", "gg"}∣
= 2 / 4
= 0.5
 
 
So, the Dice-Sørensen coefficient of similarity between the strings "gg" and "gggg" using bigrams is 0.5.
 
User
But {gg,gg,gg} as a set only has one element, so the formula yields 2 * 1 / (1+1), that is, 1, no?
 
ChatGPT: You are correct, I apologize for the oversight. Let's correct the calculation.
 
For the given strings "gg" and "gggg":
 
Intersection: There is one common bigram, "gg".
Union: There are two distinct bigrams, "gg" and "gg".
Using the formula for the Dice-Sørensen coefficient:
 
[...]
 
Dice("gg","gggg")= 1
 
So, the Dice-Sørensen coefficient of similarity between the strings "gg" and "gggg" using bigrams is indeed 1. Thank you for catching that mistake.
<pre>
2,442

edits