Jump to content

Sorensen–Dice coefficient: Difference between revisions

m
m (add a docstring)
m (→‎{{header|J}}: grammar)
Line 171:
fmt=: ((8j6": 0{::]),' ',1{::])"1</syntaxhighlight>
 
The trick here is the concept of "intersection" which we must use. We can't use set intersection -- the current draft task description, suggests that <code>SDI = 2 × (A ∩ B) / (A ⊎ B)</code> produces a number between 0 and 1. Because we're using division to produce this number, we must be using cardinality of the intersection rather than the intersection itself.
But if A and B are sets, each containing the same tokens, the result here using cardinality of sets would be 2 rather than 1.
 
Instead, we treat treat A and B as sequences of tokens (so repeated copies of a token are distinct) and, for the cardinality of the intersection we count the number of times that each token appears in either A and in B and sum the minimum of the two counts. (So, tokens which only appear in A count 0 times, for example, where a token which appears 3 times in A and 2 times in B would contribute 2 to the sum.)
 
With this implementation, here's the task examples:
6,962

edits

Cookies help us deliver our services. By using our services, you agree to our use of cookies.