Most frequent k chars distance: Difference between revisions

← Older edit

Most frequent k chars distance (view source)

Revision as of 11:29, 4 January 2024

20,029 bytes added , 5 months ago

m

→‎{{header|Wren}}: Minor tidy

PureFox

9,486

edits

Revision as of 23:55, 27 May 2021 (view source) Alextretyak (talk \| contribs) (Added 11l) ← Older edit		Latest revision as of 11:29, 4 January 2024 (view source) PureFox (talk \| contribs) m (→‎{{header\|Wren}}: Minor tidy)
(8 intermediate revisions by 5 users not shown)
Line 1: {{~~clarify~~draft task}} ~~{{draft task}}{{Wikipedia\|Most frequent k characters}}~~ The string distance function (SDF) that is the subject of this entry In [[wp:information theory\|information theory]], the '''MostFreqKDistance''' is a [[wp:String metric\|string metric]] for quickly estimating how [[wp:Similarity measure\|similar]] two [[wp:Order theory\|ordered sets]] or [[wp:String (computer science)\|strings]] are. The scheme was invented by Sadi Evren SEKER,<ref name="mfkc"/> and initially used in [[wp:text mining\|text mining]] applications like [[wp:author recognition\|author recognition]].<ref name="mfkc">{{citation was proposed in a paper published in 2014 as a method for quickly estimating how [[wp:Similarity measure\|similar]] two [[wp:Order theory\|ordered sets]] or [[wp:String (computer science)\|strings]] are. This SDF has also been proposed as being suitable in bioinformatics, e.g. for comparing sequences in the [[wp:fasta]] format. Unfortunately, the original paper by Sadi Evren Seker et al. (arXiv:1401.6596 [cs.DS]) <ref name="mfkc">{{citation \| last1 = SEKER \| first1 = Sadi E. \| author1-link = Sadi Evren SEKER \| last2 = Altun \| first2 = Oguz Line 12 ⟶ 22: \| publisher = [[wp:International Association of Computer Science and Information Technology Press (IACSIT Press)]] \| title = [[wp:International Journal of Machine Learning and Computing (IJMLC)]] \| url = ~~http~~https://arxiv.org/abs/1401.6596 \| year = 2014}}.</ref> has a number of internal inconsistencies and for this and other ~~This method is originally based on a hashing function, MaxFreqKChars<ref name="hashfunc">{{citation~~ reasons is not entirely clear in several key respects, but the paper ~~\| last1 = Seker \| first1 = Sadi E. \| author1-link = Sadi Evren SEKER~~ does give some worked examples and several of these (notably the two ~~\| last2 = Mert \| first2 = Cihan~~ given under the "Worked Examples" heading below) agree with the ~~\| contribution = A Novel Feature Hashing For Text Mining~~ interpretation used by the authors of the Python entry herein, so that ~~\| url = http://journal.ibsu.edu.ge/index.php/jtst/article/view/428~~ interpretation is used in the present task description. ~~\| pages = 37 -41~~ ~~\| publisher = [[wp:International Black Sea University]]~~ ~~\| title = Journal of Technical Science and Technologies~~ ~~\| ISSN = 2298-0032~~ ~~\| volume = 2~~ ~~\| issue = 1~~ ~~\| year = 2013}}.</ref>~~ ~~classical [[wp:author recognition\|author recognition]] problem and idea first came out while studying [[wp:data stream mining\|data stream mining]].<ref name="author">{{citation~~ ~~\| last1 = Seker \| first1 = Sadi E. \| author1-link = Sadi Evren SEKER~~ ~~\| last2 = Al-Naami \| first2 = Khaled~~ ~~\| last3 = Khan \| first3 = Latifur~~ ~~\| contribution = Author attribution on streaming data~~ ~~\| doi = 10.1109/IRI.2013.6642511~~ ~~\| url = http://ieeexplore.ieee.org/xpl/articleDetails.jsp?tp=&arnumber=6642511~~ ~~\| pages = 497-503~~ ~~\| publisher = [[wp:IEEE]]~~ ~~\| title = Information Reuse and Integration (IRI), 2013 IEEE 14th International Conference on, San Fransisco, USA, Aug 14-16, 2013~~ ~~\| year = 2013}}.</ref>~~ ~~The string distance~~ ;The Task ~~;Definition~~ ~~Method has two steps.~~ * [[wp:Hash function\|Hash]] input strings str1 and str2 separately using MostFreqKHashing and output hstr1 and hstr2 respectively * Calculate string distance (or string similarity coefficient) of two hash outputs, hstr1 and hstr2 and output an integer value (a) Show a time-efficient implementation of the SDF as if by a ~~;Most Frequent K Hashing~~ function mostFreqKSDF(s1, s2, k, n), where s1 and s2 are arbitrary ~~The first step of algorithm is calculating the hashing based on the most frequent k characters. The hashing algorithm has below steps:~~ strings, and k and n are non-negative integers as explained below; ~~'''string function''' ''MostFreqKHashing'' ('''string''' inputString, '''int''' K)~~ ~~'''def string''' outputString~~ ~~'''for each''' distinct characters~~ ~~'''count''' occurrence of each character~~ ~~'''for''' i '''from''' 0 '''to''' K~~ ~~'''char''' c = '''next''' most frequent ''i''<sup>th</sup> character (if two chars have same frequency than get the first occurrence in inputString)~~ ~~'''int''' count = number of occurrence of the character~~ ~~'''append''' to ''outputString'', c and count~~ ~~'''end for'''~~ ~~'''return''' outputString~~ (b) Show the values produced by the SDF for the following values of s1 and s2, Aim of 'Most Frequent K Hashing' function is calculating the most count of each character and returning the K most frequent character with the character and count. Rules of hash can be listed as below: with k = 2 and n = 10: * Output will hold the character and count * Most frequent character and count will appear before the least frequent at the output * if two characters have equal frequency, the first appearing in input will appear before at the output ~~Similar to the most of [[wp:hashing function\|hashing functions]], ''Most Frequent K Hashing'' is also a [[wp:one way function]].~~ ~~;Most Frequent K Distance~~ ~~Distance calculation between two strings is based on the hash outputs of two strings.~~ ~~'''int function''' ''MostFreqKSimilarity'' ('''string''' inputStr1, '''string''' inputStr2)~~ ~~'''def int''' similarity~~ ~~'''for each''' c = '''next''' character '''from''' inputStr1~~ ~~'''lookup''' c '''in''' inputStr2~~ ~~'''if''' c '''is not null'''~~ ~~similarity '''+=''' frequency of c in inputStr1 + frequency of c in inputStr2~~ ~~'''return''' similarity~~ Above function, simply gets two input strings, previously outputted from the MostFreqKHashing function. From the most frequent k hashing function, the characters and their frequencies are returned. So, the similarity function calculates the similarity based on characters and their frequencies by checking if the same character appears on both strings and if their frequencies are equal. In some implementations, the distance metric is required instead of similarity coefficient. In order to convert the output of above similarity coefficient to distance metric, the output can be subtracted from any constant value (like the maximum possible output value). For the case, it is also possible to implement a [[wp:wrapper function]] over above two functions. ~~;;String Distance Wrapper Function~~ ~~In order to calculate the distance between two strings, below function can be implemented~~ ~~'''int function''' MostFreqKSDF ('''string''' inputStr1, '''string''' inputStr2, '''int''' K, '''int''' maxDistance)~~ ~~'''return''' maxDistance - MostFreqKSimilarity(MostFreqKHashing(inputStr1,K), MostFreqKHashing(inputStr2,K))~~ Any call to above string distance function will supply two input strings and a maximum distance value. The function will calculate the similarity and subtract that value from the maximum possible distance. It can be considered as a simple [[wp:additive inverse]] of similarity. ~~;Examples~~ ~~Let's consider maximum 2 frequent hashing over two strings ‘research’ and ‘seeking’.~~ ~~<lang javascript>MostFreqKHashing('research',2) = 'r2e2'</lang>~~ ~~because we have 2 'r' and 2 'e' characters with the highest frequency and we return in the order they appear in the string.~~ ~~<lang javascript>MostFreqKHashing('seeking',2) = 'e2s1'</lang>~~ ~~Again we have character 'e' with highest frequency and rest of the characters have same frequency of 1, so we return the first character of equal frequencies, which is 's'.~~ ~~Finally we make the comparison:~~ ~~<lang javascript>MostFreqKSimilarity('r2e2','e2s1') = 2</lang>~~ ~~We simply compared the outputs and only character occurring in both input is character 'e' and the occurrence in both input is 2.~~ ~~Instead running the sample step by step as above, we can simply run by using the string distance wrapper function as below:~~ ~~<lang javascript>MostFreqKSDF('research', 'seeking',2) = 2</lang>~~ ~~Below table holds some sample runs between example inputs for K=2:~~ {\|class="wikitable" \|- !String Inputs !"Top Two" ~~!Hash Outputs~~ !SDF Output (~~max from~~k=2, n=10) \|- \|'night' 'nacht' \|n:1 i:1 ~~\|n1i1~~ n:1 a:1 ~~n1a1~~ \|98 \|- \|'my' 'a' \|m:1 y:1 ~~\|m1y1~~ a:1 ~~a1NULL0~~ \|10 \|- \|‘research’ ‘research’ \|r:2 e:2 ~~\|r2e2~~ r:2 e:2 ~~r2e2~~ \|82 \|- ~~\|‘aaaaabbbb’~~ ~~‘ababababa’~~ ~~\|a5b4~~ ~~a5b4~~ \|1 \|- \|‘significant’ ‘capabilities’ \|i:3 n:2 ~~\|i3n2~~ i:3 a:2 ~~i3a2~~ \|54 \|} ~~Method is also suitable for bioinformatics to compare the genetic strings like in [[wp:fasta]] format~~ (c) Show the value produced by the SDF for k = 2, n = 100, and the two strings: "LCLYTHIGRNIYYGSYLYSETWNTGIMLLLITMATAFMGYVLPWGQMSFWGATVITNLFSAIPYIGTNLV" "EWIWGGFSVDKATLNRFFAFHFILPFTMVALAGVHLTFLHETGSNNPLGLTSDSDKIPFHPYYTIKDFLG" ; Reference Algorithm The original paper employs an unusual scheme for encoding the frequencies of single letters as a string of letters and numbers. This scheme has several disadvantages, and so in the following we take a more abstract view and simply assume, solely for the purposes of this description, that two functions are available: 1) MostFreqK(s, k) returns (in decreasing order of their frequency of occurrence) the k most frequently occurring letters in s, with any ties broken by the position of the first occurrence in s; 2) For any letter c in MostFreqK(s, k), Freq(s,c) returns the frequency of c in s. ; MFKsimilarity Next define MFKsimilarlity(s1, s2, k) as if by the following pseudo-code: <pre> int function MFKsimilarity(string s1, string s2, int k): ~~Str1= LCLYTHIGRNIYYGSYLYSETWNTGIMLLLITMATAFMGYVLPWGQMSFWGATVITNLFSAIPYIGTNLV~~ let similarity = 0 ~~Str2 = EWIWGGFSVDKATLNRFFAFHFILPFTMVALAGVHLTFLHETGSNNPLGLTSDSDKIPFHPYYTIKDFLG~~ let mfk1 = MostFreqK(s1, k) ~~MostFreqKHashing(str1,2) = L9T8~~ let mfk2 = MostFreqK(s2, k) ~~MostFreqKHashing(str2,2) = F9L8~~ for each c in mfk1 ~~MostFreqKSDF(str1,str2,2,100) = 83~~ if (c occurs in mfk2) then similarity += Freq(s1, c) + Freq(s2, c) end return similarity </pre> ; String Distance Wrapper Function Now define MostFreqKSDF as if by: <pre> int function MostFreqKSDF(string s1, string s2, int k, int n) return n - MFKsimilarity(s1, s2, k) </pre> ; Worked Examples: The original paper gives several worked examples, including these two: (i) MostFreqKSDF("night", "nacht", 2, 10) For "night", the "top two" letters with their frequencies are: n:1 and i:1. For "nacht", the "top two" letters with their frequencies are: n:1 and a:1. Since "n" is the only shared letter amongst these candidates, the SDF is 10 - 1 - 1 = 8, as per the original paper. (2) MostFreqKSDF( "LCLYTHIGRNIYYGSYLYSETWNTGIMLLLITMATAFMGYVLPWGQMSFWGATVITNLFSAIPYIGTNLV", "EWIWGGFSVDKATLNRFFAFHFILPFTMVALAGVHLTFLHETGSNNPLGLTSDSDKIPFHPYYTIKDFLG", 2, 100) For the first string, the "top two" letters with their frequencies are: L:9 and T:8; and for the second, the "top two" are F:9 and L:8. Since "L" is the only shared letter amongst these top-k characters, and since the sum of the corresponding frequencies is 9+8 = 17, the SDF is 83. <br/> =={{header\|11l}}== {{trans\|Python}} <~~lang~~syntaxhighlight lang="11l">F most_freq_khashing(inputString, K) DefaultDict[Char, Int] occuDict L(c) inputString Line 175: print(dict2, end' ":\n") print(dict2.map((c, cnt) -> c‘’String(cnt)).join(‘’)) print(most_freq_ksdf(str1, str2, K, maxDistance))</~~lang~~syntaxhighlight> {{out}} Line 187: =={{header\|C++}}== <~~lang~~syntaxhighlight lang="cpp">#include <string> #include <vector> #include <map> Line 242: return 0 ; } </syntaxhighlight> ~~</lang>~~ {{out}} <pre>MostFreqKHashing( s1 , 2 ) = L9T8 Line 250: =={{header\|Go}}== {{trans\|Kotlin}} <~~lang~~syntaxhighlight lang="go">package main import ( Line 368: s2 = reverseStr(s1) mostFreqKSDF(s1, s2, 2, 100) }</~~lang~~syntaxhighlight> {{out}} Line 422: =={{header\|Haskell}}== <~~lang~~syntaxhighlight ~~Haskell~~lang="haskell">module MostFrequentK where import Data.List ( nub , sortBy ) Line 461: mostFreqKSDF :: String -> String -> Int ->Int mostFreqKSDF s t n = mostFreqKSimilarity ( mostFreqKHashing s n ) (mostFreqKHashing t n ) </syntaxhighlight> ~~</lang>~~ {{out}} <pre>mostFrequentKHashing "LCLYTHIGRNIYYGSYLYSETWNTGIMLLLITMATAFMGYVLPWGQMSFWGATVITNLFSAIPYIGTNLV" 2 Line 470: =={{header\|J}}== '''Solution''':<~~lang~~syntaxhighlight lang="j">NB. String Distance Wrapper Function mfksDF =: {:@:[ - (mfks@:(mfkh&.>)~ {.)~ Line 487: NB. No need to fix mfksDF mfkh =: mfkh f. mfks =: mfks f.</~~lang~~syntaxhighlight> '''Examples''':<~~lang~~syntaxhighlight lang="j">verb define '' fkh =. ;@:,@:(":&.>) NB. format k hash Line 516: 'pass' ) pass</~~lang~~syntaxhighlight> '''Notes''': As of press time, there are significant discrepancies between the task description, its pseudocode, the test cases provided, and the two other existing implementations. See the [[Talk:Most frequent k chars distance#Prank_Page.3F\|talk page]] for the assumptions made in this implementation to reconcile these discrepancies (in particular, in the scoring function). Line 523: Translation of the pseudo-code of the Wikipedia article [[wp:Most frequent k characters]] to [[wp:java]] implementation of three functions given in the definition section are given below with [[wp:JavaDoc]] comments: <~~lang~~syntaxhighlight lang="java">import java.util.Collections; import java.util.Comparator; import java.util.HashMap; Line 649: } }</~~lang~~syntaxhighlight> Line 655: Defines the functions for counting and frequency outside of the hashing function, as it is easier to calculate the comparison from a structured result rather than parsing a hash. <~~lang~~syntaxhighlight lang="javascript">//returns an object of counts keyed by character const kCounts = str => { const counts = {}; Line 703: const mostFreqKSDF = (str1, str2, k, maxDistance) => { return maxDistance - mostFreqKSimilarity(str1, str2, k); };</~~lang~~syntaxhighlight> '''Testing with Jest''' <~~lang~~syntaxhighlight lang="javascript">test('hash of "LCLYTHIGRNIYYGSYLYSETWNTGIMLLLITMATAFMGYVLPWGQMSFWGATVITNLFSAIPYIGTNLV" is "L9T8"', () => { expect(mostFreqKHashing(str1, 2)).toBe("L9T8"); }); Line 716: test("SDF of strings 1 and 2 with k=2 and max=100 is 83", () => { expect(mostFreqKSDF(str1, str2, 2, 100)).toBe(83); });</~~lang~~syntaxhighlight> '''Testing in Console''' <~~lang~~syntaxhighlight lang="javascript">const str1 = "LCLYTHIGRNIYYGSYLYSETWNTGIMLLLITMATAFMGYVLPWGQMSFWGATVITNLFSAIPYIGTNLV"; const str2 = "EWIWGGFSVDKATLNRFFAFHFILPFTMVALAGVHLTFLHETGSNNPLGLTSDSDKIPFHPYYTIKDFLG"; const K = 2; Line 725: console.log(mostFreqKHashing(str1, K)); console.log(mostFreqKHashing(str2, K)); console.log(mostFreqKSDF(str1, str2, K, maxDistance));</~~lang~~syntaxhighlight> {{out}} Line 731: F9L8 83</pre> =={{header\|jq}}== {{works with\|jq}} '''Works with gojq, the Go implementation of jq''' '''Preliminaries''' <syntaxhighlight lang="jq"># bag of words def bow(stream): reduce stream as $word ({}; .[($word\|tostring)] += 1); # Like sort_by(f) but for items that compare equal, retain the original order def sort_by_decreasing(f): # Add $n-$i def enumerate: . as $in \| length as $n \| reduce range(0;$n) as $i ([]; . + [ [$n-$i, $in[$i] ] ]); enumerate \| sort_by((.[1]\|f), .[0]) \| reverse \| map(.[1]);</syntaxhighlight> '''The Tasks''' <syntaxhighlight lang="jq"># Output: { characters: array_of_characters_in_decreasing_order_of_frequency, frequency: object} def MostFreqKHashing($K): def chars: explode \| range(0;length) as $i \| [.[$i]] \| implode; . as $in \| bow(tostring \| chars) as $bow \| $bow \| to_entries \| sort_by_decreasing(.value) # if two chars have same frequency than get the first occurrence in $in \| (reduce .[0:$K][] as $kv ({}; .[$kv.key] = $kv.value) ) as $frequency \| {characters: map(.key)[:$K], $frequency}; def MostFreqKSimilarity($in1; $in2; $K): [$in1, $in2] \| map( MostFreqKHashing($K)) as [$s1, $s2] \| reduce $s1.characters[] as $c (0; $s2.frequency[$c] as $f \| if $f then . + $s1.frequency[$c] + $f else . end) ; def MostFreqKSDF($inputStr1; $inputStr2; $K; $maxDistance): $maxDistance - MostFreqKSimilarity($inputStr1; $inputStr2; $K); def MostFreqKSDF($K; $maxDistance): . as [$inputStr1, $inputStr2] \| $maxDistance - MostFreqKSimilarity($inputStr1; $inputStr2; $K); def task2: ["night", "nacht"], ["my", "a"], ["research", "research"], ["research", "seeking"], ["significant", "capabilities"] \| MostFreqKSDF(2; 10) as $sdk \| [., $sdk] ; def task100: ["LCLYTHIGRNIYYGSYLYSETWNTGIMLLLITMATAFMGYVLPWGQMSFWGATVITNLFSAIPYIGTNLV", "EWIWGGFSVDKATLNRFFAFHFILPFTMVALAGVHLTFLHETGSNNPLGLTSDSDKIPFHPYYTIKDFLG"] \| MostFreqKSDF(2; 100); task2, task100</syntaxhighlight> {{out}} <pre> [["night","nacht"],8] [["my","a"],10] [["research","research"],2] [["research","seeking"],6] [["significant","capabilities"],4] 83 </pre> =={{header\|Kotlin}}== Line 740 ⟶ 808: I have no idea what NULL0 is supposed to mean so I've ignored that. <~~lang~~syntaxhighlight lang="scala">// version 1.1.51 fun mostFreqKHashing(input: String, k: Int): String = Line 793 ⟶ 861: s2 = s1.reversed() mostFreqKSDF(s1, s2, 2, 100) }</~~lang~~syntaxhighlight> {{out}} Line 845 ⟶ 913: SDF(input1, input2, 2, 100) = 88 </pre> =={{header\|Nim}}== It is impossible to get consistent results. We chose to implement Wikipedia algorithm (note that the page is no longer available, except in Internet Archive). With this algorithm, we get the same results as those of Wikipedia except for the last example where we get 91 instead of 83. In order to avoid the limitation to 10 occurrences, we chose to use an ordered table (equivalent to Python OrderedDict). We could have used Kotlin solution or used a sequence of tuples (char, count) but the ordered table gives better performance. <syntaxhighlight lang="nim">import algorithm, sugar, tables type CharCounts = OrderedTable[char, int] func `$`(counts: CharCounts): string = for (c, count) in counts.pairs: result.add $c & $count func mostFreqKHashing(str: string; k: Positive): CharCounts = var counts: CharCounts # To get the counts in apparition order. for c in str: inc counts.mgetOrPut(c, 0) counts.sort((x, y) => cmp(x[1], y[1]), Descending) # Note that sort is stable. var count = 0 for c, val in counts.pairs: inc count result[c] = val if count == k: break func mostFreqKSimilarity(input1, input2: CharCounts): int = for c, count in input1.pairs: if c in input2: result += count func mostFreqKSDF(str1, str2: string; k, maxDistance: Positive): int = maxDistance - mostFreqKSimilarity(mostFreqKHashing(str1, k), mostFreqKHashing(str2, k)) const Pairs = [("night", "nacht"), ("my", "a"), ("research", "research"), ("aaaaabbbb", "ababababa"), ("significant", "capabilities")] for (str1, str2) in Pairs: echo "str1: ", str1 echo "str2: ", str2 echo "mostFreqKHashing(str1, 2) = ", mostFreqKHashing(str1, 2) echo "mostFreqKHashing(str2, 2) = ", mostFreqKHashing(str2, 2) echo "mostFreqKSDF(str1, str2, 2, 10) = ", mostFreqKSDF(str1, str2, 2, 10) echo() const S1 = "LCLYTHIGRNIYYGSYLYSETWNTGIMLLLITMATAFMGYVLPWGQMSFWGATVITNLFSAIPYIGTNLV" S2 = "EWIWGGFSVDKATLNRFFAFHFILPFTMVALAGVHLTFLHETGSNNPLGLTSDSDKIPFHPYYTIKDFLG" echo "str1: ", S1 echo "str2: ", S2 echo "mostFreqKHashing(str1, 2) = ", mostFreqKHashing(S1, 2) echo "mostFreqKHashing(str2, 2) = ", mostFreqKHashing(S2, 2) echo "mostFreqKSDF(str1, str2, 2, 100) = ", mostFreqKSDF(S1, S2, 2, 100)</syntaxhighlight> {{out}} <pre>str1: night str2: nacht mostFreqKHashing(str1, 2) = n1i1 mostFreqKHashing(str2, 2) = n1a1 mostFreqKSDF(str1, str2, 2, 10) = 9 str1: my str2: a mostFreqKHashing(str1, 2) = m1y1 mostFreqKHashing(str2, 2) = a1 mostFreqKSDF(str1, str2, 2, 10) = 10 str1: research str2: research mostFreqKHashing(str1, 2) = r2e2 mostFreqKHashing(str2, 2) = r2e2 mostFreqKSDF(str1, str2, 2, 10) = 6 str1: aaaaabbbb str2: ababababa mostFreqKHashing(str1, 2) = a5b4 mostFreqKHashing(str2, 2) = a5b4 mostFreqKSDF(str1, str2, 2, 10) = 1 str1: significant str2: capabilities mostFreqKHashing(str1, 2) = i3n2 mostFreqKHashing(str2, 2) = i3a2 mostFreqKSDF(str1, str2, 2, 10) = 7 str1: LCLYTHIGRNIYYGSYLYSETWNTGIMLLLITMATAFMGYVLPWGQMSFWGATVITNLFSAIPYIGTNLV str2: EWIWGGFSVDKATLNRFFAFHFILPFTMVALAGVHLTFLHETGSNNPLGLTSDSDKIPFHPYYTIKDFLG mostFreqKHashing(str1, 2) = L9T8 mostFreqKHashing(str2, 2) = F9L8 mostFreqKSDF(str1, str2, 2, 100) = 91</pre> =={{header\|Perl}}== <~~lang~~syntaxhighlight ~~Perl~~lang="perl">#!/usr/bin/perl use strict ; use warnings ; Line 897 ⟶ 1,060: print "MostFreqKHashing ( " . '$firststring , 2)' . " is " . mostFreqHashing( $firststring , 2 ) . "\n" ; print "MostFreqKHashing ( " . '$secondstring , 2)' . " is " . mostFreqHashing( $secondstring , 2 ) . "\n" ; </syntaxhighlight> ~~</lang>~~ {{out}} <pre>MostFreqKHashing ( $firststring , 2) is L9T8 Line 904 ⟶ 1,067: =={{header\|Phix}}== {{trans\|Go}} <!--<syntaxhighlight lang="phix">(phixonline)--> ~~<lang Phix>function mostFreqKHashing(string input, integer k)~~ <span style="color: #008080;">with</span> <span style="color: #008080;">javascript_semantics</span> ~~string cfs = ""~~ <span style="color: #008080;">function</span> <span style="color: #000000;">mostFreqKHashing</span><span style="color: #0000FF;">(</span><span style="color: #004080;">string</span> <span style="color: #000000;">input</span><span style="color: #0000FF;">,</span> <span style="color: #004080;">integer</span> <span style="color: #000000;">k</span><span style="color: #0000FF;">)</span> ~~sequence cfsnx = {}~~ <span style="color: #004080;">string</span> <span style="color: #000000;">cfs</span> <span style="color: #0000FF;">=</span> <span style="color: #008000;">""</span> ~~for i=1 to length(input) do~~ <span style="color: #004080;">sequence</span> <span style="color: #000000;">cfsnx</span> <span style="color: #0000FF;">=</span> <span style="color: #0000FF;">{}</span> ~~integer r = input[i],~~ <span style="color: #008080;">for</span> <span style="color: #000000;">i</span><span style="color: #0000FF;">=</span><span style="color: #000000;">1</span> <span style="color: #008080;">to</span> <span style="color: #7060A8;">length</span><span style="color: #0000FF;">(</span><span style="color: #000000;">input</span><span style="color: #0000FF;">)</span> <span style="color: #008080;">do</span> ~~ix = find(r,cfs)~~ <span style="color: #004080;">integer</span> <span style="color: #000000;">r</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">input</span><span style="color: #0000FF;">[</span><span style="color: #000000;">i</span><span style="color: #0000FF;">],</span> ~~if ix>0 then~~ <span style="color: #000000;">ix</span> <span style="color: #0000FF;">=</span> <span style="color: #7060A8;">find</span><span style="color: #0000FF;">(</span><span style="color: #000000;">r</span><span style="color: #0000FF;">,</span><span style="color: #000000;">cfs</span><span style="color: #0000FF;">)</span> ~~cfsnx[ix][1] -= 1~~ <span style="color: #008080;">if</span> <span style="color: #000000;">ix</span><span style="color: #0000FF;">></span><span style="color: #000000;">0</span> <span style="color: #008080;">then</span> ~~else~~ <span style="color: #000000;">cfsnx</span><span style="color: #0000FF;">[</span><span style="color: #000000;">ix</span><span style="color: #0000FF;">][</span><span style="color: #000000;">1</span><span style="color: #0000FF;">]</span> <span style="color: #0000FF;">-=</span> <span style="color: #000000;">1</span> ~~cfs &= r~~ <span style="color: #008080;">else</span> ~~cfsnx = append(cfsnx,{-1,length(cfs)})~~ <span style="color: #000000;">cfs</span> <span style="color: #0000FF;">&=</span> <span style="color: #000000;">r</span> ~~end if~~ <span style="color: #000000;">cfsnx</span> <span style="color: #0000FF;">=</span> <span style="color: #7060A8;">append</span><span style="color: #0000FF;">(</span><span style="color: #000000;">cfsnx</span><span style="color: #0000FF;">,{-</span><span style="color: #000000;">1</span><span style="color: #0000FF;">,</span><span style="color: #7060A8;">length</span><span style="color: #0000FF;">(</span><span style="color: #000000;">cfs</span><span style="color: #0000FF;">)})</span> ~~end for~~ <span style="color: #008080;">end</span> <span style="color: #008080;">if</span> ~~cfsnx = sort(cfsnx) -- (aside: the idx forces stable sort)~~ <span style="color: #008080;">end</span> <span style="color: #008080;">for</span> ~~sequence acc := {}~~ <span style="color: #000000;">cfsnx</span> <span style="color: #0000FF;">=</span> <span style="color: #7060A8;">sort</span><span style="color: #0000FF;">(</span><span style="color: #000000;">cfsnx</span><span style="color: #0000FF;">)</span> <span style="color: #000080;font-style:italic;">-- (aside: the idx forces stable sort)</span> ~~for i=1 to min(length(cfs),k) do~~ <span style="color: #004080;">sequence</span> <span style="color: #000000;">acc</span> <span style="color: #0000FF;">:=</span> <span style="color: #0000FF;">{}</span> ~~integer {n,ix} = cfsnx[i]~~ <span style="color: #008080;">for</span> <span style="color: #000000;">i</span><span style="color: #0000FF;">=</span><span style="color: #000000;">1</span> <span style="color: #008080;">to</span> <span style="color: #7060A8;">min</span><span style="color: #0000FF;">(</span><span style="color: #7060A8;">length</span><span style="color: #0000FF;">(</span><span style="color: #000000;">cfs</span><span style="color: #0000FF;">),</span><span style="color: #000000;">k</span><span style="color: #0000FF;">)</span> <span style="color: #008080;">do</span> ~~acc &= {cfs[ix], -n}~~ <span style="color: #004080;">integer</span> <span style="color: #0000FF;">{</span><span style="color: #000000;">n</span><span style="color: #0000FF;">,</span><span style="color: #000000;">ix</span><span style="color: #0000FF;">}</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">cfsnx</span><span style="color: #0000FF;">[</span><span style="color: #000000;">i</span><span style="color: #0000FF;">]</span> ~~end for~~ <span style="color: #000000;">acc</span> <span style="color: #0000FF;">&=</span> <span style="color: #0000FF;">{</span><span style="color: #000000;">cfs</span><span style="color: #0000FF;">[</span><span style="color: #000000;">ix</span><span style="color: #0000FF;">],</span> <span style="color: #0000FF;">-</span><span style="color: #000000;">n</span><span style="color: #0000FF;">}</span> ~~return acc~~ <span style="color: #008080;">end</span> <span style="color: #008080;">for</span> ~~end function~~ <span style="color: #008080;">return</span> <span style="color: #000000;">acc</span> <span style="color: #008080;">end</span> <span style="color: #008080;">function</span> ~~function mostFreqKSimilarity(sequence input1, input2)~~ ~~integer similarity := 0~~ <span style="color: #008080;">function</span> <span style="color: #000000;">mostFreqKSimilarity</span><span style="color: #0000FF;">(</span><span style="color: #004080;">sequence</span> <span style="color: #000000;">input1</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">input2</span><span style="color: #0000FF;">)</span> ~~for i=1 to length(input1) by 2 do~~ <span style="color: #004080;">integer</span> <span style="color: #000000;">similarity</span> <span style="color: #0000FF;">:=</span> <span style="color: #000000;">0</span> ~~for j=1 to length(input2) by 2 do~~ <span style="color: #008080;">for</span> <span style="color: #000000;">i</span><span style="color: #0000FF;">=</span><span style="color: #000000;">1</span> <span style="color: #008080;">to</span> <span style="color: #7060A8;">length</span><span style="color: #0000FF;">(</span><span style="color: #000000;">input1</span><span style="color: #0000FF;">)</span> <span style="color: #008080;">by</span> <span style="color: #000000;">2</span> <span style="color: #008080;">do</span> ~~if input1[i] == input2[j] then~~ <span style="color: #008080;">for</span> <span style="color: #000000;">j</span><span style="color: #0000FF;">=</span><span style="color: #000000;">1</span> <span style="color: #008080;">to</span> <span style="color: #7060A8;">length</span><span style="color: #0000FF;">(</span><span style="color: #000000;">input2</span><span style="color: #0000FF;">)</span> <span style="color: #008080;">by</span> <span style="color: #000000;">2</span> <span style="color: #008080;">do</span> ~~integer freq1 = input1[i+1],~~ <span style="color: #008080;">if</span> <span style="color: #000000;">input1</span><span style="color: #0000FF;">[</span><span style="color: #000000;">i</span><span style="color: #0000FF;">]</span> <span style="color: #0000FF;">==</span> <span style="color: #000000;">input2</span><span style="color: #0000FF;">[</span><span style="color: #000000;">j</span><span style="color: #0000FF;">]</span> <span style="color: #008080;">then</span> ~~freq2 = input2[j+1]~~ <span style="color: #004080;">integer</span> <span style="color: #000000;">freq1</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">input1</span><span style="color: #0000FF;">[</span><span style="color: #000000;">i</span><span style="color: #0000FF;">+</span><span style="color: #000000;">1</span><span style="color: #0000FF;">],</span> ~~if freq1=freq2 then~~ <span style="color: #000000;">freq2</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">input2</span><span style="color: #0000FF;">[</span><span style="color: #000000;">j</span><span style="color: #0000FF;">+</span><span style="color: #000000;">1</span><span style="color: #0000FF;">]</span> ~~similarity += freq1~~ <span style="color: #008080;">if</span> <span style="color: #000000;">freq1</span><span style="color: #0000FF;">=</span><span style="color: #000000;">freq2</span> <span style="color: #008080;">then</span> ~~end if~~ <span style="color: #000000;">similarity</span> <span style="color: #0000FF;">+=</span> <span style="color: #000000;">freq1</span> ~~end if~~ <span style="color: #008080;">end</span> <span style="color: #008080;">if</span> ~~end for~~ <span style="color: #008080;">end</span> <span style="color: #008080;">if</span> ~~end for~~ <span style="color: #008080;">end</span> <span style="color: #008080;">for</span> ~~return similarity~~ <span style="color: #008080;">end</span> <span style="color: #008080;">for</span> ~~end function~~ <span style="color: #008080;">return</span> <span style="color: #000000;">similarity</span> <span style="color: #008080;">end</span> <span style="color: #008080;">function</span> ~~function flat(sequence s)~~ ~~string res = ""~~ <span style="color: #008080;">function</span> <span style="color: #000000;">flat</span><span style="color: #0000FF;">(</span><span style="color: #004080;">sequence</span> <span style="color: #000000;">s</span><span style="color: #0000FF;">)</span> ~~for i=1 to length(s) by 2 do~~ <span style="color: #004080;">string</span> <span style="color: #000000;">res</span> <span style="color: #0000FF;">=</span> <span style="color: #008000;">""</span> ~~res &= sprintf("%c%d",s[i..i+1])~~ <span style="color: #008080;">for</span> <span style="color: #000000;">i</span><span style="color: #0000FF;">=</span><span style="color: #000000;">1</span> <span style="color: #008080;">to</span> <span style="color: #7060A8;">length</span><span style="color: #0000FF;">(</span><span style="color: #000000;">s</span><span style="color: #0000FF;">)</span> <span style="color: #008080;">by</span> <span style="color: #000000;">2</span> <span style="color: #008080;">do</span> ~~end for~~ <span style="color: #000000;">res</span> <span style="color: #0000FF;">&=</span> <span style="color: #7060A8;">sprintf</span><span style="color: #0000FF;">(</span><span style="color: #008000;">"%c%d"</span><span style="color: #0000FF;">,</span><span style="color: #000000;">s</span><span style="color: #0000FF;">[</span><span style="color: #000000;">i</span><span style="color: #0000FF;">..</span><span style="color: #000000;">i</span><span style="color: #0000FF;">+</span><span style="color: #000000;">1</span><span style="color: #0000FF;">])</span> ~~return res~~ <span style="color: #008080;">end</span> <span style="color: #008080;">for</span> ~~end function~~ <span style="color: #008080;">return</span> <span style="color: #000000;">res</span> <span style="color: #008080;">end</span> <span style="color: #008080;">function</span> ~~procedure mostFreqKSDF(string input1, input2, integer k, maxDistance)~~ ~~printf(1,"input1 : %s\n", {input1})~~ <span style="color: #008080;">procedure</span> <span style="color: #000000;">mostFreqKSDF</span><span style="color: #0000FF;">(</span><span style="color: #004080;">string</span> <span style="color: #000000;">input1</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">input2</span><span style="color: #0000FF;">,</span> <span style="color: #004080;">integer</span> <span style="color: #000000;">k</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">maxDistance</span><span style="color: #0000FF;">)</span> ~~printf(1,"input2 : %s\n", {input2})~~ <span style="color: #7060A8;">printf</span><span style="color: #0000FF;">(</span><span style="color: #000000;">1</span><span style="color: #0000FF;">,</span><span style="color: #008000;">"input1 : %s\n"</span><span style="color: #0000FF;">,</span> <span style="color: #0000FF;">{</span><span style="color: #000000;">input1</span><span style="color: #0000FF;">})</span> ~~sequence s1 := mostFreqKHashing(input1, k),~~ <span style="color: #7060A8;">printf</span><span style="color: #0000FF;">(</span><span style="color: #000000;">1</span><span style="color: #0000FF;">,</span><span style="color: #008000;">"input2 : %s\n"</span><span style="color: #0000FF;">,</span> <span style="color: #0000FF;">{</span><span style="color: #000000;">input2</span><span style="color: #0000FF;">})</span> ~~s2 := mostFreqKHashing(input2, k)~~ <span style="color: #004080;">sequence</span> <span style="color: #000000;">s1</span> <span style="color: #0000FF;">:=</span> <span style="color: #000000;">mostFreqKHashing</span><span style="color: #0000FF;">(</span><span style="color: #000000;">input1</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">k</span><span style="color: #0000FF;">),</span> ~~printf(1,"mfkh(input1, %d) = %s\n", {k,flat(s1)})~~ <span style="color: #000000;">s2</span> <span style="color: #0000FF;">:=</span> <span style="color: #000000;">mostFreqKHashing</span><span style="color: #0000FF;">(</span><span style="color: #000000;">input2</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">k</span><span style="color: #0000FF;">)</span> ~~printf(1,"mfkh(input2, %d) = %s\n", {k,flat(s2)})~~ <span style="color: #7060A8;">printf</span><span style="color: #0000FF;">(</span><span style="color: #000000;">1</span><span style="color: #0000FF;">,</span><span style="color: #008000;">"mfkh(input1, %d) = %s\n"</span><span style="color: #0000FF;">,</span> <span style="color: #0000FF;">{</span><span style="color: #000000;">k</span><span style="color: #0000FF;">,</span><span style="color: #000000;">flat</span><span style="color: #0000FF;">(</span><span style="color: #000000;">s1</span><span style="color: #0000FF;">)})</span> ~~integer result := maxDistance - mostFreqKSimilarity(s1, s2)~~ <span style="color: #7060A8;">printf</span><span style="color: #0000FF;">(</span><span style="color: #000000;">1</span><span style="color: #0000FF;">,</span><span style="color: #008000;">"mfkh(input2, %d) = %s\n"</span><span style="color: #0000FF;">,</span> <span style="color: #0000FF;">{</span><span style="color: #000000;">k</span><span style="color: #0000FF;">,</span><span style="color: #000000;">flat</span><span style="color: #0000FF;">(</span><span style="color: #000000;">s2</span><span style="color: #0000FF;">)})</span> ~~printf(1,"SDF(input1, input2, %d, %d) = %d\n\n", {k, maxDistance, result})~~ <span style="color: #004080;">integer</span> <span style="color: #000000;">result</span> <span style="color: #0000FF;">:=</span> <span style="color: #000000;">maxDistance</span> <span style="color: #0000FF;">-</span> <span style="color: #000000;">mostFreqKSimilarity</span><span style="color: #0000FF;">(</span><span style="color: #000000;">s1</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">s2</span><span style="color: #0000FF;">)</span> ~~end procedure~~ <span style="color: #7060A8;">printf</span><span style="color: #0000FF;">(</span><span style="color: #000000;">1</span><span style="color: #0000FF;">,</span><span style="color: #008000;">"SDF(input1, input2, %d, %d) = %d\n\n"</span><span style="color: #0000FF;">,</span> <span style="color: #0000FF;">{</span><span style="color: #000000;">k</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">maxDistance</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">result</span><span style="color: #0000FF;">})</span> <span style="color: #008080;">end</span> <span style="color: #008080;">procedure</span> ~~constant tests = {{"research", "seeking"},~~ ~~{"night", "nacht"},~~ <span style="color: #008080;">constant</span> <span style="color: #000000;">tests</span> <span style="color: #0000FF;">=</span> <span style="color: #0000FF;">{{</span><span style="color: #008000;">"research"</span><span style="color: #0000FF;">,</span> <span style="color: #008000;">"seeking"</span><span style="color: #0000FF;">},</span> ~~{"my", "a"},~~ <span style="color: #0000FF;">{</span><span style="color: #008000;">"night"</span><span style="color: #0000FF;">,</span> <span style="color: #008000;">"nacht"</span><span style="color: #0000FF;">},</span> ~~{"research", "research"},~~ <span style="color: #0000FF;">{</span><span style="color: #008000;">"my"</span><span style="color: #0000FF;">,</span> <span style="color: #008000;">"a"</span><span style="color: #0000FF;">},</span> ~~{"aaaaabbbb", "ababababa"},~~ <span style="color: #0000FF;">{</span><span style="color: #008000;">"research"</span><span style="color: #0000FF;">,</span> <span style="color: #008000;">"research"</span><span style="color: #0000FF;">},</span> ~~{"significant", "capabilities"}}~~ <span style="color: #0000FF;">{</span><span style="color: #008000;">"aaaaabbbb"</span><span style="color: #0000FF;">,</span> <span style="color: #008000;">"ababababa"</span><span style="color: #0000FF;">},</span> ~~for i=1 to length(tests) do~~ <span style="color: #0000FF;">{</span><span style="color: #008000;">"significant"</span><span style="color: #0000FF;">,</span> <span style="color: #008000;">"capabilities"</span><span style="color: #0000FF;">}}</span> ~~string {t1,t2} = tests[i]~~ <span style="color: #008080;">for</span> <span style="color: #000000;">i</span><span style="color: #0000FF;">=</span><span style="color: #000000;">1</span> <span style="color: #008080;">to</span> <span style="color: #7060A8;">length</span><span style="color: #0000FF;">(</span><span style="color: #000000;">tests</span><span style="color: #0000FF;">)</span> <span style="color: #008080;">do</span> ~~mostFreqKSDF(t1, t2, 2, 10)~~ <span style="color: #004080;">string</span> <span style="color: #0000FF;">{</span><span style="color: #000000;">t1</span><span style="color: #0000FF;">,</span><span style="color: #000000;">t2</span><span style="color: #0000FF;">}</span> <span style="color: #0000FF;">=</span> <span style="color: #000000;">tests</span><span style="color: #0000FF;">[</span><span style="color: #000000;">i</span><span style="color: #0000FF;">]</span> ~~end for~~ <span style="color: #000000;">mostFreqKSDF</span><span style="color: #0000FF;">(</span><span style="color: #000000;">t1</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">t2</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">2</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">10</span><span style="color: #0000FF;">)</span> <span style="color: #008080;">end</span> <span style="color: #008080;">for</span> ~~string s1 := "LCLYTHIGRNIYYGSYLYSETWNTGIMLLLITMATAFMGYVLPWGQMSFWGATVITNLFSAIPYIGTNLV",~~ ~~s2 := "EWIWGGFSVDKATLNRFFAFHFILPFTMVALAGVHLTFLHETGSNNPLGLTSDSDKIPFHPYYTIKDFLG"~~ <span style="color: #004080;">string</span> <span style="color: #000000;">s1</span> <span style="color: #0000FF;">:=</span> <span style="color: #008000;">"LCLYTHIGRNIYYGSYLYSETWNTGIMLLLITMATAFMGYVLPWGQMSFWGATVITNLFSAIPYIGTNLV"</span><span style="color: #0000FF;">,</span> ~~mostFreqKSDF(s1, s2, 2, 100)~~ <span style="color: #000000;">s2</span> <span style="color: #0000FF;">:=</span> <span style="color: #008000;">"EWIWGGFSVDKATLNRFFAFHFILPFTMVALAGVHLTFLHETGSNNPLGLTSDSDKIPFHPYYTIKDFLG"</span> ~~s1 = "abracadabra12121212121abracadabra12121212121"~~ <span style="color: #000000;">mostFreqKSDF</span><span style="color: #0000FF;">(</span><span style="color: #000000;">s1</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">s2</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">2</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">100</span><span style="color: #0000FF;">)</span> ~~s2 = reverse(s1)~~ <span style="color: #000000;">s1</span> <span style="color: #0000FF;">=</span> <span style="color: #008000;">"abracadabra12121212121abracadabra12121212121"</span> ~~mostFreqKSDF(s1, s2, 2, 100)</lang>~~ <span style="color: #000000;">s2</span> <span style="color: #0000FF;">=</span> <span style="color: #7060A8;">reverse</span><span style="color: #0000FF;">(</span><span style="color: #000000;">s1</span><span style="color: #0000FF;">)</span> <span style="color: #000000;">mostFreqKSDF</span><span style="color: #0000FF;">(</span><span style="color: #000000;">s1</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">s2</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">2</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">100</span><span style="color: #0000FF;">)</span> <!--</syntaxhighlight>--> Output matches Go and Kotlin. Line 983 ⟶ 1,149: {{works with\|Python\|2.7+}} '''unoptimized and limited''' <~~lang~~syntaxhighlight lang="python">import collections def MostFreqKHashing(inputString, K): occuDict = collections.defaultdict(int) Line 1,006 ⟶ 1,172: def MostFreqKSDF(inputStr1, inputStr2, K, maxDistance): return maxDistance - MostFreqKSimilarity(MostFreqKHashing(inputStr1,K), MostFreqKHashing(inputStr2,K))</~~lang~~syntaxhighlight> '''optimized''' A version that replaces the intermediate string with OrderedDict to reduce the time complexity of lookup operation: <~~lang~~syntaxhighlight lang="python">import collections def MostFreqKHashing(inputString, K): occuDict = collections.defaultdict(int) Line 1,031 ⟶ 1,197: def MostFreqKSDF(inputStr1, inputStr2, K, maxDistance): return maxDistance - MostFreqKSimilarity(MostFreqKHashing(inputStr1,K), MostFreqKHashing(inputStr2,K))</~~lang~~syntaxhighlight> Test: <~~lang~~syntaxhighlight lang="python">str1 = "LCLYTHIGRNIYYGSYLYSETWNTGIMLLLITMATAFMGYVLPWGQMSFWGATVITNLFSAIPYIGTNLV" str2 = "EWIWGGFSVDKATLNRFFAFHFILPFTMVALAGVHLTFLHETGSNNPLGLTSDSDKIPFHPYYTIKDFLG" K = 2 Line 1,043 ⟶ 1,209: print("%s:"%dict2) print(''.join(c + str(cnt) for c, cnt in dict2.items())) print(MostFreqKSDF(str1, str2, K, maxDistance))</~~lang~~syntaxhighlight> {{out}} <pre> Line 1,055 ⟶ 1,221: =={{header\|Racket}}== <~~lang~~syntaxhighlight ~~Racket~~lang="racket">#lang racket (define (MostFreqKHashing inputString K) Line 1,082 ⟶ 1,248: ;; (Should add more tests, but it looks like there's a bunch of mistakes ;; in the given tests...) </syntaxhighlight> ~~</lang>~~ =={{header\|Raku}}== Line 1,093 ⟶ 1,259: Maybe I am too hasty though. Lets give it a try. Implement a MFKC routine and run an assortment of words through it to get a feel for how it hashes different words. <syntaxhighlight lang="raku" ~~perl6~~line># Fairly straightforward implementation, actually returns a list of pairs # which can be joined to make a string or manipulated further. Line 1,131 ⟶ 1,297: >; say @test-words.map( { join '', mfkh($_)».kv } ).Bag;</~~lang~~syntaxhighlight> {{out}} <pre>Bag(i2n2(151))</pre> Line 1,137 ⟶ 1,303: =={{header\|Ring}}== <~~lang~~syntaxhighlight lang="ring"> # Project : Most frequent k chars distance Line 1,181 ⟶ 1,347: next return aList </syntaxhighlight> ~~</lang>~~ Output: <pre> Line 1,191 ⟶ 1,357: =={{header\|Sidef}}== <~~lang~~syntaxhighlight lang="ruby">func _MostFreqKHashing(string, k) { var seen = Hash() Line 1,271 ⟶ 1,437: MostFreqKSDF(s, t, k, limit), ) }</~~lang~~syntaxhighlight> {{out}} <pre> Line 1,290 ⟶ 1,456: =={{header\|Tcl}}== {{works with\|Tcl\|8.6}} <~~lang~~syntaxhighlight lang="tcl">package require Tcl 8.6 proc MostFreqKHashing {inputString k} { Line 1,315 ⟶ 1,481: set h2 [MostFreqKHashing $inputStr2 $k] expr {$limit - [MostFreqKSimilarity $h1 $h2]} }</~~lang~~syntaxhighlight> Demonstrating: <~~lang~~syntaxhighlight lang="tcl">set str1 "LCLYTHIGRNIYYGSYLYSETWNTGIMLLLITMATAFMGYVLPWGQMSFWGATVITNLFSAIPYIGTNLV" set str2 "EWIWGGFSVDKATLNRFFAFHFILPFTMVALAGVHLTFLHETGSNNPLGLTSDSDKIPFHPYYTIKDFLG" puts [MostFreqKHashing $str1 2] puts [MostFreqKHashing $str2 2] puts [MostFreqKSDF $str1 $str2 2 100]</~~lang~~syntaxhighlight> {{out}} <pre> Line 1,330 ⟶ 1,496: ;A more efficient metric calculator This version is appreciably more efficient because it does not compute the intermediate string representation “hash”, instead working directly on the intermediate dictionaries and lists: <~~lang~~syntaxhighlight lang="tcl">proc MostFreqKSDF {inputStr1 inputStr2 k limit} { set c1 [set c2 {}] foreach ch [split $inputStr1 ""] {dict incr c1 $ch} Line 1,342 ⟶ 1,508: } return [expr {$limit - $s}] }</~~lang~~syntaxhighlight> It computes the identical value on the identical inputs. =={{header\|Typescript}}== {{trans\|Javascript}} with added type annotations <~~lang~~syntaxhighlight lang="typescript">//returns an object of counts keyed by character const kCounts = (str: string): Record<string, number> => { const counts: Record<string, number> = {}; Line 1,395 ⟶ 1,561: const mostFreqKSDF = ( str1: string, str2: string, k: number, maxDistance: number ): number => { return maxDistance - mostFreqKSimilarity(str1, str2, k); };</~~lang~~syntaxhighlight> =={{header\|Wren}}== Line 1,401 ⟶ 1,567: {{libheader\|Wren-seq}} {{libheader\|Wren-sort}} {{libheader\|Wren-~~trait~~iterate}} <~~lang~~syntaxhighlight ~~ecmascript~~lang="wren">import "./seq" for Lst import "./sort" for Sort import "./~~trait~~iterate" for Stepped var mostFreqKHashing = Fn.new { \|input, k\| Line 1,463 ⟶ 1,629: s1 = "abracadabra12121212121abracadabra12121212121" s2 = s1[-1..0] mostFreqKSDF.call(s1, s2, 2, 100)</~~lang~~syntaxhighlight> {{out}}