Most frequent k chars distance: Difference between revisions

m
syntax highlighting fixup automation
m (→‎{{header|Phix}}: added syntax colouring, marked p2js compatible)
m (syntax highlighting fixup automation)
Line 146:
{{trans|Python}}
 
<langsyntaxhighlight lang="11l">F most_freq_khashing(inputString, K)
DefaultDict[Char, Int] occuDict
L(c) inputString
Line 175:
print(dict2, end' ":\n")
print(dict2.map((c, cnt) -> c‘’String(cnt)).join(‘’))
print(most_freq_ksdf(str1, str2, K, maxDistance))</langsyntaxhighlight>
 
{{out}}
Line 187:
 
=={{header|C++}}==
<langsyntaxhighlight lang="cpp">#include <string>
#include <vector>
#include <map>
Line 242:
return 0 ;
}
</syntaxhighlight>
</lang>
{{out}}
<pre>MostFreqKHashing( s1 , 2 ) = L9T8
Line 250:
=={{header|Go}}==
{{trans|Kotlin}}
<langsyntaxhighlight lang="go">package main
 
import (
Line 368:
s2 = reverseStr(s1)
mostFreqKSDF(s1, s2, 2, 100)
}</langsyntaxhighlight>
 
{{out}}
Line 422:
 
=={{header|Haskell}}==
<langsyntaxhighlight Haskelllang="haskell">module MostFrequentK
where
import Data.List ( nub , sortBy )
Line 461:
mostFreqKSDF :: String -> String -> Int ->Int
mostFreqKSDF s t n = mostFreqKSimilarity ( mostFreqKHashing s n ) (mostFreqKHashing t n )
</syntaxhighlight>
</lang>
{{out}}
<pre>mostFrequentKHashing "LCLYTHIGRNIYYGSYLYSETWNTGIMLLLITMATAFMGYVLPWGQMSFWGATVITNLFSAIPYIGTNLV" 2
Line 470:
 
=={{header|J}}==
'''Solution''':<langsyntaxhighlight lang="j">NB. String Distance Wrapper Function
mfksDF =: {:@:[ - (mfks@:(mfkh&.>)~ {.)~
 
Line 487:
NB. No need to fix mfksDF
mfkh =: mfkh f.
mfks =: mfks f.</langsyntaxhighlight>
 
'''Examples''':<langsyntaxhighlight lang="j">verb define ''
fkh =. ;@:,@:(":&.>) NB. format k hash
 
Line 516:
'pass'
)
pass</langsyntaxhighlight>
'''Notes''': As of press time, there are significant discrepancies between the task description, its pseudocode, the test cases provided, and the two other existing implementations. See the [[Talk:Most frequent k chars distance#Prank_Page.3F|talk page]] for the assumptions made in this implementation to reconcile these discrepancies (in particular, in the scoring function).
 
Line 523:
Translation of the pseudo-code of the Wikipedia article [[wp:Most frequent k characters]] to [[wp:java]] implementation of three functions given in the definition section are given below with [[wp:JavaDoc]] comments:
 
<langsyntaxhighlight lang="java">import java.util.Collections;
import java.util.Comparator;
import java.util.HashMap;
Line 649:
}
 
}</langsyntaxhighlight>
 
 
Line 655:
Defines the functions for counting and frequency outside of the hashing function, as it is easier to calculate the comparison from a structured result rather than parsing a hash.
 
<langsyntaxhighlight lang="javascript">//returns an object of counts keyed by character
const kCounts = str => {
const counts = {};
Line 703:
const mostFreqKSDF = (str1, str2, k, maxDistance) => {
return maxDistance - mostFreqKSimilarity(str1, str2, k);
};</langsyntaxhighlight>
 
'''Testing with Jest'''
<langsyntaxhighlight lang="javascript">test('hash of "LCLYTHIGRNIYYGSYLYSETWNTGIMLLLITMATAFMGYVLPWGQMSFWGATVITNLFSAIPYIGTNLV" is "L9T8"', () => {
expect(mostFreqKHashing(str1, 2)).toBe("L9T8");
});
Line 716:
test("SDF of strings 1 and 2 with k=2 and max=100 is 83", () => {
expect(mostFreqKSDF(str1, str2, 2, 100)).toBe(83);
});</langsyntaxhighlight>
 
'''Testing in Console'''
<langsyntaxhighlight lang="javascript">const str1 = "LCLYTHIGRNIYYGSYLYSETWNTGIMLLLITMATAFMGYVLPWGQMSFWGATVITNLFSAIPYIGTNLV";
const str2 = "EWIWGGFSVDKATLNRFFAFHFILPFTMVALAGVHLTFLHETGSNNPLGLTSDSDKIPFHPYYTIKDFLG";
const K = 2;
Line 725:
console.log(mostFreqKHashing(str1, K));
console.log(mostFreqKHashing(str2, K));
console.log(mostFreqKSDF(str1, str2, K, maxDistance));</langsyntaxhighlight>
 
{{out}}
Line 737:
 
'''Preliminaries'''
<langsyntaxhighlight lang="jq"># bag of words
def bow(stream):
reduce stream as $word ({}; .[($word|tostring)] += 1);
Line 748:
| sort_by((.[1]|f), .[0])
| reverse
| map(.[1]);</langsyntaxhighlight>
'''The Tasks'''
<langsyntaxhighlight lang="jq"># Output: { characters: array_of_characters_in_decreasing_order_of_frequency, frequency: object}
def MostFreqKHashing($K):
def chars: explode | range(0;length) as $i | [.[$i]] | implode;
Line 789:
| MostFreqKSDF(2; 100);
 
task2, task100</langsyntaxhighlight>
{{out}}
<pre>
Line 808:
 
I have no idea what NULL0 is supposed to mean so I've ignored that.
<langsyntaxhighlight lang="scala">// version 1.1.51
 
fun mostFreqKHashing(input: String, k: Int): String =
Line 861:
s2 = s1.reversed()
mostFreqKSDF(s1, s2, 2, 100)
}</langsyntaxhighlight>
 
{{out}}
Line 919:
In order to avoid the limitation to 10 occurrences, we chose to use an ordered table (equivalent to Python OrderedDict). We could have used Kotlin solution or used a sequence of tuples (char, count) but the ordered table gives better performance.
 
<langsyntaxhighlight Nimlang="nim">import algorithm, sugar, tables
 
type CharCounts = OrderedTable[char, int]
Line 970:
echo "mostFreqKHashing(str1, 2) = ", mostFreqKHashing(S1, 2)
echo "mostFreqKHashing(str2, 2) = ", mostFreqKHashing(S2, 2)
echo "mostFreqKSDF(str1, str2, 2, 100) = ", mostFreqKSDF(S1, S2, 2, 100)</langsyntaxhighlight>
 
{{out}}
Line 1,010:
 
=={{header|Perl}}==
<langsyntaxhighlight Perllang="perl">#!/usr/bin/perl
use strict ;
use warnings ;
Line 1,060:
print "MostFreqKHashing ( " . '$firststring , 2)' . " is " . mostFreqHashing( $firststring , 2 ) . "\n" ;
print "MostFreqKHashing ( " . '$secondstring , 2)' . " is " . mostFreqHashing( $secondstring , 2 ) . "\n" ;
</syntaxhighlight>
</lang>
{{out}}
<pre>MostFreqKHashing ( $firststring , 2) is L9T8
Line 1,067:
=={{header|Phix}}==
{{trans|Go}}
<!--<langsyntaxhighlight Phixlang="phix">(phixonline)-->
<span style="color: #008080;">with</span> <span style="color: #008080;">javascript_semantics</span>
<span style="color: #008080;">function</span> <span style="color: #000000;">mostFreqKHashing</span><span style="color: #0000FF;">(</span><span style="color: #004080;">string</span> <span style="color: #000000;">input</span><span style="color: #0000FF;">,</span> <span style="color: #004080;">integer</span> <span style="color: #000000;">k</span><span style="color: #0000FF;">)</span>
Line 1,143:
<span style="color: #000000;">s2</span> <span style="color: #0000FF;">=</span> <span style="color: #7060A8;">reverse</span><span style="color: #0000FF;">(</span><span style="color: #000000;">s1</span><span style="color: #0000FF;">)</span>
<span style="color: #000000;">mostFreqKSDF</span><span style="color: #0000FF;">(</span><span style="color: #000000;">s1</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">s2</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">2</span><span style="color: #0000FF;">,</span> <span style="color: #000000;">100</span><span style="color: #0000FF;">)</span>
<!--</langsyntaxhighlight>-->
Output matches Go and Kotlin.
 
Line 1,149:
{{works with|Python|2.7+}}
'''unoptimized and limited'''
<langsyntaxhighlight lang="python">import collections
def MostFreqKHashing(inputString, K):
occuDict = collections.defaultdict(int)
Line 1,172:
 
def MostFreqKSDF(inputStr1, inputStr2, K, maxDistance):
return maxDistance - MostFreqKSimilarity(MostFreqKHashing(inputStr1,K), MostFreqKHashing(inputStr2,K))</langsyntaxhighlight>
 
'''optimized'''
 
A version that replaces the intermediate string with OrderedDict to reduce the time complexity of lookup operation:
<langsyntaxhighlight lang="python">import collections
def MostFreqKHashing(inputString, K):
occuDict = collections.defaultdict(int)
Line 1,197:
 
def MostFreqKSDF(inputStr1, inputStr2, K, maxDistance):
return maxDistance - MostFreqKSimilarity(MostFreqKHashing(inputStr1,K), MostFreqKHashing(inputStr2,K))</langsyntaxhighlight>
Test:
<langsyntaxhighlight lang="python">str1 = "LCLYTHIGRNIYYGSYLYSETWNTGIMLLLITMATAFMGYVLPWGQMSFWGATVITNLFSAIPYIGTNLV"
str2 = "EWIWGGFSVDKATLNRFFAFHFILPFTMVALAGVHLTFLHETGSNNPLGLTSDSDKIPFHPYYTIKDFLG"
K = 2
Line 1,209:
print("%s:"%dict2)
print(''.join(c + str(cnt) for c, cnt in dict2.items()))
print(MostFreqKSDF(str1, str2, K, maxDistance))</langsyntaxhighlight>
{{out}}
<pre>
Line 1,221:
=={{header|Racket}}==
 
<langsyntaxhighlight Racketlang="racket">#lang racket
 
(define (MostFreqKHashing inputString K)
Line 1,248:
;; (Should add more tests, but it looks like there's a bunch of mistakes
;; in the given tests...)
</syntaxhighlight>
</lang>
 
=={{header|Raku}}==
Line 1,259:
Maybe I am too hasty though. Lets give it a try. Implement a MFKC routine and run an assortment of words through it to get a feel for how it hashes different words.
 
<syntaxhighlight lang="raku" perl6line># Fairly straightforward implementation, actually returns a list of pairs
# which can be joined to make a string or manipulated further.
 
Line 1,297:
>;
 
say @test-words.map( { join '', mfkh($_)».kv } ).Bag;</langsyntaxhighlight>
{{out}}
<pre>Bag(i2n2(151))</pre>
Line 1,303:
 
=={{header|Ring}}==
<langsyntaxhighlight lang="ring">
# Project : Most frequent k chars distance
 
Line 1,347:
next
return aList
</syntaxhighlight>
</lang>
Output:
<pre>
Line 1,357:
 
=={{header|Sidef}}==
<langsyntaxhighlight lang="ruby">func _MostFreqKHashing(string, k) {
 
var seen = Hash()
Line 1,437:
MostFreqKSDF(s, t, k, limit),
)
}</langsyntaxhighlight>
{{out}}
<pre>
Line 1,456:
=={{header|Tcl}}==
{{works with|Tcl|8.6}}
<langsyntaxhighlight lang="tcl">package require Tcl 8.6
 
proc MostFreqKHashing {inputString k} {
Line 1,481:
set h2 [MostFreqKHashing $inputStr2 $k]
expr {$limit - [MostFreqKSimilarity $h1 $h2]}
}</langsyntaxhighlight>
Demonstrating:
<langsyntaxhighlight lang="tcl">set str1 "LCLYTHIGRNIYYGSYLYSETWNTGIMLLLITMATAFMGYVLPWGQMSFWGATVITNLFSAIPYIGTNLV"
set str2 "EWIWGGFSVDKATLNRFFAFHFILPFTMVALAGVHLTFLHETGSNNPLGLTSDSDKIPFHPYYTIKDFLG"
puts [MostFreqKHashing $str1 2]
puts [MostFreqKHashing $str2 2]
puts [MostFreqKSDF $str1 $str2 2 100]</langsyntaxhighlight>
{{out}}
<pre>
Line 1,496:
;A more efficient metric calculator
This version is appreciably more efficient because it does not compute the intermediate string representation “hash”, instead working directly on the intermediate dictionaries and lists:
<langsyntaxhighlight lang="tcl">proc MostFreqKSDF {inputStr1 inputStr2 k limit} {
set c1 [set c2 {}]
foreach ch [split $inputStr1 ""] {dict incr c1 $ch}
Line 1,508:
}
return [expr {$limit - $s}]
}</langsyntaxhighlight>
It computes the identical value on the identical inputs.
 
=={{header|Typescript}}==
{{trans|Javascript}} with added type annotations
<langsyntaxhighlight lang="typescript">//returns an object of counts keyed by character
const kCounts = (str: string): Record<string, number> => {
const counts: Record<string, number> = {};
Line 1,561:
const mostFreqKSDF = ( str1: string, str2: string, k: number, maxDistance: number ): number => {
return maxDistance - mostFreqKSimilarity(str1, str2, k);
};</langsyntaxhighlight>
 
=={{header|Wren}}==
Line 1,568:
{{libheader|Wren-sort}}
{{libheader|Wren-trait}}
<langsyntaxhighlight lang="ecmascript">import "/seq" for Lst
import "/sort" for Sort
import "/trait" for Stepped
Line 1,629:
s1 = "abracadabra12121212121abracadabra12121212121"
s2 = s1[-1..0]
mostFreqKSDF.call(s1, s2, 2, 100)</langsyntaxhighlight>
 
{{out}}
10,333

edits