Rosetta Code/Find bare lang tags: Difference between revisions
Content added Content deleted
(Added Erlang) |
(Added Haskell solution) |
||
Line 96: | Line 96: | ||
1 in "Perl" |
1 in "Perl" |
||
</pre> |
</pre> |
||
=={{header|Haskell}}== |
|||
There are actually many different Regex packages available for Haskell. For this example, I chose TDFA, a very fast POSIX ERE engine. To change engines, simply change the import statement. If you use a Perl-style RE engine, you'll have to modify the expressions slightly. |
|||
This solution can be compiled into a program that will either take space-delimited list of files as its argument, or take input from STDIN if no arguments are provided. The Media Wiki API bonus is not attempted. |
|||
<lang Haskell>import System.Environment |
|||
import Text.Printf |
|||
import Text.Regex.TDFA |
|||
import Data.List |
|||
import Data.Array |
|||
import qualified Data.Map as Map |
|||
splitByMatches :: String -> [MatchText String] -> [String] |
|||
splitByMatches str matches = foldr (\match acc -> |
|||
let before = take (matchOffset).head $ acc |
|||
after = drop (matchOffset + matchLen).head $ acc |
|||
matchOffset = fst.snd.(!0) $ match |
|||
matchLen = snd.snd.(!0) $ match |
|||
in before:after:(tail acc) |
|||
) [str] matches |
|||
{-| Takes a string and splits it into the different languages used. All text |
|||
before the language headers is put into the key "" -} |
|||
splitByLanguage :: String -> Map.Map String String |
|||
splitByLanguage str = Map.fromList.zip langs $ splitByMatches str allMatches |
|||
where langs = "":(map (fst.(!1)) allMatches) |
|||
allMatches = matchAllText (makeRegex headerRegex :: Regex) str |
|||
headerRegex = "==[[:space:]]*{{[[:space:]]*header[[:space:]]*\\|[[:space:]]*([^ }]*)[[:space:]]*}}[^=]*==" |
|||
{-| Takes a string and counts the number of time a valid, but bare, lang tag |
|||
appears. It does not attempt to ignore valid tags inside lang blocks. -} |
|||
countBareLangTags :: String -> Int |
|||
countBareLangTags = matchCount (makeRegex "<lang[[:space:]]*>" :: Regex) |
|||
main = do |
|||
args <- getArgs |
|||
(contents, files) <- if length args == 0 then do |
|||
-- If there aren't arguments, read from stdin |
|||
content <- getContents |
|||
return ([content],[""]) |
|||
else if length args == 1 then do |
|||
-- If there's only one argument, read the file, but don't display |
|||
-- the filename in the results. |
|||
content <- readFile (head args) |
|||
return ([content],[""]) |
|||
else do |
|||
-- Otherwise, read all the files and display their file names. |
|||
contents <- mapM readFile args |
|||
return (contents, args) |
|||
let bareTagMaps = map (Map.map countBareLangTags.splitByLanguage) $ contents |
|||
let tagsWithFiles = zipWith (\tags file -> Map.map (addFile file) tags) bareTagMaps files |
|||
let allBareTags = foldl combineMaps Map.empty tagsWithFiles |
|||
printBareTags allBareTags |
|||
where addFile file count = (count, if count>0 && file/="" then [file] else []) |
|||
combineMaps = Map.foldrWithKey insertItem |
|||
insertItem = Map.insertWith (\(newC,newF) (oldC,oldF) -> (oldC+newC,oldF++newF)) |
|||
printBareTags :: Map.Map String (Int,[String]) -> IO () |
|||
printBareTags tags = do |
|||
let numBare = Map.foldr ((+).fst) 0 tags |
|||
printf "%d bare language tags:\n\n" numBare |
|||
flip mapM_ (Map.toAscList tags) (\(lang,(count,files)) -> |
|||
if count <= 0 then return () else printf "%d in %s%s\n" count ( |
|||
if lang == "" then "no language" else lang) (filesString files)) |
|||
filesString :: [String] -> String |
|||
filesString [] = "" |
|||
filesString files = " ("++listString files++")" |
|||
where listString [file] = "[["++file++"]]" |
|||
listString (file:files) = "[["++file++"]], "++listString files</lang> |
|||
Here are the input files I used to test: |
|||
<pre><nowiki> |
|||
example1.wiki |
|||
------------------------------------------------------------- |
|||
Description |
|||
<lang>Pseudocode</lang> |
|||
=={{header|C}}== |
|||
<lang C>printf("Hello world!\n");</lang> |
|||
=={{header|Perl}}== |
|||
<lang>print "Hello world!\n"</lang> |
|||
</nowiki></pre> |
|||
<pre><nowiki> |
|||
example2.wiki |
|||
------------------------------------------------------------- |
|||
Description |
|||
<lang>Pseudocode</lang> |
|||
=={{header|C}}== |
|||
<lang>printf("Hello world!\n");</lang> |
|||
=={{header|Perl}}== |
|||
<lang>print "Hello world!\n"</lang> |
|||
<lang Perl>print "Goodbye world!\n"</lang> |
|||
=={{header|Haskell}}== |
|||
<lang>hubris lang = "I'm so much better than a "++lang++" programmer because I program in Haskell."</lang> |
|||
</nowiki></pre> |
|||
And the output: |
|||
<pre><nowiki> |
|||
6 bare language tags: |
|||
2 in no language ([[example1.wiki]], [[example2.wiki]]) |
|||
1 in C ([[example2.wiki]]) |
|||
1 in Haskell ([[example2.wiki]]) |
|||
2 in Perl ([[example1.wiki]], [[example2.wiki]]) |
|||
</nowiki></pre> |
|||
=={{header|Perl}}== |
=={{header|Perl}}== |