Rosetta Code/Rank languages by popularity

From Rosetta Code
Jump to: navigation, search
Task
Rosetta Code/Rank languages by popularity
You are encouraged to solve this task according to the task description, using any language you may know.

Sort most popular programming languages based in number of members in Rosetta Code categories
(from http://www.rosettacode.org/mw/index.php?title=Special:Categories&limit=5000)

Sample output on April 11 2013:

  1. 725 - Tcl
  2. 668 - Python
  3. 650 - C
  4. 626 - PicoLisp
  5. 624 - J
  6. 593 - D
  7. 590 - Ruby
  8. 588 - Go
  9. 576 - Perl 6
 10. 565 - Ada
 11. 555 - Mathematica
 12. 539 - Perl
 13. 536 - Haskell
 14. 514 - BBC BASIC
 15. 511 - REXX
 16. 493 - Java
 17. 481 - OCaml
 18. 469 - PureBasic
 19. 462 - Unicon
 20. 430 - AutoHotkey
 21. 429 - Icon
 22. 425 - Common Lisp
 23. 420 - C sharp
 24. 404 - C++
 25. 367 - JavaScript
 26. 346 - Clojure
 27. 343 - Scala
 28. 326 - R
 29. 324 - Lua
 30. 317 - PHP
 31. 315 - ALGOL 68
 32. 313 - Forth
 33. 306 - Pascal
 34. 300 - Groovy
 35. 295 - XPL0
 36. 292 - Liberty BASIC
 37. 288 - Fortran
 38. 278 - Oz
 39. 272 - E
 40. 270 - Seed7
...

A complete ranked listing of all 488 languages (from the REXX example) is included here ──► RC_POP.OUT.

Filtering wrong results is optional. You can check against Special:MostLinkedCategories

Contents

[edit] Ada

Library: AWS
with Ada.Integer_Text_IO;   use Ada.Integer_Text_IO;
with Ada.Strings.Fixed; use Ada.Strings.Fixed;
with Ada.Strings.Unbounded; use Ada.Strings.Unbounded;
with Ada.Text_IO; use Ada.Text_IO;
 
with Ada.Containers.Ordered_Sets;
with Ada.Strings.Less_Case_Insensitive;
 
with AWS.Client;
with AWS.Response;
 
procedure Test is
 
use Ada.Strings;
 
function "+" (S : String) return Unbounded_String renames To_Unbounded_String;
 
type A_Language_Count is
record
Count  : Integer := 0;
Language : Unbounded_String;
end record;
 
function "=" (L, R : A_Language_Count) return Boolean is
begin
return L.Language = R.Language;
end "=";
 
function "<" (L, R : A_Language_Count) return Boolean is
begin
-- Sort by 'Count' and then by Language name
return L.Count < R.Count
or else (L.Count = R.Count
and then Less_Case_Insensitive (Left => To_String (L.Language),
Right => To_String (R.Language)));
end "<";
 
package Sets is new Ada.Containers.Ordered_Sets (A_Language_Count);
use Sets;
 
Counts : Set;
 
procedure Find_Counts (S : String) is
Title_Str : constant String  := "title=""Category:";
End_A_Str : constant String  := "</a> (";
 
Title_At  : constant Natural := Index (S, Title_Str);
begin
if Title_At /= 0 then
declare
Bracket_At : constant Natural := Index (S (Title_At + Title_Str'Length .. S'Last), ">");
End_A_At  : constant Natural := Index (S (Bracket_At + 1 .. S'Last), End_A_Str);
Space_At  : constant Natural := Index (S (End_A_At + End_A_Str'Length .. S'Last), " ");
Count  : constant Natural := Natural'Value (S (End_A_At + End_A_Str'Length .. Space_At - 1));
Language  : constant String  := S (Title_At + Title_Str'Length .. Bracket_At - 2);
begin
if Bracket_At /= 0 and then End_A_At /= 0 and then Space_At /= 0 then
begin
Counts.Insert (New_Item => (Count, +Language));
exception
when Constraint_Error =>
Put_Line (Standard_Error, "Warning: repeated language: " & Language);
-- Ignore repeated results.
null;
end;
end if;
-- Recursively parse the string for languages and counts
Find_Counts (S (Space_At + 1 .. S'Last));
end;
end if;
 
end Find_Counts;
 
Place : Natural := 1;
 
procedure Display (C : Cursor) is
begin
Put (Place, Width => 1); Put (". ");
Put (Element (C).Count, Width => 1); Put (" - ");
Put_Line (To_String (Element (C).Language));
Place := Place + 1;
end Display;
 
Http_Source : constant AWS.Response.Data :=
AWS.Client.Get ("http://rosettacode.org/mw/index.php?title=Special:Categories&limit=5000");
begin
Find_Counts (AWS.Response.Message_Body (Http_Source));
Counts.Reverse_Iterate (Display'Access);
end Test;
 

[edit] ALGOL 68

Works with: ALGOL 68G version mk8+ for Unix and Linux - tested with release mk15-0.8b.fc9.i386 - uses non-standard library routines http content and grep in string.

Note: the routine http content is currently not available on Win32 systems.

This example is incorrect. ---among others, Tcl (the top dog) is missing. Please fix the code and remove this message.
PROC good page = (REF STRING page) BOOL:
IF grep in string("^HTTP/[0-9.]* 200", page, NIL, NIL) = 0
THEN TRUE
ELSE IF INT start, end;
grep in string("^HTTP/[0-9.]* [0-9]+ [a-zA-Z ]*", page,
start, end) = 0
THEN print (page[start : end])
ELSE print ("unknown error retrieving page")
FI;
FALSE
FI;
 
MODE LISTOFSTRING = STRUCT(REF LINK first, last, INT upb);
MODE LINK = STRUCT(STRING value, REF LINK next);
 
PRIO LISTINIT = 1;
OP LISTINIT = (REF LISTOFSTRING new, REF LINK first)REF LISTOFSTRING: (
new := (first, first, (first IS REF LINK(NIL) | 0 | 1 ));
new
);
 
OP +:= = (REF LISTOFSTRING list, []CHAR item)VOID: (
HEAP LINK new := (STRING(item), REF LINK(NIL));
IF first OF list IS REF LINK(NIL) THEN
first OF list := new
ELSE
next OF last OF list := new
FI;
last OF list := new;
upb OF list +:= 1
);
 
OP UPB = (LISTOFSTRING list)INT: upb OF list;
 
OP ARRAYOFSTRING = (LISTOFSTRING list)[]STRING:(
[UPB list]STRING out;
REF LINK this := first OF list;
FOR i TO UPB list DO out[i] := value OF this; this := next OF this OD;
out
);
 
INT match=0, no match=1, out of memory error=2, other error=3;
 
PROC re split = (STRING re split, REF STRING beetles)[]STRING:(
LISTOFSTRING out := (NIL, NIL, 0); # LISTINIT REF LINK NIL; #
INT start := 1, pos, end;
WHILE grep in string(re split, beetles[start:], pos, end) = match DO
out +:= beetles[start:start+pos-2];
out +:= beetles[start+pos-1:start+end-1];
start +:= end
OD;
IF start > UPB beetles THEN
out +:= beetles[start:]
FI;
ARRAYOFSTRING(out)
);
 
 
IF STRING reply;
INT rc =
http content (reply, "www.rosettacode.org", "http://www.rosettacode.org/w/index.php?title=Special:Categories&limit=500", 0);
rc /= 0 OR NOT good page (reply)
THEN print (("Error:",strerror (rc)))
ELSE
STRING # hack: HTML should be parsed by an official HTML parsing library #
re html tag = "<[^>]*>",
re a href category = "^<a href=""/wiki/Category:.*"" title=",
re members = "([1-9][0-9]* members)";
 
MODE STATISTIC = STRUCT(INT members, STRING category);
FLEX[0]STATISTIC stats;
 
OP +:= = (REF FLEX[]STATISTIC in out, STATISTIC item)VOID:(
[LWB in out: UPB in out+1]STATISTIC new;
new[LWB in out: UPB in out]:=in out;
new[UPB new]:=item;
in out := new
);
 
# hack: needs to be manually maintained #
STRING re ignore ="Programming Tasks|WikiStubs|Maintenance/OmitCategoriesCreated|"+
"Unimplemented tasks by language|Programming Languages|"+
"Solutions by Programming Language|Implementations|"+
"Solutions by Library|Encyclopedia|Language users|"+
"Solutions by Programming Task|Basic language learning|"+
"RCTemplates|Language Implementations";
 
FORMAT category fmt = $"<a href=""/wiki/Category:"g""" title=""Category:"g""""$;
STRING encoded category, category;
FORMAT members fmt = $" ("g" members)"$;
INT members;
 
FLEX[0]STRING tokens := re split(re html tag, reply);
FOR token index TO UPB tokens DO
STRING token := tokens[token index];
FILE file;
IF grep in string(re a href category, token, NIL, NIL) = match THEN
associate(file, token);
make term(file,"""");
getf(file, (category fmt, encoded category, category));
close(file)
ELIF grep in string(re members, token, NIL, NIL) = match THEN
IF grep in string(re ignore, category, NIL, NIL) /= match THEN
associate(file, token);
getf(file, (members fmt, members));
stats +:= STATISTIC(members, category);
close(file)
FI
FI
OD;
 
OP < = (STATISTIC a,b)BOOL:
members OF a < members OF b;
 
MODE SORTSTRUCT = STATISTIC;
PR READ "prelude/sort.a68" PR;
 
stats := in place shell sort reverse(stats);
 
INT max = 10;
FOR i TO (UPB stats > max | max | UPB stats) DO
printf(($g(-0)". "g(-0)" - "gl$,i,stats[i]))
OD
FI
Sample output:
1. 233 - Python
2. 222 - Ada
3. 203 - OCaml
4. 203 - C
5. 201 - Perl
6. 193 - Haskell
7. 182 - Java
8. 179 - D
9. 178 - ALGOL 68
10. 160 - Ruby

[edit] AutoHotkey

StringCaseSense, On
Progress, b2 w120 zh0 fs9, Please wait ...
Sleep, 10
 
Link = http://www.rosettacode.org/w/index.php?title=Special:Categories&limit=5000
FileDelete, Cats.html
URLDownloadToFile, %Link%, Cats.html
FileRead, Cats, Cats.html
 
Link1 = http://rosettacode.org/wiki/Category:Programming_Languages
FileDelete, lang1.htm
URLDownloadToFile, %Link1%, Lang1.htm
FileRead, Lang1, Lang1.htm
 
LookFor = (\(previous 200\) \(<a href=")(.+?)" title="Category:Programming Languages">next 200
RegExMatch(Lang1, LookFor, Link) ; Link2
StringReplace, Link2, Link2, &amp;, &
 
FileDelete, lang2.htm
URLDownloadToFile, http://www.rosettacode.org%Link2%, Lang2.htm
FileRead, Lang2, Lang2.htm
Languages := Lang1 Lang2
 
; create list of categories with member count
Loop, Parse, Cats, `n, `r
{
If InStr(A_LoopField, "<li>") {
LookFor = title=\"Category:(.+?)"
RegExMatch(A_LoopField, LookFor, Name)
RegExMatch(A_LoopField, "(\d*)\smembers", Count)
CatsList .= Count1 "|" Name1 "`r`n"
}
}
 
; create list of languages
RegExMatch(Languages, "(<h2>Subcategories</h2>)(.*)previous 200", Match)
LookFor = <a href="/wiki/Category:.*?" title="Category:.*?">(.*?)</a>(.*)
While RegExMatch(Match2, LookFor, Match)
LangList .= Match1 "`r`n"
 
; create the final list
Loop, Parse, CatsList, `n, `r
{
StringSplit, out, A_LoopField, |
If RegExMatch(LangList, "m)^" out2 "$")
FinalList .= A_LoopField "`r`n"
}
 
Sort, FinalList, RN
Gui, -MinimizeBox
Gui, Margin, 6
Gui, Add, ListView, y10 w363 r20 Grid, Rank|Members|Category
Loop, Parse, FinalList, `n, `r
{
If A_LoopField {
StringSplit, Item, A_LoopField, |
LV_Add("", A_Index, Item1, Item2)
}
}
 
LV_ModifyCol(1, "Integer")
LV_ModifyCol(2, "Integer")
LV_ModifyCol(3, 250)
FormatTime, Timestamp,, dd MMM yyyy
Progress, Off
Gui, Show,, Rosetta Categories - %Timestamp%
Return
 
GuiClose:
ExitApp
Return

[edit] AWK

This example is incorrect. Broken because nc may close the connection too soon. It sends the HTTP request but then quits before getting the response. So the program gets an empty list of languages. Please fix the code and remove this message.


Works with: Gawk version 3.1.8

The solution for Awk now goes through the API at http://rosettacode.org/mw/api.php, so it only ranks programming languages. This solution replaces an older solution for Awk that tried to scrape the HTML.

This solution needs help from the nc command to make a TCP connection.

Translation of: Perl
Library: nc
BEGIN {
for (i = 0; i <= 255; i++)
ord[sprintf("%c", i)] = i
}
 
# Encode string with application/x-www-form-urlencoded escapes.
function escape(str, c, len, res) {
len = length(str)
res = ""
for (i = 1; i <= len; i++) {
c = substr(str, i, 1);
if (c ~ /[-._*0-9A-Za-z]/)
res = res c
else if (c == " ")
res = res "+"
else
res = res "%" sprintf("%02X", ord[c])
}
return res
}
 
function nc_open(gcmcontinue, host, path) {
host = "rosettacode.org"
path = "/mw/api.php" \
"?action=query" \
"&generator=categorymembers" \
"&gcmtitle=Category:Programming%20Languages" \
"&gcmlimit=500" \
(gcmcontinue "" ? "&gcmcontinue=" escape(gcmcontinue) : "") \
"&prop=categoryinfo" \
"&format=txt"
 
nc = "printf 'GET %s HTTP/1.1\r\nHost: %s\r\n\r\n' '" \
path "' '" host "' | nc '" host "' 80"
}
 
function nc_next(out) {
# Read each line of the HTTP response.
while (nc | getline > 0) {
# Ignore all lines except
# [gcmcontinue], [title] and [pages].
if (index($0, "[gcmcontinue]")) {
# " [gcmcontinue] => BEFUNGE|"
sub(/^.*=> */, "")
out["gcmcontinue"] = $0
} else if (index($0, "[title]")) {
# " [title] => Category:AWK"
sub(/^.*Category:/, "")
out["title"] = $0
} else if(index($0, "[pages]")) {
# " [pages] => 129"
sub(/^.*=> */, "")
# Ignore " [pages] => Array".
if ($0 !~ /^[0-9]+/)
continue
 
# Force conversion to number, so AWK will do
# numeric comparisons, not string comparisons.
out["pages"] = $0 + 0
 
# Return now; [pages] came after [title].
return 1
}
}
 
if ("gcmcontinue" in out) {
close(nc)
nc_open(out["gcmcontinue"])
delete out["gcmcontinue"]
return nc_next(out)
} else
return 0
}
 
BEGIN {
nc_open()
while (nc_next(language)) {
title = language["title"] # "AWK"
pages = language["pages"] # 129
 
# Insert "129 - AWK" into rank[].
i = 1
while (i <= count && (rank[i] + 0) >= pages)
i++
for (j = count; j >= i; j--)
rank[j + 1] = rank[j]
rank[i] = pages " - " title
count++
}
 
for (i = 1; i <= count; i++)
print i ". " rank[i]
}
Output from 1 April 2010:
1. 492 - Tcl
2. 458 - Python
3. 454 - PicoLisp
4. 436 - J
5. 407 - C
...
49. 129 - AWK
...

[edit] BBC BASIC

Note that language names differing only in their case are merged.

      INSTALL @lib$+"SORTLIB"
SortUp% = FN_sortinit(0,0)  : REM Ascending
SortDown% = FN_sortinit(1,0) : REM Descending
 
VDU 23,22,640;512;8,16,16,128+8 : REM Enable UTF-8 support
DIM lang$(1000), tasks%(1000)
NORM_IGNORECASE = 1
 
SYS "LoadLibrary", "URLMON.DLL" TO urlmon%
SYS "GetProcAddress", urlmon%, "URLDownloadToFileA" TO UDTF
 
PRINT "Downloading languages list..."
url$ = "http://rosettacode.org/wiki/Category:Programming_Languages"
file$ = @tmp$ + "languages.htm"
SYS UDTF, 0, url$, file$, 0, 0 TO fail%
IF fail% ERROR 100, "File download failed (languages)"
 
file% = OPENIN(file$)
index% = 0
WHILE NOT EOF#file%
REPEAT
a$ = GET$#file%
IF INSTR(a$, "<a href=""/wiki/Category") = 0 EXIT REPEAT
i% = INSTR(a$, "</a>")
IF i% = 0 EXIT REPEAT
j% = i%
REPEAT i% -= 1 : UNTIL MID$(a$,i%,1) = ">" OR i% = 0
IF i% = 0 EXIT REPEAT
lang$(index%) = MID$(a$, i%+1, j%-i%-1)
IF lang$(index%) <> "Languages" index% += 1
UNTIL TRUE
ENDWHILE
CLOSE #file%
 
C% = index%
CALL SortUp%, lang$(0)
 
PRINT "Downloading categories list..."
url$ = "http://www.rosettacode.org/w/index.php"
url$ += "?title=Special:Categories&limit=5000"
file$ = @tmp$ + "categories.htm"
SYS UDTF, 0, url$, file$, 0, 0 TO fail%
IF fail% ERROR 100, "File download failed (categories)"
 
file% = OPENIN(file$)
WHILE NOT EOF#file%
REPEAT
a$ = GET$#file%
i% = INSTR(a$, "member")
IF i% = 0 EXIT REPEAT
REPEAT i% -= 1 : UNTIL MID$(a$,i%,1) = "(" OR i% = 0
IF i% = 0 EXIT REPEAT
tasks% = VAL(MID$(a$, i%+1))
IF tasks% = 0 EXIT REPEAT
REPEAT i% -= 1 : UNTIL MID$(a$,i%,1) = "<" OR i% = 0
IF i% = 0 EXIT REPEAT
j% = i%
REPEAT i% -= 1 : UNTIL MID$(a$,i%,1) = ">" OR i% = 0
IF i% = 0 EXIT REPEAT
k% = FNwhere(lang$(), MID$(a$, i%+1, j%-i%-1), index%-1)
IF k% <> -1 tasks%(k%) += tasks%
UNTIL TRUE
ENDWHILE
CLOSE #file%
 
CALL SortDown%, tasks%(0), lang$(0)
 
VDU 14
@% = 3 : REM Column width
PRINT "List of languages as of " TIME$
FOR i% = 0 TO index%-1
IF tasks%(i%) = 0 EXIT FOR
PRINT i%+1 ". " tasks%(i%) " - " lang$(i%)
NEXT
END
 
DEF FNwhere(a$(), S$, T%)
LOCAL B%, C%, H%
H% = 2
WHILE H%<T% H% *= 2:ENDWHILE
H% /= 2
REPEAT
IF (B%+H%)<=T% THEN
SYS "CompareString", 0, NORM_IGNORECASE, S$, -1, a$(B%+H%), -1 TO C%
IF C% >= 2 B% += H%
ENDIF
H% /= 2
UNTIL H%=0
SYS "CompareString", 0, NORM_IGNORECASE, S$, -1, a$(B%), -1 TO C%
IF C% = 2 THEN = B% ELSE = -1

Output:

Downloading languages list...
Downloading categories list...
List of languages as of Sat.17 Nov 2012,00:21:11
  1. 682 - Tcl
  2. 638 - Python
  3. 626 - PicoLisp
  4. 622 - C
  5. 592 - J
  6. 581 - Go
  7. 570 - Ruby
  8. 553 - Ada
  9. 515 - Perl
 10. 514 - D
 11. 507 - Haskell
 12. 490 - Perl 6
 13. 489 - BBC BASIC
 14. 477 - Java
 15. 473 - Mathematica
 16. 469 - PureBasic
 17. 469 - OCaml
 18. 459 - Unicon
 19. 438 - REXX
 20. 428 - Icon
......
461.   1 - ScriptBasic
462.   1 - Qore
463.   1 - Opa
464.   1 - Nickle
465.   1 - Neko
466.   1 - Neat
467.   1 - MEL
468.   1 - MAPPER
469.   1 - Kotlin
470.   1 - Chapel

[edit] C

Ghetto parser
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
 
const char * lang_url = "http://www.rosettacode.org/w/api.php?action=query&"
"list=categorymembers&cmtitle=Category:Programming_Languages&"
"cmlimit=500&format=json";
const char * cat_url = "http://www.rosettacode.org/w/index.php?title=Special:Categories&limit=5000";
 
#define BLOCK 1024
char *get_page(const char *url)
{
char cmd[1024];
char *ptr, *buf;
int bytes_read = 1, len = 0;
sprintf(cmd, "wget -q \"%s\" -O -", url);
FILE *fp = popen(cmd, "r");
if (!fp) return 0;
for (ptr = buf = 0; bytes_read > 0; ) {
buf = realloc(buf, 1 + (len += BLOCK));
if (!ptr) ptr = buf;
bytes_read = fread(ptr, 1, BLOCK, fp);
if (bytes_read <= 0) break;
ptr += bytes_read;
}
*++ptr = '\0';
return buf;
}
 
char ** get_langs(char *buf, int *l)
{
char **arr = 0;
for (*l = 0; (buf = strstr(buf, "Category:")) && (buf += 9); ++*l)
for ( (*l)[arr = realloc(arr, sizeof(char*)*(1 + *l))] = buf;
*buf != '"' || (*buf++ = 0);
buf++);
 
return arr;
}
 
typedef struct { const char *name; int count; } cnt_t;
cnt_t * get_cats(char *buf, char ** langs, int len, int *ret_len)
{
char str[1024], *found;
cnt_t *list = 0;
int i, llen = 0;
for (i = 0; i < len; i++) {
sprintf(str, "/wiki/Category:%s", langs[i]);
if (!(found = strstr(buf, str))) continue;
buf = found + strlen(str);
 
if (!(found = strstr(buf, "</a> ("))) continue;
list = realloc(list, sizeof(cnt_t) * ++llen);
list[llen - 1].name = langs[i];
list[llen - 1].count = strtol(found + 6, 0, 10);
}
*ret_len = llen;
return list;
}
 
int _scmp(const void *a, const void *b)
{
int x = ((const cnt_t*)a)->count, y = ((const cnt_t*)b)->count;
return x < y ? -1 : x > y;
}
 
int main()
{
int len, clen;
char ** langs = get_langs(get_page(lang_url), &len);
cnt_t *cats = get_cats(get_page(cat_url), langs, len, &clen);
qsort(cats, clen, sizeof(cnt_t), _scmp);
while (--clen >= 0)
printf("%4d %s\n", cats[clen].count, cats[clen].name);
 
return 0;
}
Output:
 563 Tcl
 529 PicoLisp
 522 Python
 504 C
 500 J
 442 Go
 440 Ruby
 435 Ada
 430 PureBasic
 427 Perl
...

[edit] C#

Sorting only programming languages.

using System;
using System.Collections;
using System.Collections.Generic;
using System.Linq;
using System.Net;
using System.Text.RegularExpressions;
 
class Program
{
static void Main(string[] args)
{
string get1 = new WebClient().DownloadString("http://www.rosettacode.org/w/api.php?action=query&list=categorymembers&cmtitle=Category:Programming_Languages&cmlimit=500&format=json");
string get2 = new WebClient().DownloadString("http://www.rosettacode.org/w/index.php?title=Special:Categories&limit=5000");
 
ArrayList langs = new ArrayList();
Dictionary<string, int> qtdmbr = new Dictionary<string, int>();
 
MatchCollection match1 = new Regex("\"title\":\"Category:(.+?)\"").Matches(get1);
MatchCollection match2 = new Regex("title=\"Category:(.+?)\">.+?</a>[^(]*\\((\\d+) members\\)").Matches(get2);
 
foreach (Match lang in match1) langs.Add(lang.Groups[1].Value);
 
foreach (Match match in match2)
{
if (langs.Contains(match.Groups[1].Value))
{
qtdmbr.Add(match.Groups[1].Value, Int32.Parse(match.Groups[2].Value));
}
}
 
string[] test = qtdmbr.OrderByDescending(x => x.Value).Select(x => String.Format("{0,3} - {1}", x.Value, x.Key)).ToArray();
 
int count = 1;
 
foreach (string i in test)
{
Console.WriteLine("{0,3}. {1}", count, i);
count++;
}
}
}
Output (as of May 30, 2010):
 1. 397 - Tcl
 2. 368 - Python
 3. 350 - Ruby
 4. 333 - J
 5. 332 - C
 6. 322 - Haskell
 7. 322 - OCaml
 8. 302 - Perl
 9. 290 - Common Lisp
10. 289 - AutoHotkey
    . . .

[edit] Object-oriented solution

using System;
using System.Net;
using System.Linq;
using System.Text.RegularExpressions;
using System.Collections.Generic;
 
class Category {
private string _title;
private int _members;
 
public Category(string title, int members) {
_title = title;
_members = members;
}
 
public string Title {
get {
return _title;
}
}
 
public int Members {
get {
return _members;
}
}
}
 
class Program {
static void Main(string[] args) {
string get1 = new WebClient().DownloadString("http://www.rosettacode.org/w/api.php?action=query&list=categorymembers&cmtitle=Category:Programming_Languages&cmlimit=500&format=json");
string get2 = new WebClient().DownloadString("http://www.rosettacode.org/w/index.php?title=Special:Categories&limit=5000");
 
MatchCollection match1 = new Regex("\"title\":\"Category:(.+?)\"").Matches(get1);
MatchCollection match2 = new Regex("title=\"Category:(.+?)\">.+?</a>[^(]*\\((\\d+) members\\)").Matches(get2);
 
string[] valids = match1.Cast<Match>().Select(x => x.Groups[1].Value).ToArray();
List<Category> langs = new List<Category>();
 
foreach (Match match in match2) {
string category = match.Groups[1].Value;
int members = Int32.Parse(match.Groups[2].Value);
 
if (valids.Contains(category)) langs.Add(new Category(category, members));
}
 
langs = langs.OrderByDescending(x => x.Members).ToList();
int count = 1;
 
foreach (Category i in langs) {
Console.WriteLine("{0,3}. {1,3} - {2}", count, i.Members, i.Title);
count++;
}
}
}

[edit] C++

Library: Boost

using g++ under Linux with g++ -lboost_thread -lboost_system -lboost_regex:

#include <string>
#include <boost/regex.hpp>
#include <boost/asio.hpp>
#include <vector>
#include <utility>
#include <iostream>
#include <sstream>
#include <cstdlib>
#include <algorithm>
#include <iomanip>
 
struct Sort { //sorting programming languages according to frequency
bool operator( ) ( const std::pair<std::string,int> & a , const std::pair<std::string,int> & b )
const {
return a.second > b.second ;
}
} ;
 
int main( ) {
try {
//setting up an io service , with templated subelements for resolver and query
boost::asio::io_service io_service ;
boost::asio::ip::tcp::resolver resolver ( io_service ) ;
boost::asio::ip::tcp::resolver::query query ( "rosettacode.org" , "http" ) ;
boost::asio::ip::tcp::resolver::iterator endpoint_iterator = resolver.resolve( query ) ;
boost::asio::ip::tcp::resolver::iterator end ;
boost::asio::ip::tcp::socket socket( io_service ) ;
boost::system::error_code error = boost::asio::error::host_not_found ;
//looking for an endpoint the socket will be able to connect to
while ( error && endpoint_iterator != end ) {
socket.close( ) ;
socket.connect( *endpoint_iterator++ , error ) ;
}
if ( error )
throw boost::system::system_error ( error ) ;
//we send a request
boost::asio::streambuf request ;
std::ostream request_stream( &request ) ;
request_stream << "GET " << "/mw/index.php?title=Special:Categories&limit=5000" << " HTTP/1.0\r\n" ;
request_stream << "Host: " << "rosettacode.org" << "\r\n" ;
request_stream << "Accept: */*\r\n" ;
request_stream << "Connection: close\r\n\r\n" ;
//send the request
boost::asio::write( socket , request ) ;
//we receive the response analyzing every line and storing the programming language
boost::asio::streambuf response ;
std::istream response_stream ( &response ) ;
boost::asio::read_until( socket , response , "\r\n\r\n" ) ;
boost::regex e( "<li><a href=\"[^<>]+?\">([a-zA-Z\\+#1-9]+?)</a>\\s?\\((\\d+) members\\)</li>" ) ;
//using the wrong regex produces incorrect sorting!!
std::ostringstream line ;
std::vector<std::pair<std::string , int> > languages ; //holds language and number of examples
boost::smatch matches ;
while ( boost::asio::read( socket , response , boost::asio::transfer_at_least( 1 ) , error ) ) {
line << &response ;
if ( boost::regex_search( line.str( ) , matches , e ) ) {
std::string lang( matches[2].first , matches[2].second ) ;
int zahl = atoi ( lang.c_str( ) ) ;
languages.push_back( std::make_pair( matches[ 1 ] , zahl ) ) ;
}
line.str( "") ;//we have to erase the string buffer for the next read
}
if ( error != boost::asio::error::eof )
throw boost::system::system_error( error ) ;
//we sort the vector entries , see the struct above
std::sort( languages.begin( ) , languages.end( ) , Sort( ) ) ;
int n = 1 ;
for ( std::vector<std::pair<std::string , int> >::const_iterator spi = languages.begin( ) ;
spi != languages.end( ) ; ++spi ) {
std::cout << std::setw( 3 ) << std::right << n << '.' << std::setw( 4 ) << std::right <<
spi->second << " - " << spi->first << '\n' ;
n++ ;
}
} catch ( std::exception &ex ) {
std::cout << "Exception: " << ex.what( ) << '\n' ;
}
return 0 ;
}
Sample output (just the "top ten"):
 1. 367 - Tcl
 2. 334 - Python
 3. 319 - Ruby
 4. 286 - C
 5. 277 - Perl
 6. 272 - OCaml
 7. 264 - Ada
 8. 241 - E
 9. 239 - AutoHotkey
10. 193 - Forth

[edit] Caché ObjectScript

Class Utils.Net.RosettaCode [ Abstract ]
{
 
ClassMethod GetTopLanguages(pHost As %String = "", pPath As %String = "", pTop As %Integer = 10) As %Status
{
// check input parameters
If $Match(pHost, "^([a-zA-Z0-9]([a-zA-Z0-9\-]{0,61}[a-zA-Z0-9])?\.)+[a-zA-Z]{2,6}$")=0 {
Quit $$$ERROR($$$GeneralError, "Invalid host name.")
}
 
// create http request and get page
Set req=##class(%Net.HttpRequest).%New()
Set req.Server=pHost
Do req.Get(pPath)
 
// create xml stream with doc type
Set xml=##class(%Stream.GlobalCharacter).%New()
Set sc=xml.WriteLine("<!DOCTYPE doc_type [")
Set sc=xml.WriteLine($Char(9)_"<!ENTITY nbsp '&#160;'>")
Set sc=xml.WriteLine($Char(9)_"<!ENTITY amp '&#38;'>")
Set sc=xml.WriteLine("]>")
 
// copy xhtml stream to xml stream
Set xhtml=req.HttpResponse.Data
Set xhtml.LineTerminator=$Char(10)
While 'xhtml.AtEnd {
Set line=xhtml.ReadLine()
If line["!DOCTYPE" Continue
If line["<g:plusone></g:plusone>" {
Continue
Set line="<g:plusone xmlns:g='http://base.google.com/ns/1.0'></g:plusone>"
}
Set sc=xml.WriteLine(line)
}
 
// create an instance of an %XML.XPATH.Document
Set sc=##class(%XML.XPATH.Document).CreateFromStream(xml, .xdoc)
If $$$ISERR(sc) Quit sc
 
// evaluate following 'XPath' expression
Set sc=xdoc.EvaluateExpression("//div[@id='bodyContent']//li", "a[contains(@href, '/Category:')]/ancestor::li", .res)
 
// iterate through list elements
Set array=##class(%ArrayOfDataTypes).%New()
Do {
Set dom=res.GetNext(.key)
If '$IsObject(dom) Quit
 
// get language name and members
Set lang=""
While dom.Read() {
If 'dom.HasValue Continue
If lang="" {
If $Locate(dom.Value, "User|Tasks|Omit|attention|operations|Solutions by") Quit
Set lang=dom.Value Continue
}
If dom.Value["members" {
Set members=+$ZStrip(dom.Value, "<>P")
Set list=array.GetAt(members)
Set $List(list, $ListLength(list)+1)=lang
Set sc=array.SetAt(list, members)
Quit
}
}
} While key'=""
If array.Count()=0 Quit $$$ERROR($$$GeneralError, "No languages found.")
 
// show top entries
Write "Top "_pTop_" Languages (as at "_$ZDate($HoroLog, 2)_"):", !
For count=1:1:pTop {
Set members=array.GetPrevious(.key)
If key="" Quit
Write $Justify(count, 3), ". ", key, " - ", $ListToString(members, ", "), !
}
 
// finished
Quit $$$OK
}
 
}
Example:
USER>Do ##class(Utils.Net.RosettaCode).GetTopLanguages("www.rosettacode.org", "/mw/index.php?title=Special:Categories&limit=5000")
Top 10 Languages (as at 21 Apr 2013):
  1. 728 - Tcl
  2. 668 - Python
  3. 654 - C
  4. 630 - J
  5. 626 - PicoLisp
  6. 595 - D
  7. 590 - Ruby
  8. 589 - Go
  9. 576 - Perl 6
 10. 567 - Ada

[edit] D

import std.stdio, std.algorithm, std.conv, std.array, std.regex,
std.typecons, std.net.curl;
 
void main() {
// Get a list of just the programming languages.
immutable r1 = `"title":"Category:([^"]+)"`;
const languages = get("www.rosettacode.org/w/api.php?action=query"~
"&list=categorymembers&cmtitle=Category:Pro"~
"gramming_Languages&cmlimit=500&format=json")
.match(r1.regex("g")).map!q{a[1].dup}.array;
 
// Get a pagecount for all categories.
immutable r2 = `title="Category:([^"]+)">[^<]+` ~
`</a>[^(]+\((\d+) members\)`;
const pairs = get("www.rosettacode.org/w/index.php?" ~
"title=Special:Categories&limit=5000")
.match(r2.regex("g"))
.filter!(m => languages.canFind(m[1]))
.map!(m => tuple(m[2].to!uint, m[1].dup))
.array.sort!q{a > b}.release;
 
foreach (i, res; pairs)
writefln("%3d. %3d - %s", i + 1, res[]);
}
Sample output (top twenty as of 2013-01-24):
  1. 717 - Tcl
  2. 663 - Python
  3. 643 - C
  4. 626 - PicoLisp
  5. 622 - J
  6. 587 - Go
  7. 587 - Ruby
  8. 585 - D
  9. 568 - Perl 6
 10. 564 - Ada
 11. 554 - Mathematica
 12. 535 - Perl
 13. 532 - Haskell
 14. 514 - BBC BASIC
 15. 505 - REXX
 16. 491 - Java
 17. 478 - OCaml
 18. 469 - PureBasic
 19. 462 - Unicon
 20. 430 - AutoHotkey

[edit] Erlang

 
-module( rank_languages_by_popularity ).
 
-export( [task/0] ).
 
-record( print_fold, {place=0, place_step=1, previous_count=0} ).
 
task() ->
ok = find_unimplemented_tasks:init(),
Category_programming_languages = find_unimplemented_tasks:rosetta_code_list_of( "Programming_Languages" ),
Programming_languages = [X || "Category:" ++ X <- Category_programming_languages],
{ok, {{_HTTP,200,"OK"}, _Headers, Body}} = httpc:request( "http://rosettacode.org/mw/index.php?title=Special:Categories&limit=5000" ),
Count_categories = lists:sort( [{Y, X} || {X, Y} <- category_counts(Body, []), lists:member(X, Programming_languages)] ),
lists:foldr( fun place_count_category_write/2, #print_fold{}, Count_categories ).
 
 
 
category_counts( "", [[] | Acc] ) -> Acc;
category_counts( String, Acc ) ->
{Begin, End} = category_count_begin_end( String ),
{Category_count, String_continuation} = category_count_extract( String, Begin, End ),
category_counts( String_continuation, [Category_count | Acc] ).
 
category_count_begin_end( String ) ->
Begin = string:str( String, "/wiki/Category:" ),
End = string:str( string:substr(String, Begin), " member" ),
category_count_begin_end( Begin, End, erlang:length(" member") ).
 
category_count_begin_end( _Begin, 0, _End_length ) -> {0, 0};
category_count_begin_end( Begin, End, End_length ) ->
{Begin, Begin + End + End_length}.
 
category_count_extract( _String, 0, _End ) -> {[], ""};
category_count_extract( String, Begin, End ) ->
Category_count = category_count_extract( string:substr(String, Begin, End - Begin) ),
{Category_count, string:substr( String, End + 1 )}.
 
category_count_extract( "/wiki/Category:" ++ T ) ->
Category_member = string:tokens( T, " " ),
Category = category_count_extract_category( Category_member ),
Member = category_count_extract_count( lists:reverse(Category_member) ),
{Category, Member}.
 
category_count_extract_category( [Category | _T] ) ->
lists:map( fun category_count_extract_category_map/1, string:strip(Category, right, $") ).
 
category_count_extract_category_map( $_ ) -> $\s;
category_count_extract_category_map( Character ) -> Character.
 
category_count_extract_count( ["
member" ++ _, "(" ++ N | _T] ) -> erlang:list_to_integer( N );
category_count_extract_count( _T ) -> 0.
 
place_count_category_write( {Count, Category}, Acc ) ->
Print_fold = place_count_category_write( Count, Acc ),
io:fwrite("
~p. ~p - ~p~n", [Print_fold#print_fold.place, Count, Category] ),
Print_fold;
 
place_count_category_write( Count, #print_fold{place_step=Place_step, previous_count=Count}=Print_fold ) ->
Print_fold#print_fold{place_step=Place_step + 1};
place_count_category_write( Count, #print_fold{place=Place, place_step=Place_step} ) ->
#print_fold{place=Place + Place_step, previous_count=Count}.
Sample output (top/last ten as of 2013-05-27):
1. 741 - "Tcl"
2. 676 - "Python"
3. 660 - "C"
4. 638 - "J"
5. 627 - "PicoLisp"
6. 609 - "Perl 6"
6. 609 - "D"
8. 607 - "Racket"
9. 592 - "Ruby"
10. 589 - "Go"
...
454. 1 - "Opa"
454. 1 - "Nickle"
454. 1 - "NewtonScript"
454. 1 - "Neko"
454. 1 - "Neat"
454. 1 - "MEL"
454. 1 - "MAPPER"
454. 1 - "LiveScript"
454. 1 - "Kotlin"
454. 1 - "Jacquard Loom"

[edit] F#

open System
open System.Text.RegularExpressions
 
[<EntryPoint>]
let main argv =
let rosettacodeSpecialCategoriesAddress =
"http://www.rosettacode.org/mw/index.php?title=Special:Categories&limit=5000"
let rosettacodeProgrammingLaguagesAddress =
"http://rosettacode.org/wiki/Category:Programming_Languages"
 
let getWebContent (url :string) =
using (new System.Net.WebClient()) (fun x -> x.DownloadString url)
 
let regexForTitleCategoryFollowedOptionallyByMembercount =
new Regex("""
title="
Category: (?<Name> [^"]* ) "> # capture the name of the category
( # group begin for optional part
[^(]* # ignore up to next open paren (on this line)
\( # verbatim open paren
(?<Number>
\d+ # a number (= some digits)
)
\s+ # whitespace
member(s?) # verbatim text members (maybe singular)
\) # verbatim closing paren
)? # end of optional part
""", // " <- Make syntax highlighting happy
RegexOptions.IgnorePatternWhitespace ||| RegexOptions.ExplicitCapture)
let matchesForTitleCategoryFollowedOptionallyByMembercount str =
regexForTitleCategoryFollowedOptionallyByMembercount.Matches(str)
 
let languages =
matchesForTitleCategoryFollowedOptionallyByMembercount
(getWebContent rosettacodeProgrammingLaguagesAddress)
|> Seq.cast
|> Seq.map (fun (m: Match) -> (m.Groups.Item("Name").Value, true))
|> Map.ofSeq
 
let entriesWithCount =
let parse str = match Int32.TryParse(str) with | (true, n) -> n | (false, _) -> -1
matchesForTitleCategoryFollowedOptionallyByMembercount
(getWebContent rosettacodeSpecialCategoriesAddress)
|> Seq.cast
|> Seq.map (fun (m: Match) ->
(m.Groups.Item("Name").Value, parse (m.Groups.Item("Number").Value)))
|> Seq.filter (fun p -> (snd p) > 0 && Map.containsKey (fst p) languages)
|> Seq.sortBy (fun x -> -(snd x))
 
 
Seq.iter2 (fun i x -> printfn "%4d. %s" i x)
(seq { 1 .. 20 })
(entriesWithCount |> Seq.map (fun x -> sprintf "%3d - %s" (snd x) (fst x)))
0

Showing top 20 as of 2013-04-02

   1. 721 - Tcl
   2. 665 - Python
   3. 647 - C
   4. 626 - PicoLisp
   5. 622 - J
   6. 588 - Go
   7. 588 - Ruby
   8. 585 - D
   9. 569 - Perl 6
  10. 565 - Ada
  11. 555 - Mathematica
  12. 535 - Perl
  13. 533 - Haskell
  14. 514 - BBC BASIC
  15. 505 - REXX
  16. 491 - Java
  17. 480 - OCaml
  18. 469 - PureBasic
  19. 462 - Unicon
  20. 430 - AutoHotkey

[edit] Go

package main
 
import (
"encoding/xml"
"fmt"
"io"
"io/ioutil"
"net/http"
"net/url"
"regexp"
"sort"
"strconv"
"strings"
)
 
var baseQuery = "http://rosettacode.org/mw/api.php?action=query" +
"&format=xml&list=categorymembers&cmlimit=500"
 
func req(u string, foundCm func(string)) string {
resp, err := http.Get(u)
if err != nil {
fmt.Println(err) // connection or request fail
return ""
}
defer resp.Body.Close()
for p := xml.NewDecoder(resp.Body); ; {
t, err := p.RawToken()
switch s, ok := t.(xml.StartElement); {
case err == io.EOF:
return ""
case err != nil:
fmt.Println(err)
return ""
case !ok:
continue
case s.Name.Local == "cm":
for _, a := range s.Attr {
if a.Name.Local == "title" {
foundCm(a.Value)
}
}
case s.Name.Local == "categorymembers" && len(s.Attr) > 0 &&
s.Attr[0].Name.Local == "cmcontinue":
return url.QueryEscape(s.Attr[0].Value)
}
}
return ""
}
 
// satisfy sort interface (reverse sorting)
type pop struct {
string
int
}
type popList []pop
 
func (pl popList) Len() int { return len(pl) }
func (pl popList) Swap(i, j int) { pl[i], pl[j] = pl[j], pl[i] }
func (pl popList) Less(i, j int) bool {
switch d := pl[i].int - pl[j].int; {
case d > 0:
return true
case d < 0:
return false
}
return pl[i].string < pl[j].string
}
 
func main() {
// get languages, store in a map
langMap := make(map[string]bool)
storeLang := func(cm string) {
if strings.HasPrefix(cm, "Category:") {
cm = cm[9:]
}
langMap[cm] = true
}
languageQuery := baseQuery + "&cmtitle=Category:Programming_Languages"
continueAt := req(languageQuery, storeLang)
for continueAt > "" {
continueAt = req(languageQuery+"&cmcontinue="+continueAt, storeLang)
}
// allocate slice for sorting
s := make(popList, 0, len(langMap))
 
// get big list of categories
resp, err := http.Get("http://rosettacode.org/mw/index.php" +
"?title=Special:Categories&limit=5000")
var page []byte
if err == nil {
page, err = ioutil.ReadAll(resp.Body)
resp.Body.Close()
}
if err != nil {
fmt.Println(err)
return
}
// split out fields of interest and populate sortable slice
rx := regexp.MustCompile("<li><a.*>(.*)</a>.*[(]([0-9]+) member")
for _, sm := range rx.FindAllSubmatch(page, -1) {
ls := string(sm[1])
if langMap[ls] {
if n, err := strconv.Atoi(string(sm[2])); err == nil {
s = append(s, pop{ls, n})
}
}
}
 
// output
sort.Sort(s)
for i, lang := range s {
fmt.Printf("%3d. %3d - %s\n", i+1, lang.int, lang.string)
}
}
Output on 2 Nov 2011
:
  1. 607 - Tcl
  2. 577 - PicoLisp
  3. 554 - C
  4. 554 - Python
  5. 537 - J
...
406.   3 - XBase
407.   3 - Yacas
408.   3 - Z80 Assembly
409.   3 - Zonnon
410.   3 - ΜC++

(That's μC++ there alphabetized at the end.)

[edit] Haskell

import Network.Browser
import Network.HTTP
import Network.URI
import Data.List
import Data.Maybe
import Text.XML.Light
import Control.Arrow
import Data.Ord
 
getRespons url = do
rsp <- Network.Browser.browse $ do
setAllowRedirects True
setOutHandler $ const (return ()) -- quiet
request $ getRequest url
return $ rspBody $ snd rsp
 
 
mostPopLang = do
rsp <-getRespons $ "http://www.rosettacode.org/w/api.php?action=query&list=" ++
"categorymembers&cmtitle=Category:Programming_Languages&cmlimit=500&format=xml"
mbrs <- getRespons "http://www.rosettacode.org/w/index.php?title=Special:Categories&limit=5000"
let xmls = onlyElems $ parseXML rsp
langs = concatMap (map ((\\"Category:"). fromJust.findAttr (unqual "title")). filterElementsName (== unqual "cm")) xmls
 
let catMbr = second (read.takeWhile(/=' '). drop 6). break (=='<'). drop 1. dropWhile(/='>') . drop 5
catNmbs :: [(String, Int)]
catNmbs = map catMbr $ filter (isPrefixOf "<li>") $ lines mbrs
printFmt (n,(l,m)) = putStrLn $ take 6 (show n ++ ". ") ++ (show m) ++ " " ++ l
toMaybe (a,b) =
case b of
Just x -> Just (a,x)
_ -> Nothing
 
mapM_ printFmt $ zip [1..] $ sortBy (flip (comparing snd))
$ mapMaybe (toMaybe. (id &&& flip lookup catNmbs)) langs
First 20:
*Main> mostPopLang
1.    421  Tcl
2.    392  Python
3.    365  PicoLisp
4.    363  J
5.    360  Ruby
6.    354  C
7.    344  Haskell
8.    337  OCaml
9.    316  Perl
10.   308  PureBasic
11.   302  AutoHotkey
12.   299  Common Lisp
13.   295  D
14.   295  Java
15.   293  Ada
16.   278  Oz
17.   260  R
18.   259  C sharp
19.   257  C++
20.   255  ALGOL 68

[edit] HicEst

CHARACTER cats*50000, catlist*50000, sortedCat*50000, sample*100
DIMENSION RankNr(1)
 
READ(ClipBoard) cats
catlist = ' '
pos = 1 ! find language entries like * 100 doors (2 members)
nr = 0
! after next '*' find next "name" = '100 doors' and next "(...)" = '(2 members)' :
1 EDIT(Text=cats, SetPos=pos, Right='*', R, Mark1, R='(', Left, M2, Parse=name, R=2, P=members, GetPos=pos)
IF(pos > 0) THEN
READ(Text=members) count
IF(count > 0) THEN
nr = nr + 1
WRITE(Text=catlist, Format='i4, 1x, 2a', APPend) count, name, ';'
ENDIF
GOTO 1 ! no WHILE in HicEst
ENDIF ! catlist is now = " 1 ... User ; 2 100 doors ; 3 3D ; 8 4D ; ..."
 
ALLOCATE(RankNr, nr)
EDIT(Text=catlist, SePaRators=';', Option=1+4, SorTtoIndex=RankNr) ! case (1) and back (4)
 
sortedCat = ' ' ! get the sorted list in the sequence of RankNr:
ok = 0
DO i = 1, nr
EDIT(Text=catlist, SePaRators=';', ITeM=RankNr(i), CoPyto=sample)
discard = EDIT(Text=sample, LeXicon='user,attention,solutions,tasks,program,language,implementation,')
IF(discard == 0) THEN ! removes many of the non-language entries
ok = ok + 1
WRITE(Text=sortedCat, APPend, Format='F5.0, 2A') ok, TRIM(sample), $CRLF
ENDIF
ENDDO
DLG(Text=sortedCat, Format=$CRLF)
END
2010-04-24 18:31
Top 10 entries (not all are languages)
1. 394 Tcl
2. 363 Python
3. 346 Ruby
4. 328 J
5. 319 C
6. 317 OCaml
7. 315 Haskell
8. 298 Perl
9. 288 WikiStubs
10. 281 Common Lisp

[edit] J

Solution:
require 'web/gethttp xml/sax/x2j regex'
 
x2jclass 'rcPopLang'
 
rx =: (<0 1) {:: (2#a:) ,~ rxmatches rxfrom ]
 
'Popular Languages' x2jDefn
/  := langs  : langs =: 0 2 $ a:
html/body/div/div/div/ul/li  := langs =: langs ,^:(a:~:{.@[)~ lang ; ' \((\d+) members?\)' rx y
html/body/div/div/div/ul/li/a := lang =: '^\s*((?:.(?!User|Tasks|Omit|attention|operations|by))+)\s*$' rx y
)
 
cocurrent'base'
 
sortTab =. \: __ ". [: ;:^:_1: {:"1
formatTab =: [: ;:^:_1: [: (20 A. (<'-') , |. , [: ('.' <"1@:,.~ ":) 1 + 1 i.@,~ 1{$)&.|: sortTab f.
 
rcPopLangs =: formatTab@:process_rcPopLang_@:gethttp
Example:
   10 {. rcPopLangs 'http://www.rosettacode.org/w/index.php?title=Special:Categories&limit=2000'
1. 687 - Tcl
2. 646 - Python
3. 637 - C
4. 626 - PicoLisp
5. 612 - J
6. 587 - Go
7. 556 - Ada
8. 550 - D
9. 549 - Mathematica
10. 526 - Perl

Notes: See some notes on the J solution.

[edit] Mathematica

Languages = Flatten[Import["http://rosettacode.org/wiki/Category:Programming_Languages","Data"][[1,1]]];
Languages = Most@StringReplace[Languages, {" " -> "_", "+" -> "%2B"}];
b = {#, If[# === {}, 0, #[[1]]]&@( StringCases[Import["http://rosettacode.org/wiki/Category:"<>#,"Plaintext"],
"category, out of " ~~ x:NumberString ~~ " total" ->x])} &/@ Languages;
For[i = 1, i < Length@b , i++ , Print[i, ". ", #[[2]], " - ", #[[1]] ]&@ Part[Reverse@SortBy[b, Last], i]]
Output 
As of 29 February 2012:
1. 637 - Tcl
2. 576 - C
3. 558 - J
4. 538 - Go
5. 485 - Ada
6. 456 - D
7. 450 - Haskell
8. 441 - Mathematica
9. 432 - Java
10. 425 - Icon
...

[edit] Oz

Library: OzHttpClient

Using web scraping. Does not filter non-language categories.

declare
[HTTPClient] = {Module.link ['x-ozlib://mesaros/net/HTTPClient.ozf']}
[Regex] = {Module.link ['x-oz://contrib/regex']}
 
fun {GetPage RawUrl}
Client = {New HTTPClient.urlGET init(inPrms(toFile:false toStrm:true) _)}
Url = {VirtualString.toString RawUrl}
OutParams
HttpResponseParams
in
{Client getService(Url ?OutParams ?HttpResponseParams)}
{Client closeAll(true)}
OutParams.sOut
end
 
fun {GetCategories Doc}
{Map {Regex.allMatches "<li><a[^>]+>([^<]+)</a> \\(([0-9]+) member" Doc}
fun {$ Match}
Category = {Regex.group 1 Match Doc}
Count = {String.toInt {ByteString.toString {Regex.group 2 Match Doc}}}
in
Category#Count
end
}
end
 
Url = "http://www.rosettacode.org/mw/index.php?title=Special:Categories&limit=5000"
 
{System.showInfo "Retrieving..."}
Doc = {GetPage Url}
 
{System.showInfo "Parsing..."}
Cs = {GetCategories Doc}
in
for
Cat#Count in {Sort Cs fun {$ _#C1 _#C2} C1 > C2 end}
I in 1..20
do
{System.showInfo I#". "#Count#" - "#Cat}
end
Output:
1. 371 - Tcl
2. 369 - Programming Tasks
3. 338 - Python
4. 324 - Ruby
5. 306 - Haskell
...
17. 225 - Oz
18. 214 - C++
19. 209 - JavaScript
20. 208 - ALGOL 68

[edit] Perl

Sorting only programming languages.

use MediaWiki::API;
my $api = new MediaWiki::API({api_url => 'http://rosettacode.org/mw/api.php'});
 
my @pairs =
sort {$b->[1] <=> $a->[1] or $a->[0] cmp $b->[0]}
map {$_->{title} =~ s/\ACategory://;
[$_->{title}, $_->{categoryinfo}{size} || 0];}
values %{$api->api
({action => 'query',
generator => 'categorymembers',
gcmtitle => 'Category:Programming Languages',
gcmlimit => 'max',
prop => 'categoryinfo'})->{query}{pages}};
 
for (my $n = 1 ; @pairs ; ++$n)
{my ($lang, $tasks) = @{shift @pairs};
printf "%3d. %3d - %s\n", $n, $tasks, $lang;}

[edit] Perl 6

Scraping the languages and categories pages. Perl 6 automatically handles Unicode names correctly.

shell "wget -O languages.html 'http://rosettacode.org/wiki/Category:Programming_Languages'";
shell "wget -O categories.html 'http://www.rosettacode.org/mw/index.php?title=Special:Categories&limit=5000'";
 
my @lines = slurp('languages.html').lines;
shift @lines until @lines[0] ~~ / '<h2>Subcategories</h2>' /;
my @languages = gather for @lines {
last if / '/bodycontent' /;
take ~$0 if
/ '<li><a href="/wiki/Category:' .*? '" title="Category:' .*? '">' (.*?) '</a></li>' /;
}
 
my %valid = @languages X=> 1;
 
@lines = slurp('categories.html').lines;
 
my @results = sort -*.[0], gather for @lines {
take [+$1, ~$0] if
/ '<li><a href="/wiki/Category:' .*? '" title="Category:' .*? '">'
(.*?) <?{ %valid{ ~$0 } }>
'</a>' .*? '(' (\d+) ' members)' /;
}
 
for @results.kv -> $i, @l {
printf "%d:\t%3d - %s\n", $i+1, |@l;
}
Output:

(As of 2013-02-16.)

1:	687 - Tcl
2:	650 - Python
3:	638 - C
4:	626 - PicoLisp
5:	619 - J
6:	587 - Go
7:	581 - Ruby
8:	570 - D
9:	559 - Ada
10:	551 - Mathematica
11:	528 - Perl 6
12:	528 - Perl
13:	526 - Haskell
14:	513 - BBC BASIC
15:	491 - REXX
16:	488 - Java
17:	477 - OCaml
18:	469 - PureBasic
19:	462 - Unicon
20:	430 - AutoHotkey
21:	429 - Icon
22:	424 - Common Lisp
23:	416 - C sharp
24:	400 - C++
25:	359 - JavaScript
26:	339 - Scala
27:	338 - PARI/GP
28:	335 - Clojure
29:	322 - R
30:	318 - Lua
31:	314 - ALGOL 68
32:	313 - PHP
33:	308 - Forth
34:	304 - Pascal
35:	293 - Groovy
36:	292 - Liberty BASIC
37:	290 - XPL0
38:	287 - Fortran
39:	277 - Oz
40:	275 - PL/I
...
155:	 28 - TorqueScript
156:	 27 - МК-61/52
157:	 27 - Nial
...

[edit] PicoLisp

(load "@lib/http.l")
 
(for (I . X)
(flip
(sort
(make
(client "rosettacode.org" 80
"mw/index.php?title=Special:Categories&limit=5000"
(while (from "<li><a href=\"/wiki/Category:")
(let Cat (till "\"")
(from "(")
(when (format (till " " T))
(link (cons @ (ht:Pack Cat))) ) ) ) ) ) ) )
(prinl (align 3 I) ". " (car X) " - " (cdr X)) )
Output (07apr10):
  1. 390 - Tcl
  2. 389 - Programming_Tasks
  3. 359 - Python
  4. 344 - Ruby
  5. 326 - J
  6. 316 - OCaml
  7. 315 - C
  8. 312 - Haskell
  9. 296 - Perl
 10. 281 - Common_Lisp
...

[edit] PureBasic

Structure Language
count.i
Name.s
EndStructure
 
Dim Row.Language(2000)
 
; Lines has been split to fit RC's 80 char preferences
ignore$ = "Basic language learning Encyclopedia Implementations "
ignore$ + "Language Implementations Language users "
ignore$ + "Maintenance/OmitCategoriesCreated Programming Languages "
ignore$ + "Programming Tasks RCTemplates Solutions by Library Solutions by "
ignore$ + "Programming Language Solutions by Programming Task Unimplemented "
ignore$ + "tasks by language WikiStubs Examples needing attention "
ignore$ + "Impl needed"
 
URL$="http://www.rosettacode.org/w/index.php?"
URL$+"title=Special:Categories&limit=5000"
 
URLDownloadToFile_( #Null, URL$, "special.htm", 0, #Null)
ReadFile(0, "special.htm")
While Not Eof(0)
i + 1
x1$ = ReadString(0)
x2$ = Mid(x1$, FindString(x1$, "member", 1) - 4 , 3)
Row(i)\count = Val(Trim(RemoveString(x2$, "(")))
 
x3$ = Mid(x1$, FindString(x1$, Chr(34) + ">", 1) + 2, 30)
Row(i)\Name = Left(x3$, FindString(x3$, "<", 1) - 1)
If FindString(ignore$, Row(i)\Name, 1) Or Row(i)\Name = ""
Row(i)\count = 0
EndIf
Wend
offset=OffsetOf(Language\count)
SortStructuredArray(Row(), #PB_Sort_Descending, offset, #PB_Sort_Integer)
OpenConsole()
For i = 0 To 20
PrintN( Str(i + 1) + ". " + Str(Row(i)\count) + " - " + Row(i)\Name)
Next
Input()

[edit] Python

Works with: Python version 2.6

This uses MediaWiki's JSON API to query the members of Category:Programming Languages and then scrapes Special:Categories for the number of pages in each language's category.

import urllib, re
 
key1 = lambda x: int(x[1])
 
get1 = urllib.urlopen("http://www.rosettacode.org/w/api.php?action=query&list=categorymembers&cmtitle=Category:Programming_Languages&cmlimit=500&format=json").read()
get2 = urllib.urlopen("http://www.rosettacode.org/w/index.php?title=Special:Categories&limit=5000").read()
 
langs = re.findall("\"title\":\"Category:(.+?)\"",get1)
qtdmbr = re.findall("title=\"Category:(.+?)\">.+?</a> \((\d+) members\)",get2)
 
result = [(x,int(y)) for x,y in qtdmbr if x in langs]
 
for n, i in enumerate(sorted(result,key=key1,reverse=True)):
print "%3d. %3d - %s" % (n+1, i[1], i[0])
Output (as of Sep 11, 2010):
 1. 423 - Tcl
 2. 394 - Python
 3. 368 - J
 4. 365 - PicoLisp
 5. 362 - Ruby
 6. 355 - C
 7. 351 - Haskell
 8. 339 - OCaml
 9. 316 - Perl
10. 315 - PureBasic
11. 306 - D
12. 302 - AutoHotkey
13. 300 - Common Lisp
14. 295 - Java
15. 293 - Ada
16. 278 - Oz
17. 260 - C++
18. 260 - C sharp
    . . .

[edit] R

library(RJSONIO)
langUrl <- "http://rosettacode.org/mw/api.php?action=query&format=json&cmtitle=Category:Solutions_by_Programming_Language&list=categorymembers&cmlimit=500"
 
languages <- fromJSON(langUrl)$query$categorymembers
languages <- sapply(languages, function(x) sub("Category:", "", x$title))
 
# fails if there are more than 500 users per language
user <- function (lang) {
userBaseUrl <- "http://rosettacode.org/mw/api.php?action=query&format=json&list=categorymembers&cmlimit=500&cmtitle=Category:"
userUrl <- paste(userBaseUrl, URLencode(paste(lang, " User", sep="")),sep="")
length(fromJSON(userUrl)$query$categorymembers)
}
 
users <- sapply(languages, user)
head(sort(users, decreasing=TRUE),15)
Output (as of March, 13, 2010):
         C        C++       Java     Python JavaScript       Perl UNIX Shell 
        55         55         37         32         27         27         22 
    Pascal      BASIC        PHP        SQL    Haskell        AWK    C sharp 
        20         19         19         18         17         16         16 
      Ruby 
        14 

[edit] Racket

#lang racket
 
(require net/url)
(require json)
 
(define proglangs_url "http://rosettacode.org/mw/api.php?action=query&list=categorymembers&cmtitle=Category:Programming_Languages&cmlimit=500&format=json")
(define categories_url "http://rosettacode.org/mw/index.php?title=Special:Categories&limit=5000")
 
(define (fetch-json urlstring)
(read-json (get-pure-port (string->url urlstring))))
 
(define programming-languages
(for/set ([h (in-list
(hash-ref (hash-ref (fetch-json proglangs_url) 'query)
'categorymembers))])
(substring (hash-ref h 'title) 9)))
 
(define result '())
(for ([l (in-port read-line (get-pure-port (string->url categories_url)))])
(let ([res (regexp-match #px"title=\"Category:(.+?)\".+\\((\\d+) member" l)])
(when (and res (set-member? programming-languages (cadr res)))
(set! result (cons (cons (cadr res)
(string->number (caddr res)))
result)))))
 
(printf "Place\tCount\tName~n")
(for ([lang (in-list (sort result > #:key cdr))]
[place (in-naturals 1)])
(printf "~a\t~a\t~a~n" place (cdr lang) (car lang)))
 

Output, 2013-5-25:

Place	Count	Name
1	737	Tcl
2	676	Python
3	660	C
4	638	J
5	626	PicoLisp
6	609	Perl 6
7	609	D
8	601	Racket
9	592	Ruby
10	589	Go
...

Recent, occasionally updated output also available at: http://www.timb.net/popular-languages.html

[edit] REXX

(Native) REXX doesn't support web-page reading, so the mentioned Rosetta Code categories and
and Rosetta Code Lanuages were downloaded to local files.

This program reads the Languages file and uses the contents of that file for a validation of
the categories file. This essentially is a perfect filter for the Rosetta Code categories file.
The mechanism is to use a (sparse) stemmed array which holds only the names of languages which
essentially (for most REXXes) uses a hashing algorithm to locate the entry   (which is very fast).

Programming note: the special cases of the some unicode characters:

  • ╬£C++   translated into   µC++   [Greek micro]
  • UC++)   translated into   µC++   (for consistency).
  • ╨£╨Ü-61/52   (Cyrillic   МК-61/52)   translated into   MK-61/52
  • D├⌐j├á Vu   translated into   Déjá Vu
  • Cach├⌐   translated into   Caché



Note that this REXX example properly ranks tied languages.

/*REXX program read two files and produces a ranked list of RC languages*/
sep='█'; L.=0; #.=0; u.=0; catHeap= /*assign variable defaults.*/
parse arg cutoff CinFID LinFID outFID . /*obtain specified options.*/
if cutoff==',' | cutoff=='' then cutoff=0 /*assume no cutoff default.*/
 
if CinFID=='' then CinFID='RC_POP.CAT' /*not specified? Use default. */
if LinFID=='' then LinFID='RC_POP.LAN' /*not specified? Use default. */
if outFID=='' then outFID='RC_POP.OUT' /*not specified? Use default. */
 
call tell center('timestamp: ' date() time('Civil'),79,'═'), 1, 1
#langs=reader('lang') /*assign to L.ααα */
#cats =reader('cat') /*append to the catHeap*/
#=0 /*number of categories.*/
do j=1 until catHeap=='' /*process heap of cats.*/
parse var catHeap cat.j (sep) catHeap /*pick off a category. */
parse var cat.j cat.j '<' '(' mems . /*untangle the string. */
cat.j=space(cat.j); _=cat.j; upper _ /*remove excess blanks.*/
if _=='' | \L._ then iterate /*blank or ¬ a language*/
if \datatype(mems,'W') then iterate /*"members" ¬ numeric.*/
/*handle duplicates. */
if u._\==0 then do /*(next) Possible echo the duplicate to screen.*/
/*─── say duplicate found: ' cat.j ───*/
do f=1 for # until _==@u.f; end /*f*/
say cat.j 'is a dup, old=' #.f 'add mems='mems
#.f=#.f+mems; iterate j
end
u._=u._+1
#=#+1; #.#=mems; @.#=cat.j; @u.#=_; #.#=mems /*bump counter, assign.*/
end /*j*/
 
call tell # '(total) number of languages detected, ',
# 'language's(#) "found with number of entries ≥" cutoff, 1, 1
call esort # /*sort the languages along with #*/
rank=0; tied= /* [↓] show by ascending rank. */
 
do j=# by -1 for # while #.j>=cutoff; rank=rank+1
if tied=='' then pRank=rank
tRank=rank; jp=j+1; jm=j-1; tied=
if #.j==#.jp | #.j==#.jm then tied='(tied)'
if #.j==#.jp then tRank=pRank
else pRank= Rank
 
call tell right('rank:'right(tRank,4),20) left(tied,6),
right('('#.j left("entr"s(#.j,'ies',"y")')',9),20) @.j
end /*j*/
call tell ' ☼ end-of-list. ☼', 1, 2
if cutoff>0 then call tell ' Listing stopped due to a cutoff of' cutoff"."
exit /*stick a fork in it, we're done.*/
/*───────────────────────────────ESORT subroutine───────────────────────*/
esort: procedure expose #. @.; arg N; h=N
do while h>1; h=h%2
do i=1 for N-h; j=i; k=h+i
do while #.k<#.j /*use hard swaps: @. have blanks.*/
@=@.j; #=#.j; @.j=@.k; #.j=#.k; @.k=@; #.k=#
if h>=j then leave; j=j-h; k=k-h
end /*while #.k<#.j*/
end /*i*/
end /*while h>1*/
return
/*───────────────────────────────READER subroutine──────────────────────*/
reader: arg which 2; n=0; ig_ast=1 /*ARG uppers WHICH, gets 1st char*/
if which=='L' then inFID=Linfid /*use this fileID for languages. */
if which=='C' then inFID=Cinfid /* " " " " categories.*/
oldMc='╬£C++'; newMc= "µC++" /*Unicode ╬£C++ ──> ASCII-8: µC++*/
oldUc='UC++' /*old UC++ ──> ASCII-8: µC++*/
oldMK='╨£╨Ü-'; newMK= "MK-" /*Unicode ╨£╨Ü- ──> ASCII-8: MK- */
oldDV='D├⌐j├á'; newDV= 'Déjá' /*Unicode ├⌐j├á ──> ASCII-8: Déjá*/
oldCA='Cach├⌐'; newCA= 'Caché' /*Unicode ach├⌐ ──> ASCII-8: aché*/
 
do while lines(inFID)\==0 /*read a file, 1 line at a time.*/
$=translate(linein(inFID),,'9'x) /*handle any stray tab characters*/
$$=space($); if $$=='' then iterate /*ignore all blank lines. */
if pos(oldMc,$$)\==0 then $$=changestr(oldMc,$$,newMc) /*convert micro*/
if pos(oldUc,$$)\==0 then $$=changestr(oldUc,$$,newMc) /*convert µC++ */
if pos(oldMK,$$)\==0 then $$=changestr(oldMK,$$,newMK) /*convert MK- */
if pos(oldDV,$$)\==0 then $$=changestr(oldDV,$$,newDV) /*convert Déjá */
if pos(oldCA,$$)\==0 then $$=changestr(oldCA,$$,newCA) /*convert Caché*/
$u=$; upper $u /*get an uppercase version. */
if ig_ast then do; ig_ast=pos(' * ',$u)==0; if ig_ast then iterate; end
if pos('RETRIEVED FROM',$u)\==0 then leave /*a pseudo End-Of-Data.*/
n=n+1 /*bump counter: legimate records.*/
if which=='L' then do
if left($$,1)\=='*' then iterate /*legimate?*/
parse upper var $$ '*' $$ '<'; $$=space($$)
L.$$=1 /*languages are stored uppercase.*/
iterate /*iterates the DO WHILE LINES(...*/
end /*(above) pick off language name.*/
 
if left($$,1)=='*' then $$=sep || space(substr($$,2))
catHeap=catHeap $$ /*append to the (CATegory) heap. */
end /*while lines...*/
 
call tell right(n,9) 'records read from file: ' inFID
return n
/*───────────────────────────────S subroutine───────────────────────────*/
s: if arg(1)==1 then return arg(3); return word(arg(2) 's',1) /*plural*/
/*───────────────────────────────TELL subroutine────────────────────────*/
tell: do '0'arg(2); call lineout outFID,' '  ; say  ; end
call lineout outFID,arg(1)  ; say arg(1)
do '0'arg(3); call lineout outFID,' '  ; say  ; end
return /*show before blanks lines (if any), MSG, show after blank lines*/

Some older REXXes don't have a   changestr   BIF, so one is included here CHANGESTR.REX.

[edit] all ranked 488 languages

The output for this REXX (RC_POP.REX) program is included here ──► RC_POP.OUT.

(See the talk page about some programming languages using different cases (lower/upper/mixed) for the language names).

[Note: the timestamp reflects the local time which is USA   CST or CDT   (Central Standard Time   or   Central Daylight Time).

[edit] Ruby

Works with: Ruby version 1.8.7

Now that there are more than 500 categories, the URL given in the task description is insufficient. I use the RC API to grab the categories, and then count the members of each category.

Uses the RosettaCode module from Count programming examples#Ruby

require 'rosettacode'
 
langs = []
RosettaCode.category_members("Programming Languages") {|lang| langs << lang}
 
# API has trouble with long titles= values.
# To prevent skipping languages, use short slices of 20 titles.
langcount = {}
langs.each_slice(20) do |sublist|
url = RosettaCode.get_api_url({
"action" => "query",
"prop" => "categoryinfo",
"format" => "xml",
"titles" => sublist.join("|"),
})
 
doc = REXML::Document.new open(url)
REXML::XPath.each(doc, "//page") do |page|
lang = page.attribute("title").value
info = REXML::XPath.first(page, "categoryinfo")
langcount[lang] = info.nil? ? 0 : info.attribute("pages").value.to_i
end
end
 
puts Time.now
puts "There are #{langcount.length} languages"
puts "the top 25:"
langcount.sort_by {|key,val| val}.reverse[0,25].each_with_index do |(lang, count), i|
puts "#{i+1}. #{count} - #{lang.sub(/Category:/, '')}"
end
Results:
2010-07-08 14:52:46 -0500
There are 306 languages
the top 25:
1. 399 - Tcl
2. 370 - Python
3. 352 - Ruby
4. 338 - J
5. 337 - C
6. 333 - PicoLisp
7. 322 - OCaml
8. 322 - Haskell
9. 299 - Perl
10. 299 - AutoHotkey
11. 288 - Common Lisp
12. 280 - Java
13. 275 - Ada
14. 270 - D
15. 267 - Oz
16. 253 - R
17. 252 - PureBasic
18. 245 - E
19. 243 - C++
20. 241 - C sharp
21. 239 - ALGOL 68
22. 236 - JavaScript
23. 221 - Forth
24. 207 - Clojure
25. 201 - Fortran

[edit] Run BASIC

sqliteconnect #mem, ":memory:"  ' make memory DB
#mem execute("CREATE TABLE stats(lang,cnt)")
a$ = httpGet$("http://rosettacode.org/wiki/Category:Programming_Languages")
aa$ = httpGet$("http://www.rosettacode.org/mw/index.php?title=Special:Categories&limit=5000")
i = instr(a$,"/wiki/Category:")
while i > 0 and lang$ <> "Languages"
j = instr(a$,"""",i)
lang$ = mid$(a$,i+15,j - i-15)
ii = instr(aa$,"Category:";lang$;"""")
jj = instr(aa$,"(",ii)
kk = instr(aa$," ",jj+1)
if ii = 0 then cnt = 0 else cnt = val(mid$(aa$,jj+1,kk-jj))
k = instr(lang$,"%") ' convert hex values to characters
while k > 0
lang$ = left$(lang$,k-1) + chr$(hexdec(mid$(lang$,k+1,2))) + mid$(lang$,k+3)
k = instr(lang$,"%")
wend
#mem execute("insert into stats values ('";lang$;"',";cnt;")")
i = instr(a$,"/wiki/Category:",i+10)
wend
html "<table border=2>"
#mem execute("SELECT * FROM stats ORDER BY cnt desc") ' order list by count descending
WHILE #mem hasanswer()
#row = #mem #nextrow()
rank = rank + 1
html "<TR><TD align=right>";rank;"</td><td>";#row lang$();"</td><td align=right>";#row cnt();"</td></tr>"
WEND
html "</table>"
1Tcl687
2Python650
3C638
4PicoLisp626
5J619
6Go587
7Ruby581
8D571
9Ada559
10Mathematica551
11Perl528
12Perl_6528
13Haskell526
14BBC_BASIC513
15REXX491
16Java488
17OCaml477
18PureBasic469
19Unicon462
20AutoHotkey430
21Icon429
22Common_Lisp424
23C_sharp416
24C++400
25JavaScript359
26Scala339
27PARI/GP338
28Clojure335
29R322
n......

[edit] Scala

def scrapeRosettaCodeLanguageRanks() : Seq[(String,Int)] = {
 
//
// The Programming Languages
//
val langs = {
val langsXML = scala.xml.XML.load("http://rosettacode.org/mw/api.php?action=query&list=categorymembers&cmtitle=Category:Programming_Languages&cmlimit=500&format=xml")
 
(langsXML \\ "categorymembers" \ "cm") map (c => (c \ "@title").text)
}
 
//
// The Categories
//
val cats = {
 
// The categories include a page for the language and a count of the pages linked
// therein, this count is the data we need to scrape.
val catsXML = scala.xml.XML.load("http://rosettacode.org/mw/index.php?title=Special:Categories&limit=5000")
 
(catsXML \\ "ul" \ "li") map {c =>
 
// Create a tuple pair, eg. ("Category:Erlang", 195)
(
(c \ "a" \ "@title").toString,
("[0-9]+".r.findFirstIn((c.child.drop(1)).text).getOrElse("0").toInt) // Takes the sibling of "a" and extracts the number
)
}
}
 
val ranks = ((langs map { s => cats.find( c => (c._1 == s) ) }).flatten) map {e =>
 
// Clean up the tuple pairs, eg ("Category:Erlang", 195) becomes ("Erlang", 192)
(
(if( e._1.contains("Category:") ) e._1.split("Category:")(1) else e._1),
((e._2 - 3) max 0) // 3 members (or links) are not of a task
)
} sortBy ( - _._2 )
 
ranks
}
 
val ranks = scrapeRosettaCodeLanguageRanks()
 
println("Top 50 Rosetta Code Languages by Popularity, %tF:".format(new java.util.Date) + "\n" )
ranks.take(50).zipWithIndex foreach {case ((lang,rank),i) => println( s"${i+1}. $rank - $lang" )}
 
Output:
Top 50 Rosetta Code Languages by Popularity, 2013-02-15:

1. 684 - Tcl
2. 647 - Python
3. 635 - C
4. 623 - PicoLisp
5. 616 - J
6. 584 - Go
7. 578 - Ruby
8. 567 - D
9. 556 - Ada
10. 548 - Mathematica
11. 525 - Perl
12. 523 - Haskell
13. 519 - Perl 6
14. 510 - BBC BASIC
15. 488 - REXX
16. 485 - Java
17. 474 - OCaml
18. 466 - PureBasic
19. 459 - Unicon
20. 427 - AutoHotkey
21. 426 - Icon
22. 421 - Common Lisp
23. 413 - C sharp
24. 397 - C++
25. 355 - JavaScript
26. 336 - Scala
27. 335 - PARI/GP
28. 332 - Clojure
29. 319 - R
30. 315 - Lua
31. 311 - ALGOL 68
32. 310 - PHP
33. 305 - Forth
34. 301 - Pascal
35. 290 - Groovy
36. 289 - Liberty BASIC
37. 287 - XPL0
38. 284 - Fortran
39. 274 - Oz
40. 272 - PL/I
41. 269 - E
42. 267 - Seed7
43. 256 - MATLAB
44. 254 - Factor
45. 251 - Scheme
46. 232 - Octave
47. 223 - AWK
48. 223 - Delphi
49. 213 - Smalltalk
50. 212 - Euphoria

[edit] Tcl

[edit] By web scraping

package require Tcl 8.5
package require http
 
set response [http::geturl http://rosettacode.org/mw/index.php?title=Special:Categories&limit=1000]
 
array set ignore {
"Basic language learning" 1
"Encyclopedia" 1
"Implementations" 1
"Language Implementations" 1
"Language users" 1
"Maintenance/OmitCategoriesCreated" 1
"Programming Languages" 1
"Programming Tasks" 1
"RCTemplates" 1
"Solutions by Library" 1
"Solutions by Programming Language" 1
"Solutions by Programming Task" 1
"Unimplemented tasks by language" 1
"WikiStubs" 1
"Examples needing attention" 1
"Impl needed" 1
}
 
foreach line [split [http::data $response] \n] {
if {[regexp {>([^<]+)</a> \((\d+) member} $line -> lang num]} {
if {![info exists ignore($lang)]} {
lappend langs [list $num $lang]
}
}
}
 
foreach entry [lsort -integer -index 0 -decreasing $langs] {
lassign $entry num lang
puts [format "%d. %d - %s" [incr i] $num $lang]
}
Output on 31 July 2009 (top 15 entries only):
1. 329 - Tcl
2. 292 - Python
3. 270 - Ruby
4. 250 - C
5. 247 - Ada
6. 238 - Perl
7. 223 - E
8. 221 - Java
9. 220 - AutoHotkey
10. 219 - OCaml
11. 210 - Haskell
12. 197 - ALGOL 68
13. 188 - D
14. 179 - C++
15. 175 - Forth
……

[edit] By using the API

Translation of: Ruby
Works with: Tcl version 8.5
Library: tDOM
package require Tcl 8.5
package require http
package require tdom
 
namespace eval rc {
### Utility function that handles the low-level querying ###
proc rcq {q xp vn b} {
upvar 1 $vn v
dict set q action "query"
# Loop to pick up all results out of a category query
while 1 {
set url "http://rosettacode.org/mw/api.php?[http::formatQuery {*}$q]"
puts -nonewline stderr . ;# Indicate query progress...
set token [http::geturl $url]
set doc [dom parse [http::data $token]]
http::cleanup $token
 
# Spoon out the DOM nodes that the caller wanted
foreach v [$doc selectNodes $xp] {
uplevel 1 $b
}
 
# See if we want to go round the loop again
set next [$doc selectNodes "//query-continue/categorymembers"]
if {![llength $next]} break
dict set q cmcontinue [[lindex $next 0] getAttribute "cmcontinue"]
}
}
 
### API function: Iterate over the members of a category ###
proc members {page varName script} {
upvar 1 $varName var
set query [dict create cmtitle "Category:$page" {*}{
list "categorymembers"
format "xml"
cmlimit "500"
}]
rcq $query "//cm" item {
# Tell the caller's script about the item
set var [$item getAttribute "title"]
uplevel 1 $script
}
}
 
### API function: Count the members of a list of categories ###
proc count {cats catVar countVar script} {
upvar 1 $catVar cat $countVar count
set query [dict create prop "categoryinfo" format "xml"]
for {set n 0} {$n<[llength $cats]} {incr n 40} {
dict set query titles [join [lrange $cats $n $n+39] |]
rcq $query "//page" item {
# Get title and count
set cat [$item getAttribute "title"]
set info [$item getElementsByTagName "categoryinfo"]
if {[llength $info]} {
set count [[lindex $info 0] getAttribute "pages"]
} else {
set count 0
}
# Let the caller's script figure out what to do with them
uplevel 1 $script
}
}
}
 
### Assemble the bits into a whole API ###
namespace export members count
namespace ensemble create
}
 
# Get the list of programming languages
rc members "Solutions by Programming Language" lang {
lappend langs $lang
}
# Get the count of solutions for each, stripping "Category:" prefix
rc count $langs l c {
lappend count [list [regsub {^Category:} $l {}] $c]
}
puts stderr "" ;# Because of the progress dots...
# Print the output
puts "There are [llength $count] languages"
puts "Here are the top fifteen:"
set count [lsort -index 1 -integer -decreasing $count]
foreach item [lrange $count 0 14] {
puts [format "%1\$3d. %3\$3d - %2\$s" [incr n] {*}$item]
}

[edit] TUSCRIPT

$$ MODE TUSCRIPT
remotedata = REQUEST ("http://www.rosettacode.org/mw/index.php?title=Special:Categories&limit=5000")
allmembers=allnames=""
COMPILE
LOOP d=remotedata
IF (d.sw."<li>") THEN
name=EXTRACT (d,":<<a<><%>>:"|,":<</a>>:")
IF (name.eq."Language users") CYCLE
IF (name.sw."Unimplemented tasks") CYCLE
IF (name.sw."Programming") CYCLE
IF (name.sw."Solutions") CYCLE
IF (name.sw."Garbage") CYCLE
IF (name.sw."Typing") CYCLE
IF (name.sw."BASIC LANG") CYCLE
IF (name.ew."USER") CYCLE
IF (name.ew."tasks") CYCLE
IF (name.ew."attention") CYCLE
IF (name.ew."related") CYCLE
IF (name.ct."*omit*") CYCLE
IF (name.ct.":*Categor*:") CYCLE
IF (name.ct.":WikiSTUBS:") CYCLE
IF (name.ct.":Impl needed:") CYCLE
IF (name.ct.":Implementations:") CYCLE
IF (name.ct.":':") name = EXCHANGE (name,":'::")
members = STRINGS (d,":><1<>>/><<> member:")
IF (members!="") THEN
allmembers=APPEND (allmembers,members)
allnames =APPEND (allnames,name)
ENDIF
ENDIF
ENDLOOP
index = DIGIT_INDEX (allmembers)
index = REVERSE (index)
allmembers = INDEX_SORT (allmembers,index)
allnames = INDEX_SORT (allnames, index)
ERROR/STOP CREATE ("list",SEQ-E,-std-)
time=time(),balt=nalt=""
FILE "list" = time
LOOP n, a=allnames,b=allmembers
IF (b==balt) THEN
nr=nalt
ELSE
nalt=nr=n
ENDIF
content=concat (nr,". ",a," --- ",b)
FILE "list" = CONTENT
balt=b
ENDLOOP
ENDCOMPILE
Output:
2011-01-24 14:05:27
1. Tcl --- 472 member
2. PicoLisp --- 441 member
3. Python --- 432 member
4. J --- 414 member
5. C --- 394 member
6. Ruby --- 385 member
7. PureBasic --- 371 member
8. Haskell --- 369 member
9. Ada --- 364 member
10. OCaml --- 352 member
11. Perl --- 339 member
11. D --- 339 member
13. Java --- 325 member
14. AutoHotkey --- 308 member
15. Common Lisp --- 305 membe
16. C sharp --- 301 member
17. C++ --- 285 member
18. Oz --- 278 member
19. Clojure --- 272 member
20. R --- 266 member
21. ALGOL 68 --- 262 member
22. Perl 6 --- 258 member
23. JavaScript --- 257 member
24. E --- 254 member
25. REXX --- 253 member
26. Fortran --- 251 member
27. Forth --- 248 member
28. Lua --- 238 member
29. PHP --- 215 member
30. Unicon --- 211 member
31. Icon --- 210 member
32. Factor --- 207 member
33. Scala --- 197 member
34. PL/I --- 193 member
34. Go --- 193 member
36. Scheme --- 186 member 

[edit] UnixPipes

Library: curl
curl 'http://rosettacode.org/mw/index.php?title=Special:Categories&limit=5000' |
sed -n -e 's|^<li><a href="/wiki/Category:\([^"]*\).* (\([0-9][0-9]*\) members*)<.*|\2 \1|p' |
sort -rn
Personal tools
Namespaces

Variants
Actions
Community
Explore
Misc
Toolbox