Rosetta Code/Rank languages by number of users: Difference between revisions

From Rosetta Code
Content added Content deleted
m (→‎{{header|zkl}}: removed dead code)
m (→‎{{header|Perl 6}}: Style tweaks, minor enhancements, update list)
Line 33: Line 33:


<lang perl6>use HTTP::UserAgent;
<lang perl6>use HTTP::UserAgent;
use URI::Escape;
use JSON::Fast;
use JSON::Fast;


my $client = HTTP::UserAgent.new;
say "========= Generated: { DateTime.new(time) } =========";

my $url = 'http://rosettacode.org/mw';

my $start-time = now;
my $start-time = now;

say "========= Generated: { DateTime.new(time) } =========";

my $lang = 1;
my $lang = 1;
my $rank = 0;
my $rank = 0;
Line 44: Line 51:


.say for
.say for
mediawiki-query('http://rosettacode.org/mw', 'pages',
mediawiki-query(
generator => 'categorymembers',
$url, 'pages',
:generator<categorymembers>,
gcmtitle => "Category:Language users",
:gcmtitle<Category:Language users>,
prop => 'categoryinfo')\
:gcmlimit<350>,
:rawcontinue(),
:prop<categoryinfo>
)\


.map({ %( count => .<categoryinfo><pages> || 0,
.map({ %( count => .<categoryinfo><pages> || 0,
Line 65: Line 76:
sub mediawiki-query ($site, $type, *%query) {
sub mediawiki-query ($site, $type, *%query) {
my $url = "$site/api.php?" ~ uri-query-string(
my $url = "$site/api.php?" ~ uri-query-string(
:action<query>, :format<json>, :gcmlimit<350>, :rawcontinue(), |%query);
:action<query>, :format<json>, :formatversion<2>, |%query);
my $continue = '';
my $continue = '';
my $client = HTTP::UserAgent.new;


gather loop {
gather loop {
my $response = $client.get("$url&$continue");
my $response = $client.get("$url&$continue");

my $data = from-json($response.content);
my $data = from-json($response.content);
take $_ for $data.<query>.{$type}.values;
take $_ for $data.<query>.{$type}.values;

$continue = uri-query-string |($data.<query-continue>{*}».hash.hash or last);
$continue = uri-query-string |($data.<query-continue>{*}».hash.hash or last);
}
}
}
}


sub uri-query-string (*%fields) {
sub uri-query-string (*%fields) { %fields.map({ "{.key}={uri-escape .value}" }).join("&") }</lang>
%fields.map({ "{.key}={uri-encode .value}" }).join("&")
}

sub uri-encode ($str) {
$str.subst(/<[\x00..\xff]-[a..zA..Z0..9_.~-]>/, *.ord.fmt('%%%02X'), :g)
}</lang>


{{out}}
{{out}}
<pre>========= Generated: 2017-12-18T13:50:32Z =========
<pre>========= Generated: 2017-12-23T14:07:11Z =========
# 1 Rank: 1 with 373 users: C
# 1 Rank: 1 with 373 users: C
# 2 Rank: 2 with 261 users: C++
# 2 Rank: 2 with 262 users: C++
# 3 Rank: 3 with 257 users: Java
# 3 Rank: 3 with 258 users: Java
# 4 Rank: 4 with 243 users: Python
# 4 Rank: 4 with 244 users: Python
# 5 Rank: 5 with 228 users: JavaScript
# 5 Rank: 5 with 228 users: JavaScript
# 6 Rank: 6 with 163 users: PHP
# 6 Rank: 6 with 163 users: PHP
# 7 Rank: 7 with 162 users: Perl
# 7 Rank: 7 with 162 users: Perl
# 8 Rank: 8 with 131 users: SQL
# 8 Rank: 8 with 131 users: SQL
# 9 Rank: 9 with 120 users: UNIX Shell
# 9 Rank: 9 with 121 users: UNIX Shell
# 10 Rank: 10 with 118 users: BASIC
# 10 Rank: 10 with 118 users: BASIC
# 11 Rank: 11 with 113 users: C sharp
# 11 Rank: 11 with 113 users: C sharp
Line 134: Line 136:
# 44 Rank: 43 T with 27 users: Mathematica
# 44 Rank: 43 T with 27 users: Mathematica
# 45 Rank: 45 with 25 users: AutoHotkey
# 45 Rank: 45 with 25 users: AutoHotkey
========= elapsed: 1.81 seconds =========</pre>
========= elapsed: 1.89 seconds =========</pre>


=={{header|Stata}}==
=={{header|Stata}}==

Revision as of 14:13, 23 December 2017

Rosetta Code/Rank languages by number of users is a draft programming task. It is not yet considered ready to be promoted as a complete task, for reasons that should be found in its talk page.

Sort most popular programming languages based on the number of users on Rosetta Code. Show the languages with at least 100 users.

A way to solve the task:

Users of a language X are those referenced in the page https://rosettacode.org/wiki/Category:X_User, or preferably https://rosettacode.org/mw/index.php?title=Category:X_User&redirect=no to avoid redirections. In order to find the list of such categories, it's possible to first parse the entries of http://rosettacode.org/mw/index.php?title=Special:Categories&limit=5000. Then download and parse each language users category to count the users.

Sample output on 17 december 2017:

Language      Users   Rank
--------------------------
C               373      1
C++             261      2
Java            257      3
Python          243      4
JavaScript      228      5
PHP             163      6
Perl            162      7
SQL             131      8
UNIX Shell      120      9
BASIC           118     10
C sharp         113     11
Pascal          109     12

A Rosetta Code user usually declares using a language with the mylang template. This template is expected to appear on the User page. However, in some cases it appears in a user Talk page. It's not necessary to take this into account. For instance, among the 373 C users in the table above, 3 are actually declared in a Talk page.

Perl 6

Works with: Rakudo version 2017.11

Use the mediawiki API rather than web scraping since it is much faster and less resource intensive. Show languages with more than 25 users since that is still a pretty short list and to demonstrate how tied rankings are handled. Change the $minimum parameter to adjust what the cut-off point will be.

This is all done in a single pass; ties are not detected until a language has the same count as a previous one, so ties are marked by a T next to the count indicating that this language has the same count as the previous.

<lang perl6>use HTTP::UserAgent; use URI::Escape; use JSON::Fast;

my $client = HTTP::UserAgent.new;

my $url = 'http://rosettacode.org/mw';

my $start-time = now;

say "========= Generated: { DateTime.new(time) } =========";

my $lang = 1; my $rank = 0; my $last = 0; my $tie = ' '; my $minimum = 25;

.say for

   mediawiki-query(
       $url, 'pages',
       :generator<categorymembers>,
       :gcmtitle<Category:Language users>,
       :gcmlimit<350>,
       :rawcontinue(),
       :prop<categoryinfo>
   )\
   .map({ %( count => .<categoryinfo><pages> || 0,
             lang  => .<title>.subst(/^'Category:' (.+) ' User'/, ->$/ {$0}) ) })\
   .sort( { -.<count>, .<lang> } )\
   .map( { last if .<count> < $minimum; display(.<count>, .<lang>) } );

say "========= elapsed: {(now - $start-time).round(.01)} seconds =========";

sub display ($count, $which) {

   if $last != $count { $last = $count; $rank = $lang; $tie = ' ' } else { $tie = 'T' };
   sprintf "#%3d  Rank: %2d %s  with %-4s users:  %s", $lang++, $rank, $tie, $count, $which;

}

sub mediawiki-query ($site, $type, *%query) {

   my $url = "$site/api.php?" ~ uri-query-string(
       :action<query>, :format<json>, :formatversion<2>, |%query);
   my $continue = ;
   gather loop {
       my $response = $client.get("$url&$continue");
       my $data = from-json($response.content);
       take $_ for $data.<query>.{$type}.values;
       $continue = uri-query-string |($data.<query-continue>{*}».hash.hash or last);
   }

}

sub uri-query-string (*%fields) { %fields.map({ "{.key}={uri-escape .value}" }).join("&") }</lang>

Output:
========= Generated: 2017-12-23T14:07:11Z =========
#  1  Rank:  1    with 373  users:  C
#  2  Rank:  2    with 262  users:  C++
#  3  Rank:  3    with 258  users:  Java
#  4  Rank:  4    with 244  users:  Python
#  5  Rank:  5    with 228  users:  JavaScript
#  6  Rank:  6    with 163  users:  PHP
#  7  Rank:  7    with 162  users:  Perl
#  8  Rank:  8    with 131  users:  SQL
#  9  Rank:  9    with 121  users:  UNIX Shell
# 10  Rank: 10    with 118  users:  BASIC
# 11  Rank: 11    with 113  users:  C sharp
# 12  Rank: 12    with 109  users:  Pascal
# 13  Rank: 13    with 98   users:  Haskell
# 14  Rank: 14    with 91   users:  Ruby
# 15  Rank: 15    with 71   users:  Fortran
# 16  Rank: 16    with 65   users:  Visual Basic
# 17  Rank: 17    with 60   users:  Scheme
# 18  Rank: 18    with 59   users:  Prolog
# 19  Rank: 19    with 57   users:  Common Lisp
# 20  Rank: 20    with 54   users:  Lua
# 21  Rank: 21    with 52   users:  AWK
# 22  Rank: 22    with 51   users:  HTML
# 23  Rank: 23    with 45   users:  Assembly
# 24  Rank: 24    with 44   users:  Batch File
# 25  Rank: 25    with 42   users:  X86 Assembly
# 26  Rank: 26    with 41   users:  Bash
# 27  Rank: 27    with 40   users:  Erlang
# 28  Rank: 28    with 37   users:  Forth
# 29  Rank: 29    with 35   users:  Lisp
# 30  Rank: 29 T  with 35   users:  MATLAB
# 31  Rank: 29 T  with 35   users:  Visual Basic .NET
# 32  Rank: 32    with 34   users:  J
# 33  Rank: 33    with 33   users:  Ada
# 34  Rank: 33 T  with 33   users:  Brainf***
# 35  Rank: 33 T  with 33   users:  Delphi
# 36  Rank: 33 T  with 33   users:  Objective-C
# 37  Rank: 37    with 32   users:  Tcl
# 38  Rank: 38    with 31   users:  APL
# 39  Rank: 38 T  with 31   users:  COBOL
# 40  Rank: 40    with 30   users:  R
# 41  Rank: 41    with 28   users:  Go
# 42  Rank: 41 T  with 28   users:  Perl 6
# 43  Rank: 43    with 27   users:  Clojure
# 44  Rank: 43 T  with 27   users:  Mathematica
# 45  Rank: 45    with 25   users:  AutoHotkey
========= elapsed: 1.89 seconds =========

Stata

<lang stata>copy "http://rosettacode.org/mw/index.php?title=Special:Categories&limit=5000" categ.html, replace import delimited categ.html, delim("@") enc("utf-8") clear keep if ustrpos(v1,"/wiki/Category:") & ustrpos(v1,"_User") gen i = ustrpos(v1,"href=") gen j = ustrpos(v1,char(34),i+1) gen k = ustrpos(v1,char(34),j+1) gen s = usubstr(v1,j+7,k-j-7) replace i = ustrpos(v1,"title=") replace j = ustrpos(v1,">",i+1) replace k = ustrpos(v1," User",j+1) gen lang = usubstr(v1,j+1,k-j) keep s lang gen users=.

forval i=1/`c(N)' { local s preserve copy `"https://rosettacode.org/mw/index.php?title=`=s[`i']'&redirect=no"' `i'.html, replace import delimited `i'.html, delim("@") enc("utf-8") clear count if ustrpos(v1,"/wiki/User") local m `r(N)' restore replace users=`m' in `i' erase `i'.html }

drop s gsort -users lang list if users>=100 save rc_users, replace</lang>

Output

     +---------------------+
     |        lang   users |
     |---------------------|
  1. |          C      373 |
  2. |        C++      261 |
  3. |       Java      257 |
  4. |     Python      243 |
  5. | JavaScript      228 |
     |---------------------|
  6. |        PHP      163 |
  7. |       Perl      162 |
  8. |        SQL      131 |
  9. | UNIX Shell      120 |
 10. |      BASIC      118 |
     |---------------------|
 11. |    C sharp      113 |
 12. |     Pascal      109 |
     +---------------------+


zkl

Uses libraries cURL and YAJL (yet another json library) <lang zkl>const MIN_USERS=60; var [const] CURL=Import("zklCurl"), YAJL=Import("zklYAJL")[0];

fcn rsGet{

  continueValue,r,curl := "",List, CURL();
  do{	// eg 5 times
     page:=("http://rosettacode.org/mw/api.php?action=query"
       "&generator=categorymembers&prop=categoryinfo"

"&gcmtitle=Category%%3ALanguage%%20users" "&rawcontinue=&format=json&gcmlimit=350" "%s").fmt(continueValue);

     page=curl.get(page);
     page=page[0].del(0,page[1]);  // get rid of HTML header
     json:=YAJL().write(page).close();
     json["query"]["pages"].pump(r.append,'wrap(x){ x=x[1];
        //("2708",Dictionary(title:Category:C User,...,categoryinfo:D(pages:373,size:373,...)))

// or title:SmartBASIC if((pgs:=x.find("categoryinfo")) and (pgs=pgs.find("pages")) and pgs>=MIN_USERS) return(pgs,x["title"].replace("Category:","").replace(" User","")); return(Void.Skip);

     });
     if(continueValue=json.find("query-continue",""))
       continueValue=String("&gcmcontinue=",

continueValue["categorymembers"]["gcmcontinue"]);

  }while(continueValue);
  r

}

allLangs:=rsGet(); allLangs=allLangs.sort(fcn(a,b){ a[0]>b[0] }); println("========== ",Time.Date.prettyDay()," =========="); foreach n,pgnm in ([1..].zip(allLangs))

  { println("#%3d with %4s users: %s".fmt(n,pgnm.xplode())) }</lang>
Output:
========== Wednesday, the 20th of December 2017 ==========
#  1 with  373 users: C
#  2 with  261 users: C++
#  3 with  257 users: Java
#  4 with  243 users: Python
#  5 with  228 users: JavaScript
#  6 with  163 users: PHP
#  7 with  162 users: Perl
#  8 with  131 users: SQL
#  9 with  120 users: UNIX Shell
# 10 with  118 users: BASIC
# 11 with  113 users: C sharp
# 12 with  109 users: Pascal
# 13 with   98 users: Haskell
# 14 with   91 users: Ruby
# 15 with   71 users: Fortran
# 16 with   65 users: Visual Basic
# 17 with   60 users: Scheme