Rosetta Code/Rank languages by number of users

Revision as of 21:12, 17 December 2017 by Eoraptor (talk | contribs)

Sort most popular programming languages based on the number of users on Rosetta Code. Show the languages with at least 100 users.

Rosetta Code/Rank languages by number of users is a draft programming task. It is not yet considered ready to be promoted as a complete task, for reasons that should be found in its talk page.

A way to solve the task:

Users of a language X are those referenced in the page https://rosettacode.org/wiki/Category:X_User, or preferably https://rosettacode.org/mw/index.php?title=Category:X_User&redirect=no to avoid redirections. In order to find the list of such categories, it's possible to first parse the entries of http://rosettacode.org/mw/index.php?title=Special:Categories&limit=5000.

Sample output on 17 december 2017:

Language	Users  Rank
C               373       1
C++             261       2
Java            257       3
Python          243       4
JavaScript      186       5
PHP             163       6
Perl            162       7
SQL             131       8
UNIX Shell      120       9
C sharp         113      10
Pascal          109      11
BASIC           102      12

A Rosetta Code user usually declares using a language with the mylang template. This template is expected to appear on the User page. However, in some cases it appears in a user Talk page. It's not necessary to take this into account. For instance, among the 373 C users in the table above, 3 are actually declared in a Talk page.

Stata

<lang stata>copy "http://rosettacode.org/mw/index.php?title=Special:Categories&limit=5000" categ.html, replace import delimited categ.html, delim("@") enc("utf-8") clear keep if ustrpos(v1,"/wiki/Category:") & ustrpos(v1,"_User") gen i = ustrpos(v1,"href=") gen j = ustrpos(v1,char(34),i+1) gen k = ustrpos(v1,char(34),j+1) gen s = usubstr(v1,j+7,k-j-7) replace i = ustrpos(v1,"title=") replace j = ustrpos(v1,">",i+1) replace k = ustrpos(v1," User",j+1) gen lang = usubstr(v1,j+1,k-j) keep s lang gen users=.

forval i=1/`c(N)' { local s preserve copy `"https://rosettacode.org/mw/index.php?title=`=s[`i']'&redirect=no"' `i'.html, replace import delimited `i'.html, delim("@") enc("utf-8") clear count if ustrpos(v1,"/wiki/User") local m `r(N)' restore replace users=`m' in `i' erase `i'.html }

drop s gsort -users lang list if users>=100 save rc_users, replace</lang>

Output

     +---------------------+
     |        lang   users |
     |---------------------|
  1. |          C      373 |
  2. |        C++      261 |
  3. |       Java      257 |
  4. |     Python      243 |
  5. | JavaScript      186 |
     |---------------------|
  6. |        PHP      163 |
  7. |       Perl      162 |
  8. |        SQL      131 |
  9. | UNIX Shell      120 |
 10. |    C sharp      113 |
     |---------------------|
 11. |     Pascal      109 |
 12. |      BASIC      102 |
     +---------------------+