Rosetta Code/Rank languages by number of users: Difference between revisions
Line 61: | Line 61: | ||
'''Output''' |
'''Output''' |
||
<pre> |
<pre> +---------------------+ |
||
⚫ | |||
--------------------- |
|---------------------| |
||
⚫ | |||
1. | C 373 | |
|||
2. | C++ 261 | |
|||
3. | Java 257 | |
|||
4. | Python 243 | |
|||
5. | JavaScript 228 | |
|||
|---------------------| |
|||
Perl 162 7 |
|||
6. | PHP 163 | |
|||
7. | Perl 162 | |
|||
8. | SQL 131 | |
|||
9. | UNIX Shell 120 | |
|||
10. | C sharp 113 | |
|||
|---------------------| |
|||
11. | Pascal 109 | |
|||
12. | BASIC 102 | |
|||
+---------------------+</pre> |
Revision as of 21:34, 17 December 2017
Sort most popular programming languages based on the number of users on Rosetta Code. Show the languages with at least 100 users.
A way to solve the task:
Users of a language X are those referenced in the page https://rosettacode.org/wiki/Category:X_User, or preferably https://rosettacode.org/mw/index.php?title=Category:X_User&redirect=no to avoid redirections. In order to find the list of such categories, it's possible to first parse the entries of http://rosettacode.org/mw/index.php?title=Special:Categories&limit=5000. Then download and parse each language users category to count the users.
Sample output on 17 december 2017:
Language Users Rank -------------------------- C 373 1 C++ 261 2 Java 257 3 Python 243 4 JavaScript 228 5 PHP 163 6 Perl 162 7 SQL 131 8 UNIX Shell 120 9 C sharp 113 10 Pascal 109 11 BASIC 102 12
A Rosetta Code user usually declares using a language with the mylang template. This template is expected to appear on the User page. However, in some cases it appears in a user Talk page. It's not necessary to take this into account. For instance, among the 373 C users in the table above, 3 are actually declared in a Talk page.
Stata
<lang stata>copy "http://rosettacode.org/mw/index.php?title=Special:Categories&limit=5000" categ.html, replace import delimited categ.html, delim("@") enc("utf-8") clear keep if ustrpos(v1,"/wiki/Category:") & ustrpos(v1,"_User") gen i = ustrpos(v1,"href=") gen j = ustrpos(v1,char(34),i+1) gen k = ustrpos(v1,char(34),j+1) gen s = usubstr(v1,j+7,k-j-7) replace i = ustrpos(v1,"title=") replace j = ustrpos(v1,">",i+1) replace k = ustrpos(v1," User",j+1) gen lang = usubstr(v1,j+1,k-j) keep s lang gen users=.
forval i=1/`c(N)' { local s preserve copy `"https://rosettacode.org/mw/index.php?title=`=s[`i']'&redirect=no"' `i'.html, replace import delimited `i'.html, delim("@") enc("utf-8") clear count if ustrpos(v1,"/wiki/User") local m `r(N)' restore replace users=`m' in `i' erase `i'.html }
drop s gsort -users lang list if users>=100 save rc_users, replace</lang>
Output
+---------------------+ | lang users | |---------------------| 1. | C 373 | 2. | C++ 261 | 3. | Java 257 | 4. | Python 243 | 5. | JavaScript 228 | |---------------------| 6. | PHP 163 | 7. | Perl 162 | 8. | SQL 131 | 9. | UNIX Shell 120 | 10. | C sharp 113 | |---------------------| 11. | Pascal 109 | 12. | BASIC 102 | +---------------------+