More tasks, more langauges, more users, more visits, more server power, more software, more data, more features, more expense.

I honestly don't have solid metrics for most of these; I don't know how many languages, tasks or users we had on January 1st of 2009.  We have more of all of these, now.

We were on a 256 Slicehost node, which didn't cost much.  Unfortunately, the server node simply didn't have the resources under any configuration to perform the duties it was serving when we got a long surge of StumbleUpon traffic to the Ethiopian Multiplication task, so Rosetta Code moved to a Linode 540 plan, got reconfigured with caching, tuned settings and different software, and now a visitor can hardly even notice when the server gets a surge of Proggit or StumbleUpon traffic. (Database backups still bring it to its knees for a few minutes, but that's something I'm working on.)

We didn't have ImplSearchBot yet, and so we didn't have automatic updating listings of missing task implementations. ISB is now gone, but a MediaWiki extension was modified to allow dynamic creation and fast access to the pages it had been maintaining.

While tha analytics data is slightly spotty (A MediaWiki upgrade blew away the placement of the GA JavaScript file, and it went unfixed from August 8th to February 19th.), at least a few things can be reported. To the wiki, there were at least 268,000 visits from at at least 200,000 visitors for at least 750,000 page views.  Of those, about 137,000 visits came from search engines, 75,000 from other sites like wikis, social bookmarking and programmer forums, and 35,000 came directly here. (Or, perhaps, suppressed sending their referrer header line to web servers.)

The site has gotten more active and better organized overall, with the addition of the draft task template and system, the still-in-progress language comparison table, additional languages supported by GeSHi, and so on and so forth.

To be totally honest, these are just the things that stand out in memory, which means I've probably forgotten a lot of what's happened early in the year.

On to 2010

What will 2010 bring? I don't know.  Many of my goals haven't really changed:

  • I'd like to continue to expand awareness and understanding of programming in its various forms, and continue to provide a service which helps improve the art.
  • Rosetta Code needs to start paying for its own static costs, but that requires some sort of income.
  • I want to set up the ability for people to order books based on content selection rules, but that's going to depend on further increasing the quality and organization of the site.
  • I want to sell things T-Shirts, jackets and hats.
  • I'd like to get the community even more active, without people feeling forced to do this or that.
  • I'd like to improve communications with things like an XMPP server, XMPP MUCs and a convenient in-wiki chat box. (akin to everything2's catbox, I suppose...)
  • I'd like to see more tasks covering more domains, such as concurrency, networking, GUIs and agents/AI.
  • I'd like to see more languages.
  • I'd like to see more examples taking advantages of libraries accessible to them.
  • I'd like to see more examples which use libraries show how to implement the functionality in their own code.
  • I'd like to see the Encyclopedia pages get split into their own realm, and get more programming-relevant content that tends to escape Wikipedia for reasons of notability or origin.
I have no solid idea which of these will happen when, except that there will always be more tasks and languages being added.

If you've got ideas, hop on over to the Village Pump and discuss them.

Why (and why not) MediaWiki

| 1 Comment | No TrackBacks
One of the most common questions and criticisms I've faced when introducing seasoned programmers and netizens to Rosetta Code is "Why MediaWiki? I think that's the wrong tool for the job."

I don't disagree.

The initial reason "Why" is simple; I had an idea, and I wanted to implement it quickly before I lost focus and interest. MediaWiki let me do that, even if initial versions of the site were very sparse not only in content, but in appearance, structure and features.  We didn't have "Unimplemented in X" pages, we didn't have a system of templates, and we didn't even have a strong layout pattern. Still, we had a site, Slashdot noticed it, and the site has slowly grown over the years. It's gathered good people and good content, templates have been abused and abused again to build the site closer to what it should be, bots have been written, run and abandoned to stitch the site together better, and we've even written and modified MediaWiki extensions to suit our needs.

Still, MediaWiki isn't the right tool for the job, even if it gets the job done with enough tweaking and coaxing. The site has a fairly well-defined structure, and generalizations along the lines of that structure would lead to more and more powerful ways to use and abuse the site and its content.

So why not switch Rosetta Code to another CMS?  It's certainly doable; As part of a wiki community populated by active, creative coders of many perspectives, even the nastiest problems have found automated solutions, so a migration isn't in the "impossible" category. I could even fund the use of a new, clean VPS host for the duration of the migration, assuming it didn't take too long. So, why hasn't Rosetta Code been switched to another CMS?

Support. MediaWiki may not be the right tool, but it's a well-supported tool with a lot of momentum, and a lot of active developers behind it. It's well-tested, can be configured to run at large and small scales, and has ample documentation. I would love to define my ideal structure for Rosetta Code, and work with people to have a custom CMS written to service that structure while allowing for growth and shifts in the future. In fact, at least one such CMS was already written to the observed needs of Rosetta Code (though I didn't get a chance to provide input; That was a long week at work.)

While I have no reason to doubt the sincerity of any developer seeking to (or succeeding in) write a CMS with the intention of hosting Rosetta Code content, my major concern is that support continues for the software long after we're committed. I don't have the money to pay and so I don't have any way of ensuring that the underlying software doesn't go stagnant. If the underlying software goes stagnant, then Rosetta Code itself becomes vulnerable to increased maintenance costs in countering vulnerabilities and other concerns. By "maintenance costs," I mean my spending time on the server (as I don't have a backup admin) updating software packages and fixing problems that can't be fixed through normal user action.

About the only way I would feel at ease would be if Rosetta Code weren't the only site depending on the continued quality and development of the software. I once had a discussion about this in #proggit (and #haskell, I think) on FreeNode, but I never heard back from anyone involved.

Maybe some day.

A few updates

| No Comments | No TrackBacks
A few interesting changes have been made to the site in the past few months.

ImplSearchBot Replaced

ImplSearchBot was replaced with a customized MediaWiki extension, provided by Opticron. All of the "unimplemented in X" pages are now produced automatically by directly querying the database. For server load considerations, memcached is used to cache the results, but the list should be accurate to within the last fifteen minutes.

Add a Language Guide

A rather detailed guide to adding a language has been provided, and neatly walks a newcomer through adding a new language to Rosetta Code.  If you've wanted to see a language added to the site, this will tell you how to do it.

IRC Available Through MediaWiki Again

You may remember MibbitChat, a MediaWiki extension that used Mibbet to provide access to IRC channels.  Earlier this year, Freenode blocked Mibbet as an IRC client, and then set up their own client as a replacement.  WebChat is a MediaWiki extension that replaces MibbitChat (and its Freenode sibling FreenodeChat) to provide access to everyone and Freenode.  Rosetta Code now has WebChat set up to provide access to #rosettacode from within the Wiki.

GeSHi Language File Generator

To accelerate the addition of language support for GeSHi, particularly as it relates to Rosetta Code, Underscore has created AutoGeSHi, a language file generator.  In this way, even if you don't know PHP, you can still create GeSHi language files.

RSS and Atom Recent Changes feeds

These aren't new, but it seems few folks know about them.  You can follow changes on the wiki by way of the Recent Changes feeds.  They're available in Atom and RSS formats.

Stats update coming soon

| No Comments | No TrackBacks
The stats update will come next week. I'll be going on a trip soon and the new academic quarter just started. Here are the top ten languages by examples in the meantime:

1. Tcl - 353
2. Python - 319
3. Ruby - 310
4. J - 268
5. OCaml - 266
6. C - 263
7. Common Lisp - 263
8. Ada - 257
9. Perl - 254
10. Haskell - 254

Java dropped off and Common Lisp came up in a real way.

Stats update

| No Comments | No TrackBacks
Not much happened this month in any of the stats that we graph. All steady as they go.
pageviews1109.PNG
visits1109.PNG
pageedits1109.PNG

Top ten languages by number of examples:
1. Tcl - 348
2. Python - 313
3. Ruby - 305
4. OCaml - 263
4. Common Lisp - 263
6. J - 260
7. C - 259
8. Ada - 254
9. Perl - 248
10. Java - 240
OCaml and Common Lisp are now tied. J moved up a few places. Otherwise, nothing changed.

I'd like to take note of some of the sites that gave us traffic. Besides Wikipedia, StumbleUpon, and the usual search engines, here are some site that linked to us (the ones that gave us at least 100 visits because I had to cut it off somewhere):
blog.html.it (this entry specifically)
autohotkey.com
blog.bestinclass.dk (these two entries specifically)
rseek.org
stackoverflow.com (lots of different pages)
wiki.python.org (this page specifically)

Unexpected downtime

| No Comments | No TrackBacks
Tuesday, we had a few minutes of unexpected downtime. From the sounds of things, Linode rolled out an update of their node manager, and at least a few nodes* rebooted. Rosetta Code's node also rebooted. It doesn't seem that any data was lost.

* Gauged by anecdotes seen on a live twitter search I'm no longer watching.

Stats update

| No Comments | No TrackBacks
Another month, another stats update. Things have calmed down since the StumbleUpon activity spike, but we're still going. Let's check out dem numbers!

(Movable Type isn't as good with images as WordPress was...I'll try to figure it out for next month. For now they'll look a little squished.)
pageviews.PNG
You can see the SU activity at the end of August. After that, things returned to a slightly elevated state. With the new server and associated changes, we can handle it. Keep stumbling, digging, reddit-ing(?), and maybe slash dotting? Maybe clear a /. with Mike Mol first.
visits.PNG
This one pretty much did the same as the views. Not really much else to say. People are still coming to see RC.
pageedits.PNG

Oddly enough, even with the slightly increased visits and views activity, edits seemed to drop off a bit (especially near the end of the month). Go tell your friends to contribute some code. We could even use some people to fix up the wikicode in places.

Top ten languages by number of examples:
1. Tcl - 343
2. Python - 306
3. Ruby - 302
4. Common Lisp - 263
5. OCaml - 259
6. C - 257
7. Ada - 254
8. J - 248
9. Perl - 245
10. Java - 231
E fell off the bottom and Java returned. C and J moved up. E and Haskell are at 230 and 228 respectively. They may move up soon.

Happends
  • We switched servers from Slicehost to Linode.  Site is running a lot faster, now.
  • ImplSearchbot was shut down.  At present, a rewrite of a rewrite of it is serving up static JSON files built from Mediawiki's category data drawn from the internal database representation.
  • Johannes Rössel created a client-side script in PowerShell to perform the type of work that ImplSearchBot performed, based on the JSON data.
Happenings
  • Opticron is building a MediaWiki extension to do on-the-fly generation the reports that ImplSearchBot.  To this end, a fresh export of many of Rosetta Code's pages was produced.
  • The "Tasks not Implemented in " pages were moved to the new Reports namespace.
Happenings to Be
  • It's well past time to update and upgrade GeSHi again, and to pull in support for the various languages that have cropped up on Rosetta Code.  Michael Mol is part of the GeSHi project, so if there are any changes, features and other concerns that need to be addressed, renew them on the relevant Syntax Highighting page.
  • With the update of the Blog software, XFeeds has been having issues.  It may be updated, replaced or removed; The specific outcome remains to be seen.
  • The server may get some additional reconfiguration to add support for Squid caching, but doing so for MediaWiki is non-trivial, will require some extensive attention.
  • Michael Mol has found an outside company that is willing to work with Rosetta Code towards the goal of publishing and selling books based on categorical specs provided by any user or visitor of the site. Naturally, the site's GFDL license will be respected; Any book sold will have a free electronic copy available for download. Each book is expected to contain a list of contributors to the pages used, as well as the contents of their user pages.
  • Michael Mol has been logging #rosettacode on Freenode continually since 2007, but hasn't yet put those logs online in a consistent and updateable fashion. Assistance and/or advice would be helpful.

Due to a lack of available time resulting from the growth of the site, the growth of ISB's mission, and other issues unrelated to Rosetta Code, I do not have time to maintain and operate the bot in addition to other site maintenance tasks. As a result, ISB was disabled since Labor Day 2009 due to a lack of sufficient time to get the bot up and running properly, and very likely won't be resumed in its normal role. It needs to be replaced by another bot, mechanism or avenue maintained and operated by someone who has more time available to respond to bugs and feature requests. To this end, ISB has been repurposed to provide fast access to the raw category data of the wiki, avoiding some of the overhead that bots that depend on category data currently face.

I should add that the "unimplemented in X" pages are still needed, or at least the information they provide is, but I don't have the time to maintain the software that explicitly creates and maintains them; There is a backlog of other things I need to work on with respect to site infrastructure, including fixes to old problems and addition of new features. I am very, very open to helping anyone interested in writing a substitute bot get started.

Until I have time to set up the new Report namespace, take a look at these JSON files. There is one JSON file there for every category on the wiki. Each JSON file contains the contents of the relevant category, with the exception of this one, which contains the names of all the categories. There is a running service on the server that updates the JSON files within a few minutes of a page being added to the category. The update is tied to the server's five-second load average, and should almost never take longer than four minutes. The timestamps on the file, for the most part, reflect the last time the category was updated; The files are only written to if the category contents change between checks, or if the file was deleted by manual means.

Wander over to ImplSearchBot Fate and Replacement and discuss a replacement for the ImplSearchBot. Some time in the next couple weeks, I'll be ready to grant Bot privileges to whatever account is to replace ImplSearchBot. Don't limit your ideas to MediaWiki bots, though; There are a variety of other ways the data could be used, from RSS feeds to in-browser widgets.

Change of hosting, other updates

| No Comments
Rosetta Code is now hosted at Linnode, and has roughly twice the physical server capacity that it had at Slicehost. That extra space allowed me to properly tune MySQL, as well as improve usage of memcached and install php5-xcache. I'm still not using Squid, though I was able to pull 5-7Mb/s worth of pages when using HTTP KeepAlives.

ImplSearchBot has been down since Labor Day, and will likely remain down until it is replaced.  More on that later.

You might have noticed either that the blog was down for much of the week following Labor Day, or, alternately, that the blog looks significantly different.  For a combination of performance and security reasons, I've migrated the Rosetta Code blog from Wordpress to Movable Type.  Old posts have formatting issues.  I may go back and correct them as I have time, but of all the traffic data I have, nothing suggests that that would be worthwhile.

The Rosetta Code planet is not being updated at the moment, but that will be rectified this weekend.  Hopefully, I will also be able to finish the infrastructure to provide faster and simpler access to the data that ImplSearchBot depends on.  And I will likely write a couple more blog posts.

Recent Comments

  • kit1980.myopenid.com: People always ask me exactly opposite question about Progopedia (http://progopedia.com/ read more
  • Michael Mol: That may be the user agent, but that's not the read more
  • mwn3d: I was thinking about crawlers as a possible problem for read more
  • daonlyfreez: I'm not sure what it is that you are looking read more
  • Naveen Garg: Since the rosettacode search feature is disabled, i made a read more
  • shintakenoko: great hard-working tcl-ers;) i had hard time trying to tweak read more
  • Suchenwi: Tcl didn't come out of nowhere, but out of Berkeley, read more
  • Mwn3d: How about getting a blog comments feed back on the read more
  • Aaron: I've had the same issues with Apache mod_php under high-load read more
  • mwn3d: I don't think the wiki will count it as an read more

Recent Assets

  • pageedits1109.PNG
  • pageviews1109.PNG
  • visits1109.PNG
  • pageedits.PNG
  • visits.PNG
  • pageviews.PNG
  • visits0809
  • pageviews0809
  • pageedits0809
  • viewsedit0709

Find recent content on the main index or look in the archives to find all content.