August 2009 Archives

ImplSearchBot disabled

| No Comments
ImplSearchBot is disabled, and will continue to be disabled until I fix an urgent bug. As a result, the "Tasks Unimplemented in X" pages will not be updated again until it's dealt with. This is related to the 400%+ increase in normal load we've been seeing since Sunday.The load average on the slice was as high as 22 when I checked it in response to an email.  ISB itself won't drive the load average above 1; Its operation is completely serial.  If, however, the load average is already above 1 due to incoming traffic, it can push the load average above 2, which causes a nasty cycle that explodes the server load.
  1. User asks for a page
  2. Apache asks fcgid to run a MediaWiki PHP script.  fcgid hangs briefly while waiting on the script to spawn, since there are already other processes waiting for CPU time.
  3. Script spawns, queries database, waits for a bit.
At this point, if all things go well...
  1. Script spits out HTML content, apache serves it up.
  2. User asks for another page.
If things aren't going so well...
  1. fcgid times out, terminates PHP script.  MySQL transaction has to be aborted, user gets an HTTP 500 Internal Server Error message.
  2. User hits reload.  See "User requests page" above.
In the worst case scenario...
  1. Server is taking a long time to return a result.  User gets impatient.
  2. User hits reload, or opens another tab while the first is still loading.  See "User Requests Page" above, with the added caveat that there's still another instance of a PHP script being waited on by fcgid and apache; The user will have to wait a bit longer still.
Meanwhile, all these isntances of PHP each require RAM that's already in short supply, meaning swap becomes an active part of the game.  When I checked, 31% of the CPU time was spent waiting on I/O.  When you see that, it typically means you're hitting your swap far more than sanity would imply.  This is going to slow down any step involving work, such as a PHP interpreter allocating memory for a script's run, and such as the database pulling data into memory.There's nothing I can do immediately about the RAM overages, but I can at least require ImplSearchBot to be more gentle.  I'm going to modify it to sleep a variable duration based on load average; If the load average is above a certain value, it will go back to sleep.  Otherwise, it will wake up, process a work item, and go back to sleep.  In short, the bot's going to get lazy.

Update: See ImplSearchBot Fate and Replacement.

Planet Rosetta Code

| No Comments
It's about time to announce this.  Rosetta Code has a planet.  Its focus is on programming and things generally related to Rosetta Code.  If you have a blog, leave a note with an RSS or Atom feed URL, and I'll look at including it.

ImplSearchBot source code

| No Comments
ImplSearchBot's source code has long been available on the wiki, but it's now also available via Git:
git://implsearchbot.git.sourceforge.net/gitroot/implsearchbot
As Mwn3d has pointed out, it's not as efficient as it could be.  Additionally, there's a growing list of bugs that I haven't had time to address and fix.

What is ImplSearchBot?

From a practical standpoint, ImplSearchBot maintains the lists of tasks not implemented in the various languages found on Rosetta Code.From a simplified technical standpoint, ImplSearchBot takes two categories, and finds those pages which are in one category, but not in another, and builds lists of those pages.  For our purposes, those categories are Programming Tasks and Programming Languages.  An additional set of categories is used to organize the lists, the Omit Categories.  There is one omit category per language, and each of those categories is used to identify what tasks are inappropriate for that particular language.  Those tasks aren't removed from the listing, but merely identified as being less likely to be accomplished.ImplSearchBot also calculates  a language's penetration rate, as well as tracks the total number of languages and tasks.Finally, it's been keeping its cache under version control, so that other scripts and processes may trigger on differences in cache versions, rather than querying MediaWiki's API.

So...What?

Honestly, I need your help.  I wrote ImplSearchBot months ago to deal with a recurring question: "What tasks haven't been implemented in my language?"  Unfortunately, the problem is more complicated than that, and as its seen more and more use, it's needed more and more work.  I simply don't have enough time available to make all the changes and fix all the problems, much less time to do that and fix and improve other areas of the site.So, what I'd like you to do, is grab a copy of the source code, look at it, see what it does, figure out how it might be made more efficient, more effective, more useful, more flexible, and send me a patch.

Stats update

| No Comments
There was a MediaWiki and Wordpress upgrade at the beginning of the month. Other than that it was business as usual.pageviews0709It just keeps going. Short Circuit let me know that a lot of views probably won't show because of caching. He's granted me access to the Google Analytics stats, but those will come next update probably. I have to figure out how to interpret them.viewsday0709This one seems to be smoothing out a bit. No real anomalies this month.pageedits0709See that little bump around the beginning of July? The bots were off for the upgrade.editsday0709That big dip is also from the bots being off. Other than that it seems to be staying at the same level as last month.viewsedit0709This stat actually did go up very slightly when the bots were off, but then it was back to its usual routine.Top ten tasks by all time views:
  1. IsNumeric ‎(19,949 views)
  2. Assigning Values to an Array ‎(16,055 views)
  3. Change string case ‎(15,382 views)
  4. Execute a System Command ‎(15,374 views)
  5. Tokenizing A String ‎(15,174 views)
  6. File I/O ‎(12,880 views)
  7. Sorting an Array of Integers ‎(11,202 views)
  8. Bubble Sort ‎(10,681 views)
  9. Creating an Associative Array ‎(10,541 views)
  10. Creating an Array ‎(10,495 views)
Execute a System Command and Tokenizing a String swapped, but the rest stayed the same.Top ten programming languages by number of examples:
  1. Tcl - 326
  2. Python - 289
  3. Ruby - 268
  4. C - 248
  5. Ada - 243
  6. Perl - 237
  7. E - 221
  8. Java - 219
  9. AutoHotkey - 218
  10. OCaml - 217
Haskell dropped off and C rose up. Tcl is still safely at the top.

About this Archive

This page is an archive of entries from August 2009 listed from newest to oldest.

July 2009 is the previous archive.

September 2009 is the next archive.

Find recent content on the main index or look in the archives to find all content.