User:ImplSearchBot

From Rosetta Code

What I am

I'm a bot. I'm owned by Short Circuit.

What I do

Maintain lists of tasks not currently implemented in any given language.

How often I do it

Not at all

Due to a lack of available time resulting from the growth of the site, the growth of ISB's mission, and other issues unrelated to Rosetta Code, Short Circuit does not have time to maintain and operate the bot in addition to other site maintenance tasks. As a result, ISB has been disabled since Labor Day 2009 due to a lack of sufficient time to get the bot up and running properly, and likely won't be resumed in its normal role. It needs to be replaced by another bot maintained and operated by someone who has more time available to respond to bugs and feature requests. To this end, ISB will be repurposed to provide fast access to the raw category data of the wiki, avoiding some of the overhead that bots that depend on category data currently face. --Michael Mol 15:06, 10 September 2009 (UTC)

I should add that the "unimplemented in X" pages are still needed, but I don't have the time to maintain the software that explicitly creates and maintains them. I am very, very open to helping anyone interested in writing a substitute bot get started, and I would suggest that a discussion take place over at the Village Pump as to what features the substitute bot provides. --Michael Mol 17:33, 10 September 2009 (UTC)
Until I have time to set up the new Report namespace, take a look at these JSON files. There is one JSON file there for every category on the wiki. Each JSON file contains the contents of the relevant category, with the exception of this one, which contains the names of all the categories. There is a running service on the server that updates the JSON files within a few minutes of a page being added to the category. The update is tied to the server's five-second load average, and should almost never take longer than four minutes. The timestamps on the file, for the most part, reflect the last time the category was updated; The files are only written to if the category contents change between checks, or if the file was deleted by manual means.
Wander over to ImplSearchBot Fate and Replacement and discuss a replacement for the ImplSearchBot. Some time in the next couple weeks, I'll be ready to grant Bot privileges to whatever account is to replace ImplSearchBot. Don't limit your ideas to MediaWiki bots, though; There are a variety of other ways the data could be used, from RSS feeds to in-browser widgets. --Michael Mol 09:06, 13 September 2009 (UTC)

How I do it

Check my source code. I upload it every time I run. The gist of it is that I look at the matrix of Category:Programming Tasks and Category:Programming Languages and look for any holes. If there are any new holes, or any newly fillled holes, I update the pages related to the language in question.

I also save the contents of the categories I read to an SVN repository, with the intention of allowing other bots to read and use it.

Known bugs

  • I currently post at least three pages for every language that has a change between cycles. These are the omit template body, the primary listing template body, the actual "Tasks not implemented in X" page, and possibly a body for the omit category, if the omit category hasn't been created yet. For efficiency's sake, these should only really be posted if they've changed. (A change to the primary listing doesn't mean that the omit listing needs to be posted, for example.)

Fixing bugs

There is a SourceForge project for me, but Short Circuit hasn't yet imported the bot into it. (First, he needed to become familiar with Git, and then had a lack of time and had to deal with other pressing issues.) Contact him if you're interested in helping maintain and improve the code, or at least marshal the import of code and submitted patches into the SourceForge repo; He'll see that the running copy of the bot gets updated as needed.

A More Technical Description

As its most basic functionality, the bot works by using the MediaWiki::Bot Perl module to retrieve a two core category listings, one for "what languages do we have" and one for "what tasks do we have". It then goes through and retrieves the category listings for "what tasks have been solved in language X" for each X in the "what languages do we have" listing. For each such listing, we find what items are in the "what tasks do we have" category that aren't in the "what tasks have been solved in language X" category, providing us with a list of tasks that haven't been solved in language X. (This is particularly interesting to folks who are experts in language X, because it helps them find areas of the site that they might be interested in fixing.)

The bot later was extended to pay attention to a set of listings, "what tasks SHOULDN'T be solved in language X", so that it could identify to viewers that such a task isn't appropriate for the language.

At one time, the bot correctly avoided making unnecessary calls to the MediaWiki API, but that functionality is currently broken.

Future features planned for the bot include any of twittering changes for each language X, providing RSS feeds for changes for each language X, mentioning changes for each language X in interested IRC channels or XMPP chatrooms, etc. It's also intended that it be generalized to support more generalized workloads "what operating systems/platforms have been used to solve language X", "what libraries have been demonstrated in language X", "what operating systems have seen solutions for task Y", etc.

In short, at its core, ImplSearchBot attempts to be a domain-specific approach to identifying shared and unshared components of sets and sets-of-sets, and providing passive and active notifications of changes to the resulting matrix.

Source Code

A SourceForge project for ImplSearchBot has been created, allowing anyone to snag a copy of the source via Git:

git clone git://implsearchbot.git.sourceforge.net/gitroot/implsearchbot/implsearchbot

The source code is BSD licensed. This happens to mean that code pulled directly from Rosetta Code source examples cannot be put into ImplSearchBot, but code from ImplSearchBot may be pulled into Rosetta Code. If there's a bug, and you can fix it, please, feel free to send Short Circuit a patch!