Rosetta Code:Village Pump/Task creation process discussion

From Rosetta Code
Task creation process discussion
This is a particular discussion thread among many which consider Rosetta Code.

Summary

Question on what kinds of restrictions should be applied to creation of tasks

Discussion

Many tasks versus many languages

The new J and Python contributors are creating lots of new tasks in preference to solving existing tasks. Should we be providing guidance on the types of tasks suitable for Rosetta Code? Here are my ideas, feel free to contradict.

  • Rosetta Code is about language comparison. We should encourage solving existing tasks in preference to creating new tasks.
  • We encourage contribution from the public in their free time. If we choose tasks that are too large or difficult, then we are likely to only get a few solutions. Tasks should be chosen which can be implemented succinctly in a variety of languages.
  • Rosetta Code started mostly with trivial tasks designed to demonstrate language features. Do we want to retain that focus? Are there any feature areas we forgot?
  • The more tasks we have, the more each individual task gets lost within Category:Solutions by Programming Task. We can solve this either by adding structure (subcategories) or restricting the number of tasks (merging and rejecting).
  • Duplication should be avoided. If two tasks are demonstrating the same language features, then one should be cut.

I think it would be fine to restrict our focus considering that there are other sites, like Literate Programs, for showing off public code. --IanOsgood 12:01, 24 December 2007 (MST)

Quite often the task descriptions are created with a particular set of languages (or a particular paradigm) in mind, and other languages may not map into it completely. I think it may be better to allow creation of new tasks, but filter/merge them where there is a redundancy to a more generic task at a later time. It is actually nicer to have a larger number of tasks because it allows new contributers a little more flexibility in choosing the tasks (but it becomes harder for languages to become complete). Rahul 20:09, 7 October 2008 (UTC)

I disagree. I don't think that particular languages are thought of when creating tasks. Maybe very general paradigms are thought of (like "object oriented" or "functional"), but mostly the focus is something that people do frequently or can learn a lot from. Merging isn't very easy if a lot of examples sprout up quickly, especially since merging is human enough that a bot can't do it well and--even when called upon--experts on a particular language don't frequently merge old examples and change them for the new task (see: String Length and Loop Structures vs Iteration). Having a large number of tasks is nice in some ways and not nice in others. It's nice because we can have good coverage of programming ideas, but not nice because they can get lost in the solutions category (even with the recent reorganization). I think we should continue to make sure that new tasks are unique, valuable, and possible in many languages. This can be helped by not encouraging a create then review and merge/delete sort of process.--Mwn3d 20:33, 7 October 2008 (UTC)
Oh I meant stuff like Object_Serialization where the task is created with languages that support both objects and inheritance ~/a paradigm/. I am not saying it is done consciously, rather it is an observation. As a suggestion towards the cons of numerous related tasks, perhaps we can have a way to promote only really nice and generic tasks to the solutions page (or equivalent)?. I agree with your comment on merge of old examples but I think we do need a way to introduce more generic/useful/better tasks in a less painful way than to update the task description and force all implementations to change.Rahul 21:57, 7 October 2008 (UTC)
We've got way too much structured code on RC. I need to create a task solvable only by goto.. ;-)
Rather than promoting pages to the Solutions By Task page, why don't we move them to a "Retired" subcategory? Sadly, though, this means we're going to need some sort of approval process. Gah. Procedures and rules. I move we continue this discussion at the Village Pump. --Short Circuit 03:33, 11 October 2008 (UTC)
I think a better title for Object Serialization might be "Saving/loading program state" and an example that allows many types of language to provide examples. Whilst you might strive for program paradigm neutrality, I still think it would be useful to have tasks that are naturally trivial in some class of languages - say constraints, or bit manipulation.
Since we are comparing languages although the task description may describe an algorithm to use, if your language has a feature that is normally used you might be justified in using the built-in feature with an explanation as well as, or instead of, the algorithm asked for. For example, it would be a shame if someone reading the quicksort task went away thinking that it was THE way to sort in Python/Perl/Ruby etc.. --Paddy3118 16:43, 11 October 2008 (UTC)

Three types of tasks

As I see it, three useful kinds of tasks are (and should be) on Rosetta Code: trivial examples for demonstrating discrete language features, practical examples of how to accomplish ordinary tasks, and more complex examples that show how a given language is used in practice to write nontrivial programs. That allows for a lot of different tasks. (The third category in particular covers everything from Roman Numerals to RCBF.) But tasks that don't fit neatly into any of those categories probably don't belong here, and tasks that are too similar to be simultaneously useful should be deleted. —Underscore 17:38, 13 October 2008 (UTC)

Those three categories cover a lot of tasks. Are there any on here that you think don't fit into those categories (just as an example to help specify). --Mwn3d 17:45, 13 October 2008 (UTC)
Here are a few: Apply a callback to an Array, Polynomial Fitting, Sum of squares, Search for a User in Active Directory. What these tasks lack is a clear sense of purpose, compared to similar tasks. For instance, Sum of squares is a pretty superficial variant of Sum and product of array; the chief difference isn't even the squaring, but the use of the word "array", since arrays aren't the simplest sort of data structure in some languages (like Haskell). —Underscore 23:59, 13 October 2008 (UTC)
I think Apply a callback was basically "send (or simulate sending) a function as an argument" and that's a pretty discrete language feature. Polynomial fitting seems more like one of those third category tasks (nontrivials). I see how the active directory one doesn't fit..I never quite understood that one. Sum of squares does seem like a duplicate of sum and product, but there was a long discussion about that one already that I don't want to relive. It seems like it'd be a second category task (ordinary). Am I thinking about this incorrectly? --Mwn3d 00:18, 14 October 2008 (UTC)
Using a function as an argument to another function is indeed a feature worth demonstrating; I guess I'd recommend the task's name or description be amended to reflect that purpose, since it wasn't obvious to me. Likewise, looking at "Sum of squares"'s discussion page, I feel we ought to replace it with (or it ought to be more clearly marked as) a generic function-composition task. The reason why I don't feel "Polynomial fitting" fits in the third category is because in practice, you wouldn't accomplish such a task with generic language features, but a special-purpose library, so all the tasks will look like:
Load library
Run library function
making the program trivial from the library-user's point of view, and while I feel demonstrating trivial features of a language is a worthy end, I don't feel the same way about libraries, since Rosetta Code is about languages, not libraries.
The bottom line is, if you're thinking my categories are too subjective to be useful, you may well be right. I'm going to go out on a limb and say that chances are, the only truly effective solution to this problem would be a set of clear guidelines as to what tasks do and don't belong on Rosetta, and a task-deletion process in the style of Wikipedia's Articles for Deletion. Or we could go really wild and require community consensus for tasks to be created in the first place. —Underscore 01:07, 15 October 2008 (UTC)
Those categories are a good start I think. We just need to hammer out the details and make sure we all agree on things. I wouldn't be opposed to some new rules for task creation (added to Help:Adding a new programming task), but I don't think we need rules on task deletion yet. I don't think we think about deleting tasks that much, and when we do a little bit of discussion is enough to figure out what to do. Maybe we should try to "meet" in the IRC channel to get a real time discussion going (and to make sure more people can chime in before it gets too crazy). --Mwn3d 16:14, 15 October 2008 (UTC)
Mmm… I'm afraid I can't make a commitment, since school is a little hectic at the moment. Probably the best way to do this would be for someone (possibly me, if I get the chance) to make a draft set of guidelines on a new page. Then we can all debate about it on the talk page and edit it, in the grand tradition of wikis. —Underscore 22:24, 16 October 2008 (UTC)

Hi, I gave some thought to your third category:

  • "... more complex examples that show how a given language is used in practice to write nontrivial programs"

I would think that non-trivial programs would just be too long. Reading RC and when thinking of examples I always try and think of something that in most languages I might know, would have a short solution, (as well as a short definition), and what I find interesting and think might be of interest to others. This tends to have me looking again at examples of the advertised strengths, (and weaknesses), of different programming languages; and at algorithms in general. I don't think it is good to go out of the way to think of examples that can be done by a large selection of languages I think that would lead to a rather bland RC. I like it when a task is done by several languages in a similar way, then along comes an entry that does it in a new intriguing way. Sometimes other languages then add their implementations of this new way. That's interesting! Someone once said that a good programming language will alter the way you think of solving a problem, and RC has introduced me to a few cases where the J language explanations are mind-blowing for example. The Perl example of how to solve look-and-say with a regexp is another. --Paddy3118 08:44, 26 April 2009 (UTC)

Needing a feature

To incourage people knowing the language X to write code (in X) for a task rather than thinking about new one, a page (automatically generated) holding all the tasks that are not solved with the language X should be created, so that the user knowing X can easily browse tasks where he/she can help with its language of choice.

If there is already this feature... where is it?! --ShinTakezou 16:03, 8 December 2008 (UTC)

We had a request for this feature on the Rosetta code:Wiki Wishlist, but it seems like it's a bit hard to implement. It would likely require a bot running on the RC server. I had tried to make a bot a few weeks ago, but hit a roadblock (and lost motivation). Maybe a bot person could be added to the list of jobs that people can claim in the new topic in the Village Pump (we should call him the "Bot Commander" or something). --Mwn3d 16:10, 8 December 2008 (UTC)
Hm, ... could it be done (with efforts :( ) the following way? Adding hidden templates to tasks, like placeholders for all RC-known languages... when a user adds a language, automatically the template is superseded by the header|lang template... or the user simply must remove it by hand... Then the finding of a task unimplemented in the language X will be the same as the finding of a task implemented in the language X (Solutions by Language)... but I don't know the details of running a wiki, so maybe it is not a good idea...? --ShinTakezou 23:28, 8 December 2008 (UTC)
I'd like an automatic solution. Giving users too many instructions for adding an example may discourage it. I really think a bot is the best way. --Mwn3d 02:19, 9 December 2008 (UTC)
Well, the feature existed for a few months this year. Needs to be replaced. --Michael Mol 17:45, 13 September 2009 (UTC)

New issue, if it is an issue

I am working on an implementation of the LZW algo in C (see LZW compression). There are languages where it looks so simply since the language provide hashing or similar in a rather standard and easy way. It is not so for C. I don't know widespread common libs that provides hashing or similar. And since I liked the task, started working on it, but of course first I needed to create 1) an easy way of handling strings as sequences of N bytes (therefore no the C way), 2) a dictionary (string hash → integer) with all the needed stuff. The code for the compressor only, with few lines of debug, and 18 lines for the compression usage example, is 400 lines long... I think I should create a new page, like LZW compression/C. Or is it better I create derivative tasks? The doubt is: these tasks would be just for C and similar languages that have not hash or string handling in a natural way (in all other cases, when a suitable task exists, and if the C source exists, likely the code is too much particular to be used in the LZW task...)? ... is this still ok for RC? waiting suggestions while completing the code... --ShinTakezou 01:00, 21 December 2008 (UTC)

Is there some way of putting your code in a scollable section of say only 50 lines long? --Paddy3118 15:35, 21 December 2008 (UTC)
I would suggest putting any C hash implementation into Creating an Associative Array, and then referencing that implementation in this task. Two tasks for one! --IanOsgood 16:47, 21 December 2008 (UTC)
I think that an existing C hash-table or data-structure library should be used, and that Rosetta Code should avoid reimplementing libraries like this -- Creating an Associative Array is oriented toward *using* an implementation, not making one, even if it doesn't explicitly say so. (And for what it's worth, I think the general principle to apply is: what would be enlightening to the reader?) --Kevin Reid 19:59, 21 December 2008 (UTC)
Partly I agree. But the lib should be widespread and common (considered almost standard) (?), and I've found no such a lib (if you know one, tell me —it must be simple, not like SunriseDD I've found... it is overdimensioned for this and other tasks... but as last resort I will try to learn and use this). If widespread or almost standard are not requirements, I would upload my "hash-table" implementation to my site and drop a link to it, I suppose theoretically there's nothing wrong in doing so, since we are not really interested in performance or what (what about this? Can I use libs I've created, of course as GPLed code?). The implementation's aim of my code was to be able to write the algorithm almost as a translation of Java code. --ShinTakezou 00:03, 22 December 2008 (UTC)
I suppose I could use uthash, it seems simple enough. Going to adapt the code, but not this night:D --ShinTakezou 00:25, 22 December 2008 (UTC)
Just to say as currently ended: I've used Judy library... for LZW, I've created a subtask for "binary strings" (opinionable), which proved to be usable (not elegantly) in other tasks too; while I've kept a specialized "dictionary" implementation for LZW. --ShinTakezou 23:45, 26 April 2009 (UTC)