Talk:Brace expansion

From Rosetta Code
Revision as of 11:31, 26 January 2014 by rosettacode>Smls (fix typos)

Task is ready for accepting solutions!

As far as I'm concerned, the task description is stable. Feel free to add solutions for other languages, and if you have any problems with the task then drop a line here! --Smls (talk) 17:49, 25 January 2014 (UTC)

There appears to be a bug in the task. Your third test case should have backslashes on the comma by the "always leave in the backslash" rule. See the Perl 6 output. --TimToady (talk) 02:02, 26 January 2014 (UTC)
I edited the task to add the backslashes, after verifying that the Perl code actually does include the backslashes before comma. --TimToady (talk) 02:53, 26 January 2014 (UTC)
Oops, sorry about that, I verified the Perl function using a script that automatically tested it against a local copy of the test cases (and a bunch more), where I did add the backslashes. Thanks for fixing the test case here. --Smls (talk) 10:16, 26 January 2014 (UTC)
Also, with the toy output, it's not at all clear that the current Perl or Python implementations are doing this according to spec. The Perl code claims to follow the spec, but the spec is missing backslashes on the commas. The Python code talks about barfing on code that the specification says it should accept. Maybe we should require a bit more rigor in proving solutions actually pass the spec for backslashed and not-really-meta characters. --TimToady (talk) 02:22, 26 January 2014 (UTC)
Regarding the python code: the commas not in bracers can be parsed as literal chars, but the unmatched bracers as specified by task is not workable. How do you parse "{a,{b,c}" ? As "{a", "b", "c", or as "a", "{b", "c"? Same for closing bracers. This is not just a problem with descending parsers, it's simply ambiguous and counterintuitive, so best treated as a syntax error IMO. The part of spec about "{a}" parsed literally is also not done in python, which can take some workaround but is again not intuitive. --Ledrug (talk) 02:37, 26 January 2014 (UTC)
I understand that Python culture tends more toward throwing an exception if in doubt, but I think the intent of the task was to emulate the shell's notion of how to do this, hence the failsoft aspects, which are pretty easy to do with a decent backtracking engine. It's certainly straighforward in the Perl 6 solution, once I figured out the top level parses with slightly different rules than sublevels do. And it was certainly specced how it was supposed to treat weird characters. --TimToady (talk) 02:49, 26 January 2014 (UTC)
Well, I'm not really with the python culture, but that's not important. I can see why the spec is setup this way, but the fact remains that the rule about unmatched bracers is underdefined. The spec should say which bracers would be considered extra for consistency across implementations. I'll withdraw the python code for now since it does fail to follow other rules, while the task hopefully can get some clarification. --Ledrug (talk) 03:19, 26 January 2014 (UTC)
Another thing, what's supposed to happen when the string ends with an extra "\"? --Ledrug (talk) 03:29, 26 January 2014 (UTC)
I've attempted to clarify what to do in the case of ambiguity; in general, the closest brace that makes sense in context is used. I suppose a backslash at the end of the string should just be taken as a literal. --TimToady (talk) 04:00, 26 January 2014 (UTC)
Yeah, I thought the "closest brace" disambiguation was implied by "} closes the last opened alternation group " and all the talk about "balanced" brace pairs, but it's a good thing you added a more explicit clarification. As for it being "counterintuitive", well, it's how my shell (zsh) handles unbalanced braces, so at least to me it's intuitive... :) --Smls (talk) 10:16, 26 January 2014 (UTC)
Regarding backslash at the end of the string, I semi-purposefully left that case unspecced, because such inputs may well be impossible in real-life scenarios, and I didn't want to force implementors to complicate their solutions because of it. But I did implement it as "interpret literally and pass along to output" in the Perl solution, just in case. If you think it makes sense as a hard requirement, I'm OK with having it in the spec. Although I would also be fine with something like "You may assume that input strings will not end with an unescaped backslash" in the spec. --Smls (talk) 10:16, 26 January 2014 (UTC)
TimToady is right, I did intend for fault-tolerant parsing (as opposed to throwing an error on malformed input) to be a core aspect of this task (see the "Motivation" section). Of course if it's an unreasonable requirement (e.g. making the task too big/difficult) it could be dropped, but I think that would make the task much less interesting (and there are already many other Rosetta Code tasks that deal with strict parsing). Note that I managed to implement all the requirements in the Perl solution, with plain while loops and a non-backtracking regex (which could even be replaced by low-level 'substr' operations without much difficulty), so it should be possible in Python as well. --Smls (talk) 10:17, 26 January 2014 (UTC)
Hm yeah, I guess requiring solutions to demonstrate the four test cases (with a listing of the actual output), instead of some "toy output", might make sense after all (even if it will make the page very long once there are many implementations). Btw, that Perl 6 solution looks pretty sweet... :) --Smls (talk) 10:16, 26 January 2014 (UTC)
I updated both the task description and the Perl solution accordingly now. --Smls (talk) 10:57, 26 January 2014 (UTC)