Talk:Brace expansion: Difference between revisions

Content added Content deleted

Inline

Revision as of 18:59, 31 January 2014

Task is ready for accepting solutions!

As far as I'm concerned, the task description is stable. Feel free to add solutions for other languages, and if you have any problems with the task then drop a line here! --Smls (talk) 17:49, 25 January 2014 (UTC)

There appears to be a bug in the task. Your third test case should have backslashes on the comma by the "always leave in the backslash" rule. See the Perl 6 output. --TimToady (talk) 02:02, 26 January 2014 (UTC)

I edited the task to add the backslashes, after verifying that the Perl code actually does include the backslashes before comma. --TimToady (talk) 02:53, 26 January 2014 (UTC)

Oops, sorry about that, I verified the Perl function using a script that automatically tested it against a local copy of the test cases (and a bunch more), where I did add the backslashes. Thanks for fixing the test case here. --Smls (talk) 10:16, 26 January 2014 (UTC)

Also, with the toy output, it's not at all clear that the current Perl or Python implementations are doing this according to spec. The Perl code claims to follow the spec, but the spec is missing backslashes on the commas. The Python code talks about barfing on code that the specification says it should accept. Maybe we should require a bit more rigor in proving solutions actually pass the spec for backslashed and not-really-meta characters. --TimToady (talk) 02:22, 26 January 2014 (UTC)

Regarding the python code: the commas not in bracers can be parsed as literal chars, but the unmatched bracers as specified by task is not workable. How do you parse "{a,{b,c}" ? As "{a", "b", "c", or as "a", "{b", "c"? Same for closing bracers. This is not just a problem with descending parsers, it's simply ambiguous and counterintuitive, so best treated as a syntax error IMO. The part of spec about "{a}" parsed literally is also not done in python, which can take some workaround but is again not intuitive. --Ledrug (talk) 02:37, 26 January 2014 (UTC)

I understand that Python culture tends more toward throwing an exception if in doubt, but I think the intent of the task was to emulate the shell's notion of how to do this, hence the failsoft aspects, which are pretty easy to do with a decent backtracking engine. It's certainly straighforward in the Perl 6 solution, once I figured out the top level parses with slightly different rules than sublevels do. And it was certainly specced how it was supposed to treat weird characters. --TimToady (talk) 02:49, 26 January 2014 (UTC)

Well, I'm not really with the python culture, but that's not important. I can see why the spec is setup this way, but the fact remains that the rule about unmatched bracers is underdefined. The spec should say which bracers would be considered extra for consistency across implementations. I'll withdraw the python code for now since it does fail to follow other rules, while the task hopefully can get some clarification. --Ledrug (talk) 03:19, 26 January 2014 (UTC)

Another thing, what's supposed to happen when the string ends with an extra "\"? --Ledrug (talk) 03:29, 26 January 2014 (UTC)

I've attempted to clarify what to do in the case of ambiguity; in general, the closest brace that makes sense in context is used. I suppose a backslash at the end of the string should just be taken as a literal. --TimToady (talk) 04:00, 26 January 2014 (UTC)

Yeah, I thought the "closest brace" disambiguation was implied by "} closes the last opened alternation group " and all the talk about "balanced" brace pairs, but it's a good thing you added a more explicit clarification. As for it being "counterintuitive", well, it's how my shell (zsh) handles unbalanced braces, so at least to me it's intuitive... :) --Smls (talk) 10:16, 26 January 2014 (UTC)

Regarding backslash at the end of the string, I semi-purposefully left that case unspecced, because such inputs may well be impossible in real-life scenarios, and I didn't want to force implementors to complicate their solutions because of it. But I did implement it as "interpret literally and pass along to output" in the Perl solution, just in case. If you think it makes sense as a hard requirement, I'm OK with having it in the spec. Although I would also be fine with something like "You may assume that input strings will not end with an unescaped backslash" in the spec. --Smls (talk) 10:16, 26 January 2014 (UTC)

TimToady is right, I did intend for fault-tolerant parsing (as opposed to throwing an error on malformed input) to be a core aspect of this task (see the "Motivation" section). Of course if it's an unreasonable requirement (e.g. making the task too big/difficult) it could be dropped, but I think that would make the task much less interesting (and there are already many other Rosetta Code tasks that deal with strict parsing). Note that I managed to implement all the requirements in the Perl solution, with plain while loops and a non-backtracking regex (which could even be replaced by low-level 'substr' operations without much difficulty), so it should be possible in Python as well. --Smls (talk) 10:17, 26 January 2014 (UTC)

Hm yeah, I guess requiring solutions to demonstrate the four test cases (with a listing of the actual output), instead of some "toy output", might make sense after all (even if it will make the page very long once there are many implementations). Btw, that Perl 6 solution looks pretty sweet... :) --Smls (talk) 10:16, 26 January 2014 (UTC)

I updated both the task description and the Perl solution accordingly now. --Smls (talk) 10:57, 26 January 2014 (UTC)

Duplicate supression

The task currently has an ambiguity - specifically an explicit reference to an implementation (perl 6) which implements a requirement which is not explicitly stated in the task description.

More specifically, the perl 6 implementation suppresses duplicate results despite no description of this mechanism in the current task description. Consider {a,a,b} as an example which would generate duplicates. This issue shows up in the test case '{,{,gotta have{ ,\, again\, }}more }cowbell!' where nested empty prefixes appear. This case should have eight expansions. You can easily see this by placing a letter to the left of each non-escaped comma.

This means either the task description is buggy or the reference implementation is buggy. --Rdm (talk) 18:31, 31 January 2014 (UTC)

Well, I believe the task actually refers to the Perl (5) implementation, not the Perl 6 one. But leaving that aside, there's no duplicate suppression taking place here. Note what happens when I run these (through the Perl 6 version):

{X,{Y,gotta have{ ,\, again\, }}more }cowbell!
    Xcowbell!
    Ymore cowbell!
    gotta have more cowbell!
    gotta have\, again\, more cowbell!

{a,a,b}
    a
    a
    b

We still get four entries for the first one, and the dupes aren't removed from the second one. I don't know what you're doing different, offhand, perhaps not throwing away stuff when you backtrack? --TimToady (talk) 18:54, 31 January 2014 (UTC)

The reference solution (Perl) does agree with the spec: There should be no suppression of duplicates. As for the test case you mention, it has four expansions and not eight, because nested brace groups only expand the "branch" they belong to. For example, the pattern aa{,{,11}cc}22 has three expansions (aa22 aacc22 aa11cc22), not four. --Smls (talk) 18:58, 31 January 2014 (UTC)

@@ Line 38: / Line 38: @@
 :We still get four entries for the first one, and the dupes aren't removed from the second one.  I don't know what you're doing different, offhand, perhaps not throwing away stuff when you backtrack?  --[[User:TimToady|TimToady]] ([[User talk:TimToady|talk]]) 18:54, 31 January 2014 (UTC)
-:The reference solution (Perl) does agree with the spec: There should be no suppression of duplicates. As for the test case you mention, it has '''four''' expansions and not eight, because nested brace groups only expand the "branch" they belong to. For example, the pattern <code>aa{,{,11}cc}22</code> has three expansions (<code>aa22</code>&nbsp; <code>aacc22</code>&nbsp; <code>aa11cc22</code>), not four.
+:The reference solution (Perl) does agree with the spec: There should be no suppression of duplicates. As for the test case you mention, it has '''four''' expansions and not eight, because nested brace groups only expand the "branch" they belong to. For example, the pattern <code>aa{,{,11}cc}22</code> has three expansions (<code>aa22</code>&nbsp; <code>aacc22</code>&nbsp; <code>aa11cc22</code>), not four. --[[User:Smls|Smls]] ([[User talk:Smls|talk]]) 18:58, 31 January 2014 (UTC)
---[[User:Smls|Smls]] ([[User talk:Smls|talk]]) 18:58, 31 January 2014 (UTC)