Talk:Longest string challenge: Difference between revisions

m
Line 112:
:::::::::: Yes, if we forbid "boxed" data, we forbid boxed strings. But if we change the restriction to "do not represent the length of any line as a number" then that (very J specific) detail might be irrelevant. That said, my first example would check 'ab' and 'abc' using [using a javascript-ish notation]: <nowiki>match('ab ',' ab')</nowiki> and match('abc','abc'). --[[User:Rdm|Rdm]] 10:51, 15 August 2011 (UTC)
::::::::: I think, the solution you suggest treads a fine line - which I think is ok because of the approach. The same structures could be used to read the entire file into memory and performing two passes - which I would think is on the wrong side of line. There's been a suggestion elsewhere that rereading should be prohibited (and I tend to agree). So some of this (where the line should be) is going to be subjective. For example, the recursive solution uses recursion effectively as a way of rereading the strings. Yet it feels like it meets the intent. Ultimately, this task may encourage multiple solutions. --[[User:Dgamey|Dgamey]] 04:27, 15 August 2011 (UTC)
:::::::::: Ultimately, you might say that J is all about "doing multiple passes". Reading the file, in J takes a reference to a string representing the file name and returns a string representing the file contents. At that point, it's a done deal: The moment you do two operations on that string you are doing "multiple passes". I can think of no meaningful way to prohibit 'multiple passes' and still allow a J solution. That said, there are good (efficiency) reasons for doing it this way, for most tasks. But that can get into a long discussion of computer memory architecture, and efficiency and is not one I want to start within a paragraph indented by <code>::::::::::</code>, but the short form is that cpu caches are optimized for serial processing. And, yes, this can create an issue when you get into multi-gigabyte files. One approach, in that case, would be taking the solution that worked on shorter files and recasting that as working on "blocks". This typically means you do something special about the line which gets broken between blocks and for combining results from different blocks. Thus you would expect a solution about 3 times the complexity of what you would need for a single file. (But when you are working with data that big, this approach can be much more efficient than "line at a time" approaches. But a more important issue would be that most files nowadays should not be multi-gigabyte files. That said said, there's also a language implementation issue, here, that gets into how memory mapped files get handled.) --[[User:Rdm|Rdm]] 10:51, 15 August 2011 (UTC)
 
== Not pointless ==
6,962

edits