Talk:Word wrap: Difference between revisions

← Older edit

Content deleted Content added

Revision as of 00:25, 22 August 2013 view source rosettacode>Gerard Schildberger →‎REXX Timings: made the OW subroutine non-destructive, running low on coal. -- ~~~~ ← Older edit		Latest revision as of 03:25, 17 August 2018 view source rosettacode>Gerard Schildberger →‎REXX timings: added some comments.
(23 intermediate revisions by 2 users not shown)
Line 45: ::: Oh yes. I just took the question as asking "in a perfect world ...". --[[User:Paddy3118\|Paddy3118]] ([[User talk:Paddy3118\|talk]]) 10:24, 21 August 2013 (UTC) == REXX ~~Timings~~timings == I created a file containing one line of about 1000000 characters containing words of 1 to 90 characters, randomly distributed such as Line 206: :: Translated version 2 to PL/I. Since PL/I has a limit of 32767 for character strings I had to cut the input into junks of 20000 bytes and add extra reads. Output is identical to REXX. --[[User:Walterpachl\|Walterpachl]] ([[User talk:Walterpachl\|talk]]) 19:38, 21 August 2013 (UTC) The last shown REXX program has a problem with classic REXX: '''fn''' is an unknown function.   Also, that REXX program only reads the first record of the file (does exactly one read) instead of doing a loop until done.   It would make more sense to exclude the time to read the file as well as bypassing the writing of the records to the file, as the I/O would be unvarying and slightly ~~dependant~~dependent on other I/O activity in the system, not to mention caching.   Whoever does the first reading pays for all the I/O, the 2nd reading would be from cache.   I would benchmark for a paragraph of text as the task says, not a million bytes.   Scale up the number of executions to make the timings meaningful.   Also, I took the liberty of breaking up the listing of the REXX programs into separate sections, perhaps it would be a good idea to label/identify them, not to mention to bring version 0 and 1 up to date. -- [[User:Gerard Schildberger\|Gerard Schildberger]] ([[User talk:Gerard Schildberger\|talk]]) 21:01, 21 August 2013 (UTC) ----- Line 248: :: version 0 and 1 remove them and reduce multiple blanks to one blank. --[[User:Walterpachl\|Walterpachl]] ([[User talk:Walterpachl\|talk]]) 21:59, 21 August 2013 (UTC) ::: What about version zero ???   REXX version 0 and 1 already removes leading and multiple ~~imbedded~~embedded blanks (as well as trailing blanks). -- [[User:Gerard Schildberger\|Gerard Schildberger]] ([[User talk:Gerard Schildberger\|talk]]) 22:06, 21 August 2013 (UTC) :::: that's what I tried to say. the '1' got lost.--[[User:Walterpachl\|Walterpachl]] ([[User talk:Walterpachl\|talk]]) 22:15, 21 August 2013 (UTC) Line 275: :::::* REXX version 2     1.06 seconds   (optimized with   '''parse'''   statement) :::::* REXX version 2     1.05 seconds   (optimized by making the '''ow''' subroutine non-destructive) :::::* REXX version 2     1.01 seconds   (optimized by making the '''ow''' subroutine in-line) :::::* REXX version 2     0.96 seconds   (optimized the inner DO loop, eliminated an   '''if'''   statement) The   '''lastpos'''   BIF was used to find the last blank (within a field of '''W''' characters instead of searching for the last blank character by character). <br>Further optimization was done using   '''parse'''   instead of   '''substr'''   and other such thingys. -- [[User:Gerard Schildberger\|Gerard Schildberger]] ([[User talk:Gerard Schildberger\|talk]]) 23:39, 21 August 2013 (UTC) <br>I really have to stop optimizing that REXX program, I'm running out of coal. -- [[User:Gerard Schildberger\|Gerard Schildberger]] ([[User talk:Gerard Schildberger\|talk]]) 00:11, 22 August 2013 (UTC) <br>Well, I ran out of coal ... can't stoke the fires anymore. -- [[User:Gerard Schildberger\|Gerard Schildberger]] ([[User talk:Gerard Schildberger\|talk]]) 02:31, 22 August 2013 (UTC) :: I refrained from using lastpos since the (classic?) Rexx on the host does not have it. Is version 2 that you refer to "my way" modified as noted above? Are your final versions 1 & 2 available somewhere? I had to look up vestigual (limited English) - it should have been vestigial:-) --[[User:Walterpachl\|Walterpachl]] ([[User talk:Walterpachl\|talk]]) 06:35, 22 August 2013 (UTC) ::: Yes, the REXX version 2 (as mentioned above) is a cumulative modification of the original, that is, the 3rd optimization is the changed 2nd optimization. -- [[User:Gerard Schildberger\|Gerard Schildberger]] ([[User talk:Gerard Schildberger\|talk]]) 21:19, 22 August 2013 (UTC) ::: (about the misspelling of ''vestigial''):   I did a <strike> quite </strike> quick web check and found many hits on ''vestigual'', but I saw the answers to a question and thought that was the correct spelling.   At least, I'm not alone in misspelling that word:   Vestigial Vsetigial Vesitgial Veetigial Veatigial Vedtigial Vewtigial Vextigial Vesrigial Vesgigial Vesyigial Vestogial Vestugial Vestkgial Vestirial Vestinial Vestitial Vestihial Vestibial Vestifial Vestigoal Vestigual Vestigkal Vestigisl Vestigizl Vestigiql Vestigiap Vestigiam Vestigiak --- that word must hold some kind of record in the number of ways to misspell a word.   But I got almost all of the letters right. -- [[User:Gerard Schildberger\|Gerard Schildberger]] ([[User talk:Gerard Schildberger\|talk]]) 21:19, 22 August 2013 (UTC) ::::: There is no need to strike misspellings, just correct them.   The reason I did a strike-out for the "quite" misspelling is that the misspelling was discussed later, so I just couldn't correct it without making the comment invalid. -- [[User:Gerard Schildberger\|Gerard Schildberger]] ([[User talk:Gerard Schildberger\|talk]]) 07:22, 23 August 2013 (UTC) :: In my IBM time I learned that American colleagues are less spelling-conscious than we Europeans (or Austrians). It's a matter of emphasis on spelling in school. Did you do quite a web check or a quiet web check --[[User:Walterpachl\|Walterpachl]] ([[User talk:Walterpachl\|talk]]) 05:36, 23 August 2013 (UTC) ::: Rosetta Code isn't the place to publish such observations.   I know a snub when I hear (or read) one.   Best to just remove such comments, even from a discussion page.   Even if it were true, its still not an unbiased opinion or possible not even a valid observation (too limited and narrow), and it might appear that it could be based on a limited sampling group (and by one person at that).   Not everybody has a spell-checker available.   Without spell-checkers, typos are more common.   -- [[User:Gerard Schildberger\|Gerard Schildberger]] ([[User talk:Gerard Schildberger\|talk]]) 03:24, 17 August 2018 (UTC) ::: Are you sure about the   '''lastpos'''   BIF not being available in (your) host's version of REXX?   It's been around in REXX at least since 1984 (according to a VM System Product Interpreter Reference Summary), long before it was ported to MVS (or whatever it's being called now).   Which host (and release) are you using?   I didn't post any of REXX version 2 programs since you signed your name to it, and I didn't want to publish various versions of it, as it would appear that you were the author, and it didn't seem worth all the bother to include disclaimers and whatnot, and I had so many versions.   I was just fooling around and was squeezing blood from a turnip trying to get more performance out of the program.   I probably could get more performance out of it, but I got tired shoveling all that coal, and I had to add more code to handle a special case of long words. -- [[User:Gerard Schildberger\|Gerard Schildberger]] ([[User talk:Gerard Schildberger\|talk]]) 07:28, 22 August 2013 (UTC) :::: I'd suggest to leave the header intact and add change lines such as * yyyyddmm GS this and that. But I really don't care. I put my names into my programs because I like to be known. Your programs are easily recognizable by @ and $ :-) AND your unique indentation rules! --[[User:Walterpachl\|Walterpachl]] ([[User talk:Walterpachl\|talk]]) 05:36, 23 August 2013 (UTC) ::::: I guess some people like to be known.   However, Rosetta Code has a policy against vanity badges and strongly discouraged, and most have been removed.     People can look at the ''history''' file and see who performed the entering of the computer program and/or the changes.   I have learned later (after I did the tuning and timings) that timings are also discouraged, especially between languages.   This whole discussion on the REXX timings should probably be deleted.   <br>Here is the latest revision   (with not much commenting, but better than nothing): <lang rexx>/rexx/ parse arg ifid w /get required options from CL / /{timer}/ parse arg ifid w times . /a good try is 10k ──► 100k. / /{timer}/ if times=='' then times=1 /use a default if omitted. / s='' do while lines(ifid)\==0 s=s linein(ifid) end /DO while/ s=space(s) /remove superfluous blanks. / say 'length of input string:' length(s) /display the length of input. / say call time 'Reset' /reset the REXX elapsed timer./ /{timer}/ do jj=1 for times /the repetitions thingy. / x=s' ' /var X is destroyed (below)./ do while x\=='' /1 chunk at a time./ i=lastpos(' ',x,w+1) /look for blank <W./ if i==0 then do /...no blank found./ call o left(x,w) parse var x =(w) x end else do /... a blank found./ call o left(x,i) parse var x =(i) +1 x end end /DO while/ /{timer}/ end /jj/ say say format(time('Elapsed'),,2) "seconds for" times 'times.' call lineout ifid exit /{timer}/ o: if jj==times then say arg(1); return /show last text/ o: say arg(1); return</lang> Here is the input file: <pre> ────────── Computer programming laws ────────── The Primal Scenario -or- Basic Datum of Experience: ∙ Systems in general work poorly or not at all. ∙ Nothing complicated works. ∙ Complicated systems seldom exceed 5% efficiency. ∙ There is always a fly in the ointment. </pre> -- [[User:Gerard Schildberger\|Gerard Schildberger]] ([[User talk:Gerard Schildberger\|talk]]) 07:28, 22 August 2013 (UTC) : lastpos: no I'm not sure and I have alas no longer a host (pun intended). I wonder where I missed it. Thanks for massaging my program. I shall study it later and test my 1MB file. Your input, by the way, is not exactly a "paragraph", is it? --[[User:Walterpachl\|Walterpachl]] ([[User talk:Walterpachl\|talk]]) 07:44, 22 August 2013 (UTC) :: As mentioned earlier, it was bigger than a paragraph; I hated to cut it down (as the file above).   I was using a 100x200 character wide console window and I needed something with some heft to it.   Plus, with almost all of us (readers of Rosetta Code) being computer programmers of one sort or another, I thought a by-product would be some people perusing the text and reflecting on the wisdom of the laws ... if not only in a Murphy's Law sort of way. -- [[User:Gerard Schildberger\|Gerard Schildberger]] ([[User talk:Gerard Schildberger\|talk]]) 07:58, 22 August 2013 (UTC) My results from testing your program: <pre> with i=lastpos(' ',x,w+1) /look for blank <W./ rexx gs text.txt 72 1000000 -> 7.09 seconds for 1000000 times. with Do i=w+1 to 1 by -1 If substr(x,i,1)=' ' Then Leave End rexx gs2 text.txt 72 1000000 -> 10.88 seconds for 1000000 times. </pre> : I got a 45% improvement (using Regina REXX), you got a 35% improvement (using ooRexx) --- Are my assumptions correct?   How many engines does your laptop have?   How much memory?   What other processes are running?   When I run benchmarks, the computer is running pretty much naked (as possible).   No matter what the improvement (35% or 45%), that's nothing to sneeze at. -- [[User:Gerard Schildberger\|Gerard Schildberger]] ([[User talk:Gerard Schildberger\|talk]]) 20:29, 22 August 2013 (UTC) :: Nobody sneezes. Can't answer your questions. Will use lastpos from now on. thanks. --[[User:Walterpachl\|Walterpachl]] ([[User talk:Walterpachl\|talk]]) 05:36, 23 August 2013 (UTC) Unfortunately I cannot verify a similar performance difference with my 1MB file. --[[User:Walterpachl\|Walterpachl]] ([[User talk:Walterpachl\|talk]]) 19:58, 22 August 2013 (UTC) With o: Return (to avoid output to screen) <pre> rexx gs long.txt 72 -> 2.31 seconds for 1 times rexx gs2 long.txt 72 -> 2.36 seconds for 1 times </pre> --[[User:Walterpachl\|Walterpachl]] ([[User talk:Walterpachl\|talk]]) 20:11, 22 August 2013 (UTC) : With a one megabyte file, you may be measuring the effects of paging in your laptop (as for elapsed time) as well as competition/interference with other processes.   That was one reason why I used a multiplier for the '''do''' loop instead of increasing the amount of text read.   The drawback is that (the multiplier) increases the locality of reference, and I don't know enough about the Microsoft Windows paging sub-system to know how much of an effect that is. -- [[User:Gerard Schildberger\|Gerard Schildberger]] ([[User talk:Gerard Schildberger\|talk]]) 20:36, 22 August 2013 (UTC) :: Let's leave it at that. I shall be using lastpos in the future. thanks. Nevertheless version 2 seems to be undoubtedly better than !?! --[[User:Walterpachl\|Walterpachl]] ([[User talk:Walterpachl\|talk]]) 05:40, 23 August 2013 (UTC) ::: I wouldn't agree that your version is   ''undoubtedly''   better.   I do have a few doubts.   Version one version doesn't erase existing files, it also has more options (the ''kind'' of text justifications, giving the user a choice), it has as lot more documentation (comments) to explain what is happening and why, has error checking and error messages to handle bad command line options, checks for file-not-found and file-is-empty conditions, etc.   I assume you must be using a different or unknown metric(s) for ''undoubtedly better".   Rosetta Code is not the place to crow about one's version being better than another, '''unless''' you wrote both versions and you're pointing out the value   (however one judges ''value'')   of one program entry versus another.   -- [[User:Gerard Schildberger\|Gerard Schildberger]] ([[User talk:Gerard Schildberger\|talk]]) 03:24, 17 August 2018 (UTC)