User talk:WillNess: Difference between revisions

← Older edit

User talk:WillNess (view source)

Revision as of 07:43, 27 December 2016

4,463 bytes added , 7 years ago

done!

WillNess

751

edits

Revision as of 06:50, 22 August 2016 (view source) WillNess (talk \| contribs) (priorities...) ← Older edit		Latest revision as of 07:43, 27 December 2016 (view source) WillNess (talk \| contribs) (done!)
(9 intermediate revisions by 3 users not shown)
Line 69: :::: That's a beautiful expression of the code, but it has a horrible space leak such that it's "out of memory" for anything much above 10 billion and takes about 80% of the time on garbage collection; I think the problem is that the generated zip list is entirely unevaluated (including the "add to list" predicate check) until used by the "sum" and "sortBy" functions - a problem that the "foldl'" version didn't have as it was forced to be strict. I'm afraid that while the THPoER highly abstract form can sometimes reduce complex code into something simple such that it runs faster, in this case, the abstract code comes at an extremely high cost, due to it practically being expressed as non-strict lists with the construction of such lists taking many extra cycles in the innermost loop. There are also the practical matters of memory use and garbage collection... Sometimes we need to look at code from a low level perspective to see what it is really doing; I'm afraid that in this case the non-strictness of Haskell has bitten you on the a... ;) --[[User:GordonBGood\|GordonBGood]] ([[User talk:GordonBGood\|talk]]) 03:56, 22 August 2016 (UTC) ::::: Yes, I saw that error for above 10--20B; I decided to not care about it here, and to go with the shorter code to express the algorithm clearly. The "out of memory" bit comes from Ideone and I'm not entirely sure why, when it's still reporting the exact same memory consumption for all the cases below 10B. '''Real''' space leaks are usually seen on Ideone as reported memory climbing up into the 100s MBs, not so here. Any case, it's a compiler issue. And, we have your version on the page now for the optimized code. :) -- [[User:WillNess\|WillNess]] ([[User talk:WillNess\|talk]]) 06:50, 22 August 2016 (UTC) :::::: Just for completeness, the "out of memory" error isn't just IdeOne but also with Windows GHC version 8.0.1 for values approaching a trillion on my machine (which has a lot of memory; I haven't tried to determine the exact limit); the < 20 billion limit on IdeOne is due to the amount of memory assigned. I suspect it isn't flagged as a "real" space leak because there is not a fixed pointer to the unevaluated zipped list; there is just the unevaluated list comprehensions. When the "sum" operation is run, it forces the partial evaluation of every list element to the first part of the tuple '''but not the evaluation of the predicate on whether a given element gets including in the concatenated list or not'''; this last doesn't get evaluated until the determined values get sorted, at which point the unused values get dumped (and garbage collected). So, during the course of calculation to a trillion, something like 100's of Gigabytes would be consumed (my machine doesn't have that many and thus fails). Thus, during the execution of the program the memory use goes from zero to very high but is back to very low again by the time the program ends. Although the expression of the algorithm is beautiful, the memory use profile is not! And yes, my version shows what can be done, including that if IdeOne were 64-bit, 10 billion should take less than 150 milliseconds, not about a half second for the "foldl'" version, not about 1.5 seconds for this version, and that maximum memory residency (as determined by "+RTS -s") is almost zero not many many megabytes. --[[User:GordonBGood\|GordonBGood]] ([[User talk:GordonBGood\|talk]]) 08:28, 22 August 2016 (UTC) ::::::: Ah, great, thanks for that. Yes, I understand the space leak here, that's why I wrote those other versions back then; just thought ''maybe'' they fixed it by now with the newer GHC somehow. You're right, all this should be made clear in the text. -- [[User:WillNess\|WillNess]] ([[User talk:WillNess\|talk]]) 10:34, 22 August 2016 (UTC) ::::::: I've tried it locally, and it reports "234MB memory in use" for 1B and 1187MB for the 10B. I don't understand what is this 7.8MB figure that Ideone shows for both. I was misled by this. -- [[User:WillNess\|WillNess]] ([[User talk:WillNess\|talk]]) 10:45, 22 August 2016 (UTC) ::::::: BTW the really short version of it is just <code>(Sum c,b) = mconcat [ ( Sum (i+1), ...</code>. Here it could be argued that it '''''ought''''' to be compiled efficiently, as the whole premise of Monoids is that associativity enables re-parenthesization ''(<code>a:b:c:... = [a]++([b]++([c]++...)) = ([a]++[b])++([c]++...)</code>)'', which is the basis for the efficiency of the strict left fold. Yet it is more than twice worse than the <code>prod</code> code, both in time and space. ::::::: BTW, I got 1B:0.45s-269MB and 10B:1.75s-1343MB, which suggests 1T:38s-'''28.8GB''' ''"total memory in use"''. The fold-based version indeed did good at 1T:11.2s-'''10MB'''. I use the signature <code>Int -> (Double, (Int,Int,Int))</code> which produced the fastest code in my tests (simple -O2). -- [[User:WillNess\|WillNess]] ([[User talk:WillNess\|talk]]) 17:13, 22 August 2016 (UTC) :::::::: Looks good; it's still about three times slower and uses more memory than 1T:3.75s:3MB "total memory in use" (from local) as for my version on IdeOne but it is good as referents will have a choice. I went back to using Word64 as the fastest (just a slightly higher range than Int) even for 32-bit code as it is faster, since it then uses the same type for the internal count. it looks like for all the Monoid stuff to work as you would like to see it there will have to be all kinds of specializations that haven't been added - complex stuff, and doing "concat" efficiently has always been a headache. --[[User:GordonBGood\|GordonBGood]] ([[User talk:GordonBGood\|talk]]) 00:50, 23 August 2016 (UTC) == Factors of an integer == The current algorithms factors_co and factors_o seem to give the wrong result. For example, factors_co 120 is missing the number 12 in its result. --[[User:Helge\|Helge]] ([[User talk:Helge\|talk]]) 13:15, 26 December 2016 (UTC) : Good catch, thanks! Will fix. -- [[User:WillNess\|WillNess]] ([[User talk:WillNess\|talk]]) 21:06, 26 December 2016 (UTC) : Done! Thanks again for spotting this. An added bonus was, the code simplified a bit, too. -- [[User:WillNess\|WillNess]] ([[User talk:WillNess\|talk]]) 07:43, 27 December 2016 (UTC)