Talk:Happy numbers

From Rosetta Code
Revision as of 13:30, 7 May 2009 by Ce (talk | contribs) (→‎On caching (and laziness): But just doing one iteration is still better (and I'm now convinced it actually is).)

On caching (and laziness)

It seemed prudent to put in simple caching of n in the Python solution, but not the full-length permutations of the digits of n as cache size grows very quickly. I had thought of just putting in a 'normal form' of n into the cache which would be the digits in sorted order, but then you would have to calculate the normal form before every search of the cache, which would take more time. Time vs space, space vs time? I chose the lazy option and left them both out of the Python, but maybe I should add an example with the normal-form space optimised cache so an implementation is out there? --Paddy3118 00:28, 7 May 2009 (UTC)

You don't need any caching (see the alternative C++ version I've just added). Of course you'll do more calculations anyway, so this is still a time/space tradeof.
Also note that your "normal form" is "half" of the iteration step. Of course your cache is more effective if you also add the second step (because, after all, numbers which have a different normal form may still have the same next number (e.g. both 1111 and 2 have 4 as next value). But that's the same as using the original cache, but not caching in the first iteration. So the savings with your optimized cache are probably negligible. --Ce 07:20, 7 May 2009 (UTC)
We don't need to store the permutations. Since, for the final computation, what really matters is how many of every digits there are...; e.g. 910: I don't need to store 190, 910, 109 (and eventually 019, which indeed with the "Bag" technics is considered different from 19); it is enough I say that a number containing one 0, one 1 and one 9 is happy. --ShinTakezou 10:50, 7 May 2009 (UTC)
However if instead of the "bunch of numbers" you instead store the sum of their squares (which is actually the only thing that matters), you still get more efficient (indeed, you get more efficient in two ways; you need less storage per entry, and you need less entries, because e.g. the cases "4 ones", "4 ones and a zero", "one two", "one two and ten zeroes", etc. all go into the same cache entry).
However, I now think it would be a saving anyway. The digit square sums tend to be much smaller than the original numbers; indeed, with a 32 bit unsigned int, the largest possible value after one step would be 657 (unless I mis-calculated). If storing the flags in single bits, you'd need just 83 bytes, and lookup would be strictly O(1). Moreover, if you don't have that number in the cache, you have to do this calculation anyway, so you don't lose much time in that case, so the only efficiency loss is for numbers already in the cache, but that's just one iteration (for 32 bit uint in the worst case, 18 divisions, 18 modulo operations, 9 multiplications and 17 additions, if I counted correctly). Moreover the needed memory then goes linear in the number of digits, so even with 64 bits you won't need too much memory. --Ce 13:30, 7 May 2009 (UTC)