Talk:Gradient descent: Difference between revisions

From Rosetta Code
Content added Content deleted
(→‎Needs more information: Differences between the results produced by the different samples is worrying)
(→‎Needs more information: Responded to Tigerofdarkness.)
Line 8: Line 8:
:How was the initial guess derived - if the initial guess is changed, the results are different.
:How was the initial guess derived - if the initial guess is changed, the results are different.
:I tried a varient using 32 bit floats instead of 64 bit and the results are similar (but different, of course). I also found that (with 32 bit) delG can become 0 before b is set to alpha / delG - this presumanbly should be tested for? --[[User:Tigerofdarkness|Tigerofdarkness]] ([[User talk:Tigerofdarkness|talk]]) 19:07, 2 September 2020 (UTC)
:I tried a varient using 32 bit floats instead of 64 bit and the results are similar (but different, of course). I also found that (with 32 bit) delG can become 0 before b is set to alpha / delG - this presumanbly should be tested for? --[[User:Tigerofdarkness|Tigerofdarkness]] ([[User talk:Tigerofdarkness|talk]]) 19:07, 2 September 2020 (UTC)

::More worrying still is the fact that the Go example, on which a lot of the other examples are based, no longer gives the same results as it did a year ago. When I ran it on my current setup (Ubuntu 18.04, Go version 1.14.7 amd64) the results were: x[0] = 0.10725956484848279, x[1] = -1.2235527984213825 which is 'miles' away from what they were before!

::Just to make sure, I ran it again on last year's setup (Ubuntu 16.04, Go version 1.12.5 amd64) and the results agreed with those previously posted: x[0] = 0.10764302056464771, x[1] = -1.223351901171944.

::Go has, of course, moved on a couple of versions in the interim and a possible reason for the discrepancy is that FMA instructions are now being supported (from v1.14) which will mean that a FP operation of the form x * y + z will be computed with only one rounding. So in theory results should be more accurate than before.

::I first noticed that there was a discrepancy a couple of days back when I was trying to add a Wren example. My first attempt was a straight translation of the Go code which gave results of: x[0] = 0.10781894131876, x[1] = -1.2231932529554.

::I then decided to switch horses and use zkl's 'tweaked' gradG function which gave results very close to zkl itself so I posted that. Incidentally, I wasn't surprised that there was a small discrepancy here as I'm using a rather crude Math.exp function (basically I apply the power function to e = 2.71828182845904523536) pending the inclusion of a more accurate one in the next version of Wren's standard library which will call the C library function exp().

::So I don't know where all this leaves us. There are doubtless several factors at work here and, as you say changing the initial guess leads to different results. Something else which leads to different results is whether one allows gradG to mutate 'x'. As the Go code stands it copies 'x' to 'y' and so doesn't mutate the former. However, it looks to me as though some translations may be indirectly mutating 'x' (depending on whether arrays are reference or value types in those languages) by simply assigning 'x' to 'y'. If I make this change in the Go code, the results are: x[0] = 0.10773473656605767, x[1] = -1.2231782829927973 and in the Wren code: x[0] = 0.10757894411096, x[1] = -1.2230849416131 so it does make quite a difference. --[[User:PureFox|PureFox]] ([[User talk:PureFox|talk]]) 10:11, 3 September 2020 (UTC)


== promoted from draft? ==
== promoted from draft? ==

Revision as of 10:11, 3 September 2020

Needs more information

This needs a description, purpose and preferably, a given function to solve for so that different implementations can be compared. Changed to draft status until that is supplied. --Thundergnat (talk) 12:30, 1 July 2019 (UTC)

Luckily, I managed to find a freely available book excerpt (from Google books) which contained the C# code from which the first Typescript example had been translated together with some explanation of what was being done here.
I've therefore added a rudimentary task description and a Go translation to start the ball rolling. --PureFox (talk) 17:48, 8 July 2019 (UTC)
The differences between the results from the different samples is worrying - can it really be due to minor differences in the different languages' sqrt and exp functions? How much accuracy should be expected (is it worth printing 16 digits)? It looks like the answer is somewhere around 0.107, -1.22?
How was the initial guess derived - if the initial guess is changed, the results are different.
I tried a varient using 32 bit floats instead of 64 bit and the results are similar (but different, of course). I also found that (with 32 bit) delG can become 0 before b is set to alpha / delG - this presumanbly should be tested for? --Tigerofdarkness (talk) 19:07, 2 September 2020 (UTC)
More worrying still is the fact that the Go example, on which a lot of the other examples are based, no longer gives the same results as it did a year ago. When I ran it on my current setup (Ubuntu 18.04, Go version 1.14.7 amd64) the results were: x[0] = 0.10725956484848279, x[1] = -1.2235527984213825 which is 'miles' away from what they were before!
Just to make sure, I ran it again on last year's setup (Ubuntu 16.04, Go version 1.12.5 amd64) and the results agreed with those previously posted: x[0] = 0.10764302056464771, x[1] = -1.223351901171944.
Go has, of course, moved on a couple of versions in the interim and a possible reason for the discrepancy is that FMA instructions are now being supported (from v1.14) which will mean that a FP operation of the form x * y + z will be computed with only one rounding. So in theory results should be more accurate than before.
I first noticed that there was a discrepancy a couple of days back when I was trying to add a Wren example. My first attempt was a straight translation of the Go code which gave results of: x[0] = 0.10781894131876, x[1] = -1.2231932529554.
I then decided to switch horses and use zkl's 'tweaked' gradG function which gave results very close to zkl itself so I posted that. Incidentally, I wasn't surprised that there was a small discrepancy here as I'm using a rather crude Math.exp function (basically I apply the power function to e = 2.71828182845904523536) pending the inclusion of a more accurate one in the next version of Wren's standard library which will call the C library function exp().
So I don't know where all this leaves us. There are doubtless several factors at work here and, as you say changing the initial guess leads to different results. Something else which leads to different results is whether one allows gradG to mutate 'x'. As the Go code stands it copies 'x' to 'y' and so doesn't mutate the former. However, it looks to me as though some translations may be indirectly mutating 'x' (depending on whether arrays are reference or value types in those languages) by simply assigning 'x' to 'y'. If I make this change in the Go code, the results are: x[0] = 0.10773473656605767, x[1] = -1.2231782829927973 and in the Wren code: x[0] = 0.10757894411096, x[1] = -1.2230849416131 so it does make quite a difference. --PureFox (talk) 10:11, 3 September 2020 (UTC)

I thought it was normal for a task to be a draft task until at least four (or so) examples have been entered, and also wait a week or so before promoting it.   I know there are no hard and fast rules.     -- Gerard Schildberger (talk) 19:21, 1 July 2019 (UTC)

In general, I wait for a minimum of 3 months and 20 implementations before I promote one of my tasks out of draft. That way there is plenty of opportunity for discussion and tweaks if necessary. As far as I'm concerned, this doesn't even rise to the level of a draft yet, let alone a full task. Reverted back to draft (again). --Thundergnat (talk) 20:57, 1 July 2019 (UTC)