Talk:Gradient descent: Difference between revisions

→‎Needs more information: Further comments.
(→‎Needs more information: using the actual gradient instead of the aproximation)
(→‎Needs more information: Further comments.)
Line 17:
::I first noticed that there was a discrepancy a couple of days back when I was trying to add a Wren example. My first attempt was a straight translation of the Go code which gave results of: x[0] = 0.10781894131876, x[1] = -1.2231932529554.
 
::I then decided to switch horses and use zkl's 'tweaked' gradG function which gave results very close to zkl itself so I posted that. Incidentally, I wasn't surprised that there was a small discrepancy here as I'm using a rather crude Math.exp function (basically I apply the power function to e = 2.71828182845904523536) pending the inclusion of a more accurate one in the next version of Wren's standard library which will call t:::hethe C library function exp().
 
::So I don't know where all this leaves us. There are doubtless several factors at work here and, as you say changing the initial guess leads to different results. Something else which leads to different results is whether one allows gradG to mutate 'x'. As the Go code stands it copies 'x' to 'y' and so doesn't mutate the former. However, it looks to me as though some translations may be indirectly mutating 'x' (depending on whether arrays are reference or value types in those languages) by simply assigning 'x' to 'y'. If I make this change in the Go code, the results are: x[0] = 0.10773473656605767, x[1] = -1.2231782829927973 and in the Wren code: x[0] = 0.10757894411096, x[1] = -1.2230849416131 so it does make quite a difference. --[[User:PureFox|PureFox]] ([[User talk:PureFox|talk]]) 10:11, 3 September 2020 (UTC)
Line 28:
:::I suspect that Julia is also using the actual gradient function as it is (I presume) using a built-in minimising function that uses the actual gradient function.
:::--[[User:Tigerofdarkness|Tigerofdarkness]] ([[User talk:Tigerofdarkness|talk]]) 12:08, 3 September 2020 (UTC)
 
::::Yes, to get consistent results, the answer does seem to be to use Fortran's gradient function.
::::I just substituted that in the Go code and obtained results of: x[0] = 0.10762682432948055, x[1] = -1.2232596548816101 which now agrees to 6 decimal places with the Fortran, Julia and your Algol 68 and Algol W solutions. So I'm going to update the Go example on the main page and suggest that those who've previously translated it update their translations accordingly. Thanks for your efforts here. --[[User:PureFox|PureFox]] ([[User talk:PureFox|talk]]) 13:14, 3 September 2020 (UTC)
 
== promoted from draft? ==
9,476

edits