Talk:Total circles area: Difference between revisions

 
(7 intermediate revisions by 4 users not shown)
Line 109:
::In contrast, the C coded Monte Carlo method takes 100 million points to compute 21.56262288. --[[User:Paddy3118|Paddy3118]] 15:15, 22 September 2012 (UTC)
::With 100 million points, the Python VdC area estimate is 21.565056432232886. --[[User:Paddy3118|Paddy3118]] 15:24, 22 September 2012 (UTC)
::: Maybe you want to write part of such information inside the page, instead of just here.
 
::: The comparisons are incorrect. Here are two modified examples: <lang python>from __future__ import division
from math import sqrt, atan2
from itertools import count
from pprint import pprint as pp
try:
from itertools import izip as zip
except:
pass
 
# Remove duplicates and sort, largest first.
circles = sorted(set([(0.001,0.523,1)]))
 
def vdcgen(base=2):
'Van der Corput sequence generator'
for n in count():
vdc, denom = 0,1
while n:
denom *= base
n, remainder = divmod(n, base)
vdc += remainder / denom
yield vdc
 
def vdc_2d():
'Two dimensional Van der Corput sequence generator'
for x, y in zip(vdcgen(base=2), vdcgen(base=3)):
yield x, y
 
def bounding_box(circles):
'Return minx, maxx, miny, maxy'
return (min(x - r for x,y,r in circles),
max(x + r for x,y,r in circles),
min(y - r for x,y,r in circles),
max(y + r for x,y,r in circles)
)
def circle_is_in_circle(c1, c2):
x1, y1, r1 = c1
x2, y2, r2 = c2
return (sqrt((x2 - x1)**2 + (y2 - y1)**2) + r2) <= r1
 
def remove_covered_circles(circles):
'Takes circles in decreasing radius order. Removes those covered by others'
i, covered = 0, []
while i < len(circles):
c1 = circles[i]
eliminate = []
for c2 in circles[i+1:]:
if circle_is_in_circle(c1, c2):
eliminate.append(c2)
if eliminate: covered += [c1, eliminate]
for c in eliminate: circles.remove(c)
i += 1
#pp(covered)
 
def main(circles):
print('Originally %i circles' % len(circles))
print('Bounding box: %r' % (bounding_box(circles),))
remove_covered_circles(circles)
print(' down to %i due to some being wholly covered by others' % len(circles))
minx, maxx, miny, maxy = bounding_box(circles)
# Shift to 0,0 and compute r**2 once
circles2 = [(x - minx, y - miny, r*r) for x, y, r in circles]
scalex, scaley = abs(maxx - minx), abs(maxy - miny)
pcount, inside, last = 0, 0, ''
for px, py in vdc_2d():
pcount += 1
px *= scalex; py *= scaley
for cx, cy, cr2 in circles2:
if (px-cx)**2 + (py-cy)**2 <= cr2:
inside += 1
break
if not pcount % 100000:
area = (inside/pcount) * scalex * scaley
print('Points: %8i, Area estimate: %r'
% (pcount, area))
# Hack to check if precision OK
this = '%.4f' % area
if this == last:
break
else:
last = this
print('The value has settled to %s' % this)
print('error = %e' % (area - 4 * atan2(1,1)))
 
if __name__ == '__main__':
main(circles)</lang>
 
::: <lang python>from collections import namedtuple
from math import floor, atan2
 
Circle = namedtuple("Circle", "x y r")
 
circles = [ Circle(0.001, 0.523, 1) ]
def main():
# compute the bounding box of the circles
x_min = min(c.x - c.r for c in circles)
x_max = max(c.x + c.r for c in circles)
y_min = min(c.y - c.r for c in circles)
y_max = max(c.y + c.r for c in circles)
 
box_side = 512
 
dx = (x_max - x_min) / float(box_side)
dy = (y_max - y_min) / float(box_side)
 
y_min = floor(y_min / dy) * dy
x_min = floor(x_min / dx) * dx
 
count = 0
 
for r in xrange(box_side + 2):
y = y_min + r * dy
for c in xrange(box_side + 2):
x = x_min + c * dx
for circle in circles:
if (x-circle.x)**2 + (y-circle.y)**2 <= (circle.r ** 2):
count += 1
break
 
print "Approximated area:", count * dx * dy
print "error = %e" % (count * dx * dy - 4 * atan2(1,1))
 
main()</lang>
::: Both examples have exactly one unit circle, and the regular method's error is about 8 times smaller and runs 10 times faster. Both being more or less uniform sampling, there's no inherent reason one should be more accurate than the other, but regular sampling is just going to be faster due to simplicity. Unless you can provide good reasoning or meaningful error estimate, it's very inapproriate to claim some method is better at it than others. --[[User:Ledrug|Ledrug]] 08:30, 15 October 2012 (UTC)
 
::Hi Ledrug, are you saying that the results quoted may be correct for the programs as given on the task page, but are either:
::* Not due to the use Van der Corput?
::* Or peculiar to the example circles chosen?
I must admit that I I only trawled the results stated for the other programs and did not do any extensive reruns with different circles as you have done. I just extracted accuracy vs points. --[[User:Paddy3118|Paddy3118]] 08:56, 15 October 2012 (UTC)
 
== C89 vs C99 ==
Line 121 ⟶ 251:
 
: If the 2X speed decrease you are seeing is real (and I can't reproduce it), I suggest you to create a bug report for GCC devs, to help them fix the situation in the right place, inside GCC, instead of inside my code. --[[User:Bearophile|bearophile]] 12:13, 25 September 2012 (UTC)
 
==Excessive digits==
Maybe it's worth explaining after the Task description how the many digits of the reference answer were computed. Have you used this http://hackage.haskell.org/package/hmpfr with the Haskell analytical solution? --[[User:Bearophile|bearophile]] 11:22, 1 October 2012 (UTC)
: No, it was calculated using the same method as Haskell, but in C with MPFR, at 2000 bits precision. I only did it as an exercise to know MPFR, and the code was too messy to post here -- I got too bored halfway and stopped keeping track of memory allocation, so there are hundreds of lost memory chunks and valgrind was extremely grumpy about it, but I do believe all the digits I posted are accurate. --[[User:Ledrug|Ledrug]] 20:06, 1 October 2012 (UTC)
:: If you want to create a D version of the analytical solution (using regular double or real floating point values), I will fix your code and help you with the D idioms. D has a garbage collector, so manual tracking of memory allocation/deallocation is usually not needed. Otherwise I'll eventually try to translate your Haskell code myself to D. --[[User:Bearophile|bearophile]] 23:07, 14 October 2012 (UTC)
 
I don't know D well enough to write code in it. You are welcome to write it yourself, I'll just describe the algorithm in English in case there's any confusion about it:
 
# Given a bunch of circles, give all of them the same winding. Say, imagine every circle turns clockwise.
# Find all the intersection points among all circles. Between each pair, there may be 0, 1 or 2 intersections. Ignore 0 and 1, we need only the 2 case.
# For each circle, sort its intersection points in clockwise order (keep in mind it's a cyclic list), and split it into arc segments between neighboring point pairs. Circles that don't cross other circles are treated as a single 360 degree arc.
# Imagine all circles are on a white piece of paper, and have their interiors inked black. We are only interested in the arcs separating black and white areas, so we get rid of the arcs that aren't so. These are the arcs that lie entirely within another circle. Because we've taken care of all intersections eariler, we only need to check if any point on an arc segment is in any other circle; if so, remove this arc. The haskell code uses the middle point of an arc for this, but anything other than the two end points would do.
# The remaining arcs form one or more closed paths. Each path's area can be calculated, and the sum of them all is the area needed. Each path is done like the way mentioned on that stackexchange page cited somewhere. One thing that page didn't mention is that it works for holes, too. Suppose the circles form an area with a hole in it; you'd end up with two paths, where the outer one winds clockwise, and the inner one ccw. Use the same method to calculate the areas for both, and the outer one would have a positive area, the inner one negative. Just add them up.
 
There are a few concerns about this algorithm: firstly, it's fast only if there are few circles. Its complexity is maybe O(N^3) with N = number of circles, while normal scanline method is probably O(N * n) or less, with n = number of scanlines. Secondly, step 4 needs to be accurate; a small precision error there may cause an arc to remain or be removed by mistake, with disastrous consequences. Also, it's difficult to estimate the error in the final result. The scanline or Monte Carlo methods have errors mostly due to statistics, while this method's error is due to floating point precision loss, which is a very different can of worms. --[[User:Ledrug|Ledrug]] 01:24, 15 October 2012 (UTC)
 
== Integral solution? ==
 
Does this solution with the [[wp:Heaviside step function|Heaviside step function]] (<math>H</math>) work?
 
:<math>\int_{(x,y)\in\mathbb{R}^2} (\prod_{i=1}^N H((r^{(i)})^2 - (x-c_x^{(i)})^2 - (y-c_y^{(i)})^2)) dx dy</math>
 
If it does maybe then we can write <math>H(x) = \lim_{k\to\infty}\frac{1}{1+e^{-2kx}}</math> and hope for things to simplify somehow?
--[[User:Grondilu|Grondilu]] ([[User talk:Grondilu|talk]]) 12:51, 3 October 2023 (UTC)
1,934

edits