Talk:K-d tree: Difference between revisions

 
(3 intermediate revisions by 3 users not shown)
Line 127:
again to avoid a hang, again following the Wikipedia pseudocode. (Other differences from the Wikipedia pseudocode are OK, I think, and have to do with the way the C code uses pointers instead of indexes, and keeps <code>end</code> pointing a position after the last element rather than to the last element itself.)
 
I didn't attempt to edit the example myself, partly because I don't know how :->, partly because I didn't want to be presumptuous or offend anyone. But I'm quite sure that the code as it exists is bad; my empirical testing showed that very clearly. It would be good for the code to more rigorously test itself; once the tree is constructed, it is straightforward enough to recursively check that each left/right subtree obeys the spatial partitioning imposed by the coordinate of the parent node. In fact I wrote such test code in my project, along the path to understanding and fixing this bug, and it did flag the tree constructed by the original RosettaCode version of <code>find_median()</code> as having constructed an incorrect tree, and did pass the tree constructed by the algorithm once fixed as described above. Together with a large test dataset that contains many exact duplicate coordinate values, such test code should allow the bug to be confirmed and then the fix validated.
&mdash;[[User:Bhaller|Bhaller]] 17:18, 15 August 2017 (UTC)
 
: Those sound like good observations. I do not have a strong background in kd trees, myself, so I'll leave the code handling for someone else. That said, I'm wondering if it wouldn't make sense to de-dup the points before running the algorithm against them? (I understand that in some graphics applications you might have duplicate copies of the same points, but I'm wondering why that makes sense in the context of a kd-tree?)
 
: Also, you should sign your comments on discussion pages. The wiki will fill in the details for you if you use <nowiki>--~~~~</nowiki> when you post the comment. This time, though, I manually constructed your signature from information on the history page. Anyways, thanks for taking the time to think this through - it's always good when we have people exercising the code enough to find where it has problems. --[[User:Rdm|Rdm]] ([[User talk:Rdm|talk]]) 18:03, 15 August 2017 (UTC)
 
: Hi Rdm. Thanks; I'm attempting to sign this post properly, we'll see what happens. :-> Even if de-dup'ing the points were a good idea (which would depend on the application in which one is using the k-d tree; it definitely would not be a good idea for my particular application), it wouldn't be a fix for this bug, because here the bug occurs even if two points share just a single coordinate; they don't have to share all of their coordinates. In other words, for 2-D data, the bug could be triggered by having points at (10,17) and (5, 17). You can't de-dup those. :-> --[[User:Bhaller|Bhaller]] ([[User talk:Bhaller|talk]]) 18:19, 15 August 2017 (UTC)
 
== C++ Bug? ==
 
Shouldn't the line:
 
root_ = make_tree(0, nodes_.size(), 0);
 
in the kdtree constructor be:
 
root_ = make_tree(0, nodes_.size()-1, 0);
 
As it is the following line in make_tree() reads off the end of the nodes array:
 
std::nth_element(&nodes_[begin], &nodes_[n], &nodes_[end], node_cmp(index));
 
: The correct fix is to replace &nodes_[end] by &nodes[0] + end, which is fine provided that nodes_ is not empty (the end pointer is not dereferenced). I'll fix the code soon. [[User:Simonjsaunders|Simonjsaunders]] ([[User talk:Simonjsaunders|talk]]) 12:44, 24 April 2020 (UTC)
1,777

edits