Talk:K-d tree: Difference between revisions

→‎New task: N >> 2^k
(→‎New task: Point out useful paragraph from WP page)
(→‎New task: N >> 2^k)
Line 6:
<blockquote>k-d trees are not suitable for efficiently finding the nearest neighbour in high dimensional spaces. As a general rule, if the dimensionality is k, the number of points in the data, <math>N</math>, should be <math>N \gg 2^k</math>. Otherwise, when k-d trees are used with high-dimensional data, most of the points in the tree will be evaluated and the efficiency is no better than exhaustive search, and approximate nearest-neighbour methods are used instead.</blockquote>
It would be nice to have some sort of note along those lines here too as it is a major criterion for algorithm/data-structure selection. (Astronomy's mostly low-dimensioned, 2D or occasionally 3D, so k-d trees make plenty of sense for them. Alas, the work I've done in an astronomy-allied field recently was all very high dimensioned with some dimensions not being standard-numeric, so we couldn't make good use of this sort of thing and anyway didn't need it as “nearest neighbour” wasn't a problem we had to solve. Instead, we use ''lots'' of relational databases. But I'm rambling…) –[[User:Dkf|Donal Fellows]] 06:10, 7 March 2012 (UTC)
: Certainly true. Once I put in the count of nodes visited I found that many searches on the WP data set (N = 6 and k = 2) lead to all nodes being visited. My choice of the point (9, 2) was contrived to give an answer with only 3 nodes visited. It's tough to pick how much information to present in the task description and how much to expect people to do their own homework, but sure, I added the paragraph. &mdash;[[User:Sonia|Sonia]] 19:02, 7 March 2012 (UTC)
1,707

edits