
Table of Contents
Last update: June 2026. All opinions are my own.
ML Foundations · Post 3/10
A question I used to get wrong
Which dataset has more information: A (10,000 instances, 100 features) or B (10,000 instances, 10,000 features)?
The intuitive answer is B — more features should mean more information, right? Wrong. The answer is it depends — and in practice, B usually has less useful information than A.
Why? The curse of dimensionality.
More dimensions = larger search space = more data needed
Each feature is a new dimension. Adding features makes the space your algorithm has to search exponentially bigger. The same number of data points becomes sparser and sparser — and your algorithm has to find a pattern in a much emptier space.
Picture it geometrically:
- 1 feature. Points spread along a line. 10,000 points fills the line densely.
- 2 features. Points spread across a square. 10,000 points are noticeably more spread out.
- 3 features. Points spread through a cube. The same 10,000 points now cover only a tiny fraction of the volume.
- d features. The volume grows as roughly r^d. By d = 20 or 30 your points are isolated specks in a vast empty space, and any notion of "nearest neighbour" loses meaning.
Irrelevant features add noise
If the new features carry real signal, fine. If they don't, you've made the problem strictly harder without adding any information. You've added noise dimensions that the model has to learn to ignore.
So adding features isn't free. If they aren't relevant, you have less useful information after adding them, not more.
Feature selection helps
The fix is simple in principle, hard in practice: keep the features that carry signal, drop the rest. Filter methods (correlation with the target), wrapper methods (try subsets and measure model performance), and embedded methods (Lasso, tree-based importance) all approach the same problem from different angles.
The reason this matters is leverage: removing a noise dimension is worth more than tuning a model. You're shrinking the search space, not just smoothing the search.
⚠️ When in doubt, fewer features beats more features. Adding a feature should require justification, not the other way around.
Next up — Post 4: Feature Engineering Is the Key.
