An efficient scatterplot alternative for large datasets
- It takes a very long time (38.8 seconds in this case) to produce a ~1Million point graph using matplotlib.pyplot
- Points overlap and hide those underneath
- Labels are assigned a RGB color
- A 2D histogram is created for each label
- RGB matricies for all labels are combined and normalized to produce a single RGB image
- Much faster (0.4 seconds)
- No obstruction
- Can be difficult to see edge cases
More info can be found in the .ipnb file
Scatter | Hist | Hist (with gain) |