Rendering highly granular and "zoomed out" data
Asked Answered
I

2

8

There was a gif on the internet where someone used some sort of CAD and drew multiple vector pictures in it. On the first frame they zoom-in on a tiny dot, revealing there a whole new different vector picture just on a different scale, and then they proceed to zoom-in further on another tiny dot, revealing another detailed picture, repeating several times. here is the link to the gif
Or another similar example: imagine you have a time-series with a granularity of a millisecond per sample and you zoom out to reveal years-worth of data.

My questions are: how such a fine-detailed data, in the end, gets rendered, when a huge amount of data ends up getting aliased into a single pixel.
Do you have to go through the whole dataset to render that pixel (i.e. in case of time-series: go through million records to just average them out into 1 line or in case of CAD render whole vector picture and blur it into tiny dot), or there are certain level-of-detail optimizations that can be applied so that you don't have to do this?
If so, how do they work and where one can learn about it?

Idealistic answered 4/6, 2020 at 9:48 Comment(1)
Plotly can create such zoomable scatterplots. Since it is an open source project, all the necessary code is reviewable.Manducate
O
3

This is a very well known problem in games development. In the following I am assuming you are using a scene graph, a node-based tree of objects.

Typical solutions involve a mix of these techniques:

  • Level Of Detail (LOD): multiple resolutions of the same model, which are shown or hidden so that only one is "visible" at any time. When to hide and show is usually determined by the distance between camera and object, but you could also include the scale of the object as a factor. Modern 3d/CAD software will sometimes offer you automatic "simplification" of models, which can be used as the low res LOD models.

At the lowest level, you could even just use the object's bounding box. Checking whether a bounding box is in view is only around 1-7 point checks depending on how you check. And you can utilise object parenting for transitive bounding boxes.

  • Clipping: if a polygon is not rendered in the view port at all, no need to render it. In the GIF you posted, when the camera zooms in on a new scene, what is left from the larger model is a single polygon in the background.

  • Re-scaling of world coordinates: as you zoom in, the coordinates for vertices become sub-zero floating point numbers. Given you want all coordinates as precise as possible and given modern CPUs can only handle floats with 64 bits precision (and often use only 32 for better performance), it's a good idea to reset the scaling of the visible objects. What I mean by that is that as your camera zooms in to say 1/1000 of the previous view, you can scale up the bigger objects by a factor of 1000, and at the same time adjust the camera position and focal length. Any newly attached small model would use its original scale, thus preserving its precision. This transition would be invisible to the viewer, but allows you to stay within well-defined 3d coordinates while being able to zoom in infinitely.

    On a higher level: As you zoom into something and the camera gets closer to an object, it appears as if the world grows bigger relative to the view. While normally the camera space is moving and the world gets multiplied by the camera's matrix, the same effect can be achieved by changing the world coordinates instead of the camera.

Olen answered 17/6, 2020 at 7:23 Comment(0)
S
0

First, you can use caching. With tiles, like it's done in cartography. You'll still need to go over all the points, but after that you'll be able zoom-in/zoom-out quite rapidly.

But if you don't have extra memory for cache (not so much actually, much less than the data itself), or don't have time to go over all the points you can use probabilistic approach.

It can be as simple as peeking only every other point (or every 10th point or whatever suits you). It yields decent results for some data. Again in cartography it works quite well for shorelines, but not so well for houses or administrative boarders - anything with a lot of straight lines.

Or you can take a more hardcore probabilistic approach: randomly peek some points, and if, for example, there're 100 data points that hit pixel one and only 50 hit pixel two, then you can more or less safely assume that if you'll continue to peek points still pixel one will be twice as likely to be hit that pixel two. So you can just give up and draw pixel one with a twice more heavy color.

Also consider how much data you can and want to put in a pixel. If you'll draw a pixel in black and white, then there're only 256 variants of color. And you don't need to be more precise. Or if you're going to draw a pixel in full color then you still need to ask yourself: will anyone notice the difference between something like rgb(123,12,54) and rgb(123,11,54)?

Stubble answered 12/6, 2020 at 16:39 Comment(4)
Its not about the color of the pixel, but its location. What if I was skipping 100 data points and one of them contained a HUGE spike. So instead I will draw a smooth line, not a highly irregular 1-pixel wide spike that would have indicated some anomaly. How to go about this?Idealistic
Color is just one of dimensions I was talking about. In short: tiles and preprocessing or else there is no way to deal with data anomalies with certainty.Stubble
Even without plots and other staff, let's say you have an array of million integers. How would you know if there is a 42 among them and how many of them are there? Only by iterating over whole million (or to make it more dramatic - carefully reading). And the only other thing to do is peeking randomly from the array and if there is for example only numbers around 10000, then you can probably guess that there is no 42 in the array.Stubble
And don't forget about 3 SigmaStubble

© 2022 - 2024 — McMap. All rights reserved.