May 2022: NNGINE-L/Voxoid: A deep dive into how Robot Farm renders its voxel graphics
Hey there! I hope you guys have had a good month since we last talked. And if you live in the USA, I hope you’ve enjoyed the summer weather!
We have something special this month for our more nerdy readers! We’ll be going into the nitty gritty of some of the technology that powers Robot Farm. This is going to be a long, technical post, so put your thinking cap on and buckle up.
Hello, everyone. I'm the developer of the Voxoid engine that powers both Robot Farm and Battledroid. I've been asked to write a bit about some technical stuff in the engine that might interest our players, so while there's a lot of things I could write about, it feels natural to start with the very thing that inspired the name "Voxoid", namely voxel rendering.
What's so special about voxel-based 3D models? Well, usually 3D models store only the surface of a model as a triangle mesh, usually with texture coordinates for each of the triangle corners to apply a nice texture to it. This is also what our graphics cards are optimized for rendering as well, and they're bloody fast at doing it. You simply give the GPU a list of vertices (triangle corners), followed by how to connect said vertices into triangles. A modern mainstream GPU can easily handle a few million vertices and triangles while maintaining 60 FPS.
Unfortunately, voxel rendering doesn't fit this at all. A voxel model, such as the ones exported by MagicaVoxel, simply stores a list of a bunch of tiny colored cubes. This isn't just the surface of a model; this is actually volumetric data, and if there's something that makes graphics programmers freak out, it's trying to render volumes.
Let's illustrate the problem with an example. Let's say we have a cube that's 100x100x100 units big. If this was made in a normal 3D modeling software such as Blender or Maya, etc, it'd simply be a list of the 8 corners of the cube, followed by a list of triangles built using those corners. Since we have 6 faces on a cube and we'd need 2 triangles for each face, we end up with 8 corners and 12 triangles. Dirt cheap, in other words. Unfortunately, if we want to put textures on the side of the cubes, we'll actually need different texture coordinates for each of the 3 faces. This means that we can't actually reuse vertices between faces, as even if just the UV coordinates differ, we still need to duplicate the entire vertex, so in practice, we end up with 24 corners and 12 triangles. That's still dirt cheap. :>
How about a voxel model? Well, that same cube is now a voxel model made out of 100x100x100 = 1,000,000 tiny untextured, colored cubes, all put together to give us our large cube. How can we render this? Well, if we just draw each of those tiny untextured cubes as they are, we end up with 1,000,000 x 8 corners and 1,000,000x12 triangles. Suddenly, the same cube stored as a voxel model takes 8 million vertices and 12 million triangles. That's close to what we can render in total if we want to hit 60 FPS, and this is just a single object! And that's not taking into the consideration that we're rendering everything inside the cube as well. In short, a cube that runs at 5000 FPS when rendered as a simple triangle model now runs at 20 FPS as a voxel model due to the huge number of triangles and the huge amount of overdraw.
Our only choice is to try and preprocess our voxel data to make it a better fit for our graphics cards. Luckily, the voxel models that you can create in MagicaVoxel are much more limited when it comes to their shape than the arbitrary triangle meshes you can make in Blender or Maya. That will allow us to make a lot of assumptions and optimizations, which in the end will help us out quite a lot!
The first thing we'll need to do is get rid of all the useless internal voxels. We can do this by checking if a voxel is completely surrounded by other voxels, in which case it's pretty useless. The result is that we'll simply render the voxels that form the shell of the cube, meaning that we can get rid of 98x98x98 = 941,192 of our 1,000,000 voxels, leaving just 58,808 voxels to render. That's better, but it's it's still quite a lot. We're still at 470,464 vertices and 705,696 triangles. Still pretty ouch, but at least the overdraw is a lot better! We can still do much better though, but to do that we need to stop thinking of voxels as whole cubes and instead look at the 6 individual faces of a voxel.
If we have two voxels placed next to each other, there's no point in rendering the faces of the voxels that are facing the other voxel. This is actually a very common situation; for the voxels on the surface of our large cube, 5 out of 6 faces are effectively covered by neighboring voxels. If we can eliminate individual useless voxel faces, we could therefore save a lot of triangles and vertices compared to just eliminating individual vertices! Our large cube has 6 sides, each with 100x100 voxel faces on it. Each face is 4 vertices and 2 triangles, so we end up with 240,000 vertices and 120,000 triangles. Another great improvement, but we still have a long way to go!
At this point, we could slap a texture on each voxel face and render it, and we'd be on the level of Minecraft. Unfortunately, the single-colored voxels in both Robot Farm and Battledroid are much MUCH smaller than the ones in Minecraft. Doing this in the games that Voxoid powers would result in hundreds of millions of tiny triangles needing to be rendered every frame, and the games running at single digit frame rates, if you're lucky. We need to go deeper.
The next step is to start merging neighboring faces together into larger flat 2D polygons. In the case of our big 100x100x100 cube, we should in theory be able to merge an entire 100x100 voxel side of the cube into a single 2D polygon, in this case a square. If we then turn that square into two triangles, voilà! We'd be rendering our huge cube just as efficiently as if it was a triangle mesh model in the first place, with 2 triangles per cube face! Of course, this has a bit of a problem: what if the voxel faces we're merging together have different colors? We'll return back to that problem later, so for now let's ignore that.
Note that in the case of our simple cube, we just end up with 6 large square polygons, which are easy to turn into a triangle mesh by splitting them along the diagonal. In practice, we could end up with a very complex polygon with hundreds of corners and, even worse, holes inside them! Let's check out a more complex mesh and see what would happen there. Here's a nice little cottage from Robot Farm shown in MagicaVoxel, with a different palette applied for illustration purposes.
It's a little bit bigger than 100x100x100 voxels big, so given the decent amount of air around it it's made up of around 1,000,000 voxels (assuming the inside is solid).
Here's the same cottage, now in Voxoid, showing each voxel face with its own color.
If we were to draw each voxel face separately here, this little cottage would end up with a total of 78,906 faces, for a total of 315,624 vertices and 157,812 triangles. Now, this cottage is a lot more complex than just a huge cube (which had 60,000 faces to render), but that's still a lot of vertices and triangles for what we're rendering, so let's try merging adjacent faces together!
Here's the same cottage again, but with adjacent faces merged into a single large polygon. The result is 12,198 flat polygons. Now, if only we could render all these polygon efficiently, we would end up with a really nice mesh. While a large number of those surfaces are just rectangles of different shapes, some end up much more complex. For example, the frame around the top floor window (purple in the image) not only has a large number of corners, but also has a large hole in it! Fortunately, this is a very well-studied two-dimensional problem: triangulation, in other words turning an arbitrary 2D polygon, possibly with holes in it, into a set of triangles. There is in fact a lot of research into this problem! Let's go through how Voxoid converts a complicated 2D polygon into a list of triangles!
Here's the polygon we're trying to triangulate, with the original voxel faces that it's constructed from visible:
The first thing we're going to do is to identify the contour of the polygon. We do this by traveling along the edge of it in a clockwise direction, starting at its top-most, left-most corner. In addition, we'll do the same thing for each hole in the polygon, but in a counter-clockwise order.
Next, we'll connect one corner of each hole with the closest corner in the contour of the polygon, like this:
This allows us to update connect the contours of the holes with the polygon's outer contour, and therefore "eliminate" the holes! Instead of having holes, the new contour simply takes a detour along the line we added to connect the hole, follows the hole all the way aroud, then touches itself as it goes back the same way it came!
This new polygon, although weird, technically has no holes, and is therefore much easier to triangulate! The fact that the hole contours are going in counter-clockwise direction means that it all just works out when they’re connected to the polygon’s outer contour.
As it turns out, given an arbitrary polygon with N corners, it is always possible to triangulate it into (N-2) triangles. This lines up with our experience with polygons with 4 corners. Squares are polygons with 4 sides, and can be triangulated into 4-2 = 2 triangles.
I use a very simple triangulation algorithm that goes round and round the contour of the triangle and tries to "cut off" a corner of the polygon, and each time it does so it creates a new triangle. After a number of laps around the contour, it successfully manages to break down the polygon into triangles!
Nice! We have our triangles! As it turns out that this isn't just a great way of reducing the number of vertices and triangles needed to draw a voxel model, it's actually the optimal way of drawing a voxel model with triangles; it is always the way that uses the fewest vertices and triangles to do so!
There is however one more thing we can fix before we add these triangles to the mesh. There are actually multiple ways of triangulating the same polygon! For example, if you have a square, you can cut it along both diagonals, so there are two valid triangulations of a square. For polygons with even more corners, there can be a huge number of them to choose from!
At first glance, it may not seem like it would matter which triangulation we choose. After all, the number of corners and triangles stays the same for all of them. However, there is one thing we can improve: the shape of the triangles. GPUs are actually slower at rendering long, thin triangles, even if they end up covering the same area! For that reason, it's worth trying to eliminate thin triangles from the polygon!
This can be done by going through all the edges between triangles in the polygon. Each edge is shared between exactly two triangles. For each edge, we'll connect the two triangles that share that edge into a quad, then check if we can change which diagonal we split this quad along and see if the other diagonal would be shorter. Here's an example of two triangles sharing an edge, the red line:
In this case, the diagonal would get shorter if we swapped the diagonal that is used to split it into triangles, like this:
This check is actually pretty easy to do over and over again for each edge until there are no edges left to fix! Unfortuntely, this still won't always get us to the absolute best possible triangulation, but it'll get us to a good enough one. Here's how our triangulated polygon gets optimized to avoid long, thin triangles:
Phew! That's it! We're done with this polygon and can move on to one of the other 12,197 polygons in our cottage. After going through all of them, we end up with a grand total of 16,182 vertices and 32,237 triangles! And again, this is the lowest possible number of vertices and triangles to render this voxel model! Here’s how that looks:
Except... We're forgetting something. While this will get us an optimal triangle mesh for rendering our voxel model's shape, in the process we've lost the color information of each voxel when merging together neighboring faces into larger polygons! Oh no!! How do we draw triangles with many different colors on them?! Well, textures, of course. That's how we do it for everything else! We'll simply have to save the colors of the faces we merge together into polygons into tiny textures, and then place the texture onto the polygons! Phew! That was close!
Except... Now we need texture coordinates for each vertex! Each vertex is shared by exactly 3 different polygons, meaning we'll need three different versions of each vertex with different texture coordinates! That'll essentially triple the cost of drawing our voxel models! Noooo!! Is there any way we can avoid this?
Fortunately, this is where the fact that we're rendering voxels saves us. Our voxel textures are in fact extremely simple. Each pixel in our textures are always exactly 1x1 voxels large, and are always perfectly aligned with voxels. This means that we can store some very compact texture projection data per polygon instead of storing actual texture coordinates in the vertices. Therefore, as long as each triangle knows which polygon it is part of, we can just calculate the texture coordinates per pixel. And since our vertices would no longer need texture coordinates, we can then reuse the same vertex, even for different polygons!
In fact, since the coordinates of all our vertices have to be whole numbers in the range 0 to 255, we only need 3 bytes to store the position of each vertex! For reference, usually a 3D model uses 3x4 = 12 bytes for the position alone, and usually closer to 40 to 50 bytes to be able to store texture coordinates, normals, etc. None of these parameters have to be stored per vertex, as they can instead be either stored in or implicitly calculated from the 8 bytes of projection data stored per polygon instead.
Since we now essentially just have a normal triangle mesh, we can then optimize this mesh further for rendering by reordering the triangles in it to help the GPU's vertex cache do its job. This is standard practice for all 3D models in all games and can help tremendously with performance, so I won't go into detail of how it's done here.
And that's it! This entire pipeline for converting a voxel model into a triangle mesh was unfortunately very difficult to implement and took several months of work to perfect, but the end result is a voxel engine with the ability to process an incredible number of triangles every frame. The reason for this is threefold:
Each voxel model is converted to the minimum number of vertices and triangles needed to achieve the original shape of the voxel model.
Each vertex is just 3 bytes big and is extremely cheap to process.
Each vertex can reused 3 times more efficiently than usually for voxel models, drastically reducing the number of vertices AND improving the efficiency of the vertex cache due to the lower vertex count.
By reducing the number of vertices and also the cost of each vertex so significantly, Voxoid is able to achieve a raw triangle processing power that is several times higher than engines that aren't able to take advantage of the special properties of voxel models. In fact, Voxoid scenes can easily reach 20,000,000 triangles when shadow maps are taken into account, yet still can still achieve frame rates over 100 FPS on decent hardware.
Unfortunately, this post already ended up very long, so I’ve had to gloss over a large number of details, especially in the polygon triangulation part. In particular, handling holes in polygons turned out to be very complicated with a lot of edge cases.
Whew, that was a lot to take in! That’s it for this month, but we hoped you like this month’s devblog. If you enjoyed reading this more technical dive into how Robot Farm works, be sure to let us know and we’ll do this more often.
Until next time, please take care and have a wonderful month!
♥ NOKORIWARE
Stay updated!
Check these out too: