Maximize performance: merge *all* track objects

Discussion in 'Tracks' started by luchian, Aug 21, 2016.

  1. luchian

    luchian Administrator Staff Member

    Joined:
    Jun 3, 2014
    Messages:
    3,007
    Likes Received:
    1,493
    I remember reading about performance at one point on the forums, and came across a thread from AC member BrunUK, the author of Donington, one of the best user-created track mods for Assetto Corsa.

    Here are the relevant parts of the conversation. It's a long-ish read, but voluntarily so that we can understand what's behind the actions as well.

    After adding a lot of extra stuff to Donington since the last release, I've noticed in the render stats window that the 'DIP' number had gone up a lot (often over 4,000). DIP = The number of draw calls, which is when the GPU is asking the CPU for stuff. When that happens it doesn't matter how much graphics power you have, because it's not being given enough work to do. I knew that draw calls happen more when there are large numbers of individual meshes, textures, materials etc. so started to do things like consolidate separate textures into larger maps, and reduce the number of materials where possible.

    This seemed to help a bit, but it was merging meshes together that made an absolutely huge difference. My scene previously had about 2,000 individual items, but I got that down to about 850 and performance has increased massively. No change in the overall vertex/polygon count, just the way they're saved in the scene.

    One downside (the only one as far as I'm aware) is that you lose the ability to set LoDs individually. To avoid obvious popping-in, objects might need setting to be visible from further away, but it definitely seems worth the trade-off./by BrunUK

    ***
    On the other extreme, at Lake Louise @@Snoopy and I found that by splitting the mountains into more segments actually increased performance slightly. Most probably due to the engine being able to cull segments which don't need to be rendered.

    Mind you, that only adds maybe 50-60 objects total, so the trade off is worth it in some cases in the other extreme case./by Lucas

    ***
    I guess it's always gonna be a question of balance depending on circumstances.

    I ran a few tests tonight which confirmed that every separate mesh will add a draw call if the object is in the camera's field of view. For example a 10x10x10 array of cubes increased the DIP value by exactly 1000. As far as I understand, it's much more efficient to give the GPU one 10,000 poly object rather than let the CPU deal with a hundred 100 poly objects.

    I also wanted to see what effect adding different materials would have, and it seems that each one only adds to the draw calls regardless of whether it's on a single mesh or shared by several. This suggests it makes sense to merge objects together and have several materials on each.

    The other thing about this draw call business is that I'm fairly certain it causes a big hit on the CPU. Obviously not something that's an issue with hotlapping, but when notoriously CPU-dependent AI are added into the mix it makes sense to minimise the CPU impact of the track as much as possible./by BrunUK

    ***
    I can confirm this. Merging buildings /assets into larger blocks, and packing textures into atlases really helped a lot in LuccaRing. I still have to do the opposite on the terrain which is currently a huge mesh with a huge texture, and should probably be split in at least 4 smaller chunks. /by ir Sindaco

    ***
    Regarding DIP (Draw Indexed Primitives)/draw calls and batch size, imagine a DIP as a single color on a painting. The more color a painting have, the more DIP there is. If the painting have 1000 color, that means there are 1000 DIPs in that painting.

    The batch size is how many pixels each color have in that painting. Let's say there are 200 pixels of red color, then it is counted as 1 DIP with a batch size of 200. If there are 100 yellow pixels, then the result is the painting will have 2 DIP with average batch size of 150

    as for how the DIP is counted in 3D models;
    - 1 object with 1 material (standard) is counted as 1 DIP
    - 1 object with a multi material (4 sub-materials) is counted as 4 DIP
    - 2 objects with 1 material is counted as 2 DIP
    - 2 objects with the exact same multi-material (4 sub-materials) is counted as 8 DIP

    now this is why we need to be careful when we use multi/sub-object materials

    20 separate objects with 10 polygons each, having the same multi-materials (with 10 sub-materials) is counted as 200 DIP with average batch size of 10.
    If we attach all those objects together, the result will be 10 DIP with average batch size of 200

    If we detach and break down the objects based on matID, separate the multi-material into 10 standard materials, and apply them individually, the result will be 10 DIP with average batch size of 20.
    This last method gives the best result, with the lowest number of DIP and batch size.

    To put it simple, I would suggest to avoid using multi materials unless the material is unique. Attach ALL objects that have the same material, and then split them into a reasonable size.

    Just like any other racing games, AC have a very dynamic flow, with our POV is rapidly changing depend on the track's direction.
    If we split our meshes to nearly 65K vertices all around the track, the GPU will have a hard time loading and unloading them in and out the screen, because even if a fraction of an object is appear on screen, the whole object need to be processed.

    this is why we need to think carefully on how to split our objects. And we can also take the track's direction into our consideration. Splitting objects to a lot of small objects is also not recommended, since it will raise the DIP and lower the batch size.

    Well I supposed that is all the basic about DIP and batch size. /by Abulzz

    ***
    One thing I meant to add, was that because AC's AI is *very* demanding on the CPU, it's actually more important than typical 3D games to minimise the impact that the graphics have on the processor. While it's true that heavier meshes are harder for the GPU to deal with, if you split them up in a way that would give an optimal CPU/GPU balance for hotlapping, that could result in the CPU being overwhelmed when trying to deal with AI as well./by BrunUK

    ***
    I did some experiments, and my conclusion is CPU occupancy is heavily related to physics calculation.
    Different from other games, AC have physical data that need to be processed by the CPU. And from different configurations that i tried, I'm pretty much certain that AC engine will calculate physic mesh data that came inside a specific radius of a car (either our car or AI's. that's why the cpu occupancy doubled for each AI added). As long as a physic mesh is inside our radius, they will consume CPU resources.
    but what happened if we don't separate visual and physical meshes?
    sometimes when we're on the track, we can see far to the distance. Since they are visible on screen, obviously the objects are loaded and processed to be rendered on screen. But since these objects also have physical properties, the CPU will automatically calculate the physical properties as well, even though they are not within any car's radius.

    and just like the DIP and batch size, physical meshes need to be split into reasonable size, but this time also take their radius into consideration. You might have to run some experiment to get the best configuration for a specific track. For targa florio, splitting the physical mesh every 400 meters or 4-5K triangles gave the best result./by Abulzz

    ***
    Interesting about splitting up the physics objects. At the moment I have separate physics geometry for the road and kerbs, but the grass and sand is both physics and renderable. The road and kerbs are separate just for the sake of organization in the model but they're a single mesh each, I hadn't considered splitting them up into sections but will definitely try that./by BrunUK

    ***
    That probably explains why i got better fps splitting up my mountains, because the whole landscape is physical. I guess i better change that to non physical for stuff that's too far away. I'll do some tests later on and see if that improves fps or not./by Lucas

    ***
    Here's some comparisons between Donington 1.06 and what will be 1.07:
    This is at 2560x1440 on the starting grid with 23 AI Ferrari 599XX

    Code:
     1.06     1.07
    
    GPU     70%      98%
    DIP    4818     3737
    FPS      75       85
    TRI    4.4M     5.5M
    SCN     12M    13.6M
    So I've added 1.6M more triangles to the scene and on the grid it's drawing 1.1M more. Despite this, reducing the DIP by over 20% means the GPU is allowed to work to its full potential and FPS increases accordingly, by about 13%./by BrunUK
     
  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice