Here, here, and here
The biggest take away from all this is the fact they've moved to deferred rendering for their RTS. I am a big proponent of deferred rendering, and originally pitched it back when we started HaloWars as our lighting solution. Unfortunitally, due to EDRAM sizes our lack of understanding with the hardware at the time, we couldn't' figure out how to pass around that much information in memory to store the frame buffers, and still be 4xAA compliant w/o taking tons of memory from our content creators. (We know better now..)
Those of who you've been around for a while, remember a nice little chat I had with some ATI people back in 2005. Below is the original post, with a few spelling corrections. What took me by such surprise back then was how quick they were to just toss the concept out the window. Obviously no one had tried to use it before for the RTS genere, but that doesn't mean it was simply 'going away.' I think at this day, most all games use some sort of 'deferred' system of either post screen effects, or final lighting / HDR passes, which we can effectively view as a bastardized attempt at raytracing on GPUs.
Thursday, July 28thth
So, I just got back from microsoft meltdown 2005. I have to say, pretty interesting stuff.
So, let me rant for a second... I had a chance to discuss 'Deferred Rendering' with some of the bigwigs over at ATI. I was quite surprised about how easy they were to dismiss the technique as invalid. To be fair, I did come into the situation pro-deferred, and although they brought up some valid points, nothing was REALLY said that would cause deferred rendering to not be a valid technique for generalized situations. (Besides, 'it doesn't scale well' which was actually brought up by some of the DirectX ninjas)
To preface, this was the first words out of the ATI tech mouths (note, plural, as I debated with about 2-3 of them for the better part of 3 hours.): Early Z will give you the same performance boost (in fact, better) than deferred rendering, and the amount of time that's going to be spent turning it into a viable technique, is longer than the lifetime of the technique itself. And to top it all off, forcing a single BRDF for the entire scene isn't a good plan for next gen systems. It's just not worth it. Ok, valid, Potentially graphics hardware will allow a large jump w/ the dx10 stuff where early z cull will be pretty nice. And YES, a single BRDF for the whole scene does tend to make artists a bit squirmy. BUT, that's not the reason I like it.
The example I proposed to them: Imagine you have a full RTS. You've got dynamic environment lights, texture splatting on the terrain (avg 25 splats per chunk, lets just say 4 chunks visible), 100+ (min) skinned units on the screen, about 100+ non-skinned prefabs. We've got 3 main directional lights (one main for shadows, the other two are just support), and then 35 local lights on the terrain, attached either to objects, or just placed by artists for 'cool' appeal (note, both point and spot lights for the locals). We're using Tom Forsythe's multi-frustum shadow map approach, oh, and we'll just throw in reflective water for the hell of it.
My argument was, that in this sort of situation, Deferred rendering is a BIG win. (And in my mind, 'the way') So first, lets talk about simple light management. I offered this suggestion 'So, with that many lights in the world, we have to search our hierarchy for local lights every render, decide which are the most important, bind those to our shader (assuming PS2.0 for current hardware), take the less important ones, and either throw them into an ambient cube, or some sort of SH mapping, and then bind those to the shader. Simple management heuristics at this point already tell us that deferred is a big win here. Having to manage the searching and binding of each of those lights every frame is pretty difficult. On top of that, we're using the multi-frustum shadow map approach, so potentially, each shadow casting light could have 2-3 shadow maps. (Which, in it's native implementation, requires each receiver to know what maps it's receiving.) So already, we're looking at a pretty messy management system for a single object on the screen: What lights do i get directly, what lights go into my ambient setup. What shadows do i receive (and am I capped?), and Do i have enough texture slots left in the shader for shadow maps (not taking into account albedo, normal, and a few control textures). I mean, at this point, you have to admit, from sheer management issues alone, the ability to separate the lighting passes from the objects is a pretty decent win here.
At which the ATI individuals responded: Well, that can be solved with an UBER shader derivation. Create a version of each shader for each potential combination of lights and shadows, and just assign that to the mesh before it's rendered.
Well, I countered, Isn't that limiting your artists to a single, predefined group of BRDF's? Just like deferred rendering?
Not really, you can have multiple materials. You just need a version for each shader for each combination of input lights and shadows.
Granted, this is the path that halflife 2 took, but, from what I recall, their shader directory ended up being some ungodly high number of MB in size. Fine, I'll accept that I allowed, so what about all the other effects that we wish to add globally to a scene, such as atmospheric scattering and what not. That would exponentially shoot up the number of shader combinations for each type of material to be affected by these global effects.
Just Multipass it then.
What about being batch bound at that point? Seems like tripling your calls on such a setup would be a bad thing?? X draws Per character * Y reflective surfaces + Z shadow maps??
*Shrug* Yea, you'd be draw bound at that point..
At this point, I was just dumbfounded.
I don't know if i was just talking to the wrong group of guys, but the whole conversation came across less as 'Hey, we've done the research, and here's some viable reasons why DR would cause some problems in your pipeline.' and more like 'Hey, fuck you.'
[...]
Anyhow, my thoughts on deferred rendering lie in the sole fact, that management of lights, and shadows becomes quite easy. Not only that, but keeping the headache of bazillions of shader combinations is a big win. Yea, the DX Ninja's are correct, It doesn't scale well to lower end systems (non FP Blending for example..), and, convincing your artists that a single defined set of control values through a single BRDF is useful, is quite difficult. I really think with the amount of Post processing we're moving towards, Pushing the graphics card itself (forcing a filtrate bind) with a fullscreen quad, is more efficient that resorting back to multipassing and batch limiting yourself (ESPECIALLY IN DENSE OBJECT ENVIRONMENTS!!!!)
Now, WRT earlyZ, Yes, I do agree that Earlyz will give you one hell of a performance boost. BUT, it still doesn't solve the light / shadow management situation, or the batch count submission problem. It's It just saves us on fillrate, which is entirely on the GPU side of things..
Fine, don't listen to me in 2005. You'll have to listen to Blizzard from now on.
Assholes.
~Main
1 comments:
This dug up some memories of an old favorite phrase around the GH in the C5 era...
IT IS SHIT
Post a Comment