Is this research for next-gen or next-next-gen?

One of the cooler things about Siggraph is that we all get to drink some Kool-Aid and yell at each other over the future of graphics. One set of course notes you should definitely check out is the Beyond Programmable Shading course. For me, the theme of Siggraph was “What would it take to actually solve this problem”. Instead of thinking about what hacks we can do to get slightly less-worse results, most talks seemed to focus on actually solving these things for good.

Problems for film-like rendering:

  • What would it take to have no aliasing anywhere.
  • What would it take to get sub-pixel triangles everywhere.
  • What would it take to have perfect shadow resolution with no acne.
  • What would it take to have perfect soft shadows with clean edges.
  • What would it take to get true environment reflections.
  • What would it take to get correct real-time GI.
  • What would it take to render hair as unique strands.
  • What would it take to get true lighting models.
  • What would it take to get correct order-independent transparency.
  • What would it take for photoreal water/fire/smoke.
  • What would it take for proper caustics.
  • What would it take for proper light scattering in media (i.e. smoke, thick liquids, etc).

Yes, there were some talks that focused on incremental improvement via hacks (like my talk), but the focus seemed to be on actually solving these long-term problems. Also, the consensus for most of these is that we can’t really solve these problems by just scaling up current hardware. In order to really tackle them, hardware has to change in some fundamental way. So the question becomes are we going to care about these for next-gen (PS4, Xbox 720), or are we going to wait for next-next-gen (PS5, Xbox 1080).

On the other end, I remember a discussion I had with Dave Cardwell (co-creator of Mudbox) back when I was working for EA. This would have been 2006-ish. Suppose we want to render a blue rubber ball with a pitch-black background. If you do that in a video-game engine, it has the classic “video-game” look. But if you render it in RenderMan, it looks real. It almost looks like you can touch it. I think we also agreed that it looked “juicy”. That’s a perspective that stuck with me. While I’d like to solve the sexy problems like rendering 20 planes of translucent geometry with correct depth-of-field, in games I still have yet to see something as simple as a concrete block look photoreal.

So what’s on my short list for the “next” things I want to get solved? Looking at Uncharted 2, here are the improvements that I’d like to make given more horsepower:

  1. Turn Everything On: There are lots of features that current hardware handles “well”, but we have to turn off because we just don’t have the cycles. Ideally, we’d like all surfaces to have a blend and all surfaces to have specular lighting, but lots of background geometry has those things cut for performance.
  2. Better Shading: We still can’t afford good lighting models. At a minimum, I’d like all materials to have two specular lobes and a fresnel term.
  3. More Lights: While we have support for deferred (light-prepass if you will) shadowed lights, we can very rarely afford them.
  4. More Triangles: I’d like more triangles, although getting more triangles on the main characters is less of a priority. For main characters (like Drake/Chloe/Elena) it would be nice to get one more subdivision, but they aren’t the bottleneck on visual quality. I’d like to bring the less-important characters and environments up to the same triangle density as the characters.
  5. Higher Quality Shadows: I don’t want anything crazy here. But if I could use the same shadows but afford more taps, I’d be a really happy guy.
  6. Full-Res Alpha: It would be really nice to render all particles at full resolution.
  7. Better Image Quality: This one is a little more aspirational, but it would be really nice to be able to do and afford 4x MSAA (rotated grid) with FP16 render targets. That would be great.
  8. More Foliage/Overdraw: For any foliage, we have to really cut down the cards and the lighting models because of the pixel-shader cost. I’d really like to allow the artists to have many more triangles/layers and the same lighting models as everything else.
  9. Better Post: I’d like to have cleaner edges, bokeh in the DOF, higher-quality motion blur, etc.

So that’s my wish-list. It’s not very exciting, but if I had a big chunk of extra horsepower, that’s what I’d spend it on first. Then, after I got those things, I would then want the more aspirational goals in the first list.

All that brings me back to the original question, what do we want in the next generation of consoles. Of course, everyone’s opinion will be different. But then, the question is how much more power will we actually have. If we don’t get enough raw horsepower for everything on that list, then I’d prefer a card that is basically the same but with more power. It’s only after we can afford those things that I’d be asking for a fundamental redesign to get us the more aspirational goals.

Compared to the 360/PS3, I think we could get most everything on my list with about a 10x increase in horsepower per pixel. Of course, I’m betting that the next generation will have 1080p as a standard and 3D as a standard too, each of which doubles the number of pixels we have to render per frame. So we would be about 4x behind in power-per-pixel before we even start. In my mind, if the next-gen GPUs are more than 40x more powerful than current hardware (or more), then they will need some kind of redesign to take advantage of all that power. But if they are less than 40x more powerful than the current crop, I’d be happy with a generic DX10/11 GPU, and I’ll wait for filmic rendering on PS5/Xbox 1080.

18 Responses to “Is this research for next-gen or next-next-gen?”

  1. Very interesting blog

    I agree, the next XBox and PS3 will be expected to support higher resolutions and 3D as standard, however 3D in 1080p for each eye, I doubt it, I suspect 3D will be 540p for each eye so transform rate is up but fill rate remains the same as 1080p. (Quad occupancy is worse however)

    I think we’ll likely end up with DX11 capable cards in the next XBox and Playstation. So we’re looking at:
    1. More capable texture compression formats
    2. Geometry shaders
    3. New shader model

    All welcome additions IMO. Geometry shaders are simply better at some operations, post processing for example, which buys time we can spend elsewhere.

    I suspect we’ll see efforts from both hardwares to blur the line between CPU and GPU, perhaps leaning more on GPGPU than the CPU helping out the GPU as we had/have with CELL.

    I think we’re in for new features but nothing like the leap from PS2 / Xbox1 to the current gen. Instead we’ll see yet more horsepower and programmability. I doubt the next generation will require much of a change to anyone’s content pipelines / existing renderer.

    I think quad occupancy will be a significant issue next generation, simply put, triangles will be too small and art budgets will be stretched to create good LODs. Keeping the GPU working efficiently will be very difficult. Ultimately a fundamental shift in rendering architecture is required in the drive towards sub-pixel triangles, massively parallel scan line rasterization simply breaks down. I think we’ll see hardware vendors tackling this the generation after next.

    I agree with your wish list but would add a CPU one, better profiling tools for CPU side concurrency.

  2. Hi Martin. All those things sound good. The big question for me is if we are happy with rendering all pixels as groups of 2×2 quads for one more generation. For backgrounds, triangles will still be much larger than 4 pixels. For characters, I’m not sure. I rarely see any faceting in the main characters, even in closeups during the cut-scenes. Those guys are all less than 40k triangles (I think). If we got one more subdivision, then I’m more than happy, and I think most triangles would still be more than 4 pixels in most views.

    As far as the CPU helping the GPU or the GPU helping the CPU, I’m still a little dubious. The GPU could help with a few tasks, like ray-triangle intersections or testing a ray against 1000 AABBs, but higher-level operations are still not feasible. If you look at the 360, you can use the CPU to do GPU work, but the numbers just don’t add up and it’s not worth it. Whereas for the PS3, the Cell has enough power that it makes sense to deal with the nightmare of moving work from the RSX to the Cell. So will the next gen CPUs be more like the Cell or more like the Xenon? Last I heard the Cell was discontinued.

  3. I’ve been thinking about this quite a lot this past day.

    With triangles approaching the 2×2 size we are on average going to have a lot of pixels turned off (simply by dint of rendering triangles, not screen aligned quads) though you’re right, this will largely be characters, weapons and a few problem meshes. (stairs & railings for example)

    I think subdivision and displacement mapping on the GPU may be a valuable technology next generation, valuable because we can morph to a high quality LOD very cheaply. (perhaps 2 LODs for extreme close ups in cut scenes) Even if just for characters this would be great for keeping hardware utilisation high as characters run in/out of the camera.

    Turning up the resolution / MSAA will of course improve quad utilisation at the cost of additional bandwidth.

    It will be interesting to see how much bandwidth is the bottleneck, as I understand it ALU speed is still scaling faster than bandwidth. Deferred rendering is a bandwidth eater, what we really want to more bandwidth so we can store more information and effect more complex effects and all at a higher resolution.

    I’m considerably more interested in bandwidth improvements than ALU speed. Unfortunately this it isn’t likely this is what I’m going to get.

    Concerning GPGPU the GPU will remain a limited resource and GPGPU hard to justify until all CPU cores are maxed out and the game is CPU bound. The CPU helping out the GPU requires something like CELL, certainly DMA’ing relatively large blocks of data around quickly rather than the traditional CPU cache hierarchy which would be disastrous.

    The big problem with CELL is that unless you expend the effort to make use of all the cores all the time (not all game developers have the resources to do that) then a proportion of your silicon is lying dormant. This is clearly inefficient / not good from the console manufacturers point of view.

    We have a need for traditional CPU’s with deep cache hierarchies. We have a need for a massively parallel processor (GPU) which is good at handling rasterization. We have a need for something in between which can:
    A. Help the CPU out with heavy number crunching tasks
    B. Move towards alternative rendering technologies

    I think there is a model we haven’t yet seen though. That is, the ability to pull off some, but not all GPU cores to help with CPU tasks and return them back to help out with rendering when not needed. I suspect the threading model on GPUs is near sophisticated enough to do this, the difficult bit is the high speed communication (Very much like the RSX being able to read from the PPU cache but two way.) / low latency synchronisation between CPU and GPU.

    This would give the game developers the ability to grab GPU cores for vectorised number crunching tasks (which they would have given to a CELL core on PS3) but for that those cores to be busy helping out with rasterization when the CPU doesn’t need them.

    I am intrigued to see what AMD ‘fusion’ turns out to be. NVidia are pushing GPGPU hard and Intel have dabbled with a Larabee I think in partly in response to ‘fusion’.

    Prediction, next generation
    1. We will end up with more of the same next generation, DX11/12 capable hardware, more ALU, more bandwidth, more programmability, still very largely rasterization
    2. We might see some hybrid rendering where non rasterization based rendering is used in very constrained / specialist circumstances.
    3. We might see something interesting for GPGPU, some closer tie between CPU and GPU

    Next next generation:
    1. Programmability, threading etc.. might have reached a point where fundamental changes in rendering are possible without having to jump through restrained hoops.
    2. Whether a fundamental change in rendering is practical for a game will very much depend on a fundamental change in the way GPU’s handle memory. They simply have to get better at random access or we need new techniques for scene traversal / storage which are orders of magnitude more cache efficient than present best.

  4. I believe the development and research on raytracing and the use of the sparse voxel octree will be the next step in rendering. We are at a peak of rendering and the way we render needs to change for new ideas and techniques to be implemented.

  5. ps3 doesnt use direct x anything

  6. Very insightful post.

    “Higher Quality Shadows: I don’t want anything crazy here. But if I could use the same shadows but afford more taps, I’d be a really happy guy. ”

    I find it funny that an Uncharted 2 graphics guy would put higher quality shadows on his wish list considering how good they look in UC2 =)

  7. @Martin: All good points. In general, ALU power is increasing faster than bandwidth, and that will have to drive a lot of our decisions. The Cell is great for companies (like ND) where we have the time to really optimize and take advantage of it. But most companies just don’t have the time to take advantage of it. Even the middleware companies generally spend very little time optimizing for PS3.

    @Ryan: How would you create a sparse voxel octree for a skinned mesh of 40k polys that is animating every frame?

    @Setsuna: True. Although the RSX is essentially a Geforce 7800. It has the functionality of a DX9 card.

    @Daniel: Developers definitely spend much more time focusing on the flaws in their games than the people who play them. (-:

  8. Some of the early youtube videos of sparse voxel octree’s showed rendering of skinned character. I had to laugh at that choice “go on then, animate it!”

  9. No kidding. A technique that runs “in realtime” is useless by itself. It’s only useful if it’s actually better. Would it be 10x faster? 10x slower? I haven’t seen a convincing argument that it would be 10x faster. Then again, maybe if we re-architected GPUs for them…

  10. I’d like to point out that it’s not necessary to have per strand hair rendering, what matters is an end-result indistinguishable from per-strand hair rendering. Same applies to many of the other questions raised.

    Video gaming has ALWAYS been about finding artistic and performance short-cuts that achieve illusions of processing more than is really being processed. This has been true from the time of the first video games, eg like using Look-Up Tables for computations rather than real-time computations.

  11. What other rendering methods have you seen that have hair that looks as good as the Nalu demo? I haven’t seen anything even close, but I’m open to suggestions.

  12. >What other rendering methods have you seen that have hair that looks as good as the Nalu demo?

    You’d be surprised; here’s another good one by NVIDIA:

  13. Just to note that the link in my last comment is also per strand rendering of sorts, so yeah more fuel for your argument 🙂

  14. Touche, salesman. (-:

  15. @Martin

    Actually it is possible to animate sparse voxel octrees, as well as the fact that they enable unlimited geometric complexity, unlimited draw distances, real 3D grass stretching forever not fake quad grass like we get now that pops in 10 feet in front of you. And you can guarantee a framerate by doing progressive rendering of the frame, then stopping at a certain m/s.

  16. Andrew: Of course, as we all know, there is a difference between “possible” and “fast enough for games”. Right now, we can’t even afford fake grass that pops in 10 feet in front of you. (-: In U2, the artists are constantly trying to minimize the overdraw, and are even taking out things like specular. If you’re rendering at 1080p and you only have 3ms, how much grass can you render and how much ram will it take?

    Btw, ram will probably be tighter next-gen. Processing power is increasing faster than ram is. Depending on how you make your octree, that could be a huge issue as well. As you can probably tell, I’m not expecting anything meaningful from sparse voxel trees for next-gen, but I might get proved wrong.

  17. @admin

    Yeah I agree, we probably won’t see any fully sparse voxel content next-gen, but perhaps a hybrid SVO-Polygon engine may be used?

    The best part of SVO’s is really the way they simplify things, implicit LOD, unique geometry, unique texturing, all in one.

  18. Andrew, we’ll have to agree to disagree on this one. What I’d like to see is better ambient lighting and GI before going crazy on poly count. Then again, it would be cool to see someone find a good games use-case for this stuff and prove me wrong, despite how much I enjoy being right. (-: