Clipping Space

Saturday, January 08, 2011

Using Redbean with Code Igniter 1.7.3

I started playing around with codeigniter for a quick project, and wanted a simple ORM for manipulating databases quickly. After looking briefly at RedbeanPHP, I decided to give it a spin. Boris Strahija had actually shown how to do it last year in the forums, but I just wanted to post my files for reference. This assumes the database.php file in your config dir is already pointing to a valid database. I was using mysql.

Download RedbeanPHP and drop the two files (._rb.php and rb.php) in system/application/libraries
Create a library class called Redbean.php with the contents shown below, and drop it in the same directory
Use it in your controller

Redbean.php

ci =& get_instance();

// Include database configuration
include(APPPATH.'/config/database.php');

// Include required files
include(APPPATH.'/libraries/rb.php');

// Database data
$hostname     = $db[$active_group]['hostname'];
$username     = $db[$active_group]['username'];
$password     = $db[$active_group]['password'];
$database     = $db[$active_group]['database'];

// Create RedBean instance
$toolbox = RedBean_Setup::kickstartDev('mysql:host='.$hostname.';dbname='.$database, $username, $password);
$this->ci->rb = $toolbox->getRedBean(); 
    }
}

Using it somewhere in your controller

// example code of creating a 'post' table
$post = $this->rb->dispense("post");
$post->title = 'My first post from CodeIgniter';
$post->body ='Lorem ipsum dolor sit amet, consectetur adipisicing elit....';
$post->created = time();
$id = $this->rb->store($post);

After running this controller, you should see a 'post' table created. Done!

What you should see in phpMyAdmin

Sunday, October 10, 2010

Incorporating scala, java, sbt, JOGL, Qt, and Ruby/Python

For several years now, I've been iterating on a small project that allows the user to build a Sunflow scene file. When starting it, I tried to effectively design the program with the write technologies for the task.

The language of choice
First, Sunflow is written in java and as such runs on the JVM. Not only do I want to be able to build a Sunflow file, but I want to render it interactively using Sunflow's libraries. There are many langauges available on the JVM right now, but only a few sparked my interest. First, I had done everything in Java previously, which was okay but was at times a little tedious. The languages I looked at as possible replacements were scala and clojure. Although I was very interested in clojure, I thought at the time of starting that it was too foreign to me and I would end up doing something extremely naive in my design.

Scala is a statically-typed language that can at times closely matches the java language. Scala has built a large type-system, which includes many data structures that are both mutable or immutable depending on what you're using them for (functional programming, for example, would probably favor the immutable structures). Scala treats it's functions as first-class citizens, so they're easily passed around to other functions. I can even make large anonymous functions that fit right within another function call. It's syntax is also slightly less verbose than java, and it provides an excellent getter/setter infrastructure that's extremely clean.

Scala isn't all roses, however. First, scala seems to be growing faster than my app, so as I tried to stay up to date with the latest scala features, it's often broken compatibility in significant ways, which has been a bit of a distraction fixing. Second, some of the code is illegible to me. It has a complicated type system and weird language symbols sometimes that seem to pull from esoteric languages for the sake of it instead of creating a simpler style. My scala is probably pretty java-like, so it's fairly easy to read.

Building with scala
Another advantages of scala is its ability to cross-compile with java. I was previously using ant, which was my foray into a java build system. First, I don't like work in XML for a myriad of reasons. After the ant scala compiler tool started throwing bogus errors, I was advised to move over to the simple-build-tool or sbt.

After adding a launcher to my class path, sbt allowed me to quickly setup a new project, whose project file is written in scala, a much more preferred way to configure my app. I simply put all my scala code in ./src/scala and all my java code in ./src/java and sbt combines them. Although it took a moderate amount of time to setup my project, in the end it's been a joy to work with sbt.

JOGL
The JVM requires JOGL to do OpenGL rendering. This can be tedious to setup as I'm used to just tossing a jar onto the classpath and having my build system include it. Because JOGL requires native libraries, which need to be included separately to the java library path. This took a while for me to figure out, but ended up being a simple setup in the end.

import sbt._
import java.io.File

class SunshineProject(info:ProjectInfo) extends DefaultProject(info)
{
  // tells sbt which class to run
  override def mainClass:Option[String] = Some("com.googlecode.sunshine.Sunshine")

  def nativeJOGL:String = {
    var os = System.getProperty("os.name").toLowerCase
    var arch = System.getProperty("os.arch").toLowerCase

    // this is to match the JOGL builds
    if (arch.matches("i386")) arch = "i586"

    if (os.contains("windows")) {
      os = "windows"
      arch = "i586"
    }
    println("OS: %s".format(os))
    println("JOGL Path: %s".format("./lib/jogl-2.0-%s-%s".format(os, arch)))

    "./lib/jogl-2.0-%s-%s".format(os, arch)
  }

  override def fork = forkRun("-Djava.library.path=%s".format(nativeJOGL) :: Nil)

All this basically does is query the OS and architecture from the JVM and adds it's respective JOGL directory to the library path. This required me to fork the JVM, as adding paths to the current JVM doesn't seem to work. Notice I just had to override the fork command and sbt knows I want to fork. Overriding the run command won't fork the JVM when starting. I did something similar in ant, but it was much longer and more difficult to read.

Using Qt
Java Swing is usually an okay GUI library for my small java projects, but it's slightly cumbersome when trying to do something complex. Trolltech's Qt is a very popular framework that has continually gained popularity over the years for it's great documentation, intuitive API, and it's event handling system.

QtJambi is the java binding for using the Qt libraries. A few months ago Trolltech dropped support of QtJambi, but pushed it off to the community to continue updating. So far, they seem to be doing a decent job, although they have been continually asking for help.

sbt supports automatic library manage via Apache Ivy. Instead of shipping every build of Qt for a given architecture, I can setup Qt as a managed library. By pointing sbt to the QtJambi servers, sbt will automatically fetch them during compilation.

val qtDepSnapshots = "Qt Maven2 Snapshots Repository" at "http://qtjambi.sourceforge.net/maven2/"
val qtDep = "net.sf.qtjambi" % "qtjambi" % "4.5.2_01"
val qtjambiBase = "net.sf.qtjambi" % "qtjambi-base-linux32" % "4.5.2_01"
val qtjambiPlatform = "net.sf.qtjambi" % "qtjambi-platform-linux32" % "4.5.2_01"

Right now, I'm only fetching Linux x86 libraries as that's what I'm working off of. Adding the block above directly into my build class will tell sbt to grab them for me. There's a bit of magic going on here for me as I don't understand how sbt knows these variables are library dependencies or just values I've created in my program. Regardless, it's enough to get Qt downloaded and onto the classpath.

Incorporating a scripting language
At this point, I could just start coding my application, but since I'm doing a lot of designing on the fly, it takes a long time to compile my application to see a small change. I wanted to incorporate a scripting language that lets me make changes to the interface quickly without existing the program.

As much as I complain about python, I use it a lot at work and am fairly productive programming in it. The jython project provides a python implementation that runs on the JVM. I've used this library before a year ago, but had to scrap it as it was too slow latency-wise. I've heard it's gotten significantly faster recently, so I gave it another shot and found it to be much faster. Working with Qt, however, seemed to turn up a bug blocker.

[error] Exception caught after invoking slot
[error] Traceback (most recent call last):
[error]   File "", line 15, in 
[error]   File "", line 9, in __init__
[error] TypeError: Proxy instance reused

I was initially wondering if working with Qt from a scripting language was not stable, however, this seemed to be only a jython issue. I reported it to the jython team, which seems to be resolving the issue right now for their upcoming build.

In the mean time, I thought I would give ruby a spin. I'm rather unfamiliar with the Ruby language, although I haven't been oblivious to the huge success it has garnered in web community with Ruby on Rails. I've also heard great things about JRuby, the Ruby implementation on the JVM. At one point it was actually faster than the C++ build of Ruby, although I'm not sure this is necessarily true anymore.

The ruby language seems to be something between python and perl, although that's another comment that will possibly get me shot by a Ruby developer. It's a purely object-oriented language--more so than python--and provides a bit more syntax flexibility than python (for better or for worse) including some parsing syntax from perl.

val factory = new ScriptEngineManager()
val ruby = factory.getEngineByName("jruby")

val urlFile = "clear_scene.rb"
val url = getClass().getResource(urlFile)

ruby.eval(new InputStreamReader(url.openStream()))

With my main scala code, I open a Qt main window. Then hand things over to jruby. My ruby code clears out the window and fills the UI programmatically. I can add menus and event handlers as I go and when I'm ready to see the change, I simply restart the ruby evaluator and my UI rebuilt instantly with no recompilation. I literally see no latency continually reloading my ruby code and seeing my interface changes change on the fly. I plan to do most of my designing in ruby and gradually move the classes I create over to the scala-side once they become stable for a performance boost.

I like to keep one terminal open running sbt with the "~copy-resources" action continually copying my ruby changes over to the build path, while the other terminal compiles and runs my app as I go through the code changes in scala/ruby.

I've just barely started with it, but I've been enjoying ruby for the most part. A ruby developer would probably say I'm programming like a python programmer (a scala developer would probably say my scala looks like java). Anyway, I haven't had any road blocks importing the various Qt or scala classes besides two minor inconviences.

scala hashmaps
The classes I've built in scala are compiled to byte code and I've been able to read them from java and jruby without any problem. However, I use a few hashmaps in scala and I'd like to be able to iterate through them in my ruby code. JRuby provides some hooks to iterate through java collections, but not for scala, so I made a simple wrapper for my scala hashmaps so I can iterate through them in ruby normally.

class ScalaHashMap
  def initialize(hash)
    @hash = hash
  end

  def each
    @hash.size().times{|key| yield key,@hash.get(key).get}
  end
end

Getting QtJambi to see the JOGL Context
JOGL provides some useful widgets that directly tie into java swing. These widgets don't exist in JOGL for Qt. As such, it was slightly confusing trying to find documentation on getting a GL3 context from JOGL while inside a QGLWidget.

class PanelGL < QGLWidget   
  def initialize(parent = nil)     
    super     
    @camera = Register.cameras.get(0).get      
    profile = GLProfile.get(GLProfile::GL3)     
    glCaps = GLCapabilities.new(profile)     
    glCaps.setPBuffer true     
    @pBuffer = GLDrawableFactory.getFactory(profile).createGLPbuffer(glCaps,DefaultGLCapabilitiesChooser.new(),       1, 1, nil)   
  end   
  attr_reader :camera    

  def initializeGL     
    @ctx = @pBuffer.getContext()   
  end   
  attr_reader :gl    

  def resizeGL(width,height)     
    @ctx.makeCurrent() # this line isn't required for the JOGL Swing components     
    gl = @pBuffer.getContext().getGL.getGL3     
    gl.glViewport(0,0,width,height)   
  end    

  def paintGL     
    @gl = @pBuffer.getContext().getGL.getGL3     
    @gl.glClearColor(0.3, 0.3, 0.3, 0.0)     
    @gl.glClear(GL.GL_COLOR_BUFFER_BIT | GL.GL_DEPTH_BUFFER_BIT)                  
    @gl.glEnable(GL.GL_DEPTH_TEST)      

    ...      

    @gl.glDisable(GL.GL_DEPTH_TEST)   
  end 
end

I first create an pBuffer, so I can use it's context across GL widgets and share things like display lists and VBOs. The big difference in the Qt code is having to call makeCurrent() in the first GL callback that my program executes, which happens to be resizeGL. Calling makeCurrent() there beforehand makes sure the GL context is running on the same thread as the Qt GUI (or vice versa, I guess).

Conclusion
That's my basic setup and now I've got a lot of coding to do.

Tuesday, January 26, 2010

Example of Cache Coherency

In my almost popular article The State of Ray Tracing in Games, I've frequently mentioned the coherency of rasterization being much better than ray tracing, which is pretty incoherent in many respects. I think programmers often address algorithms in a very high-level manner focusing on the big-O, which is important, but I wanted to present how coherency can directly and significantly affect run-time performance.

First, what is coherency? Coherency is defined by one dictionary as being logically or aesthetically ordered or integrated. The word comes up frequently when someone hears a disorganized speech or discussion, where the speaker jumps back and forth between topics. No one sentence or idea is wrong per se, but as a listener it's easier to bundle ideas together on a topic instead of having to jump around.

Cache coherency diagram I stole from wikipedia

Processors are incredibly complex now a days, but generally speaking there are registers which do the operations, the memory which holds the data and the program, and the cache, which sits in between. One purpose of the cache is to provide a buffer of surrounding data, so I don't have to go all the way back to system memory to get it. That trip might seem relatively fast, but it's pretty slow compared to just looking in the cache.

The Diligent Secretary Example
Imagine I was typing at my office desk and wanted to know the names on a numbered list down the hall. I send my secretary down the hall to pull the name off the list and come back with it. I use the name. I move on to the next name. This trip down the hall is relatively slow since I'm waiting for the secretary to come back. It doesn't make sense for him/her to go all the way down and come back with just one name--even if at that moment I just want one name. My secretary being the witty person that he/she is, goes and writes down ten names on this long list and comes back. I ask for the requested name and he/she gives it to me. I ask for the next name and the secretary instantly gives it to me without going down the hall again. This happens eight more times before he/she has to go down the hall to get more names off the list, but he/she has saved both of us lots of time just by grabbing the names around what I wanted. This is coherency. Because I was asking for names in order down the list, my secretary could easily cache extra names to save the trip down there. It didn't work every time, but when it did it was significantly faster.

Now, what happens if I asked for the names in an incoherent manner? My secretary comes back with the first group of names. I ask for name #3, which he/she quickly gives me. I then ask for name #47. My secretary sees he/she only has 1-10 on his/her list, so he/she needs to make another trip. This time my secretary remembers the names 40-49 and comes back. I then ask him/her for number #212. Once again the secretary has to go back to the list down the hall. I'm not asking anything more complex. I'm just asking in a seemingly incoherent order. This same phenomenon is easily presentable in a real-world example.

Real Example
Just as the secretary grabs the numbers around what I asked for with the possibility of not making another trip down the hall, processors will often pull out blocks or pages of memory around what the program requested hoping not to make the trip to system memory again. If I ask for data close (in terms of memory layout) to what I previously requested, there's a possible chance the cache will already have it and provide it faster.

Here's a really simple python script that tries to add up a bunch of integers in a coherent and incoherent manner. In the coherent function, the numbers are added up sequentially based on their order in memory. In the incoherent function, the numbers are pulled randomly from memory. Both functions provide the exact same result, but with a noticeable difference in performance. Let's see how the two compare for different sizes of data.

import random, time

n = 10000000

# make an array of coherent and incoherent indices
coherentIndices = [number for number in xrange(0,n)]
incoherentIndices = [number for number in xrange(0,n)]
random.shuffle(incoherentIndices)

def print_timing(func):
    def wrapper(*arg):
        t1 = time.time()
        res = func(*arg)
        t2 = time.time()
        print '%s took %0.3f ms' % (func.func_name, (t2-t1)*1000.0)
        return res
    return wrapper

# create some random values
values = []
for i in xrange(0,n):
    values.append(random.randint(-10,10))


# add up all the numbers coherently
@print_timing
def coherentAdding():
    total = 0
    for i in xrange(0,n):
        total += values[coherentIndices[i]]

# add up all the numbers incoherently
@print_timing
def incoherentAdding():
    total = 0
    for i in xrange(0,n):
        total += values[incoherentIndices[i]]

coherentAdding()
incoherentAdding()

Results

With a quick graph using Many Eyes, you can see the performance differences. For small numbers, the performance between the coherent and incoherent data is neglible. In the secretary example, this would be the equivalent of the secretary having all the names in his/her head so regardless of the order I ask them, no extra trips down the hall are necessary.

As the problem scales, however, the memory fetching becomes less and less efficient. For the largest test I did, incoherent access is almost 13 seconds longer than the 8 seconds it took for coherent access. In this simple example where both functions are running at O(n), it's easy to see coherency makes a huge difference.

Relevance to Ray Tracing
From a memory perspective, ray tracing is incredibly incoherent. Firing a ray from the camera to shade a pixel, the next ray over might hit a different geometry requiring different textures and instructions to be fetched. That's bad geometry and texture coherency. Plus ray tracing is touted for it's secondary effects like reflections and refractions. These rays are extremely incoherent. One ray might self-intersect the object being shaded while another ray next to it might fly off into the distant corners of the scene. Rasterization on the other hand is extremely coherent. A model is pulled in once for every pass and thrown out again. When rasterizing textured surfaces, the texture cache can grab large chunks of texture because there's good chance the next fragment being rendered is going to be in that cache. In fact the biggest reason rasterization is probably so much faster than ray tracing is its coherency. It makes hardware extremely easy and cheap to build.

Sunday, December 27, 2009

ngons: Are they really that evil?

Playing around with polygons and trying to code them, I've seen a variety problems that can occur working with polygonal geometry. More than once I've hit a snag in my program and I scratch my head and think, "How do the big guys handle that?" Then I crack open Maya, build the same topology and see it break there as well. What I've come to find is different modeling applications handle polygons in different ways--some are deficiencies and some are just design choices.

Problem with Tris and Quads
A polygon as most people know requires at least three points. This of course makes a triangle. A triangle is an ideal planar figure as the three points--assuming they all don't lie on the same line--form a plane. No matter where you move the points, they will always form a plane. Interpolating attributes between the points is easy. Triangles are simple to rasterize, ray trace, and are pretty much universal across modeling applications.

In geometry a polygon (pronounced /ˈpɒlɪɡɒn/) is traditionally a plane figure that is bounded by a closed path or circuit, composed of a finite sequence of straight line segments (i.e., by a closed polygonal chain). -http://en.wikipedia.org/wiki/Polygon

Stepping up to four points already begins to create problems with modeling applications. This is because unlike a triangle, not all four points have to lie on the same plane. The modeler can be restrictive to the user requiring all points to lie on the same plane (making it extremely difficult for an artist), or it can make certain assumptions about how to render a non-planar quad. Sometimes this may or may not be what the artist wants. There's not really a set definition of how to manipulate/render a non-planar polygon.

Take a look at the examples below. In the first two images, the user grabs two vertices at opposite ends of the cube and pulls down forming what looks like a roof. For many people, this maybe what they expect as the program--Blender in this case--just treats the quad as two triangles with the long edge going across the two unselected vertices. This may seem perfectly acceptable, but look at the second example. Now we grab the other two vertices and pull down. One might look at the first example and now assume that it will also form the house, but instead it forms some kind of double-steeple with a rain gutter in between.

Pulling down to opposite vertices

Pulling down the other two vertices

What's going on? Well, Blender just knows it has to render this quad face, so it looks at the first vertex on the face and says, "Okay, let's render these as triangles since rendering non-planar quads doesn't make sense and a planar quad is just two triangles anyway." So it looks at the next two vertices around the face--the one that was pulled down and the opposite unselected vertex and says, "That's three" and renders a nice roof. If you pull the other two vertices down like in the second example, Blender starts rendering at the same vertex as before, but goes up to the unselected vertex, and then back down to the other selected vertex forming that steeple/trench.

Here's the question. Which one is correct? If you pick one and assume that's what the program should always do, you've restricted the artist to never being able to make that other shape easily. You may think Blender's not very smart not knowing how to render non-planar polygons, but that wouldn't be fair. There's no rule. If some program makes a certain decision about how to render that, it's just how they want to deal with it. It doesn't make their solution correct, or Blender's solution incorrect.

You may be thinking, "Hrmm, I bet I could write an algorithm that changes around the order of the rendering to guess what the artist wants...". It's been done. If you go into Maya and try doing this example, you can actually get the model to "pop" in between these two examples as Maya tries to figure out by some threshold what you're trying to do. Is Maya right and Blender wrong? Like I said there's no rule for handling non-planar polygons. If you really want to bend a quad into a two triangles with guaranteed results, just make the two triangles before you bend them (as shown below). Now there's no ambiguity at all. The computer doesn't have to make any decisions about bending anything. You're just moving the points on the triangle, and it knows how to do that no questions asked.

Not only do quads introduce problems like these when they're non-planar, but happens when the quad is not convex? In the example below, I have pulled a vertex over to so the resulting polygon is concave. If I grabbed a different vertex, the renderer might have done what you expencted rendering two nice triangles inside the quad. But because I grabbed that vertex, the renderer made a triangle where it shouldn't have and you see visual tearing, where one triangle is sitting on top of another.

Here you might say, "Well, if the Blender programmers made their modeler smarter, they could just look at that quad, see that it's concave, and render it with the correct triangles". That's certainly a valid point, but that takes processor time to figure out. Assuming the polygon is planar, which as I already stated is iffy, the algorithm has to look at the quad and see how to best divide it up without going outside the polygon. Maya does this and it may seem like a great idea, but that's processing power that could be doing other things like rendering larger meshes. Which would you prefer?

No standards across programs
Not only does the designers have to decide and implement these rules, but what happens when the model leaves the program? Bending a quad may give you what you want in the modeler, but there's no guarantee the program where you're sending that quad handles it the exact same way. Take a ray tracer, for instance. I've never seen an ray intersection that handles a generic four-sided polygon. It will usually divide it up into triangles as would a rasterizer. There's no "correct" way to do handle non-planar polygons or concave polygons, so when you decide you want to model that way, you might get bitten later on. Some programs might handle these issues better than others, but they are just making an arbitrary decision.

N-gons are evil
I have shown that just by moving from a triangle to a quad opens up a whole set of problems--one little extra vertex now allows an artist to create non-planar, concave polygons. "Yuck!", I hope you are saying to yourself. What about ngons? ngons are typically what people call polygons with five or more sides. They have all the same problems as quads, except with infinitely more bad situations. For someone whose never used ngons, he/she may ask, "Well, why don't see just stay far away from ngons to avoid the problems?". This is yet another design decision. Here are a somes reasons why ngons should or should not be included in a modeler.

ngons are good

"clean" topology - when working with a polygons that are going to be planar anyway, ngons don't display a bunch of edges across it, which makes the model look a little cleaner
nice cuts - some tools create n-sided polygons. Although they may create poor geometry, they can be extremely convenient
power to the artist - allowing ngons in modelers gives the artist a chance to work more efficiently even though it may screw him/her up later. Plus, once they're ready to export, they can always clean the mesh up afterwards (some artists prefer this workflow)

An image I stole from the Blender forums. The extrude-vertex tool creates two visible ngons I've marked with a red dot. These ngons are very "clean" in the sense that they are planar, convex, and thus will break down to triangles without any funky results

Two images I stole from blender3darchitect.com. The artist has replaced the distracting quads over the arches with a simple concave ngon. It is debatable whether it is more pleasing to the eye--the former to me looks like keystones. The artist will probably have to clean that up somehow (like triangulating it) before exporting the model

ngons are bad

no standards - there are no guarantees that ngons will import/export as expected. I have read that Cinema 4D, for example, supports ngons, but triangulates ngons on import
subdivision - working in realtime subdivion is very popular now a days, but how do you subdivide an ngon? Most algorithms require very strict rules for mesh topology like the popular Catmull-Clark subdivision, that requires all quads. People have implemented/published additions to these algorithms to make them more robust, but it's up to the designer to pick which one is best for them
tool support - ngons require more complex data structures than quads or triangles meaning tools that work with the polygon have to be smarter--some tools just break or don't handle ngons
processing power - nothing is for free and at some point the program has to decide how to render the ngons to the screen. If it's a concave polygon, the program has to divide it into triangles that are contained in the original polygon. Would the artist rather be spending this processing power on something else?

What's the conclusion?
I can't take sides with how important or useful ngons are. I've heard several times that once an artist models with ngons, it's hard to go without them. I'm not sure if that's a good thing or not. ngons seem to make things easier at certain stages of modeling, while they often need to be cleaned up later in the pipeline. Working with just triangles and quads can be extremely difficult in many situations, but several powerful tools like loop cut and loop select only work because of the special nature of quad loops, which is why some artists try to keep the model as quads as long as possible.

Below is a screenshot from CGSociety, which has a wiki comparing the various modeling programs. It's apparent that nearly every application now supports ngons. zBrush is excused since I'm sure it has insane data structures to handle what only zBrush can handle, and Blender supposedly has ngon support on the way, so regardless of whether an artist needs it or should use it, support for them is/will be pretty much standard.

Table of modeling applications that provide ngon support taken from CGSociety's wiki

Wednesday, October 21, 2009

Next-Gen 3D Rendering Technology: Voxel Ray Casting

Tom's Hardware recently did a follow up to their ray tracing article specifically on Voxel Ray Casting. I thought it was a good review of the technology and did a good job presenting the pros and cons associated with it.

One of the most critical bottlenecks in any rendering technique is managing coherency. The scene in a typical game may contain gigabytes of geometry and textures and it is important most of the time isn't wasted pulling the data in and out of the cache.

No graphics article/post would be complete without at least one comment in the discussion section referencing the Matrix and how it's so close to becoming a reality. I guess that's optimistic considering we can't even predict the weather or have self-driving cars. I did see a few other comments, which I thought deserved a bit more follow up.

kikireeki 10/21/2009 7:28 PM
What happened to Normal Mapping?

Normal mapping is used when one wants to render limited geometry but apply a variety of normals the the otherwise flat surface giving the illusion of detail. The point of this is to use really high amounts of geometry to do the job. Normal mapping can't fix jagged silhouettes, for instance. Each corner of a voxel has a stored normal value, and since these voxels will be less than a pixel when evaluated, the interpolated normal between these eight points is satisfactory.

spiketheaardvark 10/21/2009 8:43 PM
I wish the article discussed how such a system would handle transparent objects and refracted light sources, such as an image of a glass of water.

It doesn't. Big disappointment, I guess. If it could, it would be sparse-voxel ray tracing, wouldn't it? The idea of ray casting is you only fire rays from the camera directly into the scene. No secondary rays. A big reason ray tracing is so slow is when secondary rays are fired to other parts of the scene for things like transparency, reflection, and shadows, you have to go get that geometry to intersect against and shade (including fetching the associated textures). This is really difficult when rays are shooting off at random directions requiring everything to be paged in just to render a metallic sphere. If you have multiple reflections around the scene, there's going to be a lot of paging going on. Don't fire off enough rays for something like color bleeding and you get something like this:

Ray casting like rasterization is very coherent. With rasterization the applicaition streams in geometry piece by piece and hopefully only has to do it one time. With ray casting, part of the octree that fits in a given area of the screen can be paged in, ray casted, and paged out of memory. If you want to add secondary effects, many secondary rays leave the paged in part of the octree and need the geometry for that part of the scene brought it. And another secondary ray from the next pixel might shoot into a different part of the scene requiring another geometry fetch. That's why voxelized octrees won't get us any closer to ray-tracing-style secondary effects. It's just to give us more geometry. It's just to give us more geometry! IT'S JUST TO GIVE US MORE GEOMETRY! I'm happy enough with that. Even if it is only static geometry.

Here's Jon Olick's demo video from Siggraph 2008. If you're wondering what those funky blocks are, those are the individual blocks of the octree paged in and out of memory.

Adroid 10/22/2009 1:32 AM
Great Article. Its an interesting point that about 10 years ago the whole market went to polygons instead of voxels. Who knows where we would be if the industry went that direction. Nowadays processors are becoming more and more capable of such a feat.

I think the sweet spot is finding a mix of both... Maximizing both processor and GPU use in games. Its a heck of alot of programming, but thats why they get paid the big bucks to make games we love right?

If the whole industry went this way 10 years ago we'd be playing a lot more beautiful chess games and fewer 1st-person shooters. I was at the conference when Jon Olick was presenting. Of course, someone had to ask the question, "How do you handle animations and rebuild your octree?". He simply replied that they're not out to solve a problem ray tracing researchers have been investigating for years. This only works on static geometry. This only works for static geometry! THIS ONLY WORKS FOR STATIC GEOMETRY! To get these voxels, they model in polygons, then sample them into voxels. To do this in a game with animations, they'd have to animate the character in polygons, voxelize, and render again to get all the data back. And from what Jon Olick has mentioned, voxelizing is too slow for real time and has to be precomputed.

The idea certainly is to find the sweet spot. Rasterize animated geometry and ray cast static geometry. Developers can still model and rig everything with polygons and the engine can voxelize anything declared static. Voxelized objects lose the ability to deform, but are rendered faster. They can use the same surface shaders and even the same shadow maps, too. It can be completely transparent to the developer using the engine besides choosing what's static and what's dynamic. If the guys excitedly pushing this say it's not going to completely replace rasterization, they have no reason to lie, right? I personally still think the GPU is still best suited for rendering as most games now like Crysis are CPU-limited already doing other things like AI, physics, etc.

Here's the full pdf of Jon Olick's slides. They're pretty complex and don't have many shiny pictures, but it was a great presentation. The best part of all this is the technology is available now on current hardware and will get more prominent as more and more developers become proficient in it.

Saturday, August 15, 2009

Step into Interactive Ray Tracing

I overheard someone commenting recently that ray tracing was the next logical step in interactive rendering. Although this is surely a popular idea, he continued that switching to a ray tracer soon before secondary effects (like reflections and refractions) were practical would be worthwhile. His argument was that since ray tracers can use the same graphical "hacks" like environment maps and shadow maps to match rasterization, when more performance comes down the road the developers could just swap out a given hack with the physically-accurate equivalent. Why not, right? Well, that's a tough pill to swallow.

Games are bottlenecking
The fact of the matter is that interactive rendering--especially in games--is a very performance-intensive medium. As opposed to most applications like a word processor or database, every AAA game title is developed to suck every last drop of performance out of the CPU and GPU. If the developers are only getting 50% CPU/GPU utilitization, they naturally would tweak the game to get more visual quality out of the system. This is natural considering most game players aren't multi-tasking between applicationss while running these programs and can dedicate the hardware to that one game.

This is the problem with transitioning to ray tracing. To move to ray tracing while providing the same quality effects adds more work to an already bottle-necked system. Ray tracing interactive scenes is inherently slower for the same perceptive quality. Optics, BRDFs, and light-transport-related topics have been researched for hundreds of years. Today's ray tracing researchers are focused on speed. Problems include rebuilding dynamic acceleration structures, corralling rays into coherent packets, and getting all the operations to march in lock step on these SIMD-plentiful architectures. It's all about performance. The only time games will implement ray tracing is when they can do the same important effects as efficiently as rasterization. Using rasterization techniques in ray tracers is still slower then rasterization: one still has to build acceleration structures, one still has to do crazy optimizations, and one still has to deal the incoherent texture and geometry accesses inherent in ray tracing. So why on Earth would any developer opt for a ray tracer with half the performance for the same graphics just so fifty years down the road, it's easier to add secondary effects? Blah.

Rasterization's Bang for the Buck
I've stated this before, but many production companies still use rasterization because they care about visual quality. Pixar and Dreamworks, for example, still rasterize the scene first before adding any secondary effects. After all these years these companies are still using basic shadow maps to shade direct illumination. And it's their business to make pretty pictures! The only people saying ray-traced games can do soft shadows interactively are ray tracing researchers and marketers. I could write a rasterizer to do physically-accurate shadows, but it wouldn't be interactive either. That's why I don't brag about it.

With the current growth of processing power, I am guessing that sometime in my life I will see ray tracing become the normal rendering technique in interactive computer graphics, but I doubt it will be in the next thirty years. Rasterization will only fall behind the second it fails to keep improving.

Five-Years Difference
Rasterization is moving at an incredible rate. In a separate article, I compared a screenshot of two games only five years apart: Battlefield 1942 released in 2002, and Crysis released only five years later. This is what ray tracing is competing with. A rapid, cut-throat industry where GPUs are getting more and more powerful and developers are creating more and more dazzling hacks. Years from now when a ray tracer can produce images comparable Crysis, we'll be looking back wondering how we ever tolerated such mediocre graphics.

What's coming down the pipeline?
Interactive rasterization certainly hasn't stopped moving and will continue to provide a lot of one-sided competition for ray tracers in games.

Here are some things coming down the pipeline that we can look forward to that will continue to choke up interactive ray tracers:

Highly-tesselated subdivision surfaces
Movie-quality anti-aliasing
Realistic hair
Interactive deforming

Thursday, August 13, 2009

Shortest Ray Tracer In the World (not really)

I saw Andrew Kensler's newest back-of-a-business-card ray tracer. It's some pretty short and simple code to render his initials in reflective spheres. I'm not sure he could find the way to shorten the code even more, without losing any kind of coherence.

This took 2 minutes on my workstation

And here's the code (butchered by my browser)


#include    // card > aek.ppm

#include 

#include 

typedef int i;typedef float f;struct v{

f x,y,z;v operator+(v r){return v(x+r.x

,y+r.y,z+r.z);}v operator*(f r){return

v(x*r,y*r,z*r);}f operator%(v r){return

x*r.x+y*r.y+z*r.z;}v(){}v operator^(v r

){return v(y*r.z-z*r.y,z*r.x-x*r.z,x*r.

y-y*r.x);}v(f a,f b,f c){x=a;y=b;z=c;}v

operator!(){return*this*(1/sqrt(*this%*

this));}};i G[]={247570,280596,280600,

249748,18578,18577,231184,16,16};f R(){

return(f)rand()/RAND_MAX;}i T(v o,v d,f

&t,v&n){t=1e9;i m=0;f p=-o.z/d.z;if(.01

for(i j=9;j--;)if(G[j]&1<,0,-j-4);f b=p%d,c=p%p-1,q=b*b-c;if(q>0

){f s=-b-sqrt(q);if(s.01)t=s,n=!(

p+d*t),m=2;}}return m;}v S(v o,v d){f t

;v n;i m=T(o,d,t,n);if(!m)return v(.7,

.6,1)*pow(1-d.z,4);v h=o+d*t,l=!(v(9+R(

),9+R(),16)+h*-1),r=d+n*(n%d*-2);f b=l%

n;if(b<0||T(h,l,t,n))b=0;f p=pow(l%r*(b

>0),99);if(m&1){h=h*.2;return((i)(ceil(

h.x)+ceil(h.y))&1?v(3,1,1):v(3,3,3))*(b

*.2+.1);}return v(p,p,p)+S(h,r)*.5;}i

main(){printf("P6 512 512 255 ");v g=!v

(-6,-16,0),a=!(v(0,0,1)^g)*.002,b=!(g^a

)*.002,c=(a+b)*-256+g;for(i y=512;y--;)

for(i x=512;x--;){v p(13,13,13);for(i r

=64;r--;){v t=a*(R()-.5)*99+b*(R()-.5)*

99;p=S(v(17,16,8)+t,!(t*-1+(a*(R()+x)+b

*(y+R())+c)*16))*3.5+p;}printf("%c%c%c"

,(i)p.x,(i)p.y,(i)p.z);}}

I'm still waiting for his back-of-a-napkin photon mapper.