Sunday, September 11, 2011

Single-Pass Wireframe Rendering Explained and Extended

I got a pretty good followup in my previous post with how to implement "Single-Pass Wireframe Rendering". I thought I'd take a second to briefly explain how the edge detection actually worked.

As I said before, the basic idea is that we want each fragment to know about how far it is from any given edge so we can color it appropriately.  First, we have to remember that OpenGL is an interpolation engine and it is very good at that.  When some attribute is assigned to a vertex and the triangle is rasterized, fragments created inside that triangle get some interpolated value of that attribute depending on how far it is from the surrounding vertices.

Shown below, a triangle has a three-channel attribute assigned to each vertex.  In the fragment shader, those values will be some mixture of the three channels.  You'll notice, for example that the triangle in the image below gets less red from left to right.  The first RGB channel starts at 1, but as one moves towards that channel goes to zero.

That's the basic gist of the attribute interpolation.  The nice thing about modern GPUs is it lets us put anything we want in these attributes.  They may be used in the fragment shader as colors, normals, texture coordinates, etc.   but OpenGL doesn't really care about how you plan on using these attributes as they all get interpolated the same way.  In the fragment shader, we can decide how to make sense of the interpolated values.

Finding Edges Programmatically
Notice the pixel values along the top-left area of the triangle above have a low green value because the left and top vertices have no green in them, so pixels moving towards that edge have less and less green.  Similarly, the right side of the triangle has pretty much no red in it, because the values in the vertices above and below it have no red.  The same holds true for the bottom edge of the triangle having no blue.  The insight to be gained here and which is used in "Single-Pass Wireframe Rendering" is that values along the edges of the triangle will have a very low value in at least one of the three channels.  If ANY of the channels is close to zero, that fragment is sitting on or very near an edge.

images taken from nVidia's Solid Wireframe paper

We could just assign similar values as here and just render edges if the value is below some threshold.  The problem, though, is that these values aren't in viewport space and we probably want to measure our line thickness in terms of pixels.  Otherwise the edge thickness on our screen would change depending on the size of the triangle (maybe you want that, whatever).

As shown in the picture above, we calculate the altitudes of each vertex in screen space and store them in some vertex attribute.  In the fragment shader, the value (d0,d1,d2) will be somewhere between the three vertex attributes.  As described above, if any of these channels d0, d1 or d2 is close to zero, that means we're sitting on an edge.

nVidia has an excellent paper called Solid Wireframe, which goes into a bit more detail how this works and provides some really great illustrations.

Excluding Edges on polygons
While rendering edges is nice, I may not want every edge of a given triangle to be rendered.  For example, if I have some five-sided polygon concave polygon that I break into triangles using some technique like ear clipping (pdf), I may not want the interior edges inside the polygon to be rendered.

A simple way to exclude an edge of a given polygon is to make sure that that value never goes to zero by setting the channels of the other vertices to some high amount Q.  This Q can be any value higher than your maximum edge-width.  In my program, I set it to 100 since I'll probably never be drawing edges thicker than that.

If Q is relatively high, fragments along that edge will not have low values in any channel
Designating which edges to exclude requires an additional vertex attribute sent down from the program.  I attach a float to each vertex with a 0 or 1 whether or not I want to exclude that edge from being rendered.  I then update my geometry shader accordingly.

my updated vertex shader...

#version 120
#extension GL_EXT_gpu_shader4 : enable
in vec3 vertex;
in vec4 color;
in vec3 normal;
in float excludeEdge;
varying vec3 vertWorldPos;
varying vec3 vertWorldNormal;
varying float vertExcludeEdge;
uniform mat4 objToWorld;
uniform mat4 cameraPV;
uniform mat4 normalToWorld;
void main() {
vertWorldPos = (objToWorld * vec4(vertex,1.0)).xyz;
vertWorldNormal = (normalToWorld * vec4(normal,1.0)).xyz;
gl_Position = cameraPV * objToWorld * vec4(vertex,1.0);
vertExcludeEdge = excludeEdge;
gl_FrontColor = color;

and my updated geometry shader...

#version 120
#extension GL_EXT_gpu_shader4 : enable
#extension GL_EXT_geometry_shader4 : enable
varying in vec3 vertWorldPos[3];
varying in vec3 vertWorldNormal[3];
varying in float vertExcludeEdge[3];
varying out vec3 worldNormal;
varying out vec3 worldPos;
uniform vec2 WIN_SCALE;
noperspective varying vec3 dist;
void main(void)
float MEW = 100.0; // max edge width
// adapted from 'Single-Pass Wireframe Rendering'
vec2 p0 = WIN_SCALE * gl_PositionIn[0].xy/gl_PositionIn[0].w;
vec2 p1 = WIN_SCALE * gl_PositionIn[1].xy/gl_PositionIn[1].w;
vec2 p2 = WIN_SCALE * gl_PositionIn[2].xy/gl_PositionIn[2].w;
vec2 v0 = p2-p1;
vec2 v1 = p2-p0;
vec2 v2 = p1-p0;
float area = abs(v1.x*v2.y - v1.y * v2.x);
dist = vec3(area/length(v0),vertExcludeEdge[1]*MEW,vertExcludeEdge[2]*MEW);
worldPos = vertWorldPos[0];
worldNormal = vertWorldNormal[0];
gl_Position = gl_PositionIn[0];
dist = vec3(vertExcludeEdge[0]*MEW,area/length(v1),vertExcludeEdge[2]*MEW);
worldPos = vertWorldPos[1];
worldNormal = vertWorldNormal[1];
gl_Position = gl_PositionIn[1];
dist = vec3(vertExcludeEdge[0]*MEW,vertExcludeEdge[1]*MEW,area/length(v2));
worldPos = vertWorldPos[2];
worldNormal = vertWorldNormal[2];
gl_Position = gl_PositionIn[2];

Without edge removal, each triangle has its own edge

Triangle mesh rendered excluding certain edges

Saturday, September 10, 2011

Single-Pass Wireframe Rendering

It came time for me to add a wireframe to my mesh and just when I was about to do the standard two-pass approach of rendering out my mesh faces and then rendering my wireframe with GL_LINES over that, I came across Single-Pass Wireframe Rendering, a simple idea for rendering my faces and lines in just one pass.  The idea, to put it simply, is to add some smarts to the fragment code so when it's rendering fragments close to the sides of a face, it blends in an edge color.  The paper gives several reasons why this is a better approach including better performance and some really cool added abilities.  The best part is it's very easy to add to existing code without much modification.

Adding the Geometry Shader
My code already had a basic vertex/fragment shader for doing some basic lighting and I just needed to add geometry shader in between that could add an attribute to each vertex specifying how far the fragment would be from the edge in screen space.

Here's the geometry shader taken almost straight from their full paper off their site...

#version 120
#extension GL_EXT_gpu_shader4 : enable
#extension GL_EXT_geometry_shader4 : enable
varying in vec3 vertWorldPos[3];
varying in vec3 vertWorldNormal[3];
varying out vec3 worldNormal;
varying out vec3 worldPos;
uniform vec2 WIN_SCALE;
noperspective varying vec3 dist;
void main(void)
// taken from 'Single-Pass Wireframe Rendering'
vec2 p0 = WIN_SCALE * gl_PositionIn[0].xy/gl_PositionIn[0].w;
vec2 p1 = WIN_SCALE * gl_PositionIn[1].xy/gl_PositionIn[1].w;
vec2 p2 = WIN_SCALE * gl_PositionIn[2].xy/gl_PositionIn[2].w;
vec2 v0 = p2-p1;
vec2 v1 = p2-p0;
vec2 v2 = p1-p0;
float area = abs(v1.x*v2.y - v1.y * v2.x);

dist = vec3(area/length(v0),0,0);
worldPos = vertWorldPos[0];
worldNormal = vertWorldNormal[0];
gl_Position = gl_PositionIn[0];
dist = vec3(0,area/length(v1),0);
worldPos = vertWorldPos[1];
worldNormal = vertWorldNormal[1];
gl_Position = gl_PositionIn[1];
dist = vec3(0,0,area/length(v2));
worldPos = vertWorldPos[2];
worldNormal = vertWorldNormal[2];
gl_Position = gl_PositionIn[2];

If you're already familiar with the vertex/fragment shader pipeline, which has been around quite a few years longer than the geometry shader, you'll recognize nothing is too out of the ordinary. It takes a world position and normal, which is basically just passed off to the fragment shader for lighting purposes. Although I've done quite a bit of GLSL, this was my first attempt at using a geometry shader, and once I learned the basic idea, I found it pretty intuitive.

First, there are varying inputs from the vertex shader that come in as arrays--one element for each vertex. The names have to match up, so for the in vec3 vertWorldPos[3] attribute, there must be a corresponding out vec3 vertWorldPos designated in the vertex shader. The exception to this is predefined variables like gl_Position, which comes in as gl_PositionIn[]. Not sure why the OpenGL designers decided to add those two letters, but whatever.

WIN_SCALE is the screen size, which we multiply be the vertex position XY. This takes our vertex positions in viewport space and converts them to screen space since we want to measure our distances in pixels in the fragment shader. That's followed by some basic trig to calculate the area of the triangle, which is used to find the altitude of each vertex (the closest distance to the opposing edge). Because the altitude is already in screen space, the noperspective keyword is added to disable perspective correction.

The geometry shader is responsible for actually creating the primitives via the EmitVertex() and EndPrimitive() functions. When EmitVertex() function is called, it sends a vertex down the pipeline with attributes based on whatever the out attributes happen to be set to at the time. EndPrimitive() just tells OpenGL that the vertices already sent down are ready to be rasterized as a primitive.

The geometry shader can actually create additional geometry on the fly, but it comes with some caveats. We must designate in our C++ code an upper bound of how many vertices we might want to create. This geometry shader doesn't create any additional geometry, but it's still useful as it provides knowledge of the neighboring vertices to calculate the outgoing vertex altitudes.

Setting Up the Geometry Shader
Using Qt, the geometry shader is compiled just like the vertex and fragment shader.

QGLShader* vertShader = new QGLShader(QGLShader::Vertex);

QGLShader* geomShader = new QGLShader(QGLShader::Geometry);

QGLShader* fragShader = new QGLShader(QGLShader::Fragment);

QGLShaderProgramP program = QGLShaderProgramP(new QGLShaderProgram(parent));

The only other adjustment is when we bind our shader. Because the geometry shader can create more geometry than inputted, it requires giving OpenGL a heads up of how much geometry you might create. You don't necessarily have to create all the vertices you allocate, but it's just a heads up for OpenGL. You can all also configure the geometry shader to output a different type of primitive than inputted like creating GL_POINTS from GL_TRIANGLES. Because this geometry shader is just taking a triangle in and outputting a triangle, we can just set the number of outgoing vertices to the number going in. GL_GEOMETRY_INPUT_TYPE, GL_GEOMETRY_OUTPUT_TYPE, and GL_GEOMETRY_VERTICES_OUT need to be specified prior to linking the shader.

QGLShaderProgram* meshShader = panel->getShader();

// geometry-shader attributes must be applied prior to linking
meshShader->setUniformValue("WIN_SCALE", QVector2D(panel->width(),panel->height()));
meshShader->setUniformValue("objToWorld", objToWorld);
meshShader->setUniformValue("normalToWorld", normalToWorld);
meshShader->setUniformValue("cameraPV", cameraProjViewM);
meshShader->setUniformValue("cameraPos", camera->eye());
meshShader->setUniformValue("lightDir", -camera->lookDir().normalized());

We also need to modify the fragment shader to take this distance variable into account to see if our fragment is close to the edge.

#version 120
#extension GL_EXT_gpu_shader4 : enable
varying vec3 worldPos;
varying vec3 worldNormal;
noperspective varying vec3 dist;
uniform vec3 cameraPos;
uniform vec3 lightDir;
uniform vec4 singleColor;
uniform float isSingleColor;
void main() {
// determine frag distance to closest edge
float nearD = min(min(dist[0],dist[1]),dist[2]);
float edgeIntensity = exp2(-1.0*nearD*nearD);
vec3 L = lightDir;
vec3 V = normalize(cameraPos - worldPos);
vec3 N = normalize(worldNormal);
vec3 H = normalize(L+V);
vec4 color = isSingleColor*singleColor + (1.0-isSingleColor)*gl_Color;
float amb = 0.6;
vec4 ambient = color * amb;
vec4 diffuse = color * (1.0 - amb) * max(dot(L, N), 0.0);
vec4 specular = vec4(0.0);
edgeIntensity = 0.0;

// blend between edge color and normal lighting color
gl_FragColor = (edgeIntensity * vec4(0.1,0.1,0.1,1.0)) + ((1.0-edgeIntensity) * vec4(ambient + diffuse + specular));

And that's it! It takes a bit more work to get it working with quads, but once done you can do some pretty wild and awesome tricks as shown on the author's wireframe site.



And here's the vertex shader for reference, but as you can see it's quite simple because a bit of the processing it's did has been moved to the geometry shader. Nothing here really even has to do with the edge rendering.

#version 120
#extension GL_EXT_gpu_shader4 : enable
in vec3 vertex;
in vec4 color;
in vec3 normal;
varying vec3 vertWorldPos;
varying vec3 vertWorldNormal;
uniform mat4 objToWorld;
uniform mat4 cameraPV;
uniform mat4 normalToWorld;
void main() {
vertWorldPos = (objToWorld * vec4(vertex,1.0)).xyz;
vertWorldNormal = (normalToWorld * vec4(normal,1.0)).xyz;
gl_Position = cameraPV * objToWorld * vec4(vertex,1.0);
gl_FrontColor = color;

Wednesday, September 07, 2011

Sunshine: Fixed Normals

I'm finally starting to make some noticeable headway in my Sunshine app.  It feels like everything I do has such a little impact on the UI, but then I tell myself it's all "infrastructure" and soon I'll be able to start adding features like crazy.  I'm probably lying to myself.

Imported lamp OBJ with calculated normals
Infrastructure - boost::python bindings
After a lot of help from the boost::python mailing list, I've managed to expose my Scene object to python.  Although there isn't a lot of the C++ code exposed, the "infrastructure" for it is there so I can expose classes and individual functions quickly on a need-to-expose basis.  Right now I have a python script, which compiles to a Qt resource in the binary and use it to import OBJs.  Although it's fairly generic (lacks material support, ignores normals/UVs), it would be plenty trivial to import other geometry types with very little effort.

Fixed Normals
As I mentioned, I'm ignoring the normals provided in the lamp OBJ for now and recompute the normals per face (no smoothing).  I've had a bug in the fragment shader for the longest time that I've ignored until I started moving the cube mesh away from the origin.  Looking at the GLSL, it looks like I was transforming my incoming normal by my object-to-world.  That made the translation factor muck up the vector.  I ended up adding a normal object-to-world matrix that right now just consists of the model rotation matrix.  It looks a lot better!

Basic Features
Import OBJs
Write python-based mesh importers
Tumble around scene in Maya-like camera
Render "test" to aqsis window

Next Step
Add a face mode and some tools (?in python?)
Send scene to aqsis
Make render settings (resolution at least) adjustable


Saturday, September 03, 2011

Adding Python Support using Boost::Python

Choosing an Embedded Language
In my multi-year toy project, I decided it would be useful to incorporate an embedded programming language into the app to make adding features a little easier.  It basically came down to three possible languages that I thought would be appropriate: python, javascript, and lua.  Lua is known for being light-weight and fast, and is used in many games like World of Warcraft to provide scripting to the user.  Javascript itself seems to have had a bit of a resurgence in recent years in terms of stepping outside of web development.  Both languages have a variety of implementations that seemed workable including luabind for lua and V8 (produced by Google) for javascript specificially for embedding into an application.  Both provide a reasonable interface for mapping C++ classes.

Although both languages seemed appealing, I wasn't really satisfied with how the mappings were written to expose the C++ classes.  Also, I have much more experience with python, so I had to decide if I wanted to learn how to bind a scripting engine to my application AND learn a new scripting language.  I've done quite a bit of javascript, but never for a large application and wasn't sure how well things would map between it and C++, as both lua and javascript don't technically support objects the same way as python.  I finally decided just to go with python.  Although not as fast or light-weight as lua or javascript, I felt it provided a better one-to-one mapping of my classes, and figured users might be more comfortable with a more C-like language for application scripting.

Which Python Binding?
I ended up trying several different libraries to embed python into my app.  PythonQt seemed like a good candidate as I was writing my app using the Qt libraries, but I encountered a few strange bugs and the community seemed to be stagnating--unfortunate as the API seemed really intuitive.  Both SIP and SWIG are popular for binding, but both require a special syntax in external files, and I wanted to modify my qmake build as little as possible and didn't want to learn a new syntax.  After finally experimenting with boost::python, I found the library allowed me to write my mappings inside C++ without learning any new syntax or much with my build system.

Using boost::python
I had a Scene class, which naturally handles everything in my scene, which I wanted to expose to python.  boost::python has a special function called BOOST_PYTHON_MODULE, which puts a class into a particular module, which is imported into python.  Once wrapping my Scene class with the class_ function, I could then import the Scene class into python from the "scene" module.  The boost::noncopyable is an optional argument that notes not to pass the Scene object to python by value, since my scene might be rather large in memory and I didn't want multiple copies.  This is more of a compiler rule as I still have to make sure I'm not passing the scene by value, but with that I get a compiler error if I try.

  class_<Scene, boost::noncopyable>("Scene");

I have also been trying to use smart pointers for heap-allocated objects.  I started out using QSharedPointer, but boost::shared_ptr is already supported by boost, so I ended up switching my smart pointers over to boost's.
typedef boost::shared_ptr SceneP;

Sending Yourself to Python
Once that is setup, I could then pass my Scene object over to python.  I immediately hit a snag.  In particular to my Scene class, the Scene actually contained my python engine and called the python code.

      object ignored = exec(EXAMPLE_PY_FUNCTION, pyMainNamespace);
      object processFileFunc = pyMainModule.attr("Foo").attr("processFile");
      processFileFunc(this, "test.txt"); // "this" being the Scene object

Although I created the scene in a smart pointer outside the class, I didn't have access to that smart pointer inside member functions to pass to python unless I passed it into the function, which seemed unnecessary to pass a Scene member function a smart pointer to itself.  I couldn't use "this" inside the member function as boost::python wouldn't know by default how to keep that in memory since other boost pointers were already pointing to my Scene object.  I couldn't just create a shared_ptr in the function either, because I didn't want my Scene deleted when the shared pointer goes out of scope when the function returns.

I was actually getting this error because boost didn't know how to deal with the this pointer without passing it by value (which I explicitly said I didn't want copied, right?).

Error in Python: : No to_python
(by-value) converter found for C++ type: Scene

It turns out boost::python has a special way to do deal with this situation using a special class called  boost::enable_shared_from_this, which my Scene class can inherit from.

class Scene : public boost::enable_shared_from_this<Scene>

boost::enable_shared_from_this provides two functions that allow the Scene object to create shared pointers inside member functions by calling shared_from_this(), which is inherited from boost::enable_shared_from_this.

      object ignored = exec(EXAMPLE_PY_FUNCTION, pyMainNamespace);
      object processFileFunc = pyMainModule.attr("Foo").attr("processFile");
      // pass the python function a shared pointer instead of "this"
      processFileFunc(shared_from_this(), "test.txt"); // can't use boost::shared_ptr(this) either

After updating the member function, the error went away and I could then send my Scene object to python inside and outside of the member function.  Python now has access to my scene object and I can start exposing some more functions and variables inside my Scene class.

Below is a short working example of sending a shared pointer to python inside and outside of a member function (or you can look here).  Special thanks to the boost::python mailing list, which was very helpful in getting me going.

#include <iostream>

#include <boost/python.hpp>
#include <boost/python/class.hpp>
#include <boost/python/module.hpp>
#include <boost/python/def.hpp>
#include <boost/enable_shared_from_this.hpp>
using namespace boost::python;

object pyMainModule;
object pyMainNamespace;

  \"from scene import Scene\\n\" \\
  \"class Foo(object):\\n\" \\
  \"  @staticmethod\\n\" \\
  \"  def processFile(scene, filename):\\n\" \\
  \"    print(\'here\')\\n\"

std::string parse_python_exception();

class Scene : public boost::enable_shared_from_this<Scene>
  void sendYourselfToPython()
    try {
      object ignored = exec(EXAMPLE_PY_FUNCTION, pyMainNamespace);
      object processFileFunc = pyMainModule.attr(\"Foo\").attr(\"processFile\");
      processFileFunc(shared_from_this(), \"test.txt\");
    } catch (boost::python::error_already_set const &) {
      std::string perror = parse_python_exception();
      std::cerr << \"Error in Python: \" << perror << std::endl;
typedef boost::shared_ptr<Scene> SceneP;

  class_<Scene, boost::noncopyable>(\"Scene\");

main(int argc, char**argv)
  std::cout << \"starting program...\" << std::endl;

  pyMainModule = import(\"__main__\");
  pyMainNamespace = pyMainModule.attr(\"__dict__\");

  boost::python::register_ptr_to_python< boost::shared_ptr<Scene> >();
  PyImport_AppendInittab(\"scene\", &initscene);

  SceneP scene(new Scene());

  // sending Scene object to python inside member function

  try {
    object ignored = exec(EXAMPLE_PY_FUNCTION, pyMainNamespace);
    object processFileFunc = pyMainModule.attr(\"Foo\").attr(\"processFile\");

    // send Scene object to python using smart pointer
    processFileFunc(scene, \"test.txt\");
  } catch (boost::python::error_already_set const &) {
    std::string perror = parse_python_exception();
    std::cerr << \"Error in Python: \" << perror << std::endl;

// taken from
namespace py = boost::python;
std::string parse_python_exception() {
    PyObject *type_ptr = NULL, *value_ptr = NULL, *traceback_ptr = NULL;
    PyErr_Fetch(&type_ptr, &value_ptr, &traceback_ptr);
    std::string ret(\"Unfetchable Python error\");
    if (type_ptr != NULL) {
        py::handle<> h_type(type_ptr);
        py::str type_pstr(h_type);
        py::extract<std::string> e_type_pstr(type_pstr);
            ret = e_type_pstr();
            ret = \"Unknown exception type\";

    if (value_ptr != NULL) {
        py::handle<> h_val(value_ptr);
        py::str a(h_val);
        py::extract<std::string> returned(a);
            ret +=  \": \" + returned();
            ret += std::string(\": Unparseable Python error: \");

    if (traceback_ptr != NULL) {
        py::handle<> h_tb(traceback_ptr);
        py::object tb(py::import(\"traceback\"));
        py::object fmt_tb(tb.attr(\"format_tb\"));
        py::object tb_list(fmt_tb(h_tb));
        py::object tb_str(py::str(\"\\n\").join(tb_list));
        py::extract<std::string> returned(tb_str);
            ret += \": \" + returned();
            ret += std::string(\": Unparseable Python traceback\");
    return ret;

Compiling the Code
To compile the code, I used python-config to get the includes and flags.  python-config is a simple utility that queries the path of your python headers and libs depending on which version of python is installed and designated on you system.  It's a useful utility as I usually have several versions of python on my machine at a time.  It's especially nice not having to hard code your application's build system to a particular version of python.

python-config --includes
python-config --libs

g++ test.cpp -I/usr/include/python2.7 -I/usr/include/python2.7 -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -lpthread -ldl -lutil -lm -lpython2.7 -lboost_python