Sunday, 31 August 2008

CUDA v2.0

I have put off downloading the version 2.0 documentation for a while now, largely as I cannot afford a board with compute capability 2 just yet. This weekend I eventually succumbed to temptation and downloaded the pdf.

For those of you playing with raytracers figure 5.4 on page 58 should be of particular interest :)  Its nice to see how much the nice NVidia engineers are improving their products all the time. Now to scrape together my pennies for a 2.0 card...   Still developing on a 8800GT isnt a bad idea as a lot of people have them - I'll just keep telling myself that for now :)

Friday, 29 August 2008

Acceptable performance guidelines for gaming

According to the PC Gaming Alliance Performance Definition these are the acceptable performance guidelines for gaming:

Pass/Fail Game Metrics (fps):
First Person Shooters: 24+ frames per second
Real Time Strategy: 20+ frames per second
Role Playing Games: 20+ frames per second
Racing Simulation: 50+ frames per second
Flight Simulation: 20+ frames per second
Playability = FPS + Responsiveness + no Graphical Problems

Assuming this is correct it could mean that raytracing will shortly be a viable alternative to rasterising for RTS and role playing games.

4 more fps

4 more fps! ... well on average anyway :)  So its now running at 25-30 continually. Last night did a small update to optimize the number of threads. Also added more colours to the spheres - and you can now press 'R' to get some very primitive physics (bouncing)

The download link is the same as the previous post but inside the zip is a version 0.02.

This is still without my new acceleration structures. They are taking a while to implement so as to maximize the efficiency in the way the GPU wants to read the memory. I'm hoping for at least a 2x speed up.

Wednesday, 27 August 2008

Cudart spheres: movie and download

Here is a sample movie of the spheres in cudart, the average fps code is not quite correct but gives a good idea of the current speed of the system. Well over 25 times the CPU implementation now.

Cudart Spheres movie

Here is the promised download - note it is only tested on a 8800GT with 512mb of memory. Any problems / comments / suggestions please let me know.


Again this one is only doing untextured spheres with 2 light sources.

Tuesday, 26 August 2008


For whatever reason spheres in an image seem to appeal to people in general a lot more than a collection of big triangles, no matter how well shaded / textured and lit they are... So to get a bit of "ooooo" and "aaaaah" factor going I decided to add spheres to cudart over the weekend.

Of course, as well as looking better than a triangle, computationally they are much quicker to compute. So even with 111 objects in the scene and 2 levels of reflection it is still possible to get close to 20fps at 800x800. I have just finished an interactive demo version which I will make available for download here as soon as I've sorted out the license agreement. In the meantime here are some screenshots.


Friday, 22 August 2008


As mentioned in earlier posts I have been calculating u,v coordinates for every triangle intersection. I have avoided actually texturing my triangles as I wanted to get most of the other things working well first - I find textures tend to disguise shading / intersection glitches.

Last night I decided to implement some basic texturing and see how much longer the scene took to render. So I added the Lena image to my modified Cornell box - if you are going to be a cliche you may as well go all the way :)    Screenshot with texturing is below.

Thursday, 21 August 2008

Modern art (or when raytracing goes wrong) and some timings...

I wanted to get some proper timing figures before I implemented reflections in order to tell how badly (or well) my reflection code is performing. So I removed one or two small optimizations which would affect the timings based on the scene layout and got this:

[caption id="attachment_186" align="aligncenter" width="300" caption="When raytracing goes wrong"]When raytracing goes wrong[/caption]

If I need something to fall back onto - modern art might just be my thing :p

Wednesday, 20 August 2008

Speed update

To prevent confusion on my timing figures, here is how I calculate them: As my target framerate is 24 fps I base my timings on creating a full frame from scratch every cycle and doing that 24 times to get my timing. Best demostrated with pseudocode:

setup stuff and create scene objects

start timing

for framecount 1 to 25         (25 instead of 24 to allow for some overhead later)

     generate frame

stop timing

I've now set up my scene with 100 objects (the goal amount) and one light source (the goal is 2). I've enabled lighting and shadows but as yet no colour and reflections. (screenshot below)

Tuesday, 19 August 2008


My goal for cudart is to run 24 frames per second at 800*600 with a decent object and light count. Decent is a bit hard to define but for now 100 objects and 2 light sources would make me pretty happy.

Most of the screen shots until now have been only partially computed on the GPU as I have been trying to develop my algorithms. I've found it is very easy to fall into the trap of making a photo realistic renderer and as a result the earlier screenshots you see are running at about 1 second a frame - ie 1 fps...  So although they might look impressive they are hardly running in realtime. Over the last few days I have been converting everything to run on the GPU and here is the first screenshot.

[caption id="attachment_173" align="aligncenter" width="300" caption="First purely GPU output"]First purely GPU output[/caption]


Friday, 15 August 2008

Managing scene objects

After a small change to the lighting (again...) I'm fairly happy with the basic ray / light model now. I still need to implement refraction but that can wait a bit. The next task is to work out an efficient way of manipulating the many triangles in the scene. I need to group them into objects - remembering that OO is not supported on CUDA, and make methods to translate / rotate / scale these objects.

[caption id="attachment_162" align="aligncenter" width="300" caption="Yet another lighting change"]Yet another lighting change[/caption]

Thursday, 14 August 2008

Colour bleeding

Just a very quick update today as I'm driving all over the place before work this morning. Even the tail end of rush hour on the m4 will be fun. For those of you who take the m4 in the mornings - don't panic! I'm not taking Buttercup, so there wont be any massive delays :p

As you can see from the images below I fixed some of the artifacts and coloured in my Cornel Box. It has a white light source very nearly in the centre. In the screenshot with reflections you can clearly see the colour bleeding (which is the desired effect).  Next steps are to get the dimensions of the box correct and add some objects to it.


Wednesday, 13 August 2008


Busy night last night - a variety of bug fixes in cudart. Amazing how your lighting improves when you are not taking a cubic root :p   I was for some reason, probably a really good one at the time, getting the magnitude of a vector and then taking the square root of the scalar. This explains rather nicely why my reflections were so weak.  Secondly fixed an error in some intersection calculations which was having an inpact on light reflections.  As you can see from the screenshots these fixes have caused a massive improvement in quality.

Archive Section

You will notice a new page link on the right today - "archives".  This is going to be a collection of things from the old site along with some old code / graphics etc taken from my old archives which I recently discovered. It covers stuff like assembler, openGL, directX, casinos, video, neural networks, completion ports etc - hopefully some of it is still useful to someone.  Archives

Tuesday, 12 August 2008

Lighting and texturing

It's been a hectic few days, what with house guests and trying to keep Buttercup (my Landy) running. Actually she features in most of the example image processing pictures, but I digress.  With limited time only had a chance to fix the coordinate systems in cudart - the y axis was inverted right from the first version which hasn't really been a problem until the surface normals weren't in the usual direction causing a bit of a debugging headache. 

[caption id="attachment_117" align="aligncenter" width="400" caption="Colours and reflections"]Colours and reflections[/caption]

Thursday, 7 August 2008

Finding a point on a ray closest to a point

This sounds like a rather trivial calculation, but rather than actually thinking about I just reached for a linear algrebra book - no luck....  Then for a calculus book ... still no luck, although I had some fun with the Taylor series.

A quick google search later and I had a few results, most were closest point on two rays and not exactly what I wanted. Some others failed, or deliberately chose not to, note that a ray is defined as r = P0 + td where (and here's the important bit) t is element of R AND t>=0 .  Which would fail to account for the closest point being at a -t value. (invalid)

Tuesday, 5 August 2008

image rotation

As I am yet to post any meaningful code I thought I'd start with a cuda image rotation tutorial.

To keep things simple we are going to use the standard sin/cos (or matrix) rotation method and not the sheer method.

Monday, 4 August 2008


Here are some more screenshots of cudart after some changes to lighting code and some other minor changes to the intersection code (It now produces u,v values so texturing will be coming soon)

Notice the artifacts along the front edge of the prism (mostly visible in the second image) which seem to be caused by the self intersection tests and my epsilon value possibly being too high.


Canny Edge Detection

As part of my entry to the CUDA challenge I implemented a Gaussian Point Spread function and a Sobel filter.

I'm not sure how my project went - badly I suppose - as I had only obtained a 8800 GT device a week before deadline :( I'll post the full project here once judging has finished. Actually I will probably keep working on improving it as I beleive there is a lot of scope for a fully implemented version.

Friday, 1 August 2008

Compiler optimizations

I've noticed when comparing assembly outputs, that despite advances in compiler optimizations,  they still dont optimize dependencies correctly.  Register dependencies can really stall the multi stage pipelining so it is well worth hand optimizing these in speed critical sections. (a certain ray tracer springs to mind :p )