Thursday 31 July 2008


cudart is my very primitive implementation of a ray tracer on a cuda enabled device. At the moment crudrt would be a more appropriate name :p

It still needs stacks of work and optimizing.  There are a few flaws in the way I've converted the traditionally recursive ray tracing algorithms but at least it more or less works.

Here is the very first output from it (scaled down in size from original) showing a ground plane, 3 triangles and some lights.  No shading or reflections / refractions yet - but coming soon.

[caption id="attachment_52" align="alignnone" width="300" caption="First Output from cudart"]First Output from cudart[/caption]

CUDA occupancy calculator

D'oh!  After all my mucking around with performance calculations related to register, shared mem and global mem usage I discovered "CUDA_Occupancy_calculator.xls" lurking in the tools directory which does it all for you. Its even mentioned in the docs ... another D'oh!

I don't feel it was a complete waste of time as I now understand the inner workings of the multiprocessors a lot better.

If you do want to use the spreadsheet dont forget to compile with the -cubin option which will tell you the register usage / shared mem usage etc.

BV2 has a new home

After our previous hosting provider effectively changed our contract to 1/10 of the previous bandwidth and then last month changed its billing procedures making it effectively impossible to pay them I decided to switch to a new host.

The new providers had the server up and going exactly when they said it would be and completely untrue to form all the backups of the old server worked.  mySQL is now really much easier to backup and restore using their management tools than it used to be. Admittedly I last moved servers years ago.

The new host is here in the UK so the latency will be about 1/4 as it used to be to Canada. I will post a review of them here after a few weeks.

Friday 25 July 2008

WARP power 9 not the quickest

Ive been playing with GPU coding and especially CUDA for quite a while now and thought I'd start posting my findings here. NVIDIA have done a great job with CUDA and their GPU families and I thoroughly recommend their products. I'm currently running a 8800GT as my coding board and a 7900GTX as my display card.

I was optimizing my grayscale kernel last night. Its quite an interesting one to optimize as the calculation it performs is trivial.  There is no warp divergence at all as there are no conditionals in the kernel. But it depends heavily on global reads and writes.

XBox360 Game Demos

Recently I've taken to downloading the demo for any upcoming XBox game that I am even vaguely considering purchasing. I would now go so far to say that I wont buy any game unless I've played the demo.

The try before you buy approach is a good one but I cant help feeling that the demos that I have tried recently are letting the game studios and the gamers down.

Wednesday 23 July 2008


I have recently been implementing a highly parallel bitmap grayscale kernel as part of another project (more on this later).

Interestingly almost all sources on the web, including wikipedia ( still quote the:

Y = (r * 0.3) + (g * 0.59) + (b * 0.11)

This is the correct technique for older displays.  Improvements in display technology (LCD / plasma) etc mean this is no longer the case.

The more correct form for modern displays is:

Y =  (r * 0.2125) + (g * 0.7154) + (b * 0.0721)

As soon as I find the link to the paper / author that conducted this research I will post it here as it is not immediately findable on

The two equations produce markedly different results, with the second one producing a much fuller / smoother luminance range on my monitors.

I wonder how many legacy (or new) video and image editing software suites still use the older version? With the proliferation of LCD displays its well worth a quick update.

First Post

With so many ideas and projects on the go I thought its about time I started a blog or website. The site as you see it now is based on wordpress ( - thanks for a great piece of software! In the next few days (more likely weeks...) I'll be adding more stuff.

In the meantime have a look at the projects and code section to see whats on the go. (the careful observer will note that neither of these pages exist yet either....)