Before everyone gets really upset with the rest of this post, as is the trend in the OO community... I thought I'd start, rather than end, with a disclaimer: I use C++ and STL on a daily basis in my job, although I don't use all of what stl has to offer it does make coding in c++ much easier. C++ in itself does allow fairly elegant code (if constructed carefully) whilst providing a decent level of code performance. So I do actually like C++ and stl and they make my life at work much better :)
But this blog isn't about my day job.... It's about my tinkering with the wonderful world of parallel algorithms and CUDA code.
What a lot of people don't realize is that you *can* use stl, c++ classes and templates in a .cu file. As long as its client side code you should be fine. I've had a few compiler crashes when using stl especially the sort. To sort this out I used the overloaded < operator in your class, don't try and define a custom < method it will crash the compiler.
Showing posts with label cudart (ray tracing). Show all posts
Showing posts with label cudart (ray tracing). Show all posts
Wednesday, 28 October 2009
Thursday, 16 April 2009
Oil and Gas
Oil and Gas: What a fascinating industry! With all the different technologies and processes involved there is so much to learn.
Why the sudden interest you may ask? Well I may have an opportunity to work for a company involved in supplying software to the oil and gas sector in a part of the UK in which we would very much like to live and doing a job with such a wide variety of interwoven fields is rather appealing. As I have no oil industry experience I have been reading and googling as much as possible.
It seems like the software and technologies required revolve around: 3D visualization and analysis of volumetric data (seismic), construction of data based on field measurements, Geographic Information Systems, GPS, fluid / gas flow simulations in porous media or in hollow volumes, visualization of the strutures (rigs etc), drilling planning and plotting. And I'm sure there are even more things to discover.
Why the sudden interest you may ask? Well I may have an opportunity to work for a company involved in supplying software to the oil and gas sector in a part of the UK in which we would very much like to live and doing a job with such a wide variety of interwoven fields is rather appealing. As I have no oil industry experience I have been reading and googling as much as possible.
It seems like the software and technologies required revolve around: 3D visualization and analysis of volumetric data (seismic), construction of data based on field measurements, Geographic Information Systems, GPS, fluid / gas flow simulations in porous media or in hollow volumes, visualization of the strutures (rigs etc), drilling planning and plotting. And I'm sure there are even more things to discover.
Monday, 9 March 2009
Qt Revisited
I managed to find some time to play with Qt this weekend. Firstly, I am rather impressed by the new IDE: While not quite as good as eclipse in terms of editing it seems more stable and uses FAR less memory. It is definately more user friendly than my visual studio 2003 and has a nice auto-complete feature that seems to actually work.
The new widgets and libraries are great if a little large and its pretty quick to whip something together. A nice touch in the IDE is the ability to just drag and drop signals to slots right on the visual layout of the form.
The new widgets and libraries are great if a little large and its pretty quick to whip something together. A nice touch in the IDE is the ability to just drag and drop signals to slots right on the visual layout of the form.
Labels:
AIR,
cudart (ray tracing),
Development,
Flex,
Projects,
Qt,
Rendering
Friday, 6 March 2009
Qt 4.5
As my raytracer is getting more and more complex with lots of dynamic options and movement controls trying to remember keypresses from a command window to turn things off and on is getting increasingly difficult.
The time has come to build a proper interface for the rendering window. Now as I dont want to spend too much time developing this from scratch I thought I'd have another look at Qt (pronounced cute). Its been quite a while and a good few versions since I last used it, and am pleasently surprised at the improvements and additions. As it does offer openGL widget / rendering it is the perfect library to use for my raytracer. Using openGL and Qt still gives me the option of going cross-platform on the raytracer and as the memory handling / page locked mem is faster on linux this could give a few more fps :)
In short, if you are looking at a C++ friendly interface designer you could do a lot worse than using the new Qt. (http://www.qtsoftware.com/)
The time has come to build a proper interface for the rendering window. Now as I dont want to spend too much time developing this from scratch I thought I'd have another look at Qt (pronounced cute). Its been quite a while and a good few versions since I last used it, and am pleasently surprised at the improvements and additions. As it does offer openGL widget / rendering it is the perfect library to use for my raytracer. Using openGL and Qt still gives me the option of going cross-platform on the raytracer and as the memory handling / page locked mem is faster on linux this could give a few more fps :)
In short, if you are looking at a C++ friendly interface designer you could do a lot worse than using the new Qt. (http://www.qtsoftware.com/)
Thursday, 26 February 2009
Voxels, Dynamic Grids, KD Trees
During the course of developing my raytracer I've investigated various acceleration structures and made comments on them in previous posts.
My primary focus has been ray tracing of dynamic scenes where you would need to re-calculate the acceleration structure for each frame. As a result of this I ruled out KD-Trees (even SAH partitioned ones) early on and concentrated on methods which could be implemented well in parallel. The voxel method with fixed maximum object counts served me very well in early implementations. It is very quick to compute in parallel and its memory accesses on traversal are very kind to a CUDA implementation. However as the number of objects in my scenes have increased, it has caused problems with this fixed max element voxel structure. It is possible for the number of element to exceed the maximum in the cell thereby resulting in visual artifacts / lost objects. Increasing the maximum elements per cell slows down the raytracing stage too much even though the objects are no longer lost. To overcome this limitation I increased the number of cells in the grid. Again this works, but slows down the raytracing due to the higher number of voxels needing to be traversed. I was using a mesh of slightly overlapping spheres as they are much quicker to run intersection tests against rather than the standard cube approach.
My primary focus has been ray tracing of dynamic scenes where you would need to re-calculate the acceleration structure for each frame. As a result of this I ruled out KD-Trees (even SAH partitioned ones) early on and concentrated on methods which could be implemented well in parallel. The voxel method with fixed maximum object counts served me very well in early implementations. It is very quick to compute in parallel and its memory accesses on traversal are very kind to a CUDA implementation. However as the number of objects in my scenes have increased, it has caused problems with this fixed max element voxel structure. It is possible for the number of element to exceed the maximum in the cell thereby resulting in visual artifacts / lost objects. Increasing the maximum elements per cell slows down the raytracing stage too much even though the objects are no longer lost. To overcome this limitation I increased the number of cells in the grid. Again this works, but slows down the raytracing due to the higher number of voxels needing to be traversed. I was using a mesh of slightly overlapping spheres as they are much quicker to run intersection tests against rather than the standard cube approach.
Thursday, 9 October 2008
Finally an update!
What with work / projects / email / spore, updating the web site has definately taken a back seat recently. The good news is that I have answered every email I've been sent so if you haven't received a reply the chances are my spam filter ate it - just resend or post a comment here if this is the case.
On the project front - have a look at this screen shot...
[caption id="attachment_299" align="aligncenter" width="286" caption="721 Spheres and 2 lights with 1 level of reflection at +- 14fps"]
[/caption]
I am actually really happy with this as having 8000 objects on the screen only drops the frame rate to +-9. I can still make a lot of optimizations to it but the basic algorithm seems to be working well. Soon it will be time to switch back to triangle meshes and render something useful :) I've recorded a short video of it which is available here , unfortunately the screen capture application I use to generate the recording dropped the frame rate by +-6fps... At the start of the recording I turn reflections on and you will see the (already diminished) framerate drop accordingly. Note that the camera isn't moving - all the objects are moving.
On the project front - have a look at this screen shot...
[caption id="attachment_299" align="aligncenter" width="286" caption="721 Spheres and 2 lights with 1 level of reflection at +- 14fps"]

I am actually really happy with this as having 8000 objects on the screen only drops the frame rate to +-9. I can still make a lot of optimizations to it but the basic algorithm seems to be working well. Soon it will be time to switch back to triangle meshes and render something useful :) I've recorded a short video of it which is available here , unfortunately the screen capture application I use to generate the recording dropped the frame rate by +-6fps... At the start of the recording I turn reflections on and you will see the (already diminished) framerate drop accordingly. Note that the camera isn't moving - all the objects are moving.
Tuesday, 23 September 2008
Various updates and a challenge
The last week and weekend has been really busy with guests staying at our house so unfortunately not much done in the way of coding. Still it's always good to catch up with old friends. You know you are getting old when you realize that you have had some of your workshop tools and friends longer than not having them....
If you have sent an email in the last week and have not received a reply please be patient I do try and answer each one.
If you have sent an email in the last week and have not received a reply please be patient I do try and answer each one.
Friday, 12 September 2008
New Downloads
Please visit the cudart downloads link to get the *hopefully* bug fixed versions of cudart for CUDA v1.1 and v2.0. The v2.0 one also includes the viewport bug fix.
Although the previous versions worked fine on the majority of systems there were issues with any CUDA 2.0 toolkit install / new forceware drivers. Hopefully these are now sorted out. Please make sure you extract all the files from the zips before you run the .exe's.
Thanks again to Audun for his help on this one. Its nice to see that there are still people out there who know how to use a debugger in native mode.
Although the previous versions worked fine on the majority of systems there were issues with any CUDA 2.0 toolkit install / new forceware drivers. Hopefully these are now sorted out. Please make sure you extract all the files from the zips before you run the .exe's.
Thanks again to Audun for his help on this one. Its nice to see that there are still people out there who know how to use a debugger in native mode.
Thursday, 11 September 2008
Bug Fixes
While working on my much talked about acceleration structure I have kept noticing artifacts and distortions as soon as it is applied to the scene. This has had me rather stumped for two days as the code / raw output seemed correct. Finally it occured to me to check my scene code and in particular my viewport code.
As it turns out my viewport code has been flawed since I converted from purely triangles to include spheres. Here is a new screenshot showing the improvement in the image quality now that its been fixed. You will also notice that you can clearly see the spiral reflected in the sphere as well as the red light that is behind the viewer (have to look closely for this one). The image is scaled down from the original size in order to save a bit of bandwith - lots of traffic from nvidia site :)
[caption id="attachment_274" align="alignnone" width="403" caption="After bug fix of viewport code"]
[/caption]
I am yet to fix the downloadable version (downloads section) but will do so tonight.
As it turns out my viewport code has been flawed since I converted from purely triangles to include spheres. Here is a new screenshot showing the improvement in the image quality now that its been fixed. You will also notice that you can clearly see the spiral reflected in the sphere as well as the red light that is behind the viewer (have to look closely for this one). The image is scaled down from the original size in order to save a bit of bandwith - lots of traffic from nvidia site :)
[caption id="attachment_274" align="alignnone" width="403" caption="After bug fix of viewport code"]

I am yet to fix the downloadable version (downloads section) but will do so tonight.
Wednesday, 10 September 2008
Acceleration of ray tracers on CUDA 1.1 devices
As those of you who follow the blog will know, I have been working on an acceleration mechanism for my raytracer over the past week.
As it turns out converting my large stack of A4 sheets of diagrams and equations into proper code has not been as easy as I had hoped. With about a third of it implemented I am getting no speed up what so ever on cudart. In fact I have lost about 2fps. This is largely to be expected as I am pretty much hitting the limits of what my card can do.
As it turns out converting my large stack of A4 sheets of diagrams and equations into proper code has not been as easy as I had hoped. With about a third of it implemented I am getting no speed up what so ever on cudart. In fact I have lost about 2fps. This is largely to be expected as I am pretty much hitting the limits of what my card can do.
Monday, 8 September 2008
Cudart on Nvidia's CUDA page
After noticing a bit of a spike in traffic to the webserver, I see that Nvidia added cudart to the CUDA page on their site.
Although the application is fairly small the videos are quite big so I'm hoping my allocated bandwith will be enough.
Almost inevitably there has been a reported problem with the application. Thanks to Audun who posted a comment. My development PC (AMD 4400 with windows XP) is currently running a 8800GT with 512mb and a 7800GTX with a monitor attached. All calculations are done on the 8800GT and the display is done on the 7800GTX. Initially I thought that this was the problem as most people run their displays on the same card. A quick fiddle around and monitor swapping over has ruled this out - its working perfectly.
Although the application is fairly small the videos are quite big so I'm hoping my allocated bandwith will be enough.
Almost inevitably there has been a reported problem with the application. Thanks to Audun who posted a comment. My development PC (AMD 4400 with windows XP) is currently running a 8800GT with 512mb and a 7800GTX with a monitor attached. All calculations are done on the 8800GT and the display is done on the 7800GTX. Initially I thought that this was the problem as most people run their displays on the same card. A quick fiddle around and monitor swapping over has ruled this out - its working perfectly.
Friday, 5 September 2008
Ray Tracing Acceleration
There are a number of ways to accelerate ray tracing which mostly fall into two categories: (a) Reduce the number of rays we trace. (b) decrease the number of intersection tests
Of course you can also increase the efficiency of your ray / object intersections but as that problem is largely solved and optimized it can be ignored.
For (a) a subsampling method is often used which operates on the primary rays. The problem is that it can produce artifacts especially with lots of small objects.
For (b) a variety of space partitioning methods exist which reduce the number of objects a ray is intersected with, the most popular being the kd tree.
Of course you can also increase the efficiency of your ray / object intersections but as that problem is largely solved and optimized it can be ignored.
For (a) a subsampling method is often used which operates on the primary rays. The problem is that it can produce artifacts especially with lots of small objects.
For (b) a variety of space partitioning methods exist which reduce the number of objects a ray is intersected with, the most popular being the kd tree.
Friday, 29 August 2008
4 more fps
4 more fps! ... well on average anyway :) So its now running at 25-30 continually. Last night did a small update to optimize the number of threads. Also added more colours to the spheres - and you can now press 'R' to get some very primitive physics (bouncing)
The download link is the same as the previous post but inside the zip is a version 0.02.
This is still without my new acceleration structures. They are taking a while to implement so as to maximize the efficiency in the way the GPU wants to read the memory. I'm hoping for at least a 2x speed up.
The download link is the same as the previous post but inside the zip is a version 0.02.
This is still without my new acceleration structures. They are taking a while to implement so as to maximize the efficiency in the way the GPU wants to read the memory. I'm hoping for at least a 2x speed up.
Wednesday, 27 August 2008
Cudart spheres: movie and download
Here is a sample movie of the spheres in cudart, the average fps code is not quite correct but gives a good idea of the current speed of the system. Well over 25 times the CPU implementation now.
Cudart Spheres movie
Here is the promised download - note it is only tested on a 8800GT with 512mb of memory. Any problems / comments / suggestions please let me know.
cudart-sp-0.01
Again this one is only doing untextured spheres with 2 light sources.
Cudart Spheres movie
Here is the promised download - note it is only tested on a 8800GT with 512mb of memory. Any problems / comments / suggestions please let me know.
cudart-sp-0.01
Again this one is only doing untextured spheres with 2 light sources.
Tuesday, 26 August 2008
Spheres
For whatever reason spheres in an image seem to appeal to people in general a lot more than a collection of big triangles, no matter how well shaded / textured and lit they are... So to get a bit of "ooooo" and "aaaaah" factor going I decided to add spheres to cudart over the weekend.
Of course, as well as looking better than a triangle, computationally they are much quicker to compute. So even with 111 objects in the scene and 2 levels of reflection it is still possible to get close to 20fps at 800x800. I have just finished an interactive demo version which I will make available for download here as soon as I've sorted out the license agreement. In the meantime here are some screenshots.
[gallery]
Of course, as well as looking better than a triangle, computationally they are much quicker to compute. So even with 111 objects in the scene and 2 levels of reflection it is still possible to get close to 20fps at 800x800. I have just finished an interactive demo version which I will make available for download here as soon as I've sorted out the license agreement. In the meantime here are some screenshots.
[gallery]
Friday, 22 August 2008
Texturing
As mentioned in earlier posts I have been calculating u,v coordinates for every triangle intersection. I have avoided actually texturing my triangles as I wanted to get most of the other things working well first - I find textures tend to disguise shading / intersection glitches.
Last night I decided to implement some basic texturing and see how much longer the scene took to render. So I added the Lena image to my modified Cornell box - if you are going to be a cliche you may as well go all the way :) Screenshot with texturing is below.
Last night I decided to implement some basic texturing and see how much longer the scene took to render. So I added the Lena image to my modified Cornell box - if you are going to be a cliche you may as well go all the way :) Screenshot with texturing is below.
Thursday, 21 August 2008
Modern art (or when raytracing goes wrong) and some timings...
I wanted to get some proper timing figures before I implemented reflections in order to tell how badly (or well) my reflection code is performing. So I removed one or two small optimizations which would affect the timings based on the scene layout and got this:
[caption id="attachment_186" align="aligncenter" width="300" caption="When raytracing goes wrong"]
[/caption]
If I need something to fall back onto - modern art might just be my thing :p
[caption id="attachment_186" align="aligncenter" width="300" caption="When raytracing goes wrong"]

If I need something to fall back onto - modern art might just be my thing :p
Wednesday, 20 August 2008
Speed update
To prevent confusion on my timing figures, here is how I calculate them: As my target framerate is 24 fps I base my timings on creating a full frame from scratch every cycle and doing that 24 times to get my timing. Best demostrated with pseudocode:
setup stuff and create scene objects
start timing
for framecount 1 to 25 (25 instead of 24 to allow for some overhead later)
generate frame
stop timing
I've now set up my scene with 100 objects (the goal amount) and one light source (the goal is 2). I've enabled lighting and shadows but as yet no colour and reflections. (screenshot below)
setup stuff and create scene objects
start timing
for framecount 1 to 25 (25 instead of 24 to allow for some overhead later)
generate frame
stop timing
I've now set up my scene with 100 objects (the goal amount) and one light source (the goal is 2). I've enabled lighting and shadows but as yet no colour and reflections. (screenshot below)
Tuesday, 19 August 2008
Optimizations
My goal for cudart is to run 24 frames per second at 800*600 with a decent object and light count. Decent is a bit hard to define but for now 100 objects and 2 light sources would make me pretty happy.
Most of the screen shots until now have been only partially computed on the GPU as I have been trying to develop my algorithms. I've found it is very easy to fall into the trap of making a photo realistic renderer and as a result the earlier screenshots you see are running at about 1 second a frame - ie 1 fps... So although they might look impressive they are hardly running in realtime. Over the last few days I have been converting everything to run on the GPU and here is the first screenshot.
[caption id="attachment_173" align="aligncenter" width="300" caption="First purely GPU output"]
[/caption]
Most of the screen shots until now have been only partially computed on the GPU as I have been trying to develop my algorithms. I've found it is very easy to fall into the trap of making a photo realistic renderer and as a result the earlier screenshots you see are running at about 1 second a frame - ie 1 fps... So although they might look impressive they are hardly running in realtime. Over the last few days I have been converting everything to run on the GPU and here is the first screenshot.
[caption id="attachment_173" align="aligncenter" width="300" caption="First purely GPU output"]

Friday, 15 August 2008
Managing scene objects
After a small change to the lighting (again...) I'm fairly happy with the basic ray / light model now. I still need to implement refraction but that can wait a bit. The next task is to work out an efficient way of manipulating the many triangles in the scene. I need to group them into objects - remembering that OO is not supported on CUDA, and make methods to translate / rotate / scale these objects.
[caption id="attachment_162" align="aligncenter" width="300" caption="Yet another lighting change"]
[/caption]
[caption id="attachment_162" align="aligncenter" width="300" caption="Yet another lighting change"]

Subscribe to:
Posts (Atom)