The last week and weekend has been really busy with guests staying at our house so unfortunately not much done in the way of coding. Still it's always good to catch up with old friends. You know you are getting old when you realize that you have had some of your workshop tools and friends longer than not having them....
If you have sent an email in the last week and have not received a reply please be patient I do try and answer each one.
When showing off the as yet unreleased version to my friends yesterday I noticed that some of the second order reflections have gone missing. It turns out they are also missing from version 2 - its surprising no one has noticed. They are being calculated but a typo in the code means they got the incorrect direction vector... ooops.
I did eventually get my acceleration structures working but they have actually slowed it down a lot so its partially back to the drawing board. Random memory accesses are really bad for performance and even if you use texture memory the cache is rather small and works on locality. I am in the process of reworking my algorithm to make the best use of memory accesses again. The kernel also got over 47 register usage so that was re-optimized down to 36. At the moment there is basically no cpu code in cudart (apart from mem copies) I may offload some of the acceleration structure creation to the cpu else it just sits there being lazy...
I have had quite a few requests for source code or an API. I am currently no keen on releasing source code for a number of reasons. This is very much a work in progress and I dont want to get into a whole maintenance nightmare of handling help requests on source from my trunk. Lots of the optimizations are also very specific at the moment. And the biggest reason is that I may actually do something with it in the long term. I know of lots of people who have been badly stung with gpl etc type licenses so that will never be an option for it.
As to releasing an API this is something I am considering but it needs a lot more work + testing. A lot of the specific code would have to be changed to be more generic. There would probably need to be two versions: one for game / demo creation and one for commercial / CAD / design raytracing where accuracy is an issue.
I was looking through some of my old backups last night in order to update my archives sections and found my old Suduko game solver. There is a story behind this and it will make the subject of another post.
I think suduko will make a very interesting CUDA challenge. Parts of the game can be very parallel (at least down to row/col/block level) and other parts (trying possibilities) very recursive. I am thinking of making a fun challenge where CPU implementations compete against CUDA implementations to solve 100k or 1million suduko puzzles from a text file.
Part of my time this weekend went to Spore. My lovely girlfriend/partner gave me it as a gift after having me go on about it for a-g-e-s :) It is a pretty fun game, especially for those like myself without lightning fast reflexes. I can't help but think its not quite as good as some of the proof-of-concept / alpha videos of it. But it is a LOT of fun and well worth buying. I may post more about "Bob" my creature in future :)