Wednesday, 17 December 2008

NVIDIA Tesla Personal Supercomputing

I attended the NVIDIA Tesla personal supercomputing launch on the 4th December in London. It was really interesting to meet other people using cuda and gpu based devices for solving incredibly compute intensive problems.

What some groups have managed to do in the fields of Tomography / CFD / Molecular modelling and Seismology is truly awe inspiring and its only a matter of time before they have a real inpact on the average persons life. The advances in medical imaging along with the power of Tesla is a prime example in this regard.

While I think the 1u Tesla boxes are a good idea for having compute power in a hosting environment I am still unsure about having a Tesla personal supercomputer on my desk. Most of the systems shown are running standard motherboards or the 2 socket skulltrail board (from Intel). Inside which 2 or 3 tesla boards are placed along with a Quadro card. The Tesla boards support more memory than your standard consumer GT280 and have apparently better quality standards and control on them. Visually the cooling seems impressive and even with 3 of them sandwiched together on a motherboard the heat seemed to be fine (most of the cases had decent fan arrangements).

So why am I unsure of getting one (apart from the obvious fact I cannot afford one :p  ) well... apart from supporting more memory the Tesla does not offer anything that a GT280 does not.

I was really hoping that the Tesla boards would have an additional feature allowing high speed communication to neighbouring Tesla cards allowing them to bypass the pci express bus and host memory. (Admittedly the new i7 chips and associated DDR3 memory have speeded this up somewhat). When I asked the product manager about the card to card bottleneck he initially said that they have no plans to change the current system as its industry standard (pci).  Which I think is perfectly acceptable for the consumer graphics cards.  I did explain to him that the device to device transfer are currently a bit of an issue if you are using different kernels on the seperate gpu's.

If NVIDIA are aiming these devices to the supercomputing community I feel this is something worth addressing. After all Cray and the likes spent a huge amount of time working on their communication paths inside their machines. But NVIDIA is really good at listening to their clients / consumers a fact bourne out by my second conversation with the product manager about an hour later. He mentioned that a lot of people have mentioned the bandwidth issue to him and it will definately be something they will look at. Now I've heard that type of marketing speak many times before and didn't hold too much faith in it, but within a week I have heard via some contacts in NVidia that they have already been holding informal meetings on how to address the issue. Excellent news!  I just may have to start saving all my pennies for a tesla in the future.

All in all the presentations were good (keep up the amazing work guys!) and I left feeling very positive about the future of "personal supercomputing" ... but ... for now, while a 1u Tesla box is a great idea for a hosting / compute environment, for development it might just be better to stick with a consumer card.


  1. I'll be building a high-performance compute cluster in the near future. To begin with, it will be small--only three nodes--but each node will have an NVidia GX260 in it. This will most likely run Rocks 5.0 with the Cuda 2.0 "Roll" package.

    Do you figure that you might be interested in using this cluster? I'm thinking of offering its services to the CUDA developers community free of charge in order to see if I can't help (in my own little way) move the CUDA movement onward to more great software like the programs you have been releasing....


  2. [...] kindly offered compute time on a 3 x 260GT machine for developers in a comment on a previous post ( he didn’t leave his email address public so contact me if you are interested and I will [...]