ComputeCube: NVidia Tesla

Showing posts with label NVidia Tesla. Show all posts

Thursday, 18 June 2009

Compute Cube

In a previous post I mentioned fitting my Tesla C1060 onto my aging Asus motherboard. It has been working well but a combination of slow host<->device transfers speeds of less than 1GB/s, 2GB ram and a relatively slow processor encouraged me to upgrade.

Some of the prerequisites for my new personal super computer were:

a) Must be small - my desk area is limited and I don't like putting computers on the floor where they consume dust better than any known vacuum cleaner...

b) Must have at least 2x pci express 2 (gen 2) slots as for decent GPU computing you need to get data in and out of the device as quickly as possible.

c) As quiet and cool as possible.

As it turns out the last one was the most tricky and needed the C1060 to do a bit of CFD for the airflow in the case.

After a lot of research, measurement and two days of building here are some pictures of the final result. The case is only 11" x 11" x 14" - ok it's not Exactly a cube.... but close enough :) The tape measure in the photos is to give some sense of scale.

Many thanks to NVidia who very kindly sent me two NVidia Tesla logos for me to stick onto the case!

[gallery]

Tesla C1060

On Tuesday I received my brand new Tesla C1060 :) right on the day it was promised by Armari.

A quick word about Tesla suppliers... Although Armari was efficient to deal with, some of the others were absolutely terrible. Possibly because I am a private individual only buying one unit or maybe just because their overall service is terrible. Their names are not mentioned here as I have complained already and it's only fair to give them a chance...

For those not familiar with the specs it has 30 multiprocessors giving 240 stream cores with 4GB of 512bit wide DDR3 memory. It is a rather large double slot 10.5 inch long PCI card.

CUDA Processing of Large Data Sets

As those who follow this blog will know, I have been playing with large generated volumetric data sets (around 16GB). The first operation I performed on them was a 3D Gaussian convolution in order to smooth a lot of the randomness out. My first implementation of the kernel was just the naive one - take a point and look at all the surrounding points. As I'm using a 5x5x5 kernel this means 125 memory reads per point - very poor!

ComputeCube

Thursday, 18 June 2009

Compute Cube

Thursday, 7 May 2009

Tesla C1060

Wednesday, 22 April 2009

CUDA Processing of Large Data Sets