Speak up on Software Patents (osum tucnaku)

MAGI - Linux Cluster Supercomputer for Speech Recognition Group at CTU Prague



June 2003: SGE+bproc Microhowto


Most things below were written back in 1997. I keep them here for historical record...
Revolution in supercomputing is here just now. While all the workstation manufacturers agree with this fact and claim that their workstations overtook big supercomputers, today's reality goes beyond this: workstations are threatened by commodity PCs.

No workstation manufacturer likes to hear this fact aloud. Their advertisement attacks minds of decision makers with the myth of workstation as a distinctive sign of higher society, quite different from the poor PC users. With a slight touch of hysteria, they all claim to offer the best price/performance ratio with their workstations and they all avoid numbers leading to direct comparison even with PCs from their own production.

Of course, the price/performance ratio is quite vague quantity, but the vast majority of comparisons one could devise leads to findings that the ratio for today's top PCs outperform that of any workstation by a factor of something between three and ten.

The potential of PCs still might be hard to use for business and industry, but certainly can be exploited in research environment. PC clusters are being used for supercomputing at many sites, for example at NASA Goddard Space Flight Center (Beowulf and Hrothgar and The Hive), Los Alamos National Laboratory (Loki), Drexel University (DragonWulf), CACR at Caltech (Hyglac and Naegling), High Energy Physics lab in Germany (Hermes), Ames Laboratory, SCL (Alice), Sandia National Laboratory, Livermore, California (DAISy and Megalon), Duke University - physics (Brahma), Clemson University (Grendel), University of California at Irvine (Aeneas cluster), George Washington University PACET, University of Southern Queensland, FOES (Topcat), NIST, Maths and CS Div. (JazzNet), Max-Planck-Institut für Informatik (Orwell), University of Mannheim (Trumpf), UMIST Manchester (MadDog), Kasetsart University, Thailand (Smile) and many others.

Explain Your Boss Advantages Of PC Clusters...

Maybe you are just working on grant application form and you need some reasons why they should give you money to buy something like MAGI. Or maybe you are just going to meet your boss (quite respectable and nice person but too busy to keep track of the latest technology). Or even YOU are the boss :)

In any case, compact explanation of basic advantages is needed. LOOK HERE FOR THE ARGUMENTATION

Our design objectives

Our cluster was designed as a multiuser machine for big speech recognition experiments. Typical features of our tasks are big databases and easy split of task into parallel subtasks working either on different parts of the database or repeating similar experiments with different parameters. From the user's perspective, the system should be as close to normal UNIX as possible. Another design objective was to fit ~27k$ budget allocated for this.

Comparisons we made quickly ruled out all conventional workstations and left Linux cluster using commodity PCs as the only viable alternative. For the first cluster version we also decided to avoid two or four-processor motherboards, however this is possible upgrade path once this hadrware and Linux SMP are common enough.

(Here are my Slides presented in September 1997)

Hardware

Each of 8 nodes consists of: and all the nodes share: Beside this, there are many cables and cables and cables...

Note that our cluster is designed as a 'cave computer' - there is no floppy disk, no CDROM, no big monitor and no mouse - together, no intimate human contact expected. The only connection of the system with the rest of the world are two cables: power cord and 100Mbps network cable. Well, you can also move harddisks - much better medium than a floppy :-)

Software

RedHat distribution (4.0 at the moment, you can use newer one) is used as the base operating system on individual nodes. Our own software is used to glue individual nodes together into one big computer. This includes: Commercial software to be used includes HTK (hidden Marcov model toolkit), which can be converted to parallel processing using a few simple scripts which do replace original HTK programs, and Matlab, which cannot be converted like this but at least can be replaced by script which dedicates full power of one node to each Matlab process.

Users are expected to use mainly job spooling system in their own programs. They can also use PVM or normal UNIX techniques like rsh.

More about software...

How does it look like? And the other side?

Click on the small images above to see...

(Or click on this or this to get really big picture)

Performance

So far we measured network bandwidth, which is 81*10^6bits/sec in node-node communication using the most default netperf test. Using netperf -t UDP_STREAM -l 60 -H magi7 -- -s 32000, -S 32000, -m 4096 we got 95.73*10^6bits/sec.

Warnings

Think twice before you decide to use 3c905 cards. The driver development did not reach its stable state yet.

Do not forget to set passwords for SuperStack Switch access.

The same harddisk seemes to have 255 heads, 63 sectors, 527 cylinders on primary IDE interface, but 16 heads, 63 sectors, 8400 cylinders when moved on secondary IDE interface. Do not be alarmed, just fdisk all disks on primary IDE if you plan to move them around. So far I have no real problem.

Documentation

HOWTO-like documents: More general documents:

My questions

Is there really nobody else who would try to make his Linux Cluster Supercomputer closer to normal UNIX (from user viewpoint)? I see things like PVM everywhere - OK, it is good, but there are dozens of little things you (as a Linux guru) could do for your dear users - do you care or not?

What about /proc exported by NFS? Cool cluster management possibility, or not? It does not work, but it might work? (It is so cool that at the moment I do a COPY of some files in /proc to simulate this now impossible mount. Why to invent another interfaces while there is /proc?)

Related people around the world

Donald Becker , author of networking drivers
Robert G. Brown known on many related maillists
Prof. Hank Dietz , author of Parallel Processing using Linux (also here)
Walter B. Ligon III working at Clemson
Al Geist, PVM technical manager

Copyright

All our special cluster software is (and will be) under GPL and will be freely available on this server once we have stable version which is worth publishing.

Your Comments

You like this page? You dislike this page? Your cluster supercomputer is not listed here? You think your homepage should be listed here? Your name is misspelled here? I want to hear it! Send me your comments..

To be done

Lots of things, of course... This page, for example. Stay tuned...

More links to related pages

Specific info for Czech surfers


Uplinks:


This page was created and this WWW server is maintained by Vaclav Hanzl