Supercomputers

2009-September-18

Recently, one of my friends brought up the topic of supercomputing and supercomputers. I forget the exact conversation, but the essence of it was that “supercomputers are cool” and that it would be cool to own a supercomputer, even if it was a machine that really doesn’t count as a supercomputer any longer.

I’ll admit that for a long time I’ve also been interested in higher end UNIX and other off-the-beaten-path types of machines. NeXT systems, SGI systems, Sun systems, and a variety of other things have always interested me. Admittedly, the specific allure of a “supercomputer” has never really seemed too important, at least not before the PowerMac G4 systems, which Apple sold as being supercomputers on a chip, were introduced.

More recently I’ve become interested in computers that can do really big things, for a variety of reasons. I’ve always been interested in SGI’s NUMA-Linking architecture for example, which may or may not qualify as being a supercomputer. In the supercomputer realm, SGI’s NUMA arch has a pretty unique position in that it runs literally all of the same software and utilities that run on SGI’s smallest desktop machines intended for just one local user.

In that way, it’s always annoying to hear about traditional supercomputers, and newer supercomputer like cluster systems, because they tend not to regular software at all. In the case of traditional supercomputers, there are proprietary interfaces that hook the machines up to modern networks, and there are proprietary shells or menus and custom, in-house software that gets designed for the machine, specifically.

In the case of newer clusters, the operating system is similar, but all of the applications are written specifically for a clustered environment, with specialized routines for getting data to/from separate physical machines, regardless of how they’re physically connected (HIPPI, Infiniband, or even just gigabit ethernet.) The annoying thing about this is that if you can’t split your task in the right way for a cluster, you simply can’t run your task.

My personal preference is the SGI style, for a lot of reasons. Larger SGI systems can have multiple physical users hooked up to them, even in smaller deskside units which can have up to two or 4 physical console users, and each user can take as much of the system’s power as they need. If one user is rendering large complex Maya scenes, but the other two or three are merely checking their e-mail, the user running a Maya process can use most of the memory and threads on all of the processors.

In a cluster environment, the app would have needed to been programmed from the beginning to be aware of what resources it can use, and in the case that other users start doing heavier things on the system, the app in question may not have been able to scale itself back easily.

SGI is still selling systems based on the NUMA, or Non-Unified Memory Architecture, which essentially takes the system bus and pipes it out the back of the system, and hooks it up to additional compute, disk, memory, and i/o "bricks," as in the case of their later Origin/Onyx and now Altix products. However I don’t know how well they’re selling, and I don’t know whether or not the NUMA systems are even popular these days.

SGI does have a system on the Top 500 Supercomputers list, but it doesn’t appear to be a NUMA system, instead claiming to be a cluster based on Infiniband. Annoying when a NUMA based system could be faster, lower latency, and easier to program for. Although, ultimately, what may end up happening is the same as when the ideas of multi-threading single processes or tasks came about, people get better at it, methods become standardized, and eventually, operating systems will just do it for us.

It’s possible that NUMA is just dying as a result of the death of anything that’s not x86. Maybe it really is becoming easier to do clusters. Even Cray, a big recognizable name in “supercomputers” and “mainframes” has resorted to the manufacture and sale of machines with AMD Opteron chips in them.

The other thing I would love to see happen is for Apple or Microsoft to make their OS work better on larger hardware that has scaled up. I think one of the big reasons traditional big iron UNIX systems (namely, SGI’s Origin and Onyx style systems with that NUMA technology) aren’t so common these days is that the software isn’t scaling very well with it. Solaris has never scaled as well as some of its counterparts like IRIX, and Linux is only where it is today because of SGI.

Right now, my honest suggestion for a path to a solution for this problem which admittedly, doesn’t even exist that much, is for Apple or Microsoft to get to work on their systems. Apple ships Macs with a lot of applications that, even at the consumer level, could seriously benefit from larger machines in a NUMA arrangement, the same way SGI had done things in the past. In Microsoft’s case, while Windows Live Messenger is a little bit of a resource hog, the suggestion I’m making isn’t necessarily that a single user would be able to take advantage of a small or large NUMA setup, but that hundreds or thousands of users on a single terminal server might be thankful for a system with a few terabytes of ram and hundreds if not thousands of processors.

The question that arises from that particular suggestion is whether or not it’s necessary to have one giant remote access server, as compared with a cluster of individual machines behind a load balancer, which is how a lot of remote access installations work these days. My answer for that is that neither way is necessarily better, and that it just depends on the needs of the organization.

Ultimately, I suspect it’s just an inevitability that single giant computers in the way they existed in the past are on their way out. Sun has one or two single machine computers that are a full rack or two, and SGI is still shipping the Altix 4700, which is the remaining example of NUMA at work, and IBM may or may not still be selling mainframes. Beyond that, it’s cluster time, I suppose.