Beta

×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Purdue Plans a 1-Day Supercomputer "Barnraising"

timothy posted more than 6 years ago | from the bring-some-ethernet-and-some-lemonade dept.

Supercomputing 97

An anonymous reader points out an article which says that "Purdue University says it will only need one day to install the largest supercomputer on a Big Ten campus. The so-called 'electronic barn-raising' will take place May 5 and involved more than 200 employees. The computer will be about the size of a semi trailer. Vice President for Information Technology at Purdue Gerry McCartney says it will be built in a single day to keep science and engineering researchers from facing a lengthy downtime." Another anonymous reader adds "To generate interest on campus, the organizers created a spoof movie trailer called 'Installation Day.'"

cancel ×

97 comments

Sorry! There are no comments related to the filter you selected.

Amish (1)

SBrach (1073190) | more than 6 years ago | (#23268796)

The Amish are great "barn raisers" maybe they can help.

Re:Amish (1)

mikael (484) | more than 6 years ago | (#23268840)

Are the builders of this system required to wear beards and black hats?

I've seen the websites where the Amish organise barn-raising parties. It's quite impressive. The womenfolk make sandwiches and other light meals, while the menfolk completely construct and assemble the parts to make a three or four floor structure. Presumably they can construct a house in the same amount of time?

Re:Amish (1)

silas_moeckel (234313) | more than 6 years ago | (#23269672)

They can frame a house in about the same amount of time. There is a lot of work to get the foundation ready and to finish the outside. A normal 4-6 man crew can frame a 3k square foot house and get it weather tight in about a week. They do about the same in a day or two.

Amish use websites? (1)

Ocker3 (1232550) | more than 6 years ago | (#23270436)

Wait, you're alleging the Amish(!) use a website to organise barn-raisings? Linky to prove?

Re:Amish use websites? (1)

mikael (484) | more than 5 years ago | (#23270570)

No, I've just visited photography web-pages dedicated to the craft of barn-raising:

Amish Barn-Raising [amishphoto.com]

A discussion [ittoolbox.com]

I'm amazed that so many people can be coordinated in such a confined space. There's a new building being built on my local campus. At most there are never more than 10 workmen on site at any time, and even then, they are always working in separate areas, operating machinery (elevators, cranes, clamps for plate glass).

OT... (0)

Anonymous Coward | more than 6 years ago | (#23268810)

Sorry for being off-topic, but is slashdot moderation currently broken for anybody else? Does anyone else see comments being moderated?

Re:OT... (0)

Anonymous Coward | more than 6 years ago | (#23268906)

it appears it is broken, yes

Re:OT... (1)

gumpish (682245) | more than 6 years ago | (#23269230)

Is it any shock?

This site is written in fucking PERL for christ's sake.

It's a wonder that it ever works.

Re:OT... (1)

DAldredge (2353) | more than 6 years ago | (#23269734)

Please go back to digg and leave the adults alone.

All together now.... (1)

Goliath (101288) | more than 6 years ago | (#23268862)

Imagine a....

Re:All together now.... (0)

Anonymous Coward | more than 6 years ago | (#23269444)

...joke that doesn't involve the expression "bewolf cluster".

Come on. You can do it.

("In Soviet Russia, our barn-rasing overlords are welcome to run linux on YOU!")

Re:All together now.... (1)

MiKM (752717) | more than 6 years ago | (#23269502)

...there's no Heaven, it's easy if you try?

Abe is the biggest cluster on a BigTen campus (1)

jpu8086 (682572) | more than 6 years ago | (#23268922)

Biggest on Big10 campus is a lie.

The article lists BigRed at Indiana (#43 on Top500) based on a technicality. But even the technicality is incorrect. The ABE cluster at NCSA@UIUC (#14 on Top500) is literally on the UIUC campus.

I doubt the Purdue one will beat Abe on the Top500 list.

Re:Abe is the biggest cluster on a BigTen campus (0)

Anonymous Coward | more than 6 years ago | (#23270372)

If you actually read the article, Abe is listed as the largest (albeit in a confusing manner, but still).

And simple math tells you they probably won't beat Abe. They'll be sitting at around 6500 processors. Sounds like somebody is upset they aren't getting the publicity.

In China, India, Russia, ... (0)

Anonymous Coward | more than 5 years ago | (#23272290)

They don't built from scratch a silicon supercomputer in one day.

Don't be stupid building this crap in one day!

They need only months of modern workings, R&D, enginnering, simulating, massive computing, to release an experimental silicon 64-bit supercomputer with pretty 128-bit FP registers for trillions of flops at lesser round's error.

You can add many DDR3 RAM modules compatible for your supercomputer instead of the incompatibility of the current Intel/AMD processors.

For starting, a minimal core of MIPS64?, later adding more capabilities, ... eureka!

The U.S. supercomputer using COTS (Coupled On Test Self) of craps is a crap!

Re:Abe is the biggest cluster on a BigTen campus (2, Informative)

navygeek (1044768) | more than 5 years ago | (#23274146)

Someone need to go back and read (re-read?) the article. It says ABE is the biggest on a Big Ten campus. Purdue's will be the largest not connected to a national center. A semantic? Maybe, but it doesn't invalidate the claim.

Re:Abe is the biggest cluster on a BigTen campus (1)

navygeek (1044768) | more than 5 years ago | (#23274254)

Purdue does have (at least) one very cool thing UIUC doesn't have - an operational nuclear reactor. Sure, sure UIUC may still have the facilities, but it's under a decommission order and will be shut down soon.

Re:Abe is the biggest cluster on a BigTen campus (0)

Anonymous Coward | more than 5 years ago | (#23274472)

"It will be the largest Big Ten supercomputer that is not part of a national center."

Not Funny (1)

IKILLEDTROTSKY (1197753) | more than 6 years ago | (#23268950)

Making fun of the Amish on the internet is like mooning a blind guy.

Re:Not Funny (0)

Anonymous Coward | more than 6 years ago | (#23268974)

What if you fart in his general direction?

Re:Not Funny (1)

TheLinuxSRC (683475) | more than 5 years ago | (#23270502)

Well... farts smell so that deaf people can enjoy them as well; I am not sure what that has to do with a blind person though.

Re:Not Funny (1)

Gat0r30y (957941) | more than 6 years ago | (#23269182)

Dude, that's what makes it funny. No harm no foul, and we all get a good chuckle

Dumb (3, Insightful)

Spazmania (174582) | more than 6 years ago | (#23268962)

built in a single day to keep science and engineering researchers from facing a lengthy downtime

Sounds like poor planning to me. The correct way to keep science and engineering researchers from facing a lengthy downtime: don't turn off the old computer until the new one is running and tested.

Re:Dumb (0)

Anonymous Coward | more than 6 years ago | (#23268998)

Sounds like poor planning to me. The correct way to keep science and engineering researchers from facing a lengthy downtime: don't turn off the old computer until the new one is running and tested.
Pssh. Obviously they don't have enough open plugs on their power strip.

Re:Dumb (1)

maxume (22995) | more than 6 years ago | (#23269222)

I'm sure if you built them a redundant building with proper environmental control systems (that is, cooling), that they would be happy to keep everything online while they are putting in the new one.

Re:Dumb (1)

Icegryphon (715550) | more than 6 years ago | (#23269314)

Well That assumes they have extra space.
Maybe Space for such systems is at a premium.
It's not like you can pick any old room to setup a huge cluster.

Re:Dumb (1)

Spazmania (174582) | more than 6 years ago | (#23269738)

So what you're saying is that it's poorly planned AND underfunded?

Re:Dumb (1)

slimjim8094 (941042) | more than 6 years ago | (#23270274)

Because every university is going to need a dedicated room to replace their fucking supercomputer every 10 years...

Chill out - in this case, it's easier than any other options. Plus it got them on Slashdot.

Re:Dumb (1)

Spazmania (174582) | more than 6 years ago | (#23270412)

I'm just sayin': it looks to me like a primo example of "work harder not smarter." There are other ways this could have been done than by having 200 folks play rack-and-stack at the same time. The breakage from this is gonna be out of sight.

Re:Dumb (1)

gregsv (631463) | more than 5 years ago | (#23270524)

There are other ways this could have been done than by having 200 folks play rack-and-stack at the same time.
How, exactly? If the old clusters being replaced were taking all available space, power, and/or cooling in the data center, that makes it pretty hard to build a new cluster without first turning them off. And new data centers are not exactly cheap.

Re:Dumb (1)

Spazmania (174582) | more than 5 years ago | (#23272898)

If they're out of physical space (not just power and cooling) then the facility is way oversubscribed and they'll tend to suffer failures as a result. They should have taken some of the money spent on the machine and used it to improve the facility.

If they're not out of physical space then they could have built the cabinets ahead of time, powered them up one at a time to verify correct cabling, hardware operation and software installation and then rolled them off into a corner. On the cutover day you'd then need about 30 people to shutdown and roll the old equipment out, and then roll the new cabinets into their correct locations. And since you'd have already tested the individual cabinets, you'd have a much better chance of it all working right.

Even that is bad. They have fiber-optic links connecting the campus buildings. If they don't, they need them and should have spent some of the money upgrading their campus infrastructure. With a fiber ring, they could have (temporarily) distributed the cabinets around the campus, bringing the machine up to full power. Then once the researchers sign off on it, the old one is powered down and moved out. Next, you move the new cabinets from their temporary housings back to the vacated room, one at a time. This is straightforward for clusters: you just remove those nodes for maintenance.

But that's only if you're desperate to keep the cost low. For 800 machines, we're talking about at most 40 cabinets here, 4 rows of 10 plus hvac, power and batteries. That's a room a little over 30'x30' with two air conditioners and a battery system. Skip the genset; you can live without it during two months of overlap until the old genset becomes available. Skip the raised floor and other stuff that isn't critical, shop carefully and buy the battery system used and you can put it together for well under $100k. Each of the 800 machines costs at least $5000, so for the price of 20 of the machines you can build a whole new room to house them.

And for a multi-million dollar system, you should damn well be prepared to improve the space in which it will be housed.

Re:Dumb (0)

Anonymous Coward | more than 5 years ago | (#23273136)

How many top-40 supercomputers have you built? My point is that these people probably know a heck of a lot more about this than you do. They seem to be doing just fine without following all of your oh-so-insightful guidance.

Re:Dumb (1)

schmeckendeugler (864881) | more than 5 years ago | (#23273276)

Jesus christ, you're just full of easy answers, aren't you? Well, why don't you come on over to our campus and just fix everything right up? I'll get you an interview with the CIO right away!! You can convince the board of directors that our data center needs more space, cooling, and millions of dollars spent on it (in addition to the millions we've already spent)! Obviously nobody has thought of THAT, yet! Freaking Moron.

Re:Dumb (0)

Anonymous Coward | more than 5 years ago | (#23273396)

If they're out of physical space (not just power and cooling) then the facility is way oversubscribed and they'll tend to suffer failures as a result. They should have taken some of the money spent on the machine and used it to improve the facility.
Full != Oversubscribed. Maybe the facility was running just fine at or near capacity with the old clusters, and will run just fine at or near capacity with the new one. Spending the millions of dollars it would likely take to improve the facility to be able to handle both clusters at once for a couple week or month transition period would be ridiculous.

If they're not out of physical space then they could have built the cabinets ahead of time, powered them up one at a time to verify correct cabling, hardware operation and software installation and then rolled them off into a corner. On the cutover day you'd then need about 30 people to shutdown and roll the old equipment out, and then roll the new cabinets into their correct locations. And since you'd have already tested the individual cabinets, you'd have a much better chance of it all working right.
It's not that simple. The clusters will likely not be configured identically. The new one is not a drop in replacement. Power, networking, physical layout, etc. must all be updated for the new configuration. These processes take significantly longer than putting nodes in racks.

Even that is bad. They have fiber-optic links connecting the campus buildings. If they don't, they need them and should have spent some of the money upgrading their campus infrastructure. With a fiber ring, they could have (temporarily) distributed the cabinets around the campus, bringing the machine up to full power. Then once the researchers sign off on it, the old one is powered down and moved out. Next, you move the new cabinets from their temporary housings back to the vacated room, one at a time. This is straightforward for clusters: you just remove those nodes for maintenance.
You've obviously never dealt with the logistics of moving a machine of this size around a campus the size of Purdue several times. This would be a nightmare, and would likely result in several times more time and effort being expended than the current approach.

Re:Dumb (1)

Spazmania (174582) | more than 5 years ago | (#23274026)

Full != Oversubscribed.

Yes, it does. You're supposed to have about a 20% reserve slack on space and power cabling, and the "+1" of n+1 reserve for battery, hvac and generator systems. That covers instant failures of those systems, but it also covers maneuvering room when its time to upgrade.

Re:Dumb (1)

poot_rootbeer (188613) | more than 5 years ago | (#23274018)

If they're out of physical space (not just power and cooling) then the facility is way oversubscribed

Assuming that the new supercomputer has the same space needs as the one it's replacing, what you're telling us is that any machine room running at more than 50% volumetric capacity is oversubscribed. I can't agree with that.

Re:Dumb (1)

puriots0 (173203) | more than 6 years ago | (#23288782)

Skip the raised floor and other stuff that isn't critical, shop carefully and buy the battery system used and you can put it together for well under $100k.
The cost of the additional building power transformers and switchgear *alone* would cost well over $100k, for the amount of power necessary. Add to that at least 85 tons of air conditioning necessary to cool the cluster (assuming 100% effiency, and no redundancy). At about $40k for each 30ton Liebert chilled/potable water DX unit, that's another $120k...

Also, if you think that we can afford to put either a UPS or generator backup under this whole mess, you don't seem to understand what kind of minimal funding we get for facilities and infrastructure. Truthfully, considering the amount of money that it'd cost vs the cost of the downtime (and typical amount/length of power outages), putting a UPS and genset under all of the cluster nodes probably isn't worthwhile, or else our customers would actually pay us to do it.

With a fiber ring, they could have (temporarily) distributed the cabinets around the campus, bringing the machine up to full power. Then once the researchers sign off on it, the old one is powered down and moved out.
Each pair of racks has a 50Gb connection back to the switching backbone. Building the cluster in this manner would horribly hurt performance (latency), probably require an extra $200k or so in networking hardware that would be useless after we re-assembled the cluster in one location, inconvenience our users (user jobs can run for up to 30 *days* at a time), require us to somehow convince other departments on campus to give us power and cooling to house the racks, require many, MANY more man hours to assemble and then move the machines back across campus, probably resulting in more hardware failures due to the number of times they're moved, etc.

It's simply not a feasible idea to try to assemble the new cluster before the old ones were demolished.

Each of the 800 machines costs at least $5000, so for the price of 20 of the machines you can build a whole new room to house them.
Each machine cost us no more than about $2300 (excluding the infrastructure). In total, we purchased about 1000 machines for about $2.6M including racks, cables, network switches, etc, so about $2600/machine total cost.

Also, for many reasons you can't just spend money on whatever you want. The money's source dictates whether it can be spent on computers, staff salaries, building improvements, etc. It's not like "here's a check for a lot of money, do with it as you please."

Re:Dumb (1)

Spazmania (174582) | more than 6 years ago | (#23294852)

Add to that at least 85 tons of air conditioning

85 tons? So you're budgeting 350 watts of actual (not max or rated) draw per node plus an extra 15kw overhead? That sounds a little high to me.

Even if you're right on target, it's all in the shopping. If you go brand new, top of the line Lieberts, raised floor and all the frills you'll chew through half a mil easy. On the other hand, I once bought a nice redundant 3+3 ton data center A/C unit for $50 at auction. Datacenter A/C units sell for dirt at auctions because nobody bids on them. If you don't mind getting a little creative, all you have to do is look.

you don't seem to understand what kind of minimal funding we get for facilities and infrastructure.

Sure I do. Whoever's brainchild this was, he pulled together I don't know how many researchers and got the all to synchronize their proposals to pull together funding for this beast. Gave himself a nice pat on the back for his impressive effort at coordination and planning. He didn't bother to think about basic infrastructure and by the time anybody did, it was too late to secure funding for a room.

That's like the guy who secures funding for a bridge and forgets to buy the asphalt for the roads leading up to it.

Re:Dumb (1)

tinkergeek (800871) | more than 6 years ago | (#23288968)

Wow, you have seriously no idea about High Performance Computing do you?! First off, even the distance between buildings increases the latency of the network beyond what is useful for multi-node programs. Yes our campus has a wonderful fiber plant. No, it can't be used for HPC. Speed of light issues dude. Next, keeping batteries under a cluster at full load is not very doable. Each rack pulls 13+kW, that's a ton o' battery. (Not to mention you also *need* to keep the cooling going during this time as well. Which on a good day is equal to the power consumed by the heat load.) Doable with standard loads on infrastructure type servers, not HPC nodes. 30'x30' room?! Hmm, no. Airflow around stuff is nice since that is what we use to cool the machines. Perhaps not the best plan, but plumbing racks for chilled water isn't doable at the moment. And, where in the world do you get your $5k/node price??? You need to think again about what a cluster is and what such a beast is used for. Then, come back with a real number for us.

Re:Dumb (1)

kyle6477 (1174231) | more than 5 years ago | (#23276082)

Because every university is going to need a dedicated room to replace their fucking supercomputer every 10 years... Chill out - in this case, it's easier than any other options. Plus it got them on Slashdot.
Who really needs to chill out here?

Re:Dumb (1)

Corbets (169101) | more than 5 years ago | (#23271530)

Sure. If you have lots of space, enough resources to cover the cost of maintaining dual systems, etc. etc. etc.

Sounds to me like you've never had to upgrade servers in an already overloaded data center. ;-)

Re:Dumb (1)

Spazmania (174582) | more than 5 years ago | (#23272908)

Sounds to me like you've never had to upgrade servers in an already overloaded data center. ;-)

Sure I have. I solved the problem by moving to a data center that wasn't overloaded.

When you're installing that expensive a piece of hardware, you don't try to fit it to the environment; you fit the environment to it.

Re:Dumb (1)

schmeckendeugler (864881) | more than 5 years ago | (#23273248)

I am a participant in this event. Spazmania, that is the usual modus operandi; however, there is limited space, and there is nowhere to put one thousand rack mounted mahcines while the previous 75+ RACKS are emptied. Secondly, all users are QUITE aware, I'm sure, that their jobs are being temporarily placed on hold so that they can install this cluster. Sometimes, my friend, out here on the bleeding edge, exceptions must be made.

Re:Dumb (1)

mscman (1102471) | more than 5 years ago | (#23273710)

Well if we had the space, we would have. Unfortunately, our data center is rather small and fairly stressed on cooling and power. This was the only way possible to fit the new cluster in.

Re:Dumb (1)

Spazmania (174582) | more than 5 years ago | (#23274068)

Like I said elsewhere in the thread: you'd have been better off sacrificing a few machines in the cluster and spending the money improving the space instead. Reliable computing starts with reliable infrastructure. If you're running that close to the edge then you don't have reliable infrastructure.

Re:Dumb (1)

mscman (1102471) | more than 6 years ago | (#23281542)

And as others have said elsewhere, if you would like to come convince Purdue's Board of Trustees along with our CIO to give us that money, we would be happy to. The reality is that everyone wants better computers, but can't afford a new facility. If you're interested in making a large donation... :)

Re:Dumb (1)

Spazmania (174582) | more than 6 years ago | (#23282212)

if you would like to come convince Purdue's Board of Trustees along with our CIO to give us that money, we would be happy to

No thanks. I was in my element at the DNC, but university politics are deadly. ;)

Re:Dumb (1)

puriots0 (173203) | more than 6 years ago | (#23285930)

First, the amount of time wasted by trying to do this incrementally would be a much bigger hassle than doing this all at once. The last cluster we built a rack at a time was 512 nodes (16 racks plus one network rack), and required about 2 months to construct. Even if we could achieve something like a zero-downtime switchover, the months it would take to assemble the systems and test them out incrementally would be completely unacceptable to our customers.

With the 1 week downtime, we were able to clean out and almost completely reorganize the half of our datacenter that HASN'T since the building was built in 1968. I personally spent over 30 hours the first two days of the downtime shutting down machines, wiping disks, removing machines and network hardware from racks, removing cables from the floor (in a manner to not disturb the remaining critical systems), etc. We pulled over 1.6 cubic yards of *network cables* out of the floor over those first two days, and spent time re-arranging the racks of systems that were left into proper hot-aisle/cold-aisle cooling rows.

Also, we've spent around $2.5M on new hardware for this compute cluster The necessary improvements to the datacenter would easily cost over $10M and require years to complete. In addition, a huge portion of the funds used to buy the new systems were from grants, which specified specifically what they could be used for (buying compute resources), and couldn't be used for building repair, maintenance, or construction.

We've been trying to convince the Board of Trustees that we need a new datacenter for the past 5-10 years now. No one there wants to cough up the amount of money that would be required to build such a building, despite how fundamental computing is to modern research. Part of the problem is cost per square foot... it's hard for them to take us seriously when other campus buildings are going up for around 10% of the cost per square foot of what is required for a properly constructed datacenter.

I guess another way to say this is "There's people who have been fighting these issues for over a decade now, and who intimately know the problems that are being faced, don't think that you're saying something new or useful that hasn't been considered.". :)

Thin on details (2, Interesting)

NotBornYesterday (1093817) | more than 6 years ago | (#23268980)

TFA mentioned the Dell 2*quad Xeon hardware, but failed to mention what kind of storage will be attached to it, what kind of network(s) they plan to use to rope it all together, what OS & filesystem they plan to use, & other stuff that would be fun to know.

If they don't tell us what they're using, how can we have flame wars over whose technology really should have been used in it? We'll be stuck with nothing to do but make up bad car analogies.

It would be like, "GM is announcing a barnraising event today to build a new car. Over 200 people will all get together at once to assemble it. It is going to have 8 cylinders and burn gas."

Sheesh.

Re:Thin on details (1)

pathological liar (659969) | more than 6 years ago | (#23269154)

You can make a pretty good guess at interconnect based on the cost (if it's there, I don't care enough to read the article) ... remember to add a factor of 2 or 3 to the price to account for the edu discount...

Re:Thin on details (1)

Digi-John (692918) | more than 6 years ago | (#23269270)

I'd say it's highly likely that the interconnect will be Infiniband. As for storage... when you get that big, I think there are generally "service nodes" that connect to the storage systems on behalf of the compute nodes; I'm just gonna go out on a limb and say they'll use either Lustre or NFS. I wish there was more information somewhere...

Re:Thin on details (1)

SanityInAnarchy (655584) | more than 6 years ago | (#23269824)

I'd guess either Lustre or gFarm -- I really don't see NFS working. Maybe NFS4 does more than I think it does?

Re:Thin on details (1)

Troy Baer (1395) | more than 6 years ago | (#23270288)

NFSv3 can scale this big for home directories if you spread the namespace and load across several beefy servers, especially if you also train your users to stage data in and out of parallel file systems (GPFS, Lustre, PVFS, etc.) and/or node-local file systems for I/O intensive jobs. There's no "silver bullet" file system that does everything well*, and there's no shame in using multiple file systems for different parts of your workload where they will work well.

Re:Thin on details (1)

SanityInAnarchy (655584) | more than 5 years ago | (#23271216)

I thought we were talking about a giant supercomputer, though -- I don't think we're talking about home directories.

Also, what's the footnote on your "no silver bullet" line?

Re:Thin on details (1)

Troy Baer (1395) | more than 5 years ago | (#23274628)

Er, supercomputers do have home directories, or at least rationally administered ones do.

The footnote I'd intended to put in was "There are, however, several file systems that do everything poorly", but I figured I'd be in trouble with several vendors if I gave specific examples...

Re:Thin on details (1)

petermgreen (876956) | more than 5 years ago | (#23279912)

I thought we were talking about a giant supercomputer, though -- I don't think we're talking about home directories.
Well users of the supercomputer need somewhere to keep files that their jobs on the supercomputer will need to access. Sure you could use the users central campus home directories but that is likely to be bad for performance and may also cause other issues (for example some universities are pretty tight when it comes to quotas for central storage).

Re:Thin on details (1)

SanityInAnarchy (655584) | more than 6 years ago | (#23285006)

Fair enough. I'm not so much debating that there will be any home directories...

I'm suggesting that the NFS solution may well work for central campus home directories, but looks like it would not work well at all for the kinds of files you'd be dealing with on a supercomputer.

Re:Thin on details (0)

Anonymous Coward | more than 5 years ago | (#23270532)

NFS works well with large deployments if the hardware running it is powerful enough. For large scale look at something like Blue-Arc, which basically uses FPGA's to implement NFS.

Re:Thin on details (1)

gregsv (631463) | more than 5 years ago | (#23270658)

A subsection of the cluster will have Infiniband interconnect. Most nodes will be GigE connected. Storage will be NFS, served from several very high end dedicated NFS servers. The cluster will run RedHat Enterprise Linux.

Re:Thin on details (1)

tinkergeek (800871) | more than 5 years ago | (#23273814)

The majority of the cluster will use Ethernet for the networking. (All machines will connect to the *same* switch.) A small number will use our existing Infiniband infrastructure. The machines will all have a single 160GB SATA disk, formatted with ext3, and run RHEL4. Half of the cluster will have 16GB of RAM and the other 32GB of RAM. All clusters are served networked storage over NFSv3 from four BlueArc NAS devices.

Re:Thin on details (1)

puriots0 (173203) | more than 6 years ago | (#23285814)

The storage will be provided by our already-in-place BlueArc Titan 2200 and 2500 systems. Both are two-head clusters with a few racks of disk behind each, the 2200 with 4Gb of uplink per head for home directory storage, and the 2500s each with 10Gb of uplink for scratch (high-speed) storage. They export their filesystems using NFS v3. We also provide an archival storage system using EMC's DXUL and a tape library that has a capacity of a couple of petabytes. We've tested the BlueArc Titans (which are FPGA-based NAS boxes) with over a thousand clients talking to them, running high-performance compute jobs, and they're basically the only solution that we've found which doesn't burst into flames. (well, stop working, anyways...)

The OS is RedHat Enterprise Linux 4, which is a requirement of some of our customers, and software that will be running on the systems.

There are actually two sections, one with about 200 nodes, which has an SDR Infiniband interconnect plus a gigabit ethernet fabric based upon re-used Cisco 4948-10GE switches, and one the other with about 600 nodes, which are networked with gigabit ethernet to a Foundry RX-16 switch, which is completely non-blocking (50Gb FDX of bandwidth connecting each 48-port blade to the switch fabric).

At peak load, we're estimating that the entire system (including in-room cooling systems) will use up about $650 (Purdue's negotiated power cost) in power each day. This doesn't include the water chillers across campus that supply cold water to the CRAC units.

Re:Thin on details (1)

NotBornYesterday (1093817) | more than 6 years ago | (#23286130)

Whoa. Cool.

Thanks for the info, all of you. Enjoy your new toy. How's it running so far?

Hail Purdue! (0)

Anonymous Coward | more than 6 years ago | (#23268990)

Go Boilermakers!

Cmon!! Seriously! 1 day? (0)

Anonymous Coward | more than 6 years ago | (#23269004)

...so now that we're all done putting this thing together...can someone tell me what this leftover connector cable is for?...hehe. Good luck with putting together, in one day, the equivalent of a Space Shuttle console.

what a waste (0)

Anonymous Coward | more than 6 years ago | (#23269088)

Purdue sucks ass

Re:what a waste (1)

siphonophore (158996) | more than 6 years ago | (#23270004)

Perhaps you're referring to our Biofuels Conference [purdue.edu] ?

Re:what a waste (1)

navygeek (1044768) | more than 5 years ago | (#23273676)

Awww... Sounds like someone didn't get into their first choice college or got kicked on a weed-out class. ;-)

Hey editors (1)

pathological liar (659969) | more than 6 years ago | (#23269124)

"The so-called 'electronic barn-raising' will take place May 5 and involved more than 200 employees."

Either the date or the tense is wrong.

Re:Hey editors (1)

Gat0r30y (957941) | more than 6 years ago | (#23269192)

Free tequila and beers with limes will be offered as compensation upon completion. How else could you motivate 200 college students?

Re:Hey editors (1)

Vancorps (746090) | more than 6 years ago | (#23269450)

You mean how else could you motivate 200 college students on Cinco de Mayo!

Re:Hey editors (1)

halcyon1234 (834388) | more than 6 years ago | (#23269504)

...200 college students?

Which one of you fuckers if fake-lifting?!?

Re:Hey editors (1)

oodaloop (1229816) | more than 6 years ago | (#23270184)

Either the date or the tense is wrong.
Or it was in a previous year.

Oblig. Simpsons (1)

B3ryllium (571199) | more than 6 years ago | (#23269142)

'tis a fine pool, english, but a supercomputer it ain't.

Re:Oblig. Simpsons (0)

Anonymous Coward | more than 6 years ago | (#23269414)

D'oheth!

Re:Oblig. Simpsons (1)

jimrob (1092327) | more than 5 years ago | (#23271318)

(nitpick)

Actually, since the referenced line is to the effect of, "'Tis a fine barn, English, but 'tis no pool." the line should read:

"'Tis a fine barn, English, but 'tis no supercomputer."

(/nitpick)

Re:Oblig. Simpsons (1)

B3ryllium (571199) | more than 5 years ago | (#23271360)

I think you mean "obligatory nitpick".

SySadmin + Cinco de Miyo = Bad News (1)

Picass0 (147474) | more than 6 years ago | (#23269152)

Whay a crappy day to pick for such a big job.

Re:SySadmin + Cinco de Miyo = Bad News (0)

Anonymous Coward | more than 6 years ago | (#23269272)

What's wrong with working on May 5th?

It's not a holiday anywhere that I'm aware of, in spite of the fact that white people like to call say that date in Spanish for no good reason.

Re:SySadmin + Cinco de Miyo = Bad News (1)

gatzke (2977) | more than 5 years ago | (#23270626)

I spent a Cinqo de Mayo in West Lafayette a few years back... We hit the Taco bell after a few rounds of margaritas. Good times (for Indiana)

Re:SySadmin + Cinco de Miyo = Bad News (1)

catdevnull (531283) | more than 5 years ago | (#23272086)

Cinco de Mayo--it's like the Mexican version of St. Patrick's Day. Evidently, there was a big military victory at Puebla in the mid 1800s and the date stuck. Now it's more of an Americanized holiday/celebration of culture. It's not like the 4th of July--Mexican Independence day is in September.

It's really big in TX, NM, AZ, and CA. Here in Texas, it's a pretty big deal--but mostly for the Tex-Mex restaurants and the cultural nazis.

Gringos usually celebrate by eating "Mexican" food (i.e., something with cheese and jalepeño peppers) and get drunk on Corona and Margaritas. Coincidentally, so do Mexicans--but usually with Miller Lite and Tequila and better Mexican food.

Conflatulations!

Re:SySadmin + Cinco de Miyo = Bad News (0)

Anonymous Coward | more than 6 years ago | (#23269408)

heheh I work on Purdue's campus (for the federal government) -- I'm thinking of asking the boss for the day off just to crack open a beer, and watch them install it! (ok dry-campus, so mountain dew: code red?)

Moderation broken? (0, Offtopic)

slashflood (697891) | more than 6 years ago | (#23269160)

Is it me or is/was the moderation system broken? At least all of the comments in the earlier story about SCO were unmoderated.

Downtime (1)

eap (91469) | more than 6 years ago | (#23269202)

"...it will be built in a single day to keep science and engineering researchers from facing a lengthy downtime."
I'm afraid the damage has already been done on the downtime front, since it has not existed up until this point in time.

Just imangine (1)

ThePawArmy (952965) | more than 6 years ago | (#23269614)

a beowulf cluster of these...

Grammar Nazi says: (2, Insightful)

siphonophore (158996) | more than 6 years ago | (#23269996)

The singular of "megaflops" is not "megaflop".

//pet peeve

Re:Grammar Nazi says: (0)

Anonymous Coward | more than 5 years ago | (#23271464)

You mean "mega-FlOp/s"?

In Soviet Russia... (1)

InSovietRussiaTroll (1282606) | more than 6 years ago | (#23270138)

The barn raises you!

Which raises the question... (1)

R2.0 (532027) | more than 6 years ago | (#23270366)

Now that there will be a working Babbage engine around, can the Amish use it?

Re:Which raises the question... (1)

justinlee37 (993373) | more than 5 years ago | (#23272288)

They could just use a Curta [wikipedia.org] .

Loose defenition (1)

twistah (194990) | more than 5 years ago | (#23270852)

When did we start calling clusters "supercomputers"?

Time off from my normal duties. (1)

westerman (1283038) | more than 5 years ago | (#23272768)

As a Purdue IT employee but not one associated with central IT (who are creating the cluster) I am approaching this as a way to take part of the day off to rub shoulders with other geeks. It should be fun and maybe even informative. They are even providing a probably poor but at least free lunch. :-) As for what central IT gets aside from a bunch of free skilled labor is good publicity and a sense of community. The borg gets to inspect its potential minions.

how fast is it? (0)

Anonymous Coward | more than 5 years ago | (#23273370)

anybody know how many flops?

U of I Supercomputer (0)

Anonymous Coward | more than 5 years ago | (#23273452)

Check out what U of I is getting!

http://www.chicagotribune.com/business/chi-tue-ibm-uofi-supercomputer-apr08,0,5428419.story

Having installed Supercomputers... (1)

vil3nr0b (930195) | more than 5 years ago | (#23274674)

I know 200 people is going to be a disaster. Can you guarantee me Jim Stoner and his buddies can assemble rail kits or anything else? Wait until they get to the infiniband cabling. One bend in that cable at 100 dollars a foot will cause all kinds of problems for the budget. No Thanks. Instead give me a software engineer and four hardware techs three days to do it properly. I guarantee you at least 195 of these folks have never installed one and the concept scares me.

Re:Having installed Supercomputers... (1)

tuomoks (246421) | more than 6 years ago | (#23282042)

Having installed computer centers.. It is scary, it is not so much that a bend cable (fiber?) cost but do you have a spare? Are you sure that the electricity is connect correctly? Will the cooling work? Hopefully it doesn't include the fire extinguisher system installation - a scary thought! Did you remember to secure the raised floor? Did you label everything - correctly? And so on..

We used to do that over weekends in very large installations but it was scary every time. Too many things which could have gone wrong and have catastrophic results.

Re: (1)

clint999 (1277046) | more than 6 years ago | (#23282592)

Jesus christ, you're just full of easy answers, aren't you? Well, why don't you come on over to our campus and just fix everything right up? I'll get you an interview with the CIO right away!! You can convince the board of directors that our data center ne
Check for New Comments
Slashdot Login

Need an Account?

Forgot your password?
or Connect with...

Don't worry, we never post anything without your permission.

Submission Text Formatting Tips

We support a small subset of HTML, namely these tags:

  • b
  • i
  • p
  • br
  • a
  • ol
  • ul
  • li
  • dl
  • dt
  • dd
  • em
  • strong
  • tt
  • blockquote
  • div
  • quote
  • ecode

"ecode" can be used for code snippets, for example:

<ecode>    while(1) { do_something(); } </ecode>