Wednesday, 30 June 2010

Me on HPC and multicore

Things I have said (or have been attributed as saying - not always the same thing!) - some older interviews with me in various publications about HPC, multicore, etc ...

What You Should Know about Power and Performance Efficiency
Scientific Computing, August 2010, Suzanne Tracy

"Components driving power consumption fall into two categories — those that, as consumers, we cannot control, and those we can. Power consumed by server hardware is increasing and is beyond our direct control as buyers (although manufacturers are working to optimize power efficiency). The biggest factors we can influence are design and deployment of HPC systems as a whole (datacenter included) and recognizing total cost of ownership (including power) when procuring."

"The primary strategy for optimizing power is to ensure proper total cost of ownership (including power) as the driver of procurement, not purely peak performance and initial capital cost. This enables the evolutions of datacenter optimization (e.g. run warm, “free-cooling,” hot aisles) and choices of power-efficient HPC system designs (e.g. more parallelism, lower power processors, etcetera) to be correctly attributed as delivering increased performance against cost."

"Optimizing software and algorithms is a key opportunity to dramatically improve the total cost of ownership of HPC solutions. By optimizing applications, fewer resources are required to deliver the results, thus reducing the power required. Equally, innovations in algorithms can deliver applications that are power-aware — that is, they recognize the energy consumed and the user can balance energy-cost against time-to-solution when selecting algorithms for a given simulation."

"The primary breakthrough will be the recognition of the role software (both implementation efficiency and algorithm design) has to play in delivering cost savings related to power efficiency. Beyond that, the key hardware technologies will be increased use of power switching across the system — while many modern processors will reduce power when not fully utilized, the ability to gate specific parts of the chip will improve, and the same capability will work into other parts of the system — memory, interconnect (maybe balancing power against bandwidth on a job-by-job basis), I/O, etcetera."

Multiple cores multiply programming
Scientific Computing World, June 2010, Paul Schreier

"When it comes to parallel programming, it’s easy to do something that looks right, but it’s difficult to be sure it is right and will do the same thing under all conditions," says Andrew Jones.

"We strongly urge people to use prepackaged routines such as these where other people have done the difficult work of dividing up the tasks in an optimal way," says Jones.

Personal Supercomputers?
Genomeweb, October 2009, By Matthew Dublin

"There is always going to be a class of computing power that is much bigger than anything that will physically fit on your desk because if you can buy something for $1,000 or $10,000 then there are going to be users that are prepared to buy hundreds of them for a million dollars," Jones says. "And there's always going to be something that is orders of magnitude bigger than what most people can afford but the cheap stuff gets more powerful."

"I don't think there's anything wrong with the term 'personal supercomputing' if it successfully gets a whole lot more people making use of the compute power that's available," Jones says. "It's marketing, but it's perfectly valid marketing, aimed at an audience that would normally not go anywhere near large-scale supercomputers. ... HPC can do so much for people trying to do simulations and modeling that whatever we call it to get more people to using it, the better."

With virtualization, high-performance computing becomes more mainstream, November 2008, By Jo Maitland

"Scheduling jobs, queuing jobs, shoring up resources, determining policies such as rejecting a job that doesn't have an estimate of how long the job is going to take … these are typical HPC skills but start to overlap when you're managing a virtualized compute environment," said Andrew Jones.

Jones said he does not believe mainstream computing will ever catch up with HPC. "By definition, HPC will always be more powerful than mainstream computing," he says.

Tuesday, 22 June 2010

Technical computing futures part 2: GPU and manycore success

[Originally posted on The NAG Blog]

In my previous blog, I suggested that the HPC revolution towards GPUs (or similar many-core technologies) as the primary processor has a lot in common with the move from RISC to commodity x86 processors a few years ago. A new technology appears to offer cheaper (or better) performance than the incumbent, for some porting and tuning pain. Of course, I’m not the first HPC blogger to have made this observation, but I hope to follow it a little further.

In particular, my previous blog suggested the outcome might be: “at first the uptake is tentative ... but in a few years time, we might well look back with nostalgia to when GPU’s were not the dominant processor for HPC systems” – in other words, hard going initially, but GPU/many-core will “win” eventually. I even ended up with an ambitious promise for my next blog (i.e. this one): “an idea of what/who will emerge as the dominant solution ...

Continuing the basis of using the past to guess the future, my prediction is that the next steady state of HPC processors will be GPU-like/manycore technologies (for most of the FLOPS at least) and, just like the current steady state (x86), those few companies with the strongest financial muscle will eventually own the dominant market share. However, other companies will have pioneered many of the technologies that make that dominant market share possible, enjoying good market share surges in the process.

I can even have a go at predicting some of the path that might get us to the next steady state of HPC architecture. NVIDIA has already shown us that GPUs for HPC are sometimes a good solution – and importantly, that a good programming ecosystem (CUDA) really helps adoption. Over the last year or so, I’d say the HPC community has moved from “if GPUs can work in this case ...” to “how do I make GPUs work across my workload?

As Intel’s Knights processors bring us many-core but with a familiar x86 instruction set, we might learn that getting good performance across a broad range of applications is possible, but critically dependent on software tools and hard work by skilled parallel programmers. AMD’s Fusion with tighter links between CPU & GPU, could show that the nature of the integration between the many-core/GPU unit and the rest of the system (be it CPU, network, main memory etc) will affect not only maximum performance on specific applications, but maybe more importantly the ease of getting “good enough” performance across a range of applications.

I don't know of any GPU/many-core/accelerator announcements from IBM, but it’s always possible IBM will throw in another useful contribution before the dust settles. They were one of the first into many-core processors for HPC acceleration with Cell and they cannot be easily counted out of top end HPC solutions - e.g. the forthcoming Blue Waters (POWER7) and Sequoia (BG/Q) chart-toppers.

But back to my “winner” prediction. When the revolution settles into a new steady state of mostly GPU/many-core for HPC processors, there won’t be (can’t be) critical distinctions between the various products anymore for most applications. Whichever product we consider (whether GPU or x86-based or whatever), many-core is sufficiently different from few-core (e.g. 1-8 cores) to mean that the early winners have been those users who are easily able to move their key applications across to get step changes in cost and performance.

The big winners in the next stages of the GPU/manycore emergence will be those users who can move the bulk of their high-value-generating HPC usage to many-core processors with the most attractive transition (economy and speed) compared to their competitors.

So what about the dominant solution I promised? For the technology to be pervasive, first there must be greater commonality between offerings (I stop short of standardization) so that programmers have at least a hope of portability. Secondly, users need to be able to extract the available performance. Ideally these would mean a software method that makes many-core programming “good enough easily enough” is discovered – and if so, that software method will be the dominant solution, across all hardware.

Or, if the magic bullet is still not market ready, skilled parallel programmers will be the dominant solution for achieving competitive performance and cost benefits - just like it is for HPC using commodity x86 processors today.

Wednesday, 16 June 2010

Supercomputing's future: Is it CPU or GPU?

[Article by me on ZDNet UK, 16 June, 2010]

Graphics processing units are a hot topic, but that does not assure them a place in supercomputing's future ...

Tuesday, 8 June 2010

Revealing the future of technical computing: part 1

[Originally posted on The NAG Blog]

I recall some years ago porting an application code I worked with, which was developed and used almost exclusively on a high end supercomputer, to my PC. Naively (I was young), I was shocked to find that, per-processor, the code ran (much) faster on my PC than on the supercomputer. With very little optimization effort.

How could this be – this desktop machine costing only a few hundred pounds was matching the performance of a four processor HPC node costing many times that? Since I was also starting to get involved in HPC procurements, I naturally asked why we spend millions on special supercomputers, when for a twentieth of the price, we’d get the same throughput from a bunch of high-spec PCs?

The answer then (and now) was that I was extrapolating from only one application, and that application could be run as lots of separate test cases with no reduction in capability (i.e. we didn’t need large memory etc, just lots of parameter space). However, the other major workload (which I also ported and also ran fast on the PC) would not have been able to do the size of problem we wanted on a PC – we needed the larger memory and extra grunt from parallel processing. (We did look at the newfangled Network Of Workstations emerging at the time but decided it might be a wolf in sheep’s clothing. Sorry.)

In the end, we had to find a balance between (a) speed at lowest cost for the one application; (b) the best capability for the other application (i.e. fastest solution time for the largest problems); (c) ease of programming – to get a good enough (fast-enough) code developed with the limited developer effort and funding we had; and (d) whole life affordability.

Why do I foist this reminiscence on you? Because the current GPU crisis (maybe “crisis” is a bit strong – "PR storm" perhaps?) looks very much the same to me. The desktop HPC surprise of my youth has evolved into the dominant HPC processor and so for some years now, we have been developing and running our applications on clusters of general purpose processors – and a new upstart is trying to muscle in with the same tactic – “look how fast and how cheap” – the GPU (or similar technologies – e.g. Larrabee, sorry Knights-thingy).

The issues are the same: (a) for some applications, GPUs offer substantial performance improvements for considerably less cost than a “normal” HPC processor; (b) for other applications, the limits such as off-card bandwidth etc mean that GPU’s cannot deliver the required capability; (c) the underlying concern is ease of programming for GPUs; (d) affordability – sure GPU’s are cheap to buy, but what about power costs when in bulk, or code porting costs, etc?

Maybe the result will be the same as when commodity processors and clusters eventually exploded to leave custom supercomputer hardware as the minority solution. At first the uptake (now) is tentative - and painful. Some will have great success stories, many will get burnt. But in a few years time, we might well look back with nostalgia to when GPU’s were not the dominant processor for HPC systems.

I’ll continue on the future of HPC in my next blog in a few days, including an idea of what/who will emerge as the dominant solution ...