Tuesday 8 June 2010

Revealing the future of technical computing: part 1

[Originally posted on The NAG Blog]

I recall some years ago porting an application code I worked with, which was developed and used almost exclusively on a high end supercomputer, to my PC. Naively (I was young), I was shocked to find that, per-processor, the code ran (much) faster on my PC than on the supercomputer. With very little optimization effort.


How could this be – this desktop machine costing only a few hundred pounds was matching the performance of a four processor HPC node costing many times that? Since I was also starting to get involved in HPC procurements, I naturally asked why we spend millions on special supercomputers, when for a twentieth of the price, we’d get the same throughput from a bunch of high-spec PCs?


The answer then (and now) was that I was extrapolating from only one application, and that application could be run as lots of separate test cases with no reduction in capability (i.e. we didn’t need large memory etc, just lots of parameter space). However, the other major workload (which I also ported and also ran fast on the PC) would not have been able to do the size of problem we wanted on a PC – we needed the larger memory and extra grunt from parallel processing. (We did look at the newfangled Network Of Workstations emerging at the time but decided it might be a wolf in sheep’s clothing. Sorry.)


In the end, we had to find a balance between (a) speed at lowest cost for the one application; (b) the best capability for the other application (i.e. fastest solution time for the largest problems); (c) ease of programming – to get a good enough (fast-enough) code developed with the limited developer effort and funding we had; and (d) whole life affordability.


Why do I foist this reminiscence on you? Because the current GPU crisis (maybe “crisis” is a bit strong – "PR storm" perhaps?) looks very much the same to me. The desktop HPC surprise of my youth has evolved into the dominant HPC processor and so for some years now, we have been developing and running our applications on clusters of general purpose processors – and a new upstart is trying to muscle in with the same tactic – “look how fast and how cheap” – the GPU (or similar technologies – e.g. Larrabee, sorry Knights-thingy).


The issues are the same: (a) for some applications, GPUs offer substantial performance improvements for considerably less cost than a “normal” HPC processor; (b) for other applications, the limits such as off-card bandwidth etc mean that GPU’s cannot deliver the required capability; (c) the underlying concern is ease of programming for GPUs; (d) affordability – sure GPU’s are cheap to buy, but what about power costs when in bulk, or code porting costs, etc?


Maybe the result will be the same as when commodity processors and clusters eventually exploded to leave custom supercomputer hardware as the minority solution. At first the uptake (now) is tentative - and painful. Some will have great success stories, many will get burnt. But in a few years time, we might well look back with nostalgia to when GPU’s were not the dominant processor for HPC systems.


I’ll continue on the future of HPC in my next blog in a few days, including an idea of what/who will emerge as the dominant solution ...

Tuesday 23 March 2010

What’s the next revolution in technical computing?

[Originally posted on The NAG Blog]

It’s a question that absorbs the attention of the technical computing community, especially those working at the leading edge of technology and performance (high performance computing, HPC). What is the next disruptive technology? In other words, what is the next technology that will replace a currently dominant technology? Usually a disruptive technology presents a step-change in performance, cost or ease-of use (or a combination of these) compared to the established technology. The new technology may or may not be disruptive in the sense of discontinuous change in user experience.



Why is identifying disruptive technology so important? First, those who spot the right change early enough and deploy it effectively can attain a significant advantage over competitors as a result of a substantial improvement in technical computing capability or reduction in cost. Second, identifying the right technology change in time can help ensure that future investments (whether software engineering, procurement planning, or HPC product development) are optimally spent.



However, in a field as fast moving as technical computing, spotting the next disruptive technologies of specific relevance to your individual needs can easily become a full time activity (which is why NAG helps to do this for others).



One very credible candidate for disruptive change in HPC right now is GPU computing (or related products that might be in development). However, at the Newport conference recently, the discussion turned to what the next disruptive technology to hit HPC would be (after the possible GPU disruption). One suggestion, made by John West (of InsideHPC fame), was that the next disruptive technology could be in software, especially programming tools and interfaces. This builds on the fact that parallel computing is no longer a specialist activity unique to the HPC crowd – parallel processors are becoming pervasive across all areas of computing from embedded to personal to workgroup technical computing. Parallel programming is thus heading towards a mass market activity – and the mass market is unlikely to view what we have in HPC currently (Fortran plus MPI and/or OpenMP, or limited tools, etc) with much favour. I’m not knocking any of these, but they are not mass-market interfaces to parallel computing. So perhaps the mass market, through volume of people in need – and companies driven by economics will come up with a “better” solution for interfacing with supercomputers.



As a HPC community we lost control of much of our hardware to the commodity market some years ago. Maybe we now face losing control of our software to the commodity community too.

Wednesday 24 February 2010

Events guide: What's on in supercomputing

[Article by me on ZDNet UK, 24 February, 2010]

The key events in the supercomputing calendar can provide real insights and a chance to network ...

http://www.zdnet.co.uk/news/it-strategy/2010/02/24/events-guide-whats-on-in-supercomputing-40041925/

Thursday 18 February 2010

Exascale or personal HPC?

[Originally posted on The NAG Blog]

Which is more interesting for HPC watchers - the ambition of exaflops or personal supercomputing? Anyone who answers "personal supercomputing" is probably not being honest (I welcome challenges!). How many people find watching cars on the local road more interesting than F1 racing? Or think local delivery vans more fascinating than the space shuttle? Of course, everyday cars and local delivery vans are more important for most people than F1 and the space shuttle. And so personal supercomputing is more important than exaflops for most people.

High performance computing at an individual or small group scale directly impacts a far broader set of researchers and business users than exaflops will (at least for the next decade or two). Of course, in the same way that F1 and the shuttle pioneer technologies that improve cars and other everyday products, so the exaflops ambition (and the petaflops race before it) will pioneer technologies that make individual scale HPC better.

One potential benefit to widespread technical computing that some are hoping for is an evolution in programming. It is almost certain that the software challenges of an exaflops supercomputer with a complex distributed processing and memory hierarchy demanding billion-way concurrency will be the critical factor to success and thus tools and language evolutions will be developed to help the task.

Languages might be extended (more likely than new languages) to help express parallelism better. Better may mean easier or with assured correctness rather than higher performance. Language implementations might evolve to better support robustness in the face of potential errors. Successful exascale applications might expect to make much greater use of solver and utility libraries optimized for specific supercomputers. Indeed one outlying idea is that libraries might evolve to become part of the computer system rather than part of the application. Developments like these should also help to make the task of programming personal scale high performance computing much easier, reducing the expertise required to get acceptable performance from a system using tens of cores or GPUs.

Of course, while we wait for the exascale benefits to trickle down, getting applications to achieve reasonable performance across many cores still requires specialist skills.

Thursday 4 February 2010

Don't call it High Performance Computing?

[Originally posted on The NAG Blog]

Having just signed up for twitter (HPCnotes), I've realised that the space I previously had to get my point across was nothing short of luxurious (e.g. my ZDNet columns). It's like the traditional challenge of the elevator pitch - can you make your point about High Performance Computing (HPC) in the 140 character limit of a tweet? It might even be a challenge to state what HPC is in 140 characters. Can we sum up our profession that simply? To a non-HPC person?





The inspired John West of InsideHPC fame wrote about the need to explain HPC some time ago in HPCwire. It's not an abstract problem. As multicore processors (whether CPUs or GPUs) become the default for scientific computing, the parallel programming technologies and methods of HPC are becoming important for all numercial computing users - even if they don't identify themselves as HPC users. In turn, of course, HPC benefits in sustainability and usability from the mass market use of parallel programming skills and technologies.





I'll try to put it in 140 characters (less space for a link): Multicore CPUs promise extra performance but software must be optimised to take advantage. HPC methods can help.





It's not good - can you say it better? Add a comment to this blog post to try ...





For those of you finding this blog post from the short catch line above, hoping to find the answer to how HPC methods can help - well that's what my future posts and those of my colleagues here will address.

Thursday 28 January 2010

Are we taking supercomputing code seriously?

[Article by me on ZDNet UK, 28 January, 2010]

The supercomputing programs behind so much science and research are written by people who are not software pros ...

http://www.zdnet.co.uk/news/it-strategy/2010/01/28/are-we-taking-supercomputing-code-seriously-40004192/

Tuesday 15 December 2009

2009-2019: A Look Back on a Decade of Supercomputing

[Article by me for HPCwire, December 15, 2009]

As we turn the decade into the 2020s, we take a nostalgic look back at the last ten years of supercomputing. It's amazing to think how much has changed in that time.

http://www.hpcwire.com/features/2009-2019-A-Look-Back-on-a-Decade-of-Supercomputing-79351812.html?viewAll=y

Thursday 12 November 2009

Tough choices for supercomputing's legacy apps

[Article by me on ZDNet UK, 12 November, 2009]

The prospect of hundreds of petaflops and exascale computing raises tricky issues for legacy apps ...

http://www.zdnet.co.uk/news/it-strategy/2009/11/12/tough-choices-for-supercomputings-legacy-apps-39869521/

Thursday 1 October 2009

Monday 10 August 2009

Personal supercomputing anyone?

[Article by me on ZDNet UK, 10 August, 2009]

Personal supercomputing may sound like a contradiction in terms, but it definitely exists ...

http://www.zdnet.co.uk/news/it-strategy/2009/08/10/personal-supercomputing-anyone-39710087/

Thursday 18 June 2009

When supercomputing benchmarks fail to add up

[Article by me on ZDNet UK, 18 June, 2009]

Using benchmarks to choose a supercomputer is more complex than just picking the fastest system ...

http://www.zdnet.co.uk/news/it-strategy/2009/06/18/when-supercomputing-benchmarks-fail-to-add-up-39664193/