Showing posts with label software. Show all posts
Showing posts with label software. Show all posts

Monday 31 July 2017

HPC Getting More Choices - Technology Diversity

HPC has been easy for a while ...

When buying new workstations or personal computers, it is easy to adopt the simple mantra that a newer processor or higher clock frequency means your application will run faster. It is not totally true, but it works well enough. However, with High Performance Computing, HPC, it is more complicated.

HPC works by using parallel computing – the use of many computing elements together. The nature of these computing elements, how they are combined, the hardware and software ecosystems around them, and the challenges for the programmer and user vary significantly – between products and across time. Since HPC works by bringing together many technology elements, the interaction between those elements becomes as important as the elements themselves.

Whilst there has always been a variety of HPC technology solutions, there has been a strong degree of technical similarity of the majority of HPC systems in the last decade or so. This has meant that (i) code portability between platforms has been relatively easy to achieve and (ii) attention to on-node memory bandwidth (including cache optimization) and inter-node scaling aspects would get you a long way towards a single code base that performs well on many platforms.

Increase in HPC technology diversity

However, there is a marked trend of an increase in diversity of technology options over the last few years, with all signs that this is set to continue for the next few years. This includes breaking the near-ubiquity of Intel Xeon processors, the use of many-core processors for the compute elements, increasing complexity (and choice) of the data storage (memory) and movement (interconnect) hierarchies of HPC systems, new choices in software layers, new processor architectures, etc.

This means that unless your code is adjusted to effectively exploit the architecture of your HPC system, your code may not run faster at all on the newer system.

It also means HPC clusters proving themselves where custom supercomputers might have previously been the only option, and custom supercomputers delivering value where commodity clusters might have previously been the default.

Thursday 27 August 2015

The price of open-source software - a joint response

This viewpoint is published jointly on, (personal blog), (personal blog) under a CC-BY licence. It was written by Neil Chue Hong (Software Sustainability Institute), Simon Hettrick (Software Sustainability Institute), Andrew Jones (@hpcnotes & NAG), and Daniel S. Katz (University of Chicago & Argonne National Laboratory)

In their recent paper, Krylov et al. [1] state that the goal of the research community is to advance “what is good for scientific discovery.” We wholeheartedly agree. We also welcome the debate on the role of open source in research, begun by Gezelter [2], in which Krylov was participating. However, we have several concerns with Krylov’s arguments and reasoning on the best way to advance scientific discovery with respect to research software.

Gezelter raises the question of whether it should be standard practice for software developed by publicly funded researchers to be released under an open-source licence. Krylov responds that research software should be developed by professional software developers and sold to researchers.

We advocate that software developed with public funds should be released as open-source by default (supporting Gezelter’s position). However, we also support Krylov’s call for the involvement of professional software developers where appropriate, and support Krylov’s argument that researchers should be encouraged to use existing software where possible. We acknowledge many of Krylov’s arguments of the benefits of professionally written and supported software.

Our first major concern with Krylov’s paper is its focus on arguing against an open-source mandate on software developed by publicly funded researchers. To the knowledge of the authors, no such mandate exists. It appears that Krylov is pre-emptively arguing against the establishment of such a mandate, or even against it becoming “standard practice” in academia. There is a significant difference between a recommendation of releasing as open-source by default (which we firmly support) and a mandate that all research software must be open source (which we don’t support, because it hinders the flexibility that scientific discovery needs).

Our second major concern is Krylov’s assumption that the research community could rely entirely on software purchased from professional software developers. We agree with this approach whenever it is feasible. However, by concentrating on large-scale quantum chemistry software, Krylov overlooks the diversity of software used in research. A significant amount of research software is at a smaller scale: from few line scripts to short programs. Although it is of fundamental importance to research, this small-scale software is typically used by only a handful of researchers. There are many benefits in employing professionals to develop research software but, since so much research software is not commercially viable, the vast majority of it will continue to be developed by researchers for their own use. We do advocate researchers engaging with professional software developers as far as appropriate when developing their own software.

Our desire is to maximise the benefit of software by making it open—allowing researchers other than the developers to read, understand, modify, and use it in their own research—by default. This does not preclude commercial licensing where it both is feasible and is the best way of maximising the software benefit. We believe this is also the central message of Gezelter.

In addition to these two fundamental issues with Krylov, we would like to respond to some of the individual points raised.

Friday 30 August 2013

All software needs to be parallel

I often use this slide to show why all software has to be aware of parallel processing now.

In short, if your software does not exploit parallel processing techniques, then your code is limited to less than 2% of the potential performance of the processor. And this is just for a single processor - it is even more critical if the code has to run on a cluster or a supercomputer.

Friday 12 October 2012

The making of “1000x” – unbalanced supercomputing

I have posted a new article on the NAG blog: The making of "1000x" – unbalanced supercomputing.

This goes behind my article in HPCwire ("Chasing1000x: The future of supercomputing is unbalanced"), where I explain the pun in the title and dip into some of the technology issues affecting the next 1000x in performance.

Friday 15 June 2012

Supercomputers are for dreams

I was invited to the 2012 NCSA Annual Private Sector Program (PSP) meeting in May. In my few years of attending, this has always been a great meeting (attendance by invitation only), with an unusually high concentration of real HPC users and managers from industry.

NCSA have recently released streaming video recordings of the main sessions - the videos can be found  as links on the Annual PSP Meeting agenda page.

Bill Gropp chaired a panel session on "Modern Software Implementation" with myself and Gerry Labedz as panellists.

The full video (~1 hour) is here but I have also prepared a breakdown of the panel discussion in this blog post below.

Friday 25 May 2012

Looking ahead to ISC'12

I have posted my preview of ISC'12 Hamburg - the summer's big international conference for the world of supercomputing over on the NAG blog. I will be attending ISC'12, along with several of my NAG colleagues. My blog post discusses these five key topics:
  • GPU vs MIC vs Other
  • What is happening with Exascale?
  • Top 500, Top 10,
  • Tens of PetaFLOPS
  • Finding the advantage in software
  • Big Data and HPC 
Read more on the NAG blog ...

Friday 4 November 2011

My SC11 diary 10

It seems I have been blogging about SC11 for a long time - but it has only been two weeks since the first SC11 diary post, and this is only the 10th SC11 diary entry. However, this will also be the final SC11 diary blog post.

I will write again before SC11 in HPC Wire (to be published around or just before the start of SC11).

And, then maybe a SC11 related blog post after SC11 has all finished.

So, what thoughts for the final pre-SC11 diary then? I'm sure you have noticed that the pre-show press coverage has started in volume now. Perhaps my preview of the SC11 battleground, what to look out for, what might emerge, ...

Friday 24 June 2011

ISC11 Review

ISC11 - the mid-season big international conference for the world of supercomputing - was held this week in Hamburg.

Here, I update my ISC11 preview post with my thoughts after the event.

I said I was watching out for three battles.

GPU vs MIC vs Fusion

The fight for top voice in manycore/GPU world will be one interesting theme of ISC11. Will this be the year that the GPU/manycore theme really means more than just NVidia and CUDA? AMD has opened the lid on Fusion in recent weeks and has sparked some real interest. Intel's MIC (or Knights) is probably set for some profile at ISC11 now the Knights Ferry program has been running a while. How will NVidia react to no longer being the loudest (only?) noise in GPU/manycore land? Or will NVidia's early momentum carry through?

Review: None of this is definitive, but my gut reaction is that MIC won this battle. GPU lost. Fusion didn't play again. My feeling from talking to attendees was that MIC was second only to the K story, in terms of what people were talking about (and asking NAG - as collaborators in the MIC programme - what we thought). Partly because of the MIC hype, and the K success (performance and power efficient without GPUs), GPUs took a quieter role than recent years. Fusion, disappointingly, once again seemed to have a quiet time in terms of people talking about it (or not). Result? As I thought, manycore is now realistically meaning more than just NVidia/CUDA.

Exascale vs Desktop HPC

Both the exascale vision/race/distraction (select according to your preference) and the promise of desktop HPC (personal supercomputing?) have space on the agenda and exhibit floor at ISC11. Which will be the defining scale of the show? Will most attendees be discussing exascale and the research/development challenges to get there? Or will the hopes and constraints of "HPC for the masses" have people talking in the aisles? Will the lone voices trying to link the two extremes be heard? (technology trickle down, market solutions to efficient parallel programming etc.) What about the "missing middle"?

Review: Exascale won this one hands down, I think. Some lone voices still tried to talk about desktop HPC, missing middles, mass usage of HPC and so-on. But exascale got the hype again (not necessarily wrong for one of the year's primary "supercomputing" shows!)

Software vs Hardware

The biggie for me. Will this be the year that software really gets as much attention as hardware? Will the challenges and opportunities of major applications renovation get the profile it deserves? Will people just continue to say "and software too". Or will the debate - and actions - start to follow? The themes above might (should) help drive this (porting to GPU, new algorithms for manycore, new paradigms for exascale, etc). Will people trying to understand where to focus their budget get answers? Balance of hardware vs software development vs new skills? Balance of "protect legacy investment" against opportunity of fresh look at applications?

Review: Hardware still got more attention than software. Top500, MIC, etc. Although ease-of-programming for MIC was a common question too. I did miss lots of talks, so perhaps there was more there focusing on applications and software challenges than I caught. But the chat in the corridors was still hardware dominated I thought.

The rest?

What have I not listed? National flag waving. I'm not sure I will be watching too closely whether USA, Japan, China, Russia or Europe get the most [systems|petaflops|press releases|whatever]. Nor the issue of cloud vs traditional HPC. I'm not saying those two don't matter. But I am guessing the three topics above will have more impact on the lives of HPC users and technology developers - both next week and for the next year once back at work.

Review: Well, I got those two wrong! Flags were out in force, with Japan (K, Fujitsu, Top500, etc) and France (Bull keynote) waving strongly among others. And clouds were seemingly the question to be asked at every panel! But in a way, I was still right - flags and clouds do matter and will get people talking - but I mainatin that manycore, exascale vs desktop, and the desperation of software all matter more.

 What did you learn? What stood out for you? Please add your comments and thoughts below ...

Thursday 26 May 2011

Poll: legacy software and future HPC applications

I've added a new quick survey to the HPC Notes blog: "Which do you agree with more for developing the next generation of HPC applications?"

Is the argument about "protecting our invetsment" a load of nonsense? Or is throwing it all away and starting again irresponsible?

I've only allowed the two extremes as voting options - I know you might want to say "both" - but choose one!

See top right of the blog home page. Vote away ...

For clues on my own views, and some good audience debate, see the software panel video recording from the recent NCSA PSP annual meeting here:

Blog on the topic to follow shortly ... after some of your votes have been posted :-)

Thursday 24 March 2011

Investments Today for Effective Exascale Tomorrow

I contributed to this article in the March 2011 The Exascale Report by Mike Bernhardt.

"Initiatives are being launched, research centers are being established, teams are being formed, but in reality, we are barely getting started with exascale research. Opinions vary as to where we should be focusing our resources.

In this issue, The Exascale Report asks NAG's Andy Jones, Lawrence Livermore's Dona Crawford, and Growth Science International's Thomas Thurston where should we (as a global community) be placing our efforts today with exascale research and development?"

Friday 18 March 2011

Performance and Results

[Originally posted on The NAG Blog]

What's in a catch phrase?

As you will hopefully know, NAG's strapline is "Results Matter. Trust NAG".

What matters to you, our customers, is results. Correct results that you can rely on. Our strapline invites you to trust NAG - our people and our software products - to deliver that for you.

When I joined NAG to help develop the High Performance Computing (HPC) services and consulting business, one of the early discussions raised the possibility of using a new version of this strapline for our HPC business, reflecting the performance emphasis of the increased HPC activity. Probably the best suggestion was "Performance Matters. Trust NAG." Close second was "Productivity Matters. Trust NAG."

Thursday 17 March 2011

The Addictive Allure of Supercomputing

The European Medical Device Technology (EMDT) magazine interviewed me recently. InsideHPC also has pointed to the interview here.

The interview discusses false hopes of users: "Computers will always get faster – I just have to wait for the next processor and my application will run faster."

We still see this so often - managers, researchers, programmers even - all waiting for the silver bullet that will make multicore processors run their application faster with no extra effort from them. There is nothing now or coming soon that will do that excpet for a few special cases. Getting performance from multicore processors means evolving your code for parallel processing. Tools and parallelized library plugins can help - but in many cases they won't be a substitute for re-writing key parts of the code using multithreading or similar techniques.

Saturday 30 October 2010

Comparing HPC across China, USA and Europe

[Originally posted on The NAG Blog]

In my earlier blog post today on China announcing the world's faster supercomputer, I said I'd be back with more later on the comparisons with the USA, Europe and others. In this morning's blog, I made the point that the world's fastest supercomputer, in itself, is not world changing. But leading supercomputers, critically matched with appropriate expertise in programming and using them, togther with the vision to ensure use across basic research, industry and defence applications can indeed be strategically beneficial to a nation - including real economic impact.

There are plenty of reports and studies describing the strategic impact of HPC within a given organisation or at national levels (some are catalogued by IDC here), so let's take it as a premise for the following thoughts.

Friday 29 October 2010

Why does the China supercomputer matter to western governments?

[Originally posted on The NAG Blog]

There is a lot of fuss in the mainstream media (BBC, FT, CNET, even the Daily Mail!) the last few days about the world's fastest supercomputer being in China for the first time. And much ado on Twitter (me too - @hpcnotes).

But much of the mainstream reporting, twitter-fest, and blogging is missing the point I think. China deploying the world's fastest supercomputer is news (the fastest supercomputer has almost always been American for decades, with the occasional Japanese crown). But the machine alone is not the big news.

Monday 13 September 2010

Do you want ice with your supercomputer?

[Originally posted on The NAG Blog]

Would you like ice with your drink?” It’s a common question of course. One that divides people – few will think “I don’t mind” – most have a firm preference one way or the other. There are people who hate ice with their drink and those who freak if there is none. National stereotypes have a role to play – in the USA the question is not always asked – it’s assumed you want ice with everything. In the UK, you often have to ask specifically to get ice.

Yet the role of ice in making our drinks chilled is misleading. I once had a discussion with a leading American member of the international HPC community about this. “No ice”, he was complaining as we headed out of a European country, “they had no ice for the drink”.

I don’t get this obsession with ice”, I chipped in. “What?!” He looked at me as if I were mad. “Why do you like your coke warm?

Ah, but that’s just it”, I replied. “I hate warm drinks – I really like my coke chilled. But surely, in this modern world over a century after the invention of the refrigerator, it’s not unreasonable to expect the fluid to be chilled – without the need to drop lumps of solid water into it?

Ah, fair point”, he conceded.

What has this got to do with supercomputing? Perhaps the common thread is that usually we just accept the habitual choices of ways to do things – and don’t often step back to think – “are those the only choices?

Maybe we should step back a little more often and ask ourselves what we are trying to achieve with HPC – and are the usual choices the only ways forward? Or are there different ways to approach the problem that will deliver simpler, better or cheaper performance?

Perhaps your business/research goals mean you need to conduct more complex modelling or you need faster performance. Maybe the drive of computing technology towards many-core processors rather than faster processors is limiting your ability to achieve this. (I have had several conversations recently, where companies are buying older technology because their software won’t run on multicore).

The “ice or no ice” question might be whether or not to upgrade your HPC with the latest multicore processors. But what about the “just chill the fluid” option? Well, how about upgrading the software instead, or as well?

NAG has plenty of case studies to show where enhancements to software have achieved huge gains in performance or capability (e.g.,

Sometimes buying more compute power is the right answer. Sometimes, extracting more efficient performance from what you have is the answer. Bringing them together - a balance of hardware upgrades and software innovations might well give you the best chance of optimising cost efficiency, performance and sustainability of performance.

Monday 30 August 2010

Me on HPC 2

Things I have said (or have been attributed as saying - not always the same thing!) - some older interviews with me in various publications about HPC, multicore, etc ...

Successful Deployment at Extreme Scale: More than Just the Iron
The Exascale Report
August 2010, by John West

[full article requires subscription, extracts here are not complete, and are modified slightly to support that]

"cost of science, not just the cost of supercomputer ownership"

"lead time, and funding, to get the user community ready"

"spend a year or more selecting a machine and then deploy it as quickly as possible, makes it very difficult to build a community and get codes ready ahead of time"

"software must be viewed as part of the scientific instrument, in this case a supercomputer, that needs its own investment. High performance computing is really about the software; whatever hardware you are using is just an accelerator system."

"a machine is deployed and then obsolete within three years. And the users often have no idea what architecture is coming next. There is no real chance for planning, or a return on software development investment."

Monday 19 July 2010

Time Machines and Supercomputers

[Originally posted on The NAG Blog]

I found a Linpack App for the iPhone last week. Nothing special, just a bit of five minute fun. It seems a 3G model achieves about 20 MFLOPS. [Note 1]

What's that got to do with time machines? Well it got me thinking "I wonder when 20 MFLOPS was the performance of a leading edge supercomputer?" Actually, it was before the start of the Top500 list (1993), so finding out was beyond the research I was prepared to do for this blog.

So I thought instead about the first supercomputer I used in anger. As soon as I name it, if anyone is still reading this waffle, you will immediately fall into two camps - those who think I'm too young to be nostalgic about old supercomputers yet - and those who think I'm too old to be talking about modern supercomoputers :-).

It was a Cray T3D.

You're still waiting for the time machine bit ... hang on in there.

My application on that T3D sustained about 25 GFLOPS. Which is about the same as a high end PC of recent years. What this means to me is that anyone who cares to apply the effort today with a high end PC, could get comparable results to that work of 15-20 years ago that needed the supercomputer.

Or, in other words, that supercomputer gave us a 15-20 years time advantage over everyone who didn't have supercomputers - or a few years over others with smaller supercomputers. [Note 2]

That is one of the key benefits of High Performance Computing - the ability to get a result before a competitor - you could say HPC is a time machine for simulation and modelling.

Now for the [Notes] - which actually contain the real story!

Note 1 : It's not really true to say the iPhone 3G can do 20 MFLOPs - all we can say is that particular App achieved 20 MFLOPs on that iPhone 3G. The result is a factor of both the software and the hardware. Better performance can come from optimising the application as much as from buying a more powerful phone.

Note 2 : If fact, even with the same supercomputer, it would be hard for most people to replicate the results - simply because there was as much value in the software (physics, algorithms, performance engineering, implementation, etc) and the associated validation and verification program as there was in the supercomputer.

The supercomputer offered us a time machine. But the attention to performance and scalability in the application enabled us to actually use that time machine to get results faster than others - even if those others used the same supercomputer. And the validation and verification effort meant that we could trust what our time machine was telling us.

Tuesday 22 June 2010

Technical computing futures part 2: GPU and manycore success

[Originally posted on The NAG Blog]

In my previous blog, I suggested that the HPC revolution towards GPUs (or similar many-core technologies) as the primary processor has a lot in common with the move from RISC to commodity x86 processors a few years ago. A new technology appears to offer cheaper (or better) performance than the incumbent, for some porting and tuning pain. Of course, I’m not the first HPC blogger to have made this observation, but I hope to follow it a little further.

In particular, my previous blog suggested the outcome might be: “at first the uptake is tentative ... but in a few years time, we might well look back with nostalgia to when GPU’s were not the dominant processor for HPC systems” – in other words, hard going initially, but GPU/many-core will “win” eventually. I even ended up with an ambitious promise for my next blog (i.e. this one): “an idea of what/who will emerge as the dominant solution ...

Continuing the basis of using the past to guess the future, my prediction is that the next steady state of HPC processors will be GPU-like/manycore technologies (for most of the FLOPS at least) and, just like the current steady state (x86), those few companies with the strongest financial muscle will eventually own the dominant market share. However, other companies will have pioneered many of the technologies that make that dominant market share possible, enjoying good market share surges in the process.

I can even have a go at predicting some of the path that might get us to the next steady state of HPC architecture. NVIDIA has already shown us that GPUs for HPC are sometimes a good solution – and importantly, that a good programming ecosystem (CUDA) really helps adoption. Over the last year or so, I’d say the HPC community has moved from “if GPUs can work in this case ...” to “how do I make GPUs work across my workload?

As Intel’s Knights processors bring us many-core but with a familiar x86 instruction set, we might learn that getting good performance across a broad range of applications is possible, but critically dependent on software tools and hard work by skilled parallel programmers. AMD’s Fusion with tighter links between CPU & GPU, could show that the nature of the integration between the many-core/GPU unit and the rest of the system (be it CPU, network, main memory etc) will affect not only maximum performance on specific applications, but maybe more importantly the ease of getting “good enough” performance across a range of applications.

I don't know of any GPU/many-core/accelerator announcements from IBM, but it’s always possible IBM will throw in another useful contribution before the dust settles. They were one of the first into many-core processors for HPC acceleration with Cell and they cannot be easily counted out of top end HPC solutions - e.g. the forthcoming Blue Waters (POWER7) and Sequoia (BG/Q) chart-toppers.

But back to my “winner” prediction. When the revolution settles into a new steady state of mostly GPU/many-core for HPC processors, there won’t be (can’t be) critical distinctions between the various products anymore for most applications. Whichever product we consider (whether GPU or x86-based or whatever), many-core is sufficiently different from few-core (e.g. 1-8 cores) to mean that the early winners have been those users who are easily able to move their key applications across to get step changes in cost and performance.

The big winners in the next stages of the GPU/manycore emergence will be those users who can move the bulk of their high-value-generating HPC usage to many-core processors with the most attractive transition (economy and speed) compared to their competitors.

So what about the dominant solution I promised? For the technology to be pervasive, first there must be greater commonality between offerings (I stop short of standardization) so that programmers have at least a hope of portability. Secondly, users need to be able to extract the available performance. Ideally these would mean a software method that makes many-core programming “good enough easily enough” is discovered – and if so, that software method will be the dominant solution, across all hardware.

Or, if the magic bullet is still not market ready, skilled parallel programmers will be the dominant solution for achieving competitive performance and cost benefits - just like it is for HPC using commodity x86 processors today.

Wednesday 16 June 2010

Supercomputing's future: Is it CPU or GPU?

[Article by me on ZDNet UK, 16 June, 2010]

Graphics processing units are a hot topic, but that does not assure them a place in supercomputing's future ...

Tuesday 8 June 2010

Revealing the future of technical computing: part 1

[Originally posted on The NAG Blog]

I recall some years ago porting an application code I worked with, which was developed and used almost exclusively on a high end supercomputer, to my PC. Naively (I was young), I was shocked to find that, per-processor, the code ran (much) faster on my PC than on the supercomputer. With very little optimization effort.

How could this be – this desktop machine costing only a few hundred pounds was matching the performance of a four processor HPC node costing many times that? Since I was also starting to get involved in HPC procurements, I naturally asked why we spend millions on special supercomputers, when for a twentieth of the price, we’d get the same throughput from a bunch of high-spec PCs?

The answer then (and now) was that I was extrapolating from only one application, and that application could be run as lots of separate test cases with no reduction in capability (i.e. we didn’t need large memory etc, just lots of parameter space). However, the other major workload (which I also ported and also ran fast on the PC) would not have been able to do the size of problem we wanted on a PC – we needed the larger memory and extra grunt from parallel processing. (We did look at the newfangled Network Of Workstations emerging at the time but decided it might be a wolf in sheep’s clothing. Sorry.)

In the end, we had to find a balance between (a) speed at lowest cost for the one application; (b) the best capability for the other application (i.e. fastest solution time for the largest problems); (c) ease of programming – to get a good enough (fast-enough) code developed with the limited developer effort and funding we had; and (d) whole life affordability.

Why do I foist this reminiscence on you? Because the current GPU crisis (maybe “crisis” is a bit strong – "PR storm" perhaps?) looks very much the same to me. The desktop HPC surprise of my youth has evolved into the dominant HPC processor and so for some years now, we have been developing and running our applications on clusters of general purpose processors – and a new upstart is trying to muscle in with the same tactic – “look how fast and how cheap” – the GPU (or similar technologies – e.g. Larrabee, sorry Knights-thingy).

The issues are the same: (a) for some applications, GPUs offer substantial performance improvements for considerably less cost than a “normal” HPC processor; (b) for other applications, the limits such as off-card bandwidth etc mean that GPU’s cannot deliver the required capability; (c) the underlying concern is ease of programming for GPUs; (d) affordability – sure GPU’s are cheap to buy, but what about power costs when in bulk, or code porting costs, etc?

Maybe the result will be the same as when commodity processors and clusters eventually exploded to leave custom supercomputer hardware as the minority solution. At first the uptake (now) is tentative - and painful. Some will have great success stories, many will get burnt. But in a few years time, we might well look back with nostalgia to when GPU’s were not the dominant processor for HPC systems.

I’ll continue on the future of HPC in my next blog in a few days, including an idea of what/who will emerge as the dominant solution ...