Monday 5 October 2015

Essential Analogies for the HPC Advocate

This is an update of a two-part article I wrote for HPC Wire in 2013: Part 1 and Part 2.

An important ability for anyone involved in High Performance Computing (HPC or supercomputing or big data processing, etc.) is to be able to explain just what HPC is to others.

"Others” include politicians, Joe Public, graduates possibly interested in HPC, industry managers trying to see how HPC fits into their IT or R&D programs, or family asking for the umpteenth time “what exactly do you do?

One of the easiest ways to explain HPC is to use analogies that relate the concepts to things that the listener is more familiar with. So here is a run-through of some useful analogies for explaining HPC or one of its concepts:

The simple yet powerful: A spade


Need to dig a hole? Use the right tool for the job – a spade. Need to dig a bigger hole, or a hole through tougher material like concrete? Use a more powerful tool – a mechanical digger.

Now instead of digging a hole, consider modeling and simulation. If the model/simulation is too big or too complex – use the more powerful tool: i.e. HPC. It’s nice and simple – HPC is a more powerful tool that can tackle more complex or bigger models/simulations than ordinary computers.

There are some great derived analogies too. You should be able to give a spade to almost anyone and they should be able to dig a hole without too much further instruction. But, hand a novice the keys to a mechanical digger, and it is unlikely they will be able to effectively operate the machine without either training or a lot of on the job learning. Likewise, HPC requires training to be able to use the more powerful tool effectively. Buying mechanical diggers is also requires expertise that buying a spade doesn’t. And so on.

It neatly focuses on the purpose and benefit of HPC rather than the technology itself. If you’ve heard any of my talks recently you will know this is an HPC analogy that I use myself frequently.

The moral high ground: A science/engineering instrument


I’ve occasionally accused the HPC community of being riddled with hypocrites – we make a show of “the science is what matters” and then proceed to focus the rest of the discussion on the hardware (and, if feeling pious or guilty, we mention “but software really matters”).

However, there is a critical truth to this – the scientific (or engineering) capability is what matters when considering HPC. I regularly use this perspective, often very firmly, myself: a supercomputer is NOT a computer – it is a major scientific instrument that just happens to be built using computer technology. Just because it is built from most of the same components as commodity servers does not mean that modes of usage, operating skills, user expectations, etc. should be the same. This helps to put HPC into the right context in the listeners mind – compare it to a major telescope, a wind tunnel, or even LHC@CERN.

The derived analogies are effective too – expertise in the technology itself is required, not just the science using the instrument. Sure, the skills overlap but they are distinct and equally important.

This analogy focuses on the purpose and benefit of HPC, but also includes a reference to it being based on a big computer.

Thursday 27 August 2015

The price of open-source software - a joint response

This viewpoint is published jointly on software.ac.uk, hpcnotes.com (personal blog), danielskatzblog.wordpress.com (personal blog) under a CC-BY licence. It was written by Neil Chue Hong (Software Sustainability Institute), Simon Hettrick (Software Sustainability Institute), Andrew Jones (@hpcnotes & NAG), and Daniel S. Katz (University of Chicago & Argonne National Laboratory)

In their recent paper, Krylov et al. [1] state that the goal of the research community is to advance “what is good for scientific discovery.” We wholeheartedly agree. We also welcome the debate on the role of open source in research, begun by Gezelter [2], in which Krylov was participating. However, we have several concerns with Krylov’s arguments and reasoning on the best way to advance scientific discovery with respect to research software.

Gezelter raises the question of whether it should be standard practice for software developed by publicly funded researchers to be released under an open-source licence. Krylov responds that research software should be developed by professional software developers and sold to researchers.

We advocate that software developed with public funds should be released as open-source by default (supporting Gezelter’s position). However, we also support Krylov’s call for the involvement of professional software developers where appropriate, and support Krylov’s argument that researchers should be encouraged to use existing software where possible. We acknowledge many of Krylov’s arguments of the benefits of professionally written and supported software.

Our first major concern with Krylov’s paper is its focus on arguing against an open-source mandate on software developed by publicly funded researchers. To the knowledge of the authors, no such mandate exists. It appears that Krylov is pre-emptively arguing against the establishment of such a mandate, or even against it becoming “standard practice” in academia. There is a significant difference between a recommendation of releasing as open-source by default (which we firmly support) and a mandate that all research software must be open source (which we don’t support, because it hinders the flexibility that scientific discovery needs).

Our second major concern is Krylov’s assumption that the research community could rely entirely on software purchased from professional software developers. We agree with this approach whenever it is feasible. However, by concentrating on large-scale quantum chemistry software, Krylov overlooks the diversity of software used in research. A significant amount of research software is at a smaller scale: from few line scripts to short programs. Although it is of fundamental importance to research, this small-scale software is typically used by only a handful of researchers. There are many benefits in employing professionals to develop research software but, since so much research software is not commercially viable, the vast majority of it will continue to be developed by researchers for their own use. We do advocate researchers engaging with professional software developers as far as appropriate when developing their own software.

Our desire is to maximise the benefit of software by making it open—allowing researchers other than the developers to read, understand, modify, and use it in their own research—by default. This does not preclude commercial licensing where it both is feasible and is the best way of maximising the software benefit. We believe this is also the central message of Gezelter.

In addition to these two fundamental issues with Krylov, we would like to respond to some of the individual points raised.