Category Archives: Reblogged

Happy birthday BASIC!

Yesterday marked the 50th anniversary of the BASIC programming language!

Happy birthday BASIC!

BASIC is the language that many started their adventures in scientific computing with and the current generation of Data Scientists owe it quite a lot of respect. Dartmouth college and the hall where the very first BASIC code was run are holding a series of celebrations to mark this special anniversary, many of which can be viewed online, and have put together a Docco on the history and impact of the language.

Check out the party here:



IPython 2.0 has been released!

IPython 2.0 is actually out now! Yay! (
If you don’t believe me then just:

pip install --upgrade ipython

See what’s new at

but the highlights are:

  • interactive widgets for the notebook
  • directory navigation in the notebook dashboard
  • persistent URLs for notebooks
  • a new modal user interface in the notebook
  • a security model for notebooks.

You can check out the example IPython notebooks on nbviewer.

The guys and gals at IPython HQ are asking that we all please give it a test. They plan to have 2.0.1 released within a month – based on the initial feedback.

Bring on the bug reports!

Are modern scientists being pushed out?

“With fewer publications under their belt, those that spend time developing the computing and analysis tools needed by the scientific community can find themselves pushed out the market, and out of academia.”

This recent article in The Conversation highlights the plight of the data scientist that wants to stay in traditional science. Is the time spent building tools and writing code better spent writing papers and getting published?

Originally posted on The Conversation by Geraint Lewis of Uni Sydney and by Chris Power of Uni WA on 26 February 2014 – [ ].


The challenge of the modern scientist is to avoid career suicide

Close your eyes and picture a scientist. What do you see?

Perhaps an Albert Einstein, staring intently at a blackboard covered in incomprehensible equations, or of Alexander Fleming, hunched over the laboratory bench poring over a Petri dish?

The likelihood is that you will imagine the scientist as an individual of great intellect, grappling heroically with nature’s secrets and looking for the “Eureka!” moment that will transform our understanding of the universe.

This notion of the individual effort is implicit in the everyday language of scientists themselves. We talk of Newton’s Laws of Motion or Mendelian Inheritance. We have the annual pronouncements of the Nobel committee, which awards science prizes to at most three living individuals in each category.

Contemporary popular culture presents us with characters such as Big Bang Theory’s Sheldon Cooper, single-mindedly and single-handedly in pursuit of a theory of everything.

But the practice of science over the last century has witnessed a significant shift from the individual to the group, as scientific research has become more specialised and the nature of research problems have become more complex, requiring increasingly sophisticated approaches.

The lone scientist appears to be almost a myth.

The rise of ‘Big Science’

Much of science, as it is conducted now, is Big Science, characterised by major international collaborations supported by multi-government billion dollar investments.

Examples include the effort to build the next atom smasher to hunt for the Higgs boson, a telescope to uncover the first generation of stars or galaxies, and the technology to unravel the complex secrets of the human genome.

One of the key driving forces behind this wonderful growth in science has been the similarly spectacular growth in computer power and storage. Big Science now equals Big Data – for example, when the Square Kilometre Array starts observing the sky in 2020, it will generate more data on its first day than will have existed on the internet at that time.

Powerful supercomputers are the tool researchers use to sift through the wealth of data produced by observations of the universe, large and small.

At the same time, they are harnessed to provide insights into complex phenomena in simulated universes – from the way atoms and molecules arrange themselves on the surfaces of novel materials, to the complexity of folding proteins, and the evolution of structure in a universe dominated by dark matter and dark energy.

Big Science has resulted in a spectacular growth in our understanding of the universe, but its reliance on cutting-edge computing has presented a number of new challenges, not only in the cost and running expenses of supercomputers and massive data stores, but also in how to take advantage of this new power.

The Big Science bottleneck

Unlike general computer users – who may want to simply check email, social media or browse photos – scientists often need to get computers to do things that haven’t been done before. It could anything from predicting the intricate motions of dark matter and atoms in a forming galaxy, or mining the wealth of genetic data in the field of bioinformatics.

And unlike general users, scientists seldom have off-the-shelf solutions and software packages to solve their research problems. They require new, home-grown programs that need to be written from scratch.

But the training of modern scientists poorly prepares them for such a high tech future. Studying for a traditional science degree that focuses upon theory and experiment, they get limited exposure to the computation- and data-intensive methods that underpin modern science.

This changes when they enter their postgraduate years – these scientists-in-training are now at the bleeding edge of research, but the bleeding-edge computational tools often do not exist and so they have to develop them.

The result is that many scientists-in-training are ill-equipped to write software (or code, in the everyday language of a researcher) that is fit-for-purpose. And just like driving and child rearing, they are likely to get very cross if you attempt to criticise their efforts, or suggest there is a better way of doing something.

This systemic failing is compounded by a view that the writing of good code is not so much a craft as a trivial exercise in the true effort of science (an attitude that drives us to despair).

For this reason, it is probably unsurprising that many fields are awash with poor, inefficient codes, and data-sets too extensive to be properly explored.

Coding the future

Of course, there are those to whom efficient and cutting-edge coding comes a lot more naturally. They can write the programs to simulate the Universe and take advantage of new GPU-based supercomputers, or efficiently interrogate the multi-dimensional genomic databases.

Writing such codes can be a major undertaking, consuming the entire three to four years of a PhD. For some, they are able to use their codes to obtain new scientific results.

But too often the all-consuming nature of code development means that an individual researcher may not uncover the major scientific results, missing out on the publications and citations that are the currency of modern science.

Those that can code are out of a job

Other researchers, those that just use rather than develop such codes, are able to reap the rewards, and this better paves their way into an academic career. The rewards go to those that seek to answer the questions, not those that make it happen.

With fewer publications under their belt, those that develop the tools needed by the scientific community find themselves pushed out the market, and out of academia.

Some senior academics recognise this path to career suicide, and young researchers are steered into projects with a more stable future (as stable as academic careers can be).

But we are then faced with a growing challenge on who will develop the necessary tools for Big Science to continue to flourish.

How to grow an early scientist

So, what’s the answer? Clearly, science needs to make a cultural change in understanding on what makes a good modern scientist.

As well as fertilising links with our computer scientist colleagues, we need to judge early scientists on more than their paper output and citation count. We need to examine their contribution in a much broader context.

And within this context, we need to develop a career structure that rewards those who make the tools that allow Big Science to happen. Without them, supercomputers will groan with inefficient code, and we are simply going to drown in the oncoming flood of data.

You can already code…

This article, although not written with scientists in mind, does a brilliant job of capturing the anxiety and confusion suffered by most PhD students trying to learn to code.
Originally posted on by Ed Rex in the section On Coding  – [ ]

When someone tells you they code, it’s as if they’re calling you from inside the world’s most exclusive club. It’s probably a pretty great party in there, but you’ve got no idea how they got on the guest list and you’re fairly sure that even if they came out, floored the bouncer and physically carried you in, the bar staff would spot your trainers and you’d find yourself back on this side of the door in ten minutes. Like speaking Chinese or perfecting the moonwalk, coding is just one of those things you’ll never be able to do.

This, of course, is a complete myth. There’s nothing stopping you learning to code. In fact, you could start right now. Go on – don’t even read to the end of this post. Click here instead. You’ll have written your first lines of code before you next check Facebook. Or here if you want to make a website. Or here if you fancy giving an iPhone app a go. Like most things, getting started turns out to be as simple as Googling it and clicking on the first link that’s not an ad. Every coder out there has to start from square one at some point.

But you’re not really starting from square one. Because really, deep down, you already know how to do it. Code is instructions. You write the instructions, and the computer follows them. Any time you’ve given someone directions to your house, or typed in a sum on a calculator, or lined up a row of dominoes, you’ve essentially been coding. The person following your directions, you pressing the equals button, knocking over the first domino – that’s the code being run. Coding is pretty much teaching a series of steps to a computer, for the sole reason that it can follow those steps a hell of a lot quicker than you can.

Running your first line of code and seeing it do whatever it was you told it to, you quickly realise this is something you could get used to. Most of us love giving orders, and when you sit down to code you’ve got what amounts to an uncomplaining, untiring, unerring servant literally at your fingertips. Sure, you have to issue your edicts in a fairly precise way – but ask nicely and it will do pretty much anything for you. And learning the language is easier than you might think; you’ll quickly find that amateur coders are probably the third best served group on the internet, losing out only to Google Incognitos and cat-lovers. For literally every problem you come across, someone will have had it before, asked the rest of the world about it, and received an answer that sounds like it’s been taken straight out of a computer science textbook. It’s as if Tim Berners-Lee is sitting in a room somewhere, scouring the Internet for helpless beginners, and answering each of their questions in turn under a different, ill-judged pseudonym. Bless him.

There’s the usual spiel about the astronomical salaries, the free lunches, the wearing hoodies to work – but you already know all that. Everyone has since they made that film about Justin Timberlake going to Harvard. No, a better reason to start coding, one that may trample all over your better judgement, is that it’s fundamentally creative. You just have to look at what some of the tech companies out there are doing – the Twitters and Apples of this world – to see that this much is true. Thinking that coding is the nerdy IT guy at work rebooting your computer is like thinking that music is what happens when the piano tuner comes round.

Let’s be clear – like anything, getting really good is tough. Unless you happen to be a 7-year-old, you’re probably not going to find time to rack up your 10,000 hours. But that’s not what most of us are going for, and it’s certainly no reason not to pick it up. So if you’ve ever thought you’d like one day to give it a go, treat today as that day. Or at least some time this week. Because, basically, you can already do it.