Big data

  • Published

Doctors in London have stored more than one and a half thousand beating human hearts in digital form on a computer.

The aim is to discover new treatments by comparing the detailed information on the hearts and the patients' genes.

The project is the latest which makes use of advances in storing large amounts of information.

The study is among a wave of new "Big Data" projects that are transforming the way in which research is carried out.

Researchers at Medical Research Council's Clinical Sciences Centre at Hammersmith Hospital are scanning detailed 3-D videos of the hearts of 1, 600 patients and collecting all the genetic information from each volunteer.

Dr Declan O'Regan who is involved in the heart study said that this new approach had the potential to reveal much more than normal clinical trials in which relatively small amounts of health information is collected from patients over the course of several years.

"There is a really complicated relationship between people's genes and heart disease and we are still trying to unravel what that is. But by getting really clear 3-D pictures of the heart we hope to be able to get a much better understanding of the cause and effect of heart disease and give the right patients the right treatment at the right time".

Subtle

The idea of storing so much information on so many hearts is to compare them and to see what the common factors are that lead to illnesses. Dr O'Regan believes that this kind of analysis will increasingly become the norm in medicine.

"There are often subtle signs of early disease that are really difficult to pick up even if you know what to look for. A computer is very sensitive to picking up subtle signs of a disease before they become a problem".

The new big idea across a range of scientific research fields is called "Big Data" and there are some very big numbers involved.

Computers at the European Bioinfomatics Institute in Cambridge stores the entire genetic code of tens of thousands of different plants and animals. The information occupies the equivalent of more than five thousand laptops.

And to find out how the human mind works researchers at the Institute for Neuroimaging and Informatics at the University of Southern California are storing 30,000 detailed 3D brain scans requiring the space equivalent to 10,000 laptops.

And the Square Kilometre Array, which is a telescope being built in Africa will collect enough data in one year to fill 300 million lap tops. That is a 150 times the current total annual global internet traffic.

Transform

Researchers at the American Association of Science meeting in San Jose are discussing just how they are going to store and sift through this mass of data.

According to Prof Ewan Birney of the EBI near Cambridge Big Data is already beginning to transform the way in which research is being carried out across a range of disciplines.

"Suddenly we don't have to be afraid of measuring lots and lots of things: about humans, about oceans, about the Universe; because we know that we can be confident that we can collect that data and extract some knowledge from it".

The falling cost of storage has helped those developing systems to manage Big Data research, but when faced with an imminent tsunami of information they will have to run to stand still and find ever more intelligent ways to compress and store the information.

The other main issue is how to organise and label the data.

Just as librarians have found ways to classify books by subject or by author, a whole new science is emerging of how to classify scientific data logically so that researchers can find the things they want to find. But when one considers the trillions of pieces of information involved and the complexity of the scientific fields involved the task is much harder than organising a library.

The UK's Biotechnology and Biological Sciences Research Council has announced a £7.45 million pound investment at the meeting for Big Data Infrastructure designed to help.

The emergence of Big Data can be thought of like the development of the microscope: a powerful new tool for scientists to study intricate processes in nature that they have never been able to see before.

Omniscience

Having large amounts of at enables scientists to look at Those involved in developing Big Data infrastructure believe that the investment will lead to a radical shift in the way research across a variety of disciplines is carried out. They sense that a step toward omniscience is within reach; a way of seeing the Universe as it really is rather than the distorted view even scientists have through the filter of our limited brains and senses.

According to Paul Flicek of the EBI Big Data could potentially lift a veil that has been shrouding important avenues of research.

"One of the things about science is that you don't always discover the important things, you discover what you can discover. But by using larger amounts of data we can discover new things and so what will be found? That is an open question," he told BBC News.

The challenge is for scientists to find new ways to manage this data and new ways to analyse them. Just collecting data doesn't solve any problems by itself.

But properly organised and managed it could enable scientists to identify rare subtle events that only occur every so often in nature but have a big effect on our lives. The Higgs Boson was discovered in this way..

"We are not going to slow down generating new data," says Prof Flicek. The fact that we have demonstrated that we can generate a lot of this data, we can sequence these genomes. We are never going to stop doing that and so it opens up so many more exciting things. We can learn new things and we can see things we have never seen before".

Follow Pallab on Twitter