Location: Home / Article / The AI revolution in scientific research

The AI revolution in scientific research

One-Stop Service Platform |
283

 

The Royal Society and The Alan Turing Institute

 

The Royal Society is the UK’s national academy of sciences. The Society’s fundamental purpose, reflected in its founding Charters of the 1660s, is to recognise, promote, and support excellence in science and to encourage the development and use of science for the benefit of humanity.

 

The Alan Turing Institute is the UK’s national institute for data science and artificial intelligence. Its mission is to make great leaps in research in order to change the world for the better.

 

In April 2017, the Royal Society published the results of a major policy study on machine learning. This report considered the potential of machine learning in the next 5 – 10 years, and the actions required to build an environment of careful stewardship that can help realise its potential. Its publication set the direction for a wider programme of Royal Society policy and public engagement on artificial intelligence (AI), which seeks to create the conditions in which the benefits of these technologies can be brought into being safely and rapidly.

 

As part of this programme, in February 2019 the Society convened a workshop on the application of AI in science. By processing the large amounts of data now being generated in fields such as the life sciences, particle physics, astronomy, the social sciences, and more, machine learning could be a key enabler for a range of scientific fields, pushing forward the boundaries of science.

 

This note summarises discussions at the workshop. It is not intended as a verbatim record and its contents do not necessarily represent the views of all participants at the event, or Fellows of the Royal Society or The Alan Turing Institute.

 

Data in science: from the t-test to the frontiers of AI

 

Scientists aspire to understand the workings of nature, people, and society. To do so, they formulate hypotheses, design experiments, and collect data, with the aim of analysing and better understanding natural, physical, and social phenomena.

 

Data collection and analysis is a core element of the scientific method, and scientists have long used statistical techniques to aid their work. In the early 1900s, for example, the development of the t-test gave researchers a new tool to extract insights from data in order to test the veracity of their hypotheses. Such mathematical frameworks were vital in extracting as much information as possible from data that had often taken significant time and money to generate and collect.

 

Examples of the application of statistical methods to scientific challenges can be seen throughout history, often leading to discoveries or methods that underpin the fundamentals of science today, for example:

 

• The analysis by Johannes Kepler of the astronomic measurements of Tycho Brahe in the early seventeenth century led to his formulation of the laws of planetary motion, which subsequently enabled Isaac Newton FRS (and others) to formulate the law of universal gravitation.

• In the mid-nineteenth century, the laboratory at Rothamsted was established as a centre for agricultural research, running continuously monitored experiments from 1856 which are still running to this day. Ronald Fisher FRS – a prominent statistician – was hired to work there in 1919 to direct analysis of these experiments. His work went on to develop the theory of experimental design and lay the groundwork for many fundamental statistical methods that are still in use today.

• In the mid-twentieth century, Margaret Oakley Dayhoff pioneered the analysis of protein sequencing data, a forerunner of genome sequencing, leading early research that used computers to analyse patterns in the sequences.

 

Throughout the 20th century, the development of artificial intelligence (AI) techniques offered additional tools for extracting insights from data.

 

Papers by Alan Turing FRS through the 1940s grappled with the idea of machine intelligence. In 1950, he posed the question “can machines think?”, and suggested a test for machine intelligence – subsequently known as the Turing Test – in which a machine might be called intelligent, if its responses to questions could convince a person that it was human.

 

In the decades that followed, AI methods developed quickly, with a focus on symbolic methods in the 1970s and 1980s that sought to create human-like representations of problems, logic and search, and expert systems that worked from datasets codifying human knowledge and practice to automate decision-making. These subsequently gave way to a resurgence of interest in neural networks, in which layers of small computational units are connected in a way that is inspired by connections in the brain. The key issue with all these methods, however, was scalability – they became inefficient when confronted with even modest sized data sets.

 

The 1980s and 1990s saw a strong development of machine learning theory and statistical machine learning, the latter in particular driven by the increasing amount of data generated, for example from gene sequencing and related experiments. The 2000s and 2010s then brought advances in machine learning, a branch of artificial intelligence that allows computer programs to learn from data rather than following hard-coded rules, in fields ranging from mastering complex games to delivering insights about fundamental science.

 

The 1980s and 1990s saw a strong development of machine learning theory and statistical machine learning, the latter in particular driven by the increasing amount of data generated, for example from gene sequencing and related experiments. The 2000s and 2010s then brought advances in machine learning, a branch of artificial intelligence that allows computer programs to learn from data rather than following hard-coded rules, in fields ranging from mastering complex games to delivering insights about fundamental science.

 

Advances in AI technologies offer more powerful analytical tools

 

The ready availability of very large data sets, coupled with new algorithmic techniques and aided by fast and massively parallel computer power, has vastly increased the power of today’s AI technologies. Technical breakthroughs that have contributed to the success of AI today include:

 

• Convolutional neural networks: multi-layered ‘deep’ neural networks, that are particularly adapted to image classification tasks by being able to identify the relevant features required to solve the problem .

• Reinforcement learning: a method for finding optimal strategies for an environment by exploring many possible scenarios and assigning credit to different moves based on performance.

• Transfer learning: an old idea of using concepts learned in one domain on a new unknown one, this idea has enabled the use of deep convolutional nets trained on labelled data to transfer already-discovered visual features to classify images from different domains with no labels .

• Generative adversarial networks: continues the idea of pitching the computer against itself by co-evolving the neural network classifier with the difficulty of the training data set .