In hindsight, when we look back to the 1990s, we obviously had to become digitally literate because the means of communication went digital, and communication is central to the way that we all work. In the 2020s, decision-making is going digital, and so far, we need to become data-literate to keep up. The arrival of a quantum internet is only likely to accelerate the move towards greater data literacy and we are currently in an experimental phase of what is to come.
Renegotiating the Internet for the first time in decades
The quantum world is already here. Businesses, such as Banco Caixa in Spain, have already been experimenting with quantum software, which captures some of the benefits of quantum analysis on a regular computer. The base code for the new quantum internet, a completely new physical infrastructure for communications, has been agreed and this year developers have been working on building applications. That will present several challenging realities.
Firstly, it seems highly unlikely that we will get a global Quantum Internet. The collaborative construction of the World Wide Web was led by academics back in the 1980s, particularly under the personal leadership of Prof Sir Tim Berners Lee. Today, governments like Russia and China are already trying to withdraw from the global internet retrospectively, with geo-blocking technologies and legal controls. Modern politicians are in no doubt about the impact the internet has on political and social life, and in the coming months. The internet borders are going up, and we don’t really know what that will mean, particularly for international research and collaboration.
How did we get here?
Small changes in base code have historically brought about enormous social upheaval. Web 1.0, released to the public in 1991, relieved internet users of the tediousness of having to know the address of the server where documents were stored. By linking through human-friendly string data URLs, finding documents became easier and the way we looked for information changed forever.
Around 2002, Web 2.0 removed another hindrance, the need to go through someone who could code in HTML and had access to the back end of websites if you wanted to add or change information online. In making the fundamental instructions ‘get’, ‘post’, ‘put’ and ‘delete’ available through clickable buttons, the floodgates for social media and online shopping were released. Small changes to the base code of the internet can be revolutionary. Building a new internet is likely to transform our lives in the coming years.
An Internet based on data, not documents
The Quantum Internet offers few benefits if your main purpose is the efficient sharing of cat and dog photos, or any other documents. Document sharing is where the World Wide Web excelled. The Quantum Internet offers new processing powers to solve problems with complex data, problems which are currently intractable under existing technologies. The Quantum internet, therefore, be built with data, not documents, at its heart, at least in the early years.
Today, many of us take it for granted that data are accessed through a document. We download a data table and study it on our own machines, or analyse unstructured data sets if we have a few more skills. It has always been possible to bypass the document and go straight to the data, though, in theory.
The same URL that made locating documents easier can be attached to individual data points, such as a temperature reading. The header can contain essential information, on when and where the data was collected. We can go directly to the data, and search, in theory, for data points as easily as we search for cat photos today. This is often referred to as nano tagging and was a capability that was always built into the web, from its initial inception, but tagging and searching individual data points didn’t catch on in people’s imagination. The code to access and analyse that data was also unreliable back then, but that is not the case today, now we are building the Quantum Internet.
Implications for research
Big research publishers have already begun work to nano-tag scientific papers at the word level. This will facilitate academic literature reviews, but it is not a straightforward process. The precise semantics of data, or what a data point represents and means, need to be negotiated and agreed on in advance. For example, in chemistry, there is no simple ‘carbon’. The different atomic structures of carbon are immensely important. In contrast, in the environmental sciences or business research, one word, ‘carbon’, is usually more than adequate to cover most uses. There is a risk that data in the quantum internet will be nano-tagged long before we have had the opportunity to negotiate and agree on the semantics of that data.
It also places questions about where research is going. We know that the current publication system is heavily flawed. Publication bias has meant that a disproportionate amount of praise has been heaped on studies that achieve the 95% statistical significance level in unusual and novel applications. Statistically, we know that if 20 studies are carried out in a similar field, and only one produces those statistically significant results to the 95% confidence interval required for publication, there’s a high chance that study was part of the 5% of results that we would expect to see occur at random statistically, anyway. The praise gets heaped on it anyway.
Since the integrity of the academic publishing system was challenged in the early 2000s, businesses and industries have been more vocal in complaining about the challenges of replicating academic research results in their own trials. Academics do not bother very often with their own reproducible studies as there is a slim chance of publication. A similar issue has appeared with machine learning solutions, which have also notoriously struggled to adapt to applications ‘in the wild’, with real-world messy data.
In the US, the Food & Drug Administration Act requires the results of all clinical trials for experimental drugs to be registered formally, bypassing the publication process and addressing the problem of publication bias and this may be a model we see more of in future. ‘Published with data’ studies have already become the gold standard of academic publishing. When the results of all research, disappointing or not, can be shared on the quantum internet and nano-tagged for easy search, we may have more realistic views of the progress of science, but that is not necessarily the case. It is unclear where peer reviewing will fit into the process of collecting and labelling all the data we share on the Quantum Internet. A few gains in one direction may do very little to counteract the losses in another if we don’t formalise these processes soon.
New analytics need a new kind of university expert
The biggest challenge universities face from the data literacy revolution, however, is that it is interdisciplinary. We need both subject specialists in a variety of fields, and people trained to be able to make sense of the new analytics. We have always had quantitative researchers in many social sciences, but few are comfortable using Machine Learning algorithms, and even fewer have dabbled in quantum computing. This is a problem, as we need people who can make sense of the semantics of data, or what the data means, as well as understanding the syntax, or the ‘grammar’ of how it has been put together and analysed and results were produced.
These are the two sides of a language that cannot be studied in isolation and requires truly interdisciplinary work, not multi-disciplinary, which merely required people to work side by side in ignorance of each other’s skills. Universities tend to be built on a strict division of work into siloed subjects. These convenient faculty and departmental divisions mask the complexity of the modern world and it’s only getting worse. We don’t really know what it would look like to be a subject specialist and be truly data literate, but it’s something we need to start to define. Even trying to bridge something simple like finance budgets for interdisciplinary projects across faculties can be like trying to cross continents in many universities.
These are all big questions around the future of teaching and research in Higher Education in the coming years. Personally, even though I see a high demand for non-academic routes into my field, data analytics, I’m a firm believer in Higher Education and the role it plays in developing the range of skills that we need for a fully data literate society. Making decisions in human-machine teams requires critical thinking, fair and accurate reporting, and clarity over the limitations of work. These are all skills that find a natural home in academia. Machine learning algorithms bring with them a host of problems of poor contestability and data biases. Quantum computing is only going to magnify these issues and academics may be able to steer us through the early, difficult phases.
But I also do not believe that Higher Education teaching or research will somehow enjoy a privileged, ring-fenced status. When the new human-machine labour divisions fully arrive, and these technologies are now reaching full maturity, Higher Education is going to need to make some big changes, just like everyone else. Now is a good time to reconsider fundamental roles.
The Institute of Analytics (IoA) is the leading global body for analytics and data science professionals. We help our members stay ahead of the curve of digital reform through access to the knowledge and training needed to thrive in the data age. To find out more, or to get your analytics courses accredited, visit ioa.global.org
Clare will be speaking at the OEB Academic Plenary on Friday, November 25 2022