The fourth dialogue of the Road to Bern via Geneva initiative titled ‘Using (big) data: Making data user friendly’ was held online on 14 October 2020. The event was organised by the Permanent Mission of Switzerland to the UN in Geneva and the Geneva Internet Platform (GIP), and was co-hosted by the World Economic Forum (WEF) and the International Telecommunication Union (ITU).
The session was opened by Amb. Jean-Pierre Reymond, who noted that the UN World Data Forum, now postponed to 2021, is an opportunity to better use data sets and official statistics to foster the implementation of the sustainable development goals (SDGs).
The 2018 edition of the UN Data Forum declared that data demands the 2030 Agenda require urgent new solutions that leverage the power of new data sources and technologies through partnerships between national statistical authorities, the private sector, civil society, academia and other research institutions. There is a clear need for partnerships. The cross-sectoral dialogues ‘Road to Bern…via Geneva’ were organised to best implement this recommendation in cooperation with eight Geneva-based organisations. The four dialogues have been designed to follow the lifecycle of data: the collection, protection, sharing, and use of data; in order to identify and fulfil some of the most pressing data demands of the 2030 Agenda.
Welcome and opening remarks
Statistics show that 90% of all data was created in the last 2 years, highlighted Dominic Waughray (Managing Director, World Economic Forum [WEF]). It is predicted that by 2025 463 exabytes of data will be created. As we enter the decade of action, we need to explore new ways to fast track the SDGs and we must harness data to do so. The WEF data shows that 70% of the SDGs’ underlying targets can be supported by technological innovation, and that technology can have a high impact on 10 out of 17 SDGs. Data in the 21st century is the new oil – the new powerhouse. By 2025, 49% of the world’s data will be in public cloud environments, which is a change in terms of who can access it, and what can be measured, monitored, and challenged. Nearly 30% of it will be in real time, allowing for faster, user-centric, contextualised decision-making. Analytics-based insights will be growing exponentially and will be helping organisations, governments and businesses to make better decisions regarding resource allocation. New technologies explosion and cloud-based analytic transparency will change international co-operation and overseas development assistance. The WEF places harnessing technologies such as big data as a fundamental component of its approach to sustainable development. The WEF hosts centres to foster the 4th industrial revolution (4IR) worldwide to foster technology governance, minimise the risk and maximise the benefit of new technologies for society. The WEF also hosts partnerships like the 2030 Vision, which aims to challenge the use of 4IR technologies towards the achievement of the SDGs. The WEF has also signed a strategic partnership framework with the UN in 2019 which outlines areas of co-operation to deepen institutional engagement and jointly accelerate the implementation of the 2030 Agenda, and one of these areas is digital co-operation. The partnership is focused on meeting the needs of the 4IR in the context of the 2030 Agenda; including analysis, dialogue, and standards for digital governance and digital inclusiveness.
The pandemic has shown how ingrained digital technologies are in our lives, and it is clear that our future will be increasingly digital. The use of data has evolved from old, linear models to models where the focus is on increased efficiency and better decision-making. The challenges and opportunities show that much more is needed from Geneva’s international agencies and organisations to meet the 2030 Agenda goals, namely transformative partnerships and innovative mechanisms to accelerate the impact. In collaboration with Salesforce and Deloitte, WEF also launched UpLink, a digital platform to crowdsource innovation to overcome challenges of achieving the SDGs. One of those challenges is data visualisation. The WEF has been making data tangible through transformative maps, showing linkages between data to help the user guide themselves through complex challenges.
The first year of the decade of action has almost passed, and that there are a little more than 9 years to deliver on the 17 SDGs, noted Ms Doreen Bogdan-Martin (Director, Telecommunication Development Bureau, International Telecommunication Union [ITU]). With progress lagging behind on almost every SDG, new ideas and approaches that can accelerate solutions are needed. The ITU has long advocated for the catalytic role of digital technologies in accelerating sustainable development. This role is underlined by the release of the UN Secretary General’s Roadmap, the strong focus on digital at this year’s UN General Assembly debates, and COVID-19 putting the spotlight on what it means to be unconnected in the digital age. The pandemic has illustrated the vital importance of meaningful connectivity to people’s livelihoods, employment, health and wellbeing, education, and social participation. It has also served as a wake-up call to the global community to renew efforts to connect the 3.6 billion people still offline. The UNSG’s roadmap also underlines that the global community needs to mobilise, energise, and find ways to collaborate more closely and more productively to achieve the SDGs. Bogdan-Martin noted that Geneva’s unique international ecosystem can contribute to identifying and fulfilling some of the most pressing demands of the 2030 Agenda.
Extracting value from big data is a complex activity and the results must be used effectively, Bogdan-Martin noted. It is important that all types of data are made accessible and available to inform the policy-making process. The ITU project ‘Big Data for measuring the Information Society’ identified extracting useful information from very large amounts of data as a complex challenge that requires specific skills. As big data is a relatively new field, big data scientists are scarce, especially in developing countries, and policymakers fail to understand the potential and the value of big data. Visualising big data through compelling stories is an important tool in promoting insights into results so that policymakers understand and use the information that the data yields in decision-making processes. The ITU has already begun this journey, and impactful and meaningful visualisations will be showcased in its 2020 edition of ICT Facts and Figures publication as part of its Measuring Digital Development series, which will be launched next month.
Panel debate on the use of big data
The panel discussion was moderated by Steven Ramage (Group on Earth Observations, GEO) who asked the speakers to elaborate on different aspects related to the use of data and data visualisation.
Use of big data
Andiswa Mlisa (Managing Director, Earth Observations, South African National Space Agency [SANSA]) noted the importance of appropriate and useful products that governments can use for policy response. She stated that big data has five components: data visualisation and user interaction, data analytics, data processing architectures, skills for big data, and data management. When cloud computing and high-performance computing are added, they form hybrid infrastructures. Combining these aspects enables SANSA to respond to challenges that the society faces, such as climate change, agriculture, and water resource management.
From an economist perspective, the advantage of big data is that it can possibly provide answers to questions we could not answer before, noted Jonathan Schwabish (Senior Fellow, Urban Institute). By merging different data sets, analysts can better understand behaviour and ultimately create better public policies based on this new understanding. From the data visualisation perspective, big data provides a lot of opportunities as it enables visualisation of many things. It also provides challenges such as narrowing things down to tell a useful and coherent story, and the computing power able to handle the work and the data. The impact of big data’s combination with other data sources on underrepresented groups also needs to be considered. Including not only how the data can be used for nefarious purposes, but also how data communicators and data analysts present information to the people they analyse and communicate with.
Big data can solve some of the world’s biggest challenges, highlighted Kate Kallot (Head of Emerging Areas, Nvidia) if governments have the data and the platforms to process them, and data science skills. She noted that governments lack core data science skills and there is no mechanism to match problem owners, problem solvers, and solutions providers to facilitate the collaborations required to get solutions off the ground and at scale. This is why collaboration between different bodies is necessary.
Ronald Jansen (Assistant Director, Chief of Data Innovation and Capacity Branch, UN Statistical Division) provided a few examples on the advantages of big data. The Global Working Group on Big Data works on the use of big data for price indexes and daily tracking of price developments, which were especially important during lockdowns. Another example is the automatic identification system (AIS), which tracks the positioning of ships almost in real time. The last example Jansen provided was the use of mobile data to monitor how the effects of government intervention worked on human mobility to reduce the spread of COVID-19, which was done in Ghana and the Gambia for example.
Tariq Khokhar (Head, Data for Science and Health Priority Area, Wellcome Trust) highlighted that there is a plethora of the best uses of big data in science and health, boosted by the availability of tools and infrastructure, as well as the skills and incentives for people to work on these problems. In the field of genetics, huge volumes of data have been generated for quite a long time, and there are already sophisticated tools and workflows built. After the emergence of the COVID-19 pandemic, sequencing analysis and data sharing started occurring in record time as a result of the infrastructure and tooling that already existed. In the field of medical imaging, machine learning has been making large strides in computer-assisted mammography, both in assisting and diagnostics. An upcoming area is the analysis or the large scale analysis of electronic health records.
Moral standard for using big data
Standards have not been set yet and that is part of the problem, Schwabish noted. For many researchers and analysts, there are processes which dictate how to use big data ethically and carefully with privacy and security concerns, such as institutional review boards. However, these boards do not take into account the bias inherent in the data itself. These biases are structural issues in societies, and an interdisciplinary approach where social scientists work with computer scientists is needed to solve these issues, noted Khokhar.
Skills needed for the effective use and communication of big data
Kallot pointed out that innovation and big data are being driven by different actors in different parts of the globe: major tech companies are driving the innovation in Americas; governments in Asia; individuals in Europe; and in Africa, there is bottom-up artificial intelligence (AI) innovation or data science innovation. African national statistical offices and governments still lack statistical skills, but there is an emerging developer and data science community that can help solve the problems. International collaborations need to start including the developer community and the data science community as they have the required skills and can help invest in developing those skills in the region long-term.
Mlisa noted that when building collaborations, we need to work as ecosystems, rather than individual institutions or individual disciplines. One of the examples she gave was connecting with data science competition platform Zindi to use data to map informal settlements, so that the government could deliver aid such as food parcels and water resources, solving the issues of addressing.
A multidisciplinary team is needed to actually execute things, Jansen reiterated. He noted that quite a few statistical offices are setting up data science centres adjacent to the national statistical office so data scientists are becoming part of their work, and statisticians and data scientists are working together on projects with big data. Data engineers are needed to make applications robust for statistical officers to be able to use the applications, process the data, and work with the results including to present them to the public. It’s up to each institute to decide which skills it will have in the house and for which skills it will seek to partner with other institutes.
Make data analysis results more tangible through visualisation and storytelling
Khokhar noted that there are three elements that would make data visualisation actionable. (a) It needs to be adjusted to the audience’s context. (b) It needs to generate insights that are actionable. (c) It needs to be simple, as people have limited bandwidth to intake information, and picking one idea is preferable. Schwabish underlined the importance of understanding who the target audience is. It is important to deliver the content in the way the audience needs it and understands it. Ramage noted that the ‘audience before content’ (ABC) is a good principle to follow. Mlisa highlighted that information should be presented in an easy to understand manner, and also noted that information can be made available to commercial platforms to further build applications and products.
Big data for evidence-based policy-making
The data science community needs to be included as it will help translate the findings gained from analysing data into actionable insights for governments, policy-making, and decision-making, Kallot reiterated. Greater digital collaboration between the public and private sectors and the data science communities is the only way to achieve useful and scalable AI.
Jansen highlighted that data for official purposes is the bread and butter of statistical offices, and that a lot of them stepped up in times of the pandemic and brought out data that could be considered data for experimental use, but it proved useful for the public in general. He provided the example of the statistical office of Columbia which made a geospatial map with COVID-19 cases and showcasing risks for elderly people and households with many inhabitants. The pandemic opened up possibilities of data for public purposes, he concluded.
People working with data need to think where it came from, what is underlying the data, and what are the potential biases in the data to be able to responsibly communicate it, Schwabish concluded.
Khokhar concluded that the pandemic has brought to light the importance of data for policy-making. It is necessary for governments to elevate data and technology to essential public digital infrastructure, in order not to undermine a whole generation of policy-making which is going to be reliant on this sort of data.
Summary and conclusions
Break out room: ‘Big data for policy-making’
Philip Thigo (Senior Adviser, Data, Innovation and Open Government, Office of the Deputy President of Kenya) summarised the discussion of use cases around big data policy-making, how governments responded to COVID-19 big data, and potential challenges around blindspots. Contact tracing apps were mentioned as use cases, as well as experiences of national statistical offices in using technology applications around human abilities or layering either telco data and observations data. Here, challenges pertain to automation, as countries that did not have pre-existing relationships with mobile operators are still trying to get those applications out of the gate. Participants concluded that we need to be able to build resiliency so that we can be able to simply activate those collaborations during a crisis. A lot of countries do not have negotiating power on how technologies are adopted and used because they need to use these technologies during the pandemic. Many of these countries potentially draw out of existing data protection or data privacy regulations, and they are not protected against the data exploitation of private companies for uses other than its intended purposes. One of the most important questions is how to ensure that policymakers have the skills and knowledge for policy-making in the 21st century. These issues are supranational and a global or regional framework which will put countries on equal playing ground is necessary. Collaboration within the public sector among agencies that use data in their work and have data collaboratives is also needed.
Breakout Room: ‘Big data visualisation’
Jonathan Schwabish (Senior Fellow, Urban Institute) reported on the discussion about the core principles of data visualisation, starting with the intended audience and the aim: informing, facilitating decision-making or advocating. It was noted that the title and labels and annotations within the visualisations can help the audience read the visualisation and deliver the content. This requires both practice and trial and error, and while this practice does not necessarily require in-depth training, the fact that so many analysts do not apply this practice suggests that training is useful. To build teams that can create good content, especially with limited resources, organisations should give existing staff the space, time, and resources to develop and improve their skills to create visualisation projects. Tasking teams to create frequent data visualisation projects can help organizations better develop and understand how these teams work together and evolve to a successful approach.
Breakout room: ‘Skills for big data’
Dominik Rozkrut (President, Statistics Poland; Chair, UN Global Working Group Task Team on Training, Competencies and Capacity Development) said that participants identified skills related to building big data platforms as the biggest challenge. They also noted that the UN World Data Forum should further stress the importance of international partnerships and co-operation that can be improved through the forum. A proposal from this group was to extend the Erasmus programme to incorporate statistical offices to facilitate faster learning. Data literacy can be developed by building data ecosystems allowing people to grow with their skills, as well as investing in these ecosystems and into understanding different parts of data teams. National statistical offices should play an important role as they already form a part of the info infrastructure. Governments should make this infrastructure available to other players in the ecosystem in order to set up governance around the data ecosystem. The issue of trust and full transparency is important, as it explains the use of statistics – to gain insight into crucial social-economic problems.