Horizontal analysis of societal externalities of big data

The BYTE-project presents a horizontal analysis in D4.1 of the case studies it has undertaken of positive and negative externalities in the use of big data. The practices involving big data show a wide variety in characteristics and maturity. Technical challenges often are the translation of societal externalities. 4 main categories of societal externalities were reviewed: economic externalities, social and ethical externalities, legal externalities and political externalities.

We have observed positive economical externalities (and social when the activity concerns social aims) in terms of innovation and in improvements in efficiency. This also leads to changes in business models and the appearance of new business models, which also includes ‘creative destruction’ of old models and can lead to dominance of and dependence on a few technological players. Further, despite these positive economic impacts the role of public funding proves to be important into kick-starting a data economy.

The risk for negative impacts on important social values could also be observed. In most case studies (potential) negative effects on privacy were reported, while several case studies mentioned the risk for equality and new risks for discriminatory practices. Trust was often also an issue, where the risk for manipulation and exploitation leads to distrust and withdrawal. This points to the need for developing practices, including but not limited to legal frameworks, which can assure a proper balance and thereby establish trust. In this respect both data protection and intellectual property rights are important legal frameworks, but often acting as a barrier to big data. In general both frameworks were considered outdated and too restrictive for big data. Political externalities concerned mostly political economics. Public sector or non-profit organisations fear rent-seeking behaviour or capture by the private sector. Further the fear to lose control to actors abroad, and in particular US-based actors, was present widely and sometimes translates in protectionist attitudes and requirements to store data within national territories.

The overall picture shows positive benefits but also the potential to negatively affect other important social or ethical values. Important is that big data is not just a technical issue but has an impact on organisational borders and the ‘business ecology’ in general. This leads to uncertainty and conflict in a range of areas, translating in distrust and reluctance by all sorts of actors and conflicts on political and legal level. Organisational borders need to be redefined or redrawn, while also social norms and legal frameworks need to be clarified again based on a proper balancing of all interests.

Case study reports on positive and negative externalities realised!

Deliverable D3.2 Case study reports on positive and negative externalities  presents the case study reports on positive and negative externalities in the use of big data that we have undertaken in the BYTE project. The case studies correspond to the domains of crisis informatics, culture, energy, environment, healthcare, maritime transportation and smart cities. Following a formal designed methodology, we have gathered evidence from the case studies by means of semi-structured interviews, disciplinary focus groups and literature review. Overall, we have conducted 49 interviews with data experts from each case study discipline and we have counted with 6-12 external domain experts per focus group.

The crisis informatics case study is focused on the use of social media – especially Twitter data – to support humanitarian relief efforts during crisis situations.

The culture case study examines a pan-European public initiative that provides open access to digitised copies of cultural heritage works.

The energy case study analyses the impact of big data in exploration and production of oil & gas in the Norwegian Continental Shelf.

The environment case study is probably the most mature in terms of big data. Stakeholders take for granted the availability of data, especially from authoritative sources such as prominent earth and space observation portals, and there is a growing interest in crowd-sourced data.

The healthcare case study is conducted within a health institute at a medical university in the UK. This institute facilitates the discovery of new genes, the identification of disease and innovation in health care utilising genetic data.

The maritime transportation case study analyses the use of big data in the shipping industry that accounts more than 90% of global trade.

Finally, the smart cities case study examines the creation of value from potentially massive amounts of urban data that emerges through the digitalized interaction of a city’s users with the urban infrastructure of resources.


What do members of the public think about big data information practices?

We provide a comprehensive report on public sentiment towards big data as part of work package 2 of the Big data roadmap and cross-disciplinarY community for addressing socieTal Externalities (BYTE) project. The BYTE project aims to assist European science and industry in capturing the positive externalities and diminishing the negative externalities associated with big data in order to gain a greater share of the big data market by 2020. Public sentiments play a role in science and industry gaining a greater share in the big data market.

Public sentiments include perceptions of, and aspirations for, information practices relating to big data. The public sentiments towards big data practices are relevant to big data actors operating in the case study areas examined by the EU-FP7 funded BYTE project, including:

  • Environmental data
  • Crisis informatics
  • Transport data
  • Utilities/ Smart cities data
  • Cultural data
  • Energy data
  • Health data

Big data practices supporting the collection, storage and use of personal data have become a part of everyday life at all levels of society and users have raised concerns in relation to these processes. Particular concerns are the privacy and security of personal data, as well as a general distrust of those handling big data, particularly private sector companies. We find these sentiments to be particularly important because they can indicate the extent to which the public, as a major source of data, willingly contribute to the big data process currently, and how willing they may continue to be.

We examine practices such as data collection, data storage, data sharing (including selling) and data analysis. There tends to be more negative sentiment towards the information practices that the general public has a stronger connection with, such as the point of collection, as they play an active role at this initial stage. Whilst there does not appear to be any overwhelmingly negative sentiments about specific practices that follow the collection of their data, aside from privacy invasive practices, this may be because users tend to be less informed about, and more removed from, specific big data practices such as data analysis, data storage and the selling and sharing of data.

What these perceptions mean for companies and organisations is the need to foster a growing awareness to better inform their users with more transparent policies concerning the subsequent use of the data, as well as the benefits that can flow from information technology practices.

We also consider public aspirations towards big data by looking at what relevant information can tell us about how members of the public would like big data to operate in a manner that causes the least number of negative implications for them. Personal benefit is the strongest incentive for being in favour of the collection and use of personal data by government and companies. This is particularly true when a tangible public benefit is readily identified, such as when data use produces improvements in public security or where developments in health care treatment and diagnostics are achieved. Conversely, if the public see little benefit from sharing their data and little confidence that they will see benefits in future, this may hinder the amounts of data available to big data actors into the future thereby, threatening the longevity of the European big data industry.

The big data industry can benefit by fostering a more positive interaction between big data actors and users. We suggest that one of the ways in which this can be achieved is through the recognition of public aspirations, particularly the delivery of benefits and transparent practices, by incorporating them big data practices and policies. This is also reflected in the suggestions we make in relation to a good practice framework that takes into account privacy and security concerns.

The big data industry must find ways to ensure that citizens, as a major data source, continue to comfortably and securely contribute to large data sets. Positive public sentiments towards big data are imperative to the continuation of data processing activities processes, and the future of big data as a value adding institution/ process.

This research is funded by the European Union under grant number 619551.


Big data and open access

We examine open access policies and initiatives relating to big data in work package 2 of the Big data roadmap and cross-disciplinarY community for addressing socieTal Externalities (BYTE) project. The BYTE project aims to assist European science and industry in capturing the positive externalities and diminishing the negative externalities associated with big data in order to gain a greater share of the big data market by 2020.

Our comprehensive report provides a review of open access initiatives across a number of sectors. We discover the meaning of open access policies and initiatives in the public sector as compared with similar policies and initiatives driven by the private sector. We also explore how effective these policies and initiatives have been, identify barriers associated with open access to big data, and in particular, explore good practices.

We evaluate open access initiatives from the public and private sectors in relation to a number of case study areas including:

  • Health data – GOSgene health initiative, Gen Bank and Protein Data Banks (in Europe and internationally), Teralab initiative and the Yale University Open Data Access initiative;
  • Crisis data – Ushahidi crisis mapping platform and social media models including Twitter;
  • Energy data – the Norwegian Petroleum Directorate initiative called FactPages;
  • Environmental data – the GEO and the GEOSS Data Core, as well as the Copernicus Programme;
  • Transport data – the UK National Public Transport Data Repository;
  • Cultural data – Europeana; and
  • Smart cities/ utilities – Jakarta and Florence.

We reveal that whilst the public sector leads the charge with open access policies relating to big data, we also discover that the accessibility of big data sets is no longer synonymous with only government held or other publicly funded research data, as powerful private sector profiling and data mining technologies are increasingly supporting open access initiatives for commercial purposes. Despite this, open access to big data policies are not yet implemented in the private sector as widely as they have been in the public sector, even though there exists great potential for industry when business models include open access elements, as well as increasing opportunities for cross-sector collaboration. In fact, open access policies can be developed in relation to any types of large data sets to produce a multitude of benefits as seen in the case studies examined in this section.

Relevant to private and public sector big data actors, our report culminates in providing our recommendations concerning good practice lessons for open access to big data that can be translated across sectors, as well as promoting the collaboration between private and public sectors in the development of ‘open’ access projects.

Our report reflects the burgeoning relationship between big data and open access policies. It also recognises open access initiatives as a great benefit to society by providing an abundance of opportunities for Europe. Encouraging asset holders to provide free and open access to that asset requires both voluntary and proscribed policies and initiatives so that the socio-economic benefits of big data can be fully realised.

This research is funded by the European Union under grant number 619551.


Big data implications for society

We report on a number of the legal, economic, social and ethical, and political issues that are implicated by big data as part of our work in Work Package 2 of the EU FP7-funded BYTE project. The BYTE project will develop a Big data roadmap and cross-disciplinarY community for addressing socieTal Externalities. The BYTE project will assist European science and industry in capturing the positive externalities and diminishing the negative externalities associated with big data in order to gain a greater share of the big data market by 2020.

This report provides a comprehensive outline of the potential legal, economic, social and ethical, and political impacts that could be raised by big data. By identifying these issues and understanding the positive and negative externalities they raise, we incite vital discussion of these issues to assist the European big data industry moving forward.

We also examine these issues in relation to a number of big data case studies in actual big data practices across the following disciplinary and industrial sectors to gain an understanding of the economic, legal, social, ethical and political externalities that are in evidence:

  • Environment – Earth Observation Programme, Copernicus and the European Space Agency
  • Crisis informatics – Qatar Computing Research Institute
  • Transport – DNV GL
  • Utilities/ Smart cities – Siemens
  • Cultural – The European Library and Europeana
  • Energy – Statoil
  • Health – Institute of Child Health, University College London

For example, legal issues that arise in relation to big data include intellectual property rights, licensing and contract issues, as well as data protection and privacy risks, jurisdictional issues and implications for due process. An examination of these legal issues highlights the “gap” between technological capability and the legal framework, which means uncertain outcomes for economic competitiveness.

Big data also implicates economic issues that have positive and negative social consequences. We find that big data relates to the economy because it can be a catalyst for innovation, in particular, when new business models require development to incorporate strategies for deriving the added value from big data and in order to capture the efficiencies of big data across a number of sectors. However, concerns for privacy are raised along side these positive effects.

Big data practices such as transparency, profiling and tracking, re-use and unintended secondary use of data, open access, and levels of data access raise social and ethical issues such as trust, discrimination, an inequality of access, privacy, exploitation and manipulation. We discover that these issues are implicated by big data because big data practices deal with data from people, and this human element reflects individual social and moral codes. These issues require recognition so that big data companies and organisations can incorporate fundamental social and ethical values into big data practices and policies.

Political issues emerge as the big data economy develops. Political issues in big data revolve around the change of balances in relationships between states, corporations and citizens because big data seriously impacts the balance of power between politicians and citizens and between states and corporations. We reveal that as intermediation platforms become central to economies and to build social relations. As most of them are American in origin, they challenge the geopolitical balance.

The expertise generated by this report will enable the project to identify these potential impacts through the case study research and suggests potential positive and negative issues to look out for when undertaking a big data project, across all sectors and industry increasingly engaging in big data.

This research is funded by the European Union under grant number 619551.

10 Big Data Initiatives: An insight into the big data landscape

We contribute to work package 1 of the BYTE project by undertaking an expert examination of ten big data initiatives that assists stakeholders in gaining an insight into the current and evolving big data landscape. The EU-FP7 funded BYTE project acronym stands for, Big data roadmap and cross-disciplinarY community for addressing socieTal Externalities. The aim of this project is to assist European science and industry in capturing the positive externalities and diminishing the negative externalities associated with big data in order to gain a greater share of the big data market by 2020.

These ten big data initiatives assist in meeting the objectives of the BYTE project by providing meaningful examples of projects that involve big data, and highlight where they make societal, economic and technological developments, especially as a result of the collaboration between a number of different stakeholders. We discovered initiatives undertaken by governments, commercial organisations and other organisations in a number of different sectors in order to identify what sorts of big data practices are currently in evidence, and what these different stakeholders seek to gain from these projects.

The following big data initiatives reflect current programmes seeking to implement courses of action:

  • European Centre for Nuclear Research (CERN) – Worldwide LHC Computing Grid;
  • US Big Data Research and Development Initiative;
  • Australian Government Public Service Big Data Strategy;
  • UK Data Service;
  • eBay Inc. Big Data Analytics Programme;
  • UN Global Pulse Initiative
  • European Space Agency Big Data Initiative;
  • European Bioinformatics Institute;
  • Teradata; and
  • DeCODE– Genetics, Iceland

We describe each of these initiatives and consider how their conceptualisation of big data, their policies and practices and their strategies for managing big data can inform the activities of BYTE, and indeed the wider big data economy. We examine what big data means in each of these initiatives and what stakeholders are involved. Finally, we look for technological, legal and ethical issues as well as economic and social developments and externalities and consider how these might provide preliminary data for BYTE.

Importantly, our expert analysis results in three main conclusions for the BYTE project. First, the analysis challenges the traditional and long-accepted basic definition based on the 3Vs originally proposed by Gartner. Second, all the aforementioned big data policy initiatives involve cross-sector and cross-agency collaborations, business partnerships and similar emerging relationships. Finally, and most importantly, the aforementioned big data stories detail initiatives that have been implemented to utilise big data to produce positive results, impacts and externalities and provide initial information about how some big data initiatives are addressing potential negative externalities.

This research is funded by the European Union under grant number 619551.

ICO Consultation: Big data and data protection

Trilateral has submitted a piece of written evidence to the UK Information Commissioner’s Office, in response to a consultation that they have opened regarding big data and data protection. The ICO document outlines the potential data protection issues raised by big data and how these might be addressed by big data practitioners. The consultation document can be found using this link.

Trilateral’s response was drafted using expertise gained in the BYTE project on the potential externalities associated with big data.

Trilateral’s response to the consultation can be found here.