MozFest: fair rules for data commons

What could and should be the rules for sharing sensor data, medical data and personal data? AMdEX’s questions at MozFest 2022 sparked lively discussions. The answers weren’t that numerous, but that’s not what this festival for and because of the open internet movement is about.

The online AMdEX session focused on three use cases, addressing different aspects of data sharing, data management and governance. All questions during the session stemmed from Elinor Ostrom‘s great work on data commons. She devoted her research to the collective management of scarce resources and how to apply them so that we (rather than big tech) remain in control of our data.

“Data raises very interesting legal, social, technical and philosophical questions,” says Robert Goené, Lead at Waag’s Future Internet Lab. There are more questions than answers, making this “such an exciting field to work in”. It is not easy: different types of data require different approaches. There is no one data commons solution for everything. “Data commons are by definition pluralistic in nature and the community sets the rules.” Thomas van Binsbergen, researcher in software languages at AMdEX partner University of Amsterdam, adds: “Transparent rules that enable human intervention in the event of conflict.”

Case 1: Marineterrein

Hayo Schreijer, director of deXes and AMdEX partner, describes the Marineterrein in Amsterdam as an area with a range of sensors and cameras. These collect a multitude of data, with sensitivity levels ranging from public to personal. “These data are interesting for students, journalists, researchers and the municipality,” says Schreijer. “But sharing data requires a certain amount of trust. How do we know that an interested user is handling the data responsibly?”

The data from the Marineterrein provided an interesting use case for AMdEX, which was also discussed at a previous event . To ensure that the Marineterrein remains in control of their data, the AMdEX team has set up an extra layer in the sharing cycle between data owners and data users. All parties must agree to a set of rules before they can initiate a transaction. Only if the user accepts the conditions, he can access the data. For example, in some cases, a user can ‘order’ aggregated data from a data owner using micropayments – paying a small amount. Additional rules are that the user is known and works on the Marineterrein.

“Data space, data union, data commons. Do we have a shared language?”

There are still many challenges for data sharing. The data is often not yet ready to be shared, the quality needs to be further improved, as well as the metadata and data structure. The different sensitivity levels of the data (high trust, low trust) can be addressed with different conditions, for which templates are available (e.g. via the Data Sharing Coalition). These terms and conditions must be made ‘readable’ and enforceable by computers.

Stefano, one of the attendees at the session, asks who owns the data. He states that the Marineterrein are the data users, not the owners. Could the General Data Protection Regulation (GDPR) solve that thorny problem? A short discussion follows, from which it becomes clear that the GDPR does not talk about ownership, but about data subjects and processing rights.

Participant Mike was curious about the language used to describe the data commons. “Data space, data union, data commons. Do we have a shared language?” No, we don’t have those. Schreijer: “Focus on work now. We will choose the language if it becomes legally necessary.”

Case 2: BioCommons

In 2003, the human genome was successfully mapped and DNA sequencing became a multi-million dollar business. Citizens enthusiastically submitted their DNA for family tree research and genetic scans. Quirine van Eeden, concept developer at Waag, argued in her presentation that valuable insights from such scans should not lie in the hands of a few. “This data should be seen as commons. They must be shared with universities, healthcare institutions and some commercial parties. We want to start thinking about how we can properly organize genetic data. For example, individuals could only share relevant parts of their DNA sequence. DNA also contains information about relatives. How do you organize individual and collective informed consent? How can we estimate the risk of sharing this data? Should we outsource the organization of that permission to parties that can assess that risk?

BioCommons, a project by Waag, shows the diversity of questions in data commons and the limits of the GDPR. Session participant Thomas: “The challenge is how you communicate in the way in which the rules are established and made transparent. What are the rules for the rules?” AMdEX researcher Van Binsbergen answers that the underlying system must be adaptable, because decisions in this new domain must be continuously reviewed. Discussion ensues about who should set the rules for BioCommons: collectives? individuals? Persons with or without the family members? And how to guarantee the necessary transparency for consent? This whole topic clearly appealed to the imagination and is discussed in a separate session.

Case 3: Social housing and energy bills

Speaker Tom Griffioen is a co-founder of Clappform. This data analysis platform works together with local and regional government and with social housing corporations. The company was asked to predict which of the 20,000 households in a given region would have the highest energy bills so they could get a rent reduction. The data restrictions in this case were strict: data on energy consumption was not available at the house level, nor was there data on the current financial situation of the tenants. The challenge was to share sensitive data in a fair and reliable way. To this end, Clappform is investigating the use of synthetic data. There was some confusion among the public as to why the question had to be answered this way in the first place. Tenants know how high their energy bill is and can request a rent reduction themselves. Some suggested asking the owners of the data (the tenants) for help, as any rent reduction would benefit them.

Text: Karina Meerman