Thought leadership from our experts

The latest challenge: free flow of data and protection of non-personal data

Thomas Heymann, Covington & Burling, Germany

Building a European data economy

A recent study identified cross-border information flows as the "fastest growing component of US as well as EU trade". However, providers in the US are still getting the lion's share of this business. One of the key reasons for this is the fragmentation of the legal rules applying in the European market to data storage and data flows. No wonder that the European Commission has made "Building a European Data Economy", meaning basically the free flow of data within Europe, one of its key goals (comp. COM (2017, 9 final)). It is apparent that in certain contexts both information and data has become a valuable commodity. This raises the question whether there should be exclusive rights in such "data" (other than personal data) or even in information.

Data and information

For a long time data and information have been identified as the "basic commodity" of the "information age". Before discussing their protection we need to understand what we mean by these concepts, in particular by "information".

Professor Herbert Zech, a German legal scholar, has distinguished three forms of information:

  • "Semantic Information" as the meaning of information (e.g. of a text) corresponding to the „knowledge" of its user,
  • Syntactic Information" as a number of symbols and their relationships to each other,
  • "Structural Information" as the physical incorporation of semantic information (as in the case of a printed poem) or of syntactical information (as in the case of a file on a hard disk).

Ownership of data and information?

It is tempting to look for one basic principle determining the "ownership" of data. In the current debate this is the position of advocates of a "right in rem", which would be modeled upon the traditional concept of property protection (i.e. with a presumption that the rightholder can exclude any third party from unauthorized use).

However, such an all-encompassing principle is hard to conceive: First, it is often very difficult to attribute information to a specific party. Who should "own" the data generated by a GPS? The direct generator (i.e. the driver), the indirect generator (the owner of the car, e.g. a leasing company), the manufacturer of the GPS-device used in generating these data, the manufacturer of the car, who has "co-sold" the GPS-device, the operator of the GPS-database, or the poor driver (and technically the data subject in the sense of the privacy laws)? Each of these answers has been proposed in legal literature. Note that most of these players are in fierce competition for the data, and have no real incentive to come to an agreement on the applicable principles. Similar considerations apply in the case of smart energy grids.

Another challenge is that the use and sharing of information "cuts" through many different national legal concepts.

Specific information can of course be the subject of legal protection, as in the case of patents or trade secrets. But a generic ownership right applying to any set of "Semantic Information" would (i) destroy language and communication (imagine you could prohibit the use of the sentence: "what a lovely summer day") and (ii) privilege the interests of certain private parties over those of the public and (iii) make it impossible to balance the interests of the involved private parties with each other – which will differ depending on the nature of the information. The same applies to the protection of "Syntactic Information", insofar as this is a simple expression of "Semantic Information". All this would apparently also lead to monopolies of "knowledge", which are clearly detrimental to technical, economic or social welfare.

Finally, ownership only makes sense if it grants exclusive rights, including the right to exclude third parties. However, as information is almost always developed on the back of previously existing information, the involved interests need to be balanced very carefully.

Sharing of data

Against this background, it makes sense that the European Commission is focusing less on the "fundamental" "right in rem" per se than on the development of a legal framework which encourages a better generation, collection and sharing of data. This is best achieved by a set of specific defensive rights. We currently witness the dramatic growth of business models relying on such "sharing" of data. Examples (outside the open source economy) include the granting of access to data via API's in the banking, mobility or energy sectors. In addition, data gathering and analytics are increasingly outsourced to specialized subcontractors. In this economy value is usually created not by exclusion of competitors but by cooperation with as many parties as possible.

It may be useful to distinguish between different layers how access to and protection of information can and should be regulated:

Rules protecting the commercial investment: An example for rules protecting the commercial investment could be the Database Directive (Directive 96/9/EC), which aims at protecting the "investment" against "extraction" and "reutilization" of "substantial parts". Note: This has nothing to do with traditional copyright protection (as Art. 7.4 of the Database Directive makes crystal clear). Such protection is the subject matter of the Trade Secrets Protection (Directive (EU) 2016/943). However, in some cases the rightholder will have to share data (in particular in collaborative business models). That means that the definition of "trade secret" in Art. 2 (1) of the Trade Secrets Protection may have to be reconsidered, as Art. 4 (3) of the Trade Secrets Protection requires the breach of a specific confidentiality undertaking, which may not always be a practical requirement.

Rules preventing the abuse of information and data: examples would be defamation rules. While this is apparently a rather traditional field of law, new challenges arise as a consequence (i) of the international accessibility to such information in the internet, (ii) the possibility to disseminate information anonymously and (iii) the emergence of social networks as intermediary platform which allow for a dramatically enhanced level of distribution of such information. In other words: today almost everyone has the technical means to "publish" such information.

Enforcement of contractual restrictions: One key development for the future will be the introduction of "watermarking" information, which the original creator wants to keep protected. This will be a precondition (i) to enforce limitations on secondary use (for evidentiary purposes) (ii) introducing meaningful contractual limitations with the "recipient" of the information and (iii) also a key defense against liability claims alleging damages caused by incorrect or misleading information.

Limitations regarding the use of information

Just as important are rights for the data subject regarding the commercial exploitation of sensitive information. This is traditionally a domain for data privacy rules. The key justification for the collection (and secondary use) of personal data – i.e. the commercial exploitation – is the consent of the data subject.

This traditional model is also increasingly under pressure in light of the many business models based on the principle "services against data". Art. 7 (4) of the General Data Protection Regulation (Regulation (EU) 2016/679 applicable as of 25 May 2018 ("GDPR")) addresses this by the introduction a "coupling prohibition" under which a consent is only valid, if given freely. Whether this is the case depends inter alia on the question whether the consent is necessary for the performance of the relevant contract. Note, however, that this still presupposes that the data subject freely "negotiates" with the counterparty on what information and data it reveals and which secondary use it authorizes. Increasingly it is argued that (i) such secondary use can be subject to antitrust rules, (ii) the consent is never really given "freely" and we therefor need quasi-public law rules (or model clauses) determining when such consent is acceptable, and (iii) the exchange "data vs. services" constitutes a contract or a "quasi contract". In the latter case the consent is a kind of "currency", which generates actionable contract rights for the data subject, e.g. for proper performance of the "free" services.

A second area where the traditional concept is getting under pressure is the determination of what data is personal and what data is anonymous. This is important as the data privacy rules traditionally only apply to non-anonymous data and it is often argued that business models based on "big data" and "data mining" do not require anymore the knowledge of a specific data subject. This assumption seems to be underlying the Data Protection Regulation which in its recital 28 and in different contexts (e.g. Art. 6 (4) e, Art. 25 (1), Art. 32 (1) lit a) GDPR) encourages "pseudonymisation". However, in practice "pseudonym" does not mean "anonymous" and data which are really "pseudonym" (let alone "anonymous") become the exception: The reason is that in the age of "big data", in many cases, such seemingly "pseudonym" data still allow the identification of the relevant data subjects. Where a pseudonym is used, it is often possible to identify the data subject by analyzing the underlying or related data.


General rules regarding the rights in data (with non-personal and personal data as subsets) is emerging as a major area of legal interest. The European Commission is focusing increasing attention to these questions, and so should lawyers.