Antitrust risks and Big Data


The explosive growth in companies’ exploitation of big data is drawing intense scrutiny from European antitrust authorities. EU Competition Commissioner Margrethe Vestager has promised to “keep a close eye on how companies use data” and a number of European antitrust authorities have conducted full-blown studies on big data issues, including an especially comprehensive May 2016 Franco/German study on “Competition Law and data”.1

Antitrust authorities fear that big data can create barriers to entry and market power, especially where companies hold unique datasets that cannot be replicated by competitors. However, a closer look at big data-related theories of harm suggests that the focus on uniqueness may be misplaced. Many companies collect voluminous datasets that are unique and non-replicable, as well as, being valuable and competitively significant, without raising exclusionary concerns of the types highlighted by antitrust authorities.

The perception that limited access to big data may create barriers to entry and stifle the growth of the digital economy risks provoking an overbroad legislative reaction. In January 2017, the EU Commission consulted on a wide range of data-related issues, proposing among other things that companies be required to share non-personal, machine-generated data with third parties, including competitors, whether or not the data-holder holds a dominant position or engages in abusive conduct. As such, it would go far beyond any access remedy recognized in the antitrust context.

This article discusses European antitrust authorities’ concerns about the foreclosure risks of big data from the perspective of the different types and uses of big data. This approach reveals that exclusionary concerns arise in a relatively small segment of big data uses, and those situations can be assessed using traditional antitrust tools. Some important big data issues not addressed in this article include the mirror image of these exclusionary concerns – i.e., that ubiquitous big data can make markets so transparent that competition may be impaired by competitors using pricing algorithms – the role of big data in merger review, and the relationship between antitrust enforcement and data protection. Big data are a big topic, and a complex one.

What is “Big Data” and which characteristics are antitrust-relevant?

Big data are commonly defined by reference to the “three Vs,” “volume,” “velocity” and “variety.” A fourth “V” – veracity – is sometimes added.  But potentially more important characteristics for antitrust purposes include who is collecting the data and on what subjects; whether comparable data are available from multiple sources; the marginal value of additional data; and the reduction in data’s value over time.

Data are often collected directly on companies’ own assets, products and services, or on their customers, and used by those companies for a variety of business purposes. But data can also be purchased as a product, in which case it can be described as “third-party data,” in contrast to the “first-party data” collected or inferred directly. The novel issues raised by big data relate mainly to first-party data, while third-party data issues can be evaluated under traditional antitrust principles.

Data are often said by proponents to be “non-rivalrous,” in that multiple entities can collect and use the same datum, and “ubiquitous.” Data are indeed non-rivalrous from a property law perspective, and data are certainly ubiquitous in a general sense. For antitrust purposes, however, such observations may be overly broad. For example, companies developing and testing products ranging from jet engines to automobiles to computers collect huge volumes of data on their products, and those data are not generally accessible to competitors. The unique and non-replicable nature of such data does not raise antitrust concerns, however, and antitrust authorities would not normally view sharing of such data among competitors as pro-competitive. 

Proponents also note that decreasing marginal returns to scale limit the competitive advantages from large amounts of data, and data’s value may decrease quite quickly over time. Again, these observations may be simplistic, since the incremental value and useful life of data depends on the type of data and the uses to which it is put. In more time-sensitive applications, such as targeting promotions to consumers passing near a coffee shop, incremental data about potential customers’ interests and activities may not exhibit diminishing marginal value, but that value may be short-lived.  Conversely, where data’s value is less time-sensitive, for instance data generated by testing and developing products over months or years, beyond a certain volume incremental data seem more likely to yield diminishing marginal returns, suggesting that a big company cannot necessarily foreclose competition from smaller companies simply because it has more data.

Big Data, barriers to entry and market power

Antitrust authorities fear that the need for a large volume or variety of data may result in entry barriers when new entrants or smaller companies are unable to collect or buy access to the same kind of data as established companies. They fear that such characteristics could be self-reinforcing, converging towards a monopolization of data-related markets. Authorities acknowledge that big data can also reduce entry barriers, however, for instance if new entrants can use data to identify consumer needs more efficiently than with traditional methods. 

The Franco/German Study indicates that where “access to a large volume or variety of data is important in ensuring competitiveness on the market (which is a market-specific question), the collection of data may result in entry barriers when new entrants are unable either to collect the data or to buy access to the same kind of data, in terms of volume and/or variety, as established companies.” The Study seems to assume that where big data practices give rise to entry barriers, they are a source of market power.

But general statements about big data’s propensity to raise barriers to entry can be misleading. Although the Franco/German Study acknowledges that the importance of data varies from market to market, whether big data raise or lower barriers to entry also depends on the nature and use of the data and the availability of alternative data sources. In the case of third-party data available as a product, for example, the same data normally would be available to anyone willing to pay for it, absent exclusive contracts (as discussed below). In that case, the extent of any barrier will depend on the data’s cost, as with any input.  The extent to which such costs represents a barrier to entry will vary depending on the business in question and companies’ alternatives, but these issues are not unique to big data.

In the case of first-party data collected directly, no two companies will have access to exactly the same data, so each company’s dataset is unique. The question is whether the uniqueness of a particular company’s data creates a barrier to entry for others, because they need data to compete and lack alternative sources to data that are substitutable for their purposes. This issue would not arise in relation to first-party data companies collect on their own assets, products and services, since the fact that one company has collected such data would not preclude competitors from doing the same.

The situation is more difficult with regard to first-party data companies collect on other legal and natural persons, because there are many different sources and different data are needed for different purposes. Evaluating the substitutability of different datasets can be difficult even for companies in relatively similar businesses. For example, a specialized online retailer seeking data to advertise a particular product line would have more limited data needs than a generalist online retailer. The characteristics of the required data may also vary significantly; for instance, the value of information on consumer interests in large or unusual purchases such as automobiles or leisure travel may be much more time sensitive than information on repeat purchases such as books, music or food. The cost associated with collecting these data will vary significantly, as will the alternative sources available, which could include a combination of first-party and third-party data. The volume and variety of data a company needs for a particular purpose may also vary depending on the processing tools available to it, such as internally developed algorithms or third-party software.

Where big data are collected and used for product development and testing, on the other hand, each company needs data on its own products, so a dataset developed by one company does not create a barrier to entry for another, no matter how unique and non-replicable it is. The cost of data collection and processing tools may constitute a barrier to entry, but the issue again is not unique to big data. 

These examples illustrate the risks of generalizing about the extent to which big data practices raise or lower barriers to entry. It seems clear, in any case, that the mere fact that a particular dataset – even a highly valuable one - is unique and non-replicable does not imply that competitors necessarily need access to the same or even similar data to compete.

The inference that data-related barriers to entry translate into market power is similarly open to question. Except where the data themselves are offered as a product, the nature of a dataholder’s market power depends on the competitive structure of the markets in which it is active.  Even if the need for big data does create a barrier to entry, it does not necessarily follow that a company holding such data has market power, or a dominant position, in a related market.

Big Data and exclusionary practices

The Franco/German Study identifies five types of conduct in relation to big data that it fears could be exclusionary: refusal to provide access to data; discriminatory access; exclusive contracts; tied sales and cross-usage; and discriminatory pricing. Each of these exclusionary theories of harm is discussed briefly below.

Refusal to provide access to data

The first exclusionary conduct discussed by the Franco/German Study is a refusal to provide access to data, which can be anticompetitive “if the data are an ‘essential facility’ to the activity of the undertaking asking for access.” Since third-party data are offered to third parties for consideration, this theory of harm could apply where a dominant provider in a particular data market refuses to supply such data to competitors. This is a traditional antitrust scenario, however; the more novel data access issues concern first-party data collected by a company and not otherwise made accessible to competitors or other third parties.

Under the European Court of Justice’s judgments in Microsoft, IMS and Bronner, an authority can order a company to give a competitor access to an “essential facility” if the incumbent’s refusal to grant access (i) concerns a product which is indispensable for carrying on the business of the company seeking access, (ii) prevents the emergence of a new product for which there is a potential consumer demand (this condition being applicable when the exercise of an intellectual property right is at stake), (iii) is not justified by objective considerations and (iv) is likely to exclude all competition in the secondary market.  A product or service is “indispensable” for these purposes only if there are no alternative products or services and there are technical, legal or economic obstacles that make it impossible or unreasonably difficult for any undertaking seeking to operate on the downstream market to develop products or services, even in cooperation with other companies. 

These requirements would be met, according to the Franco/German Study, “if it is demonstrated that the data owned by the incumbent is truly unique and that there is no possibility for the competitor to obtain the data that it needs to perform its services.” This statement seems overly broad as applied to first-party big data, for three reasons.

First, for a duty to provide access to arise under EU law, the company holding the data would need to hold a dominant position in a market for a product or service other than the data themselves. As discussed above, first-party data collected by a dominant company may or may not constitute a barrier to entry to competitors or contribute to a dominant position on the part of the dataholder. The existence of the dominant position, and the relation between the dominant company’s data and its dominant position, do not follow from the unique and non-replicable nature of the data.

Second, the potential competitor seeking access to a dominant company’s first-party data would have to require access to the data to develop a new product or service in another market for which there is potential consumer demand. In other words, there is no EU law basis to suggest that a company could be required to share its data to allow a competitor better to compete with it in the same market, even if the data in question are unique and not otherwise available to competitors. 

Third, even if the company seeking access to data wanted to use that data to offer a new product or service, it would have to show that developing the new product or service without access to the dominant company’s data would be impossible or unreasonably difficult, and that refusal to provide access would exclude all competition for the new product or service. Whether the need for data meets the required standard would depend on the data and the purpose for which the non-dominant company needs it. If a company seeking access wants to use the data to develop a new type of individually targeted advertising service, for example, the short-lived value of the data in question may make it difficult to show the requisite level of need, while the availability of many different types of data may make it difficult to show that it would be impossible or unreasonably difficult for the company seeking access to use alternative data or that the lack of access would exclude all competition for the new product or service. 

Discriminatory access

The Franco/German Study argues that “refusal to [provide] access [to] data could also be deemed anticompetitive if it is discriminatory.” According to the Study, discriminatory access to strategic information by vertically integrated companies can distort competition, for instance where marketplace operators also operating as online retailers may get access to information about their competitors selling on that marketplace and the behaviour of consumers. Thanks to such information, a vertically integrated platform might be able to adjust its product range and pricing more efficiently than a non-vertically integrated retailer. A vertically integrated platform could also restrict the information received by downstream competitors regarding transactions they are involved in. Such information transfers and limitations could make the integrated platform operator more competitive than its competitors operating on its market place.

The Franco/German Study cites no EU sources for the proposition that discrimination in providing access should be treated as a separate violation from a refusal to provide access, but it cites a French case in which Cegedim, which was dominant in the market for medical information databases in France, refused to sell its OneKey database to customers using the software of Euris, a competitor of Cegedim on the adjacent market for customer relationship management software in the health sector. The French Competition Authority concluded that Cegedim’s behaviour represented an abuse of its dominant position. Cegedim could be analysed as a traditional tying case, however, since Cegedim was dominant in a market for the provision of third-party data, and it tied access to its dominant database to use of its non-dominant data analytics software, foreclosing competition in that market. 

The Franco/German Study also cites the Commission’s investigation of Google’s comparison shopping services, where the Commission claims that Google systematically favours its own comparison shopping service over competitors’ in its search result pages. In this case, however, the Commission does not allege that Google discriminates in the provision of access to data, but rather that Google’s favouring of its own comparison shopping service stifles innovation and leads to users not necessarily seeing the most relevant search results.

It is not clear from the Franco/German Study’s analysis or the examples it cites why discriminatory provision of access to big data should be viewed as a separate antitrust violation compared to a refusal to provide access, as discussed above. In particular, if a vertically integrated platform could use customer information to adjust its product range and pricing more efficiently than a non-vertically integrated retailer, that advantage, by itself, would not create an obligation for the vertically integrated platform to share its data with any non-vertically integrated retailer.

Exclusive contracts

The Franco/German Study argues that exclusive agreements or networks of agreements involving data access may infringe antitrust laws if they prevent rivals from accessing data or foreclose rivals’ opportunities to procure similar data by making it harder for consumers to adopt their technologies or platforms.

The Franco/German Study notes that the European Commission has alleged that Google’s practice of entering into exclusive contracts in the search advertising market might infringe Article 102 TFEU, because these agreements foreclose competitors from being able to challenge the company. From the Commission’s public statements in this case, however, it doesn’t appear that the Commission alleges that Google used exclusive contracts to prevent rivals from accessing data or foreclosing rivals’ opportunities to procure similar data. 

Exclusive contracts in relation to big data could, however, give rise to antitrust concerns in two situations. In relation to the supply of data, a dominant third-party data supplier could use exclusive contracts with its customers to foreclose competition from other third-party data providers. In this scenario, however, the fact that the product market involves big data would not seem to raise any novel issues. 

In relation to data collection, exclusive contracts or networks of contracts could potentially foreclose competition by third-party data providers or companies collecting first-party data on other natural or legal persons, such as platform users or consumers. The potential for individual exclusive contracts, or a network of exclusive contracts, to interfere with competitors’ access to data would need to be assessed in light of the substitutability of different types of data for the same purpose, the potential sources for each type of data and the purpose for which the data are required. 

As discussed above, the substitutability of datasets can be difficult to assess. For instance, foreclosing access to data from one population of users may not affect competition to develop products where big data are important but the identity of the data provider is not, such as developing search algorithms, spell-check programs or voice-recognition software. In applications where users’ identity is important, such as individually targeted advertising services, the competitive effect of exclusivity in agreements providing for the collection of data from third parties would need to be assessed in light of the population of potential users, the availability of substitutable data from other sources (for instance as a result of multi-homing), the duration of any contractual exclusivity, and how quickly the value of the data in question diminishes over time.

Tied sales and cross-usage

Several authorities warn that tying sales or “cross usage,” i.e., the use of data collected on a given market in another market, can have foreclosing effects. The Commission will normally take action in such cases where an undertaking is dominant in the tying market, the tying and tied products are distinct products, and the tying practice is likely to lead to anti-competitive foreclosure.

In the big data context, a tying issue would only arise where the data in question are offered as a product for consideration, i.e., in relation to third-party data. Indeed, the Cegedim case cited by the Franco/German Study as an example of discriminatory access could be viewed as such a tying case involving third-party data, as discussed above.  In the case of “cross usage,” it is not clear why any foreclosure or exclusionary issue would arise unless accompanied by a refusal by a dataholder to give access to data that can be used in more than one market, as discussed above.

Discriminatory pricing

Data is also said to facilitate price discrimination, since companies with market power who collect data about their clients’ purchasing habits may be better able to assess their willingness to pay for a given good or service and to use that information to set different prices for different customers. The Franco/German Study notes that price discrimination can be defended on economic efficiency grounds and queried whether “price discrimination in itself is within the scope of European competition law” and suggested that national competition law may be more likely to apply.

Price discrimination is in fact within the scope of EU competition law, though it is not a priority enforcement area addressed in the Commission’s Exclusionary Practices Priorities. In Clearstream, the General Court upheld a Commission decision finding that Clearstream’s charging of a higher price to Euroclear Bank for equivalent clearing and settlement services than to national central securities depositaries constituted discriminatory pricing prohibited by Union law. 

In the big-data context, similarly, a company that is dominant in a market for the provision of third-party data could be found in violation of Article 102 TFEU. This theory of harm, however, would not apply to first-party data or to the provision of third-party data by a company that was not dominant in a market for such data.

The Franco/German Study seems to contemplate a different concern, that the use of first-party data could facilitate discriminatory pricing by a dominant company in a different product or service market by enabling such companies to better assess their customers’ willingness to pay for their products and services. The Franco/German Study’s comments on the potential efficiencies to be derived from such practices may reflect a doubt about whether discriminatory pricing by dominant companies should be prohibited under EU law, but this issue is beyond the scope of this article.  


This article examines big-data theories of harm advanced by European antitrust authorities in light of the characteristics of different types of data and the ways companies use it. This approach reveals that the risk of data creating a barrier to entry or market power depend on the type of data involved and the characteristics that affect its value in different contexts. 

For example, where third-party data are offered as a product, the fact that the seller holds such data doesn’t create a barrier to entry, although the data’s cost could constitute a barrier, as with any input. In markets for third-party data, dominant companies may engage in exclusionary practices, as in any market. Of the theories of harm raised in the Franco/German Study, tying, exclusive contracts and discriminatory pricing seem most likely to arise in relation to third-party data.

In the case of first-party data, the fact that a company collects on its own assets, products or services seems unlikely to raise barriers to entry or create market power. The cost and expertise required to collect and use such data could be a barrier, as with any input, but any barrier would not derive from the data as such, and a company’s use of such data should not give rise to any exclusionary concerns. 

On the other hand, first-party data collected on others, such as a company’s customers, could in principle create a barrier to entry. However, the competitive significance of such data varies depending on the availability of substitutable data, the volume of data required for the desired purpose and how quickly the data’s value diminishes over time. Authorities lack clear tools to assess the substitutability and value of data across a range of markets, each of which must be evaluated on a case-by-case basis. In any event, the fact that a particular dataset is unique or non-replicable, in itself, is not necessarily an indication that big data raise barriers to entry or create market power.  

Only in the narrow situation where a dataholder is found to have a dominant position in a relevant market, the data it collects on other persons contributes to its market power, and the dataholder is abusing its dominant position, would the question of mandatory access arise. Mandatory access to data could be a suitable remedy if the data constituted an essential facility meeting the criteria set out in the European Court of Justice’s Microsoft, IMS and Bronner judgments. In this context, exclusive contracts or networks of contract to procure data could potentially have foreclosure effects. By contrast, other exclusionary theories of harm raised in the Franco/German Study, including exclusive customer contracts, tying and discriminatory pricing, would not be expected to arise in the first-party data context.

As they develop their thinking on big data issues, antitrust authorities will need to look more closely at the substitutability of different types of data for different purposes and the role of related but distinct factors such as the availability of algorithms or other software to process such data. Meanwhile, a simplistic focus on whether data are unique or non-replicable risks provoking an overly broad legislative reaction. In particular, the European Commission’s proposal to require companies to share any non-personal, machine-generated data they collect could apply to huge volumes of first-party data companies collect on their own assets, products and services, even though such data are among the least likely to create barriers to entry or contribute to abuses of dominant positions.  Imposing such a broad access obligation could chill innovation and investment in a critical period in the evolution of big data practices.

Recent publications

Subscribe and stay up to date with the latest legal news, information and events...