Commercialization of Private Data: Judicial Approach

In our daily lives, we go through a number of activities without realising that at every step of an activity, our data is being used, collected, and commercially exploited by huge corporations in order to provide us with advertisements, commercial calls as well as using it to further their marketing purposes.

A personal data is any information that can be used to identify or identify an individual based on either their physical, physiological, mental, economic, cultural, or social characteristics. Commercialization of data is the process of turning existing data obtained from business operations into new revenue streams

Personal data is any piece of information that can be double checked to identify a specific individual such as fingerprints, DNA, or certain information specific to the said person. Personal data possesses a great deal of commercial value for businesses. As a result, these files containing such huge personal information are bought and sold and these commercial groups become tempted to identify and search for potential clients for their business. Organizations usually collect many different types of information on people and even if one piece of data doesn’t identify as belonging to someone, it could become irrelevant to other information and be sidelined. For instance, a data controller that requests information on people who download products from their website might put down their occupation as banker, therefore since there are millions of people who might be working in a bank, the information gets sidelined since it is of no use for their websites.

In recent years, the collection and use of consumer data by commercial entities has transitioned from being a closely-held secret to being an openly discussed issue among the public. People have likely realised over a short period of time how little control they have over who records, purchases, and sells their data. Consumers have taken basic activities such as shopping, ordering food, and using online services where they generate more and more of Data. Large internet companies that organise these activities have realised that they can utilise or sell the same data for different customer or marketing engagements.

How does data commercialization work?

Taking existing business data and turning it into new revenue streams is data commercialization. In data commercialization, companies take their data and use it to generate revenue or save money. A data mining, pattern recognition, or machine learning algorithm can be used to identify new business opportunities, products, or services from existing data sets.

A data commercialization strategy involves leveraging data to create new value for a business, such as creating services or products tailored to meet the needs and preferences of customers. Inefficiencies or areas of improvement could also be identified by analyzing data from a company’s operations. The monetization of data is an important aspect of data commercialization. Below are ways to monetize data for more successful data commercialization.

What makes it different from data monetization?

Data monetization is the process of using existing data to generate revenue. Companies can sell data to third parties in raw form or in a form converted into insights. Commercialization, however, involves leveraging data to generate revenue. By bringing new products or services to market, it enables companies to improve their business outcomes.[1]

What are the major components and techniques of data commercialization?

There are three major technological components contributing to the growth of data commercialization: the Internet of Things (IoT), artificial intelligence (AI), and blockchain.

A global market for end-user IoT solutions is expected to reach $1.6 trillion by 2025. IoT enables companies to collect a great deal of data from edge devices. By leveraging IoT-based data, businesses can make real-time decisions.

As part of their data commercialization strategy, companies use AI and machine learning techniques to process their data and develop innovative products. Amazon’s Echo, for example, collects data from consumer commands and then uses that information to recommend other products to them.

A key component of data commercialization strategies is security. Blockchain enables organizations to have an open, distributed ledger that records and transmits data securely. Record changes require verification.

What are some of the challenges associated with data commercialization?

Businesses cannot create value from data unless it is clean and usable. Otherwise, data commercialization will be a huge failure. Depending on the context, data is collected and governed differently. The value of data depends on its availability, structure, and cleanliness. You may want to use the same data for a different use case or business unit. In this case, you have to think about whether the data is still useful in a new context. When integrating data from different systems, companies can encounter technical challenges due to data sourcing. For example, they must consider whether the data is encrypted or not.

How should data be commercialized?

Here are some practices to overcome the challenges of commercializing data:

  • An effective data management and collection process requires a data map, which takes into account what data you have, what data is sensitive, and so forth.
  • Identify commercially viable offerings. Data prices should be comparable to those of other companies. Here are some approaches to pricing data:
  • In quantitative pricing, the selling price is determined by the marginal costs associated with data generation, such as storage and collection costs.
  • In qualitative pricing, the value of the data set is determined by customers. Other factors, such as return on investment and competitors’ prices, affect the price of data.

Individual rights protected by Indian IT laws

The Information Technology (Reasonable security practices and procedures, Sensitive personal data or information) Rules, 2011 notified under the Information Technology Act, 2000 governs data protection in India. Certain obligations are inflicted upon the organizations that collect, process, store and transfer sensitive personal data or information of individuals such as obtaining consent, publishing a privacy policy, responding to requests from individuals, disclosure, and transfer restrictions by the Data Protection Authority.

A fundamental right of every citizen is the Right to Privacy, which is also protected under Article 21 of the Indian Constitution. According to the Supreme Court, information about a person and the right to access that information by that person are also within the scope of the right to privacy[2].

The IT Act contains the following relevant provisions:

  • Section 43A  – if a body corporate owns, deals, or handles any sensitive personal data or information in a computer resource that it owns, controls, or operates, it is liable for damages by way of compensation for wrongful loss or wrongful gain caused by negligence in implementing and maintaining reasonable security practices.[3]
  • In accordance with Section 72A, a person who knowingly discloses information in violation of a lawful contract or without the consent of that person, intending or knowing that such a person is likely to cause wrongful loss or wrongful gain, shall be punished with imprisonment for up to three years, a fine up to five lakh rupees, or both.[4]
  • It is mandatory for any legal entity to publish a privacy policy under the Data Protection Rules, which give individuals rights over sensitive personal information. Aside from providing individuals with access to and verification of information, it requires legal entities to obtain consent before disclosing personal information, except in the case of law enforcement, where consent can be withdrawn.[5]

As regulations are widely dispersed, the current legal framework contains many loopholes. Additionally, their applicability is limited to sensitive personal information generated and transmitted electronically. In addition, some contract provisions may be overridden.

Access, use, and control of personal data through artificial Intelligence concerns

AI have several unique characteristics compared with traditional health technologies. Notably, they can be prone to certain types of errors and biases, and sometimes cannot easily or even feasibly be supervised by human medical professionals due to the “black box” problem. This opacity may also apply to how health and personal information is used and manipulated if appropriate safeguards are not in place. Notably, in response to this problem, many researchers have been developing interpretable forms of AI that will be easier to integrate into medical care.

A significant portion of existing technology relating to machine learning and neural networks rests in the hands of large tech corporations. Google, Microsoft, IBM, Apple and other companies are all “preparing, in their own ways, bids on the future of health and on various aspects of the global healthcare industry.” Information sharing agreements can be used to grant these private institutions access to patient health information. Also, we know that some recent public-private partnerships for implementing machine learning have resulted in poor protection of privacy. For example, DeepMind, owned by Alphabet Inc. (hereinafter referred to as Google), partnered with the Royal Free London NHS Foundation Trust in 2016 to use machine learning to assist in the management of acute kidney injury. Critics noted that patients were not afforded agency over the use of their information, nor were privacy impacts adequately discussed. A senior advisor with England’s Department of Health said the patient info was obtained on an “inappropriate legal basis.” Further controversy arose after Google subsequently took direct control over DeepMind’s app, effectively transferring control over stored patient data from the United Kingdom to the United States. The ability to essentially “annex” mass quantities of private patient data to another jurisdiction is a new reality of big data and one at more risk

While some of these violations of patient privacy may have occurred despite existing privacy laws, regulations, and policies, it is clear from the deep mind example that appropriate safeguards must be in place to maintain privacy and patient agency in the context of these public-private partnerships. Beyond the possibility for general abuses of power, AI pose a novel challenge because the algorithms often require access to large quantities of patient data, and may use the data in different ways over time. The location and ownership of servers and computers that store and access patient health information for healthcare AI to use are important in these scenarios. Regulation should require that patient data remain in the jurisdiction from which it is obtained, with few exceptions.

There have been calls for greater systemic oversight of big data health research and technology in order to protect privacy, but this is not always achievable due to the way that commercial implementations can be managed.

Given the recent examples of corporate abuse of patient health information, it is unsurprising that issues of public trust can arise. For example, a 2018 survey of four thousand American adults found that only 11% were willing to share health data with tech companies, versus 72% with physicians. Moreover, only 31% were “somewhat confident” or “confident” in tech companies’ data security. In some jurisdictions like the United States, this has not stopped hospitals from sharing patient data that is not fully anonymized with companies like Microsoft and IBM. A lack of public trust might heighten public scrutiny of or even litigation against commercial implementations of healthcare AI.

Reidentification: a problem

Another concern with big data use of commercial AI relates to the external risk of privacy breaches from highly sophisticated algorithmic systems themselves. Healthcare data breaches have risen in many jurisdictions around the world, including the United States, Canada, and Europe. And while they may not be widely used by criminal hackers at this time, AI and other algorithms are contributing to a growing inability to protect health information. A number of recent studies have highlighted how emerging computational strategies can be used to identify individuals in health data repositories managed by public or private institutions.

This reality potentially raises privacy risks of allowing private AI companies to control patient health information, even in circumstances where anonymization occurs. It also raises questions of liability, insurability, and other practical issues that differ from instances where state institutions directly control patient data. Considering the variable and complex nature of the legal risk private AI developers and maintainers could take on when dealing with high quantities of patient data, carefully constructed contracts will need to be made delineating the rights and obligations of the parties involved, and liability for the various potential negative outcomes.

Through the use of generative data, AI developers may be able to mitigate ongoing privacy concerns. With generative models, patients’ data can be generated in a realistic but synthetic manner without being connected to actual individuals. Despite the initial need to create the generative model, machine learning can be enabled without long-term use of real patient data.

Conclusion

Data are a new goldmine in today’s hyper-connected world. The Internet of Things, data-mining technologies, predictive analytics, cloud computing and data virtualization: these and many more advances in technology have made it possible to collect huge volumes of data, analyse them at unprecedented speed, and build insights on people and their behaviours – this being why European businesses often wish to commercialise their data and capitalise on their monetary potential. However, in order to do so undisturbedly, they will need to stay compliant with data protection legislation. In this paper, we have hopefully shown that commercialising data in compliance with data protection law in Europe can be more problematic than it may appear at first.

In particular, there will be two significant hurdles in the way of abiding by data protection law. Firstly, it will often be unclear if the GDPR applies to personal data at all (what we have called the ‘if problem’). Indeed, the GDPR applies to personal data only. As we have seen, however, this notion of what constitutes ‘personal data’ remains elusive and might prove extremely wide in scope. Anonymising personal data to de-personalise them and escape the GDPR regime may be very hard, if not practically impossible, in many circumstances. And even where formerly-personal data have been ex hypothesi anonymised, they may cease to be anonymous if new technologies arise enabling re-identification. Secondly, even where the GDPR applies, it will often be difficult to predict how it applies (what we have called the ‘how problem’). It is unclear what lawful basis for processing can be invoked to trade data; nor is it always easy to demarcate sensitive data from non-sensitive ones: their dividing line is as much shifty as it is essential in determining how a business should go about commercialising their data. Ultimately, the ‘Protean’[6].


[1] In Depth Guide to Data Commercialisation in 2023;

Author: Gulbahar Karatas.

[2] Justice K.S Puttaswamy & another Vs. Union of India Writ Petition (CIVIL) NO 494 OF 2012.

[3]Section 43A in The Information Technology Act, 2000.

[4] Section 72A in The Information Technology Act, 2000.

[5] Rule 8 of the Data Protection Rules.                       

[6] It comes from the shape-shifting sea god Proteus in Hellenic mythology.


Author: Prity Kumari Suman


Leave a comment