Introduction

By 2018 almost every citizen of the country had a digital identity, over 80% of the country had a bank account and 68% of all payments in the country were digital. According to the World Bank, in the period between 2014 and 2017, for every two new bank account that were opened anywhere in the world, one of them was in India. In the month of December 2022 alone, 8 billion digital transactions were processed in the country.

The Global Findex Database 2017. Measuring Financial Inclusion and the Fintech Revolution.
https://documents1.worldbank.org/curated/en/332881525873182837/pdf/126033-PUB-PUBLIC-pubdate-4-19-2018.pdf

The Challenge of Data Governance

The Data Revolution

While the internet had already been with us for decades at that point in time there were, at the turn of the century just 400 million people on it – nearly half of whom lived in North America and Europe. Today that number is fast approaching 5 billion, with the vast majority of internet users accessing it from Asia.
https://ourworldindata.org/grapher/number-of-internet-users?time=2000..2020

At the turn of the century, there were just 740 million cell phone subscriptions worldwide.
https://www.statista.com/statistics/262950/global-mobile-subscriptions-since-1993/

By the end of 2021, 4.3 billion people were using the mobile internet even though there were still 3.2 billion people living within the footprint of a mobile broadband network and not using it.
https://www.gsma.com/r/wp-content/uploads/2022/12/The-State-of-Mobile-Internet-Connectivity-Report-2022.pdf

Today 38% of mobile internet users have reported that they use the mobile internet for education at least once a week – an increase of over 10% when compared to 2019. Around 21% of them use it to manage their health and 14% to order goods or services.
https://www.gsma.com/r/wp-content/uploads/2022/12/The-State-of-Mobile-Internet-Connectivity-Report-2022.pdf

The single largest use of the internet after communication, is to watch videos. Over 2.6 billion people worldwide use YouTube every month.
https://www.globalmediainsight.com/blog/youtube-users-statistics/#overview

467 million of them are in India.
https://www.statista.com/statistics/280685/number-of-monthly-unique-youtube-users

T-Series is the most subscribed YouTube channel globally, with 232 million subscribers.
https://www.statista.com/statistics/277758/most-popular-youtube-channels-ranked-by-subscribers/

Every day, people watch over a billion hours of video.
https://blog.youtube/news-and-events/you-know-whats-cool-billion-hours/

More than 70% of people watch video on a mobile device.
https://www.socialmediatoday.com/social-business/mobile-usage-trends-youtube-infographic

There are over 10 billion YouTube installs on Android devices.
https://earthweb.com/how-many-downloads-does-youtube-have/

A recent study found that data from wearables could be used to predict the onset of influenza-like illness up to a week before symptoms appeared by analysing changes in heart rate and sleep patterns – offering hope for a new kind of early warning system in the event of another global pandemic.
https://www.statnews.com/2021/09/29/wearables-predict-presymptomatic-infections-study/

Given that in India, only 14% per cent of small businesses have access to credit, leading to an astounding $530 billion credit gap, there is almost no option but to find ways to use these unusual technological workarounds.
https://www.financialexpress.com/industry/sme/msme-fin-530-billion-massive-credit-gap-in-indias-msme-sector-out-of-819-billion-addressable-demand-report/3031372/

Google famously discovered that the most effective teams are those made up of people with diverse backgrounds and perspectives leading to a fundamental shift in the way that the company hired.
https://www.thinkwithgoogle.com/future-of-marketing/management-and-culture/diversity-and-inclusion/-diversity-in-the-workplace/

There is no greater evidence of how real that possibility is than the fact that in 2022, a short film generated entirely by text-to-video AI won the Jury Award at Cannes.
https://www.youtube.com/watch?v=5dvxY6vXHsA&ab_channel=GlennMarshallNeuralArt

The Devil’s in the Data

In the United States, the Stateville Correctional Center in Illinois, was notorious for its Panopticon-inspired roundhouse.
https://www.chicagotribune.com/news/breaking/ct-stateville-roundhouse-closed-met-20161201-story.html

In Cuba, the Presidio Modelo as a direct implementation of the Panopticon design with five circular buildings, each housing up to 2,500 prisoners.
https://www.amusingplanet.com/2014/04/presidio-modelo-abandoned-panopticon.html

Inasmuch as this project resulted in the creation of some of the earliest networked intelligence databases, its long term impact on networked surveillance was on account of the manner in which it served as a training ground for a generation of data scientists who learned the art of data mining in the process. By making it possible to carry out data analysis across a vast network of computers, ARPA expanded Hollerith’s original “punch-photograph” vision of the individuals into something much larger and more all-encompassing. Yasha Levine. 2018. Surveillance Valley: The Secret Military History of the Internet. Perseus Books, USA.

In 1975, NBC correspondent Ford Rowan broke the news about the existence of a sophisticated computer communications network that the military was using to spy on Americans. “If you pay taxes, or use a credit card,” he said, “if you drive a car or have ever served in the military, if you’ve ever been arrested, or even investigated by a police agency, if you’ve had major medical expenses or contributed to a national political party, there is information on you somewhere in some computer. Congress has always been afraid that computers, if all linked together, could turn the government into “big brother” with the computers making it dangerously easy to keep tabs on everyone. In 1968 it killed a proposal for a national data bank which would have held all the computer files on every American. Last year the Congress rejected Fednet a plan to hook together the computer files of various government agencies. But NBC News has learned that while Congress was voting down plans for big computer link ups, the Defence Department was developing exactly that capability: the technology to connect virtually every computer.”
https://books.google.co.in/books id=IucI84ApkBEC&printsec=frontcover&source=gbs_ge_summary_r&cad=0#v=onepage&q&f=false

It is hard to imagine how, having known the problems that could arise from networked data, we did nothing to stop it. How the NBC Report that had anticipated all the problems we are struggling to control was ignored to the point where so few even know of its existence. Levine, Yasha. Surveillance Valley: The Secret Military History of the Internet. New York: PublicAffairs, 2018. 288 pp. ISBN 978-1-61039-802-2

On the 14th of June, 2014, the Chinese government unveiled its Social Credit System.
https://chinacopyrightandmedia.wordpress.com/2014/06/14/planning-outline-for-the-construction-of-a-social-credit-system-2014-2020/

It was a system designed to shape social behaviour using incentives that would “allow the trustworthy to roam everywhere while making it hard for the discredited to take a single step.”
https://www.brookings.edu/wp-content/uploads/2019/08/FP_20190827_digital_authoritarianism_polyakova_meserole.pdf

Jack Ma, the swashbuckling head of Alibaba said it best – “Whoever owns enough data and computing ability can predict problems, predict the future, and judge the future.” Chin, J., & Lin, L. (2022). Surveillance State: Inside China’s Quest to Launch a New Era of Social Control. St. Martin’s Press.

A customer of the Target store outside Minneapolis, walked in demanding to see the manager in something of a rage. His young teen-age daughter had just been sent a bunch of shopping coupons for baby products and he was furious that Target could have done something like this. “She’s still in high school,” he said, “and you’re sending her coupons for baby clothes and cribs? Are you trying to encourage her to get pregnant?” The poor store manager had no idea why he had been sent this material and could only assume that this was an inadvertent mistake by the marketing department. And so he apologised as best he could under the circumstances assuring him that this would not happen again. A couple of days later the manager received a sheepish call from the irate customer who said he’d had a chat with his daughter and had been made aware of goings-on in her life that he had been unaware of at the time. She was, in fact pregnant, and due in August. Fry, H. (2019). Hello World.
https://doi.org/10.17104/9783406732201

In another incident, this time involving domestic violence, a smart device that had turned itself on during a heated (and violent) argument misheard the angry boyfriend’s repeated screams at his girlfriend: “Did you call the sheriff? Did you call the sheriff” as an instruction and, on its own accord phoned the sheriff’s department.
https://www.nytimes.com/2017/07/11/business/amazon-echo-911-emergency.html

When Karen Navarra was discovered in her house with fatal lacerations on her head and neck, the police were puzzled. Her stepfather, Anthony Aiello, had visited her than night but only briefly drop off some homemade pizza and biscotti at her house. When interviewed the nonagenarian said his stepdaughter was alive when he left her – in fact she had given him two roses and walked him to the door. But as the investigation wore on, the police discovered that the Fitbit fitness tracker Karen was wearing on her person showed that her heart rate had spiked significantly at around 3:20 pm when she was still with him in her apartment and that it slowed down rapidly to a halt at 3:28 pm a full five minutes before the neighbours confirmed he left. This was enough for Anthony to be arrested for her murder.
https://www.nytimes.com/2018/10/03/us/fitbit-murder-arrest.html

Take the case of Heidi Waterhouse who, after traumatically losing a much-wanted pregnancy kept seeing advertisements for products for a new-born baby even after she had manually unsubscribed herself from every email provider she could think off that might be targeting advertisements at her. Or Carly-May Kavanagh whose polycystic ovary syndrome made it incredibly difficult, if not impossible, to conceive – and yet kept getting sent advertisements for maternity clothes and baby toys in her social feed. Fry, H. (2019). Hello World.
https://doi.org/10.17104/9783406732201

Judges with daughters have been shown to be more likely to rule favourably towards women compared to their bretheren who do not. Adam N. Glynn and Maya Sen, ‘Identifying judicial empathy: does having daughters cause judges to rule for women’s issues?’, American Journal of Political Science, vol. 59, no. 1, 2015, pp. 37–54,
https://scholar.harvard.edu/files/msen/files/daughters.pdf.

There is evidence to suggest that judges are more likely to award bail if they have just come back from a recess, as compared to those who are approaching a food break. Keren Weinshall-Margel and John Shapard, ‘Overlooked factors in the analysis of parole decisions’, Proceedings of the National Academy of Sciences of the United States of America, vol. 108, no. 42, 2011, E833,
http://www.pnas.org/content/108/42/E833.long.

In a study in the UK, when 81 judges were presented with the same hypothetical fact situation and asked to decide whether or not they would award bail to the imaginary defendants they failed to agree on a single one of the 41 cases presented to them. Mandeep K. Dhami and Peter Ayton, ‘Bailing and jailing the fast and frugal way’, Journal of Behavioral Decision-making, vol. 14, no. 2, 2001, pp. 141–68,
http://onlinelibrary.wiley.com/doi/10.1002/bdm.371/abstract.

In general machine-predicted jobs tend to be less diverse and more stereotypical for women than for men reflecting, for the most part the skewed gender and ethnicity distribution found in US Labor Bureau data reflecting existing societal biases.
https://arxiv.org/pdf/2102.04130.pdf

Data Regulation

In Stratton Oakmont v. Prodigy, where a New York state court was called upon to determine whether Prodigy, an online service provider, should be held liable for defamatory statements made by an anonymous user on its platform. The plaintiff argued that, unlike Compuserv, Prodigy actively moderated and controlled the content on its platform. Since it was an active moderator and in control of the information that was posted, its operations were more akin to a publisher than a passive distributor of information. The court agreed and ruled: “It is Prodigy’s own policies, technology and staffing decisions which have altered the scenario and mandated the finding that it is a publisher.” Kosseff, J. (2019). The Twenty-Six words that created the Internet. In Cornell University Press eBooks.
https://doi.org/10.7591/9781501735783

India’s DPI Journey

India’s Digital Economy

In 2016, Haresh Chawla, partner at True North, one of India’s most experienced and respected private equity funds, wrote an article titled “How India’s Digital Economy can Rediscover its Mojo”. It was a long, well-argued essay about the current state of India’s digital economy and the new directions it ought to be considering. While the entire piece is well worth reading, what was particularly striking was the unique demographic lens with which he analysed the country.
https://www.foundingfuel.com/article/how-indias-digital-economy-can-rediscover-its-mojo/

Today 700 million Indians have smartphone and access to some of the cheapest data plans on the planet.
https://www.fortuneindia.com/technology/india-to-have-1-billion-smartphone-users-by-2026/107220

Categories of DPI

What Could Go Wrong?

In a controversial report first issued in 2017, the Centre for Internet and Society reported that about 130 million Aadhaar numbers and other related confidential data had accidentally been made public.
https://cis-india.org/internet-governance/information-security-practices-of-aadhaar-or-lack-thereof-a-documentation-of-public-availability-of-aadhaar-numbers-with-sensitive-personal-financial-information-1

A year later, The Tribune broke the news that an anonymous WhatsApp group was selling Aadhaar card details for Rs 500 a pop.
https://www.tribuneindia.com/news/archive/nation/rs-500-10-minutes-and-you-have-access-to-billion-aadhaar-details-523361

In 2019, the Cyberabad police last week lodged a FIR against IT Grids Pvt. Ltd, for having in its possession 78 million Aadhaar records from Andhra Pradesh and Telangana – presumably obtained from the State Resident Data Hub.
https://www.medianama.com/2019/04/223-it-grids-aadhaar-leak/

In 2021, a hacker group, called Red Rabbit got access to 2.5 million records of Airtel customers including their Aadhaar numbers, address, dates of birth, names, and phone numbers.
https://www.deccanherald.com/business/technology/25-lakh-airtel-customers-data-with-aadhaar-ids-leaked-company-denies-any-data-breach-946942.html

In 2022, a security researcher was able to access a portion of the Pradhan Mantri Kisan Samman Nidhi website that revealed Aadhaar information about the 100 million farmers registered on it.
https://techcrunch.com/2022/06/13/aadhaar-leak-pm-kisan/

By the end of the day, questions were being raised about whether the breach was in fact all that it had been portrayed to be. Reporters from India Today’s open source investigation team claimed to have got in touch with the hacker who admitted that the results generated by the chatbot were accessed using a vulnerability in another platform associated with the Health Ministry.
https://www.indiatoday.in/india/story/cowin-data-leak-investigation-hacker-telegram-chatbot-covid-2392087-2023-06-12

The Tribune report on the breach of Aadhaar data that alleged that identity information of Indian citizens was available for as little as Rs. 500 lead the World Economic Forum to state, in its Global Risks Report, 2019 that the largest data breach of the year had taken place in India, “where the government ID database, Aadhaar, reportedly suffered multiple breaches that potentially compromised the records of all 1.1 billion registered citizens.”
https://www3.weforum.org/docs/WEF_Global_Risks_Report_2019.pdf

Around the world examples abound of data breaches in which highly sensitive information leaks into the public domain allowing anyone who wants to access it from the dark web to use that data to access their accounts (username and passcode), financial information (bank account and credit card details) or health records (personal medical information). Very few incidents of this nature have, so far, been reported in India. The worst instance of this sort reported in India was in relation to the breach of 35 million user accounts of Juspay customers that included masked card details and card fingerprints. This breach allegedly took place using an unrecycled, old Amazon Web Services access key to access the a test server.
https://www.hackread.com/juspay-data-breach-card-data-sold-dark-web/

In the list of the many individuals around the world who had been the target of surveillance using this software, were over 170 Indians – including sitting ministers of the government, leaders of the opposition, journalists, members of the legal community, businessmen, assorted government officials, scientists, and human rights activists.
https://thewire.in/rights/project-pegasus-list-of-names-uncovered-spyware-surveillance

Santoshi Kumari was a young girl who lived in Karimati a rural village in the heart of Jharkhand. On the 20th of September, 2017 her family ran out of food – the card they had relied on for food rations had been revoked because it was not linked to an Aadhaar number and they had no other means to get the what they needed to survive. For eight days they went without food, unable to beg or borrow it from neighbours. At 8 pm on the 28th of September, Santoshi complained of a stomach ache. Two hours later she was dead.
https://thewire.in/politics/jharkhand-death-aadhaar-ration-card

According to a list compiled by activists Reetika Khera and Siraj Dutta, a good proportion of the hunger-related deaths recorded in 2017 and 2018 were on account of failures in the Aadhaar system. Earlier that year, the State of Jharkhand had required these databases to be seeded with beneficiary Aadhaar details to eliminate all those who were extracting more from the system than they were entitled to. In the process, they excluded many who depended on the food they received from it to live from day to day.
https://thewire.in/rights/of-42-hunger-related-deaths-since-2017-25-linked-to-aadhaar-issues