 Hi, my name is Sarah Lambden and I'm a professor of law at City University, New York at the Law School and I am making this video to give a brief overview and kind of a sneak peek into my book, Data Cartels, which is coming out on November 8th and is published by Stanford University Press. This book is of special interest to libraries and librarians because it focuses on companies that we work with a lot. My background is in librarianship, so that actually is what brought my interest to this topic. So in 2017, I was working as a reference librarian at the Law School and I was, you know, teaching a lot of classes about how to use Lexis and Westlaw, which are the two main legal research companies in the United States. They're kind of a duopoly in legal information platforms that all law libraries have and a lot of public and other academic and other types of libraries also have vendor contracts with. And we worked so much with these platforms as legal researchers that I used to joke that some days I felt like a product rep for Lexis. So I was really surprised when one day in 2017, the article on the screen showed up on my computer and it was about how there was a slew of tech companies that were lining up or learning about how to work with ICE, which is the Immigration and Customs Enforcement Agency in the United States, to build something called an extreme vetting surveillance program. So this was around the same time that ICE was in the news a lot for doing some things that really brought up some kind of brought human rights questions. The agency was being probed and looked at because they were separating children from their families and just doing a lot of other cruel and unjust things. So I was really surprised to learn that among the companies vying for to do work on their tech surveillance program, LexisNexis and Thompson Reuters, the parent company of West Law's legal research platform, were both on the lists of attendees to meetings to learn how to work with ICE and to help ICE do surveillance, right? Data surveillance. And I was really surprised because I only knew Lexis and West Law as legal research platforms. So I couldn't figure out why these companies would be involved in data surveillance. So I did some digging and I found out that these companies are not only legal research platforms, they're also some of the government's biggest personal data brokers. And when I noticed that, a colleague and I wrote a blog post for the American Association of Law Libraries website. And we sent it to the people who ran the blog, the librarians who ran the blog, and the blog post went up. But within two minutes, it was taken down by the organization. And I was really surprised by that because it felt like censorship, right? And librarians aren't really the type to censor. So this made me curious about why we weren't, we were kind of being stopped from talking about, critically about Lexis and West Law's data brokering work. And also it made me curious about what exactly these companies were doing and what kind of power they have over librarians and all of the informational markets that they operate in. So ever since that happened in 2017, I've been trying to connect the hidden informational world of data analytics companies like Read Elsevier, Lexis, Nexis and Thomson Reuters. Because when I, in 2017, learned that Lexis, Nexis was also a data broker, I was surprised, which made me wonder if data workers were surprised that Lexis, Nexis is also a legal information platform. And then when I found out that Lexis, Nexis is also associated with Elsevier, one of the biggest academic publishers in the world, that made me even more curious. So I wrote the book to kind of lay out exactly what these companies are doing in various informational markets. And then to show how all of those activities are connected. And through my research, I realized that these companies, not only act as monopolists in each of the markets that they occupy, they kind of block out competition. And they're not especially customer friendly because they don't have to be because they are such market leaders. So they exploit consumers, including librarians. And I started to compare some of the behavior of these major companies to cartels because even though we don't know if they have explicit agreements about certain decisions that they make, it's clear that they're acting in lockstep to occupy multiple information markets and to make new informational products that rely on exploiting people's data privacy. So I'm going to discuss that in a little bit more detail. First, I'm going to describe kind of an overview of these information markets. And then I'm going to give you a preview of each chapter in my book. Because what I did is I start the book with an overview of how I believe these information systems work and why I think these companies are data cartels. And then each chapter after that deals with a specific market that companies like Rita Elsevier, Lexus, Nexus occupy and how they're infusing data analytics and using data analytics in each of those markets to create more products, get more consumers kind of locked into their products. And to really occupy these data spaces by taking advantage of their informational wealth, their troves and troves of copyrighted informational assets and their personal data dossiers, which are really, really robust. So I'll go over all of that. And then my last chapter, I give some solutions. So I tried to frame this. I'm a law professor, so I've framed this in terms of legal problems, the problems of antitrust and cartelization and lack of fiduciary duty, lack of data privacy laws. So in the final chapter, I present some solutions. And I'll go over a few of those at the end. And then if you are interested, you can get the book when it comes out in November. So I'm going to start by explaining how these companies have transitioned from publishing or just providing informational content to becoming data analytics companies and how that benefits them. So this is a common kind of diagram that shows what data analytics companies do. So at the bottom of the pyramid, you have what the companies have done traditionally. They offer raw data, personal data or court cases, court opinions. So that's structured data. That's information, right? And they offer these troves of information, whether it be an Elsevier Journal article or a profile of somebody who's published an Elsevier Journal article, et cetera, et cetera. So they can offer data and information as a surface, that kind of raw informational product. Or they can do what we've seen them do since the 90s, which is organize data into searchable databases to create smart data systems and also create big, robust informational platforms. We see that with ScienceDirect. We see that with Lexis and Westlaw, these systems where you can log in and you can use not only the raw data, but you can take advantage of the way they've organized and structured their platforms so that you can do better research and find what you need and save documents in various files, right? So they have platforms, kind of workflow tools and other systems that help you organize the raw data that they're providing. Now, where they're really looking to open up their markets and where kind of the future of their work lives is in the top of this value pyramid and the most valuable type of information that they're now producing, which are predictive analytics. So they make predictive policing products. They make predictive academic metrics, right? That predict what grant funded projects are going to be the most lucrative, what pharmaceutical companies should invest in. So they predict what might happen in the future market trends with their financial data products and then prescriptive analytics which tell people what to do, tell grant funders who to grant, who to give grants to, tell lawyers where to file their lawsuits, right? So they're making those products now and that's kind of the way they've shifted from general publishing or creating and collating information and data to actually using that data to create whole new informational products that give predictions and prescriptions, right? So these companies have the benefit of both having thousands of technologists and billions of dollars to spend on acquiring data analytics companies and building their own data analytics product. And they also have the benefit of having huge troves of information in multiple information markets. So just a few companies dominates each of these critical information categories and marketplaces. So in the United States, legal information is provided by a duopoly Thompson Raiders Westlaw or Readouts of your Lexus Nexus's Lexus Law product. Academic information is is controlled by us, you know, dominated by a small oligopoly of publishers, which are also transitioning to data analytics and data metrics. So Elsevier, again, you see in this market as well, and Springer Wiley, just a few other companies. And it's worth noting that Thompson Raiders actually invented the metric system that is now clarivates metric system. So Thompson Science was actually kind of before the first academic metrics company. So all these companies kind of move around and sell different sell different assets and products to one another. But they all kind of operate in these in the same little cluster of informational power. There's also personal information. So Readouts of your Lexus Nexus and Thompson Raiders split this company called Choice Point, which was the first ambiguous data fusion and personal data dossier enterprise in the United States. So they are both now some of the biggest government data brokers and they have the biggest data dossiers with tens of thousands of sources just pouring in data about us updated in real time constantly. I call it just a fire hose of personal data that they're both receiving. And not only do they sell the raw data to companies like Palantir or to government agencies who are using companies or contracting with companies like Palantir, which is a predictive pleasing systems company. They also are building their own predictive data analytics products and they call those products risk products and business solution products. And those products do things like guess at who might commit insurance fraud, guess might who might be a good or bad employee or tenant. And they make all sorts of predictions based on our personal data and the data analytics algorithms that they build. They also have a lot of power in financial information industry. So kind of making the top shelf financial data and predictions about markets, about industries by scraping data about stock prices and other materials from Edgar filings and using data analytics to draw predictions and prescriptions once again for financial opportunities. And finally, the news, the news information industry, right? Reuters is its own news. News company, right? And Lexis Nexus actually has one of the biggest data or news archives or maybe the biggest news archive in the world, right? So they claim so there's they also collect a lot of news information and that news information is also really useful from an analytics standpoint. You can easily kind of use algorithms, use key keyword searches to kind of find, you know, news information about companies, about people, about, you know, judges, scholars, different types of things that generate interesting data analytics. So the premise of the book is that until now, the data analytics companies have done a good job of obscuring the immensity of their informational power by maintaining each of their products in separate silos. And those are the silos that we're used to seeing as librarians. And by obscuring what data products do by giving them vague names like special services and risk solutions. In turn, we as consumers have treated each of the company's markets as separate entities, not as pieces of the same problem. If we deal with each of the data company's product lines, personal data, academic information, legal information, financial information and news as if they are separate, we will never get to the heart of the problems that these data analytics companies cause. So the goal of the book data cartels is to bring all of these different industries together, show how they're all interlinked and think about how we can get at the heart of the problems, the information access problems and the data privacy problems that these companies cause. So here's kind of the quick overview of each chapter of the book. So the first chapter, chapter two, after the introduction, deals with personal data and it talks about how these companies engage in institutional data brokering and they sell our data to government agencies, including law enforcement and to other institutions that provide insurance, housing, employment opportunities and health care, the kind of essential services that we rely on and they basically tell these companies whether we are a good bet for their services or whether we should not be getting their services. The data companies do this by amassing invasive data dossiers on each of us and by gathering data from thousands of sources updated in real time. And this chapter kind of unpacks that work and raises some of the problems around data privacy that the personal data industry causes. So the next chapter deals with academic research, which is probably the one that we're most familiar with. Academic research is often publicly funded and often conducted at institutions that are public by public employees. But the fruits of our academic labor are treated like private property, right, by companies that are acting more and more like data analytics firms. The companies treat our public research like their personal portfolios of copyright assets and their paywalls prevent people from accessing the information they need in order to make the best decisions and to conduct their own research. People who can access the research platforms are subjected to surveillance because the companies have turned their websites into data collection tools that fuel data analytics products that grant funders and research institutions use to determine which products receive grant funding and who gets hired. And you can see how it might be a little bit dangerous for the companies that run this academic research data analytics to also run personal data analytics enterprises that sell data to law enforcement and other enterprises that are not academic, right. And we'll see this kind of this this theme of their interconnectedness flow throughout the book. So the next market I discuss is legal information. And as I've already said, there are a few data analytics companies that paywall all of our public laws and they make it impossible for anyone who can't afford their legal information platforms to see the most accessible and up to date versions of case law and statutes, et cetera, other legal materials. The government edict doctrine says that the law should be publicly accessible, but companies have found ways to turn laws into their property that they can sell and put behind paywalls. Now they're finding ways to use legal information platforms and the personal data drawn from them to help those who can afford legal data analytics services gain the law by predicting which judges will be the most favorable and which legal strategies will be the most lucrative. Meanwhile, pros and cons, including prisoners who are often left to represent themselves are unable to access the legal information they need. The next market that I discuss is financial information. So financial information is interesting because there's a wealth of financial information online. It's not hard to find stock updates and other sorts of financial information. SEC filings are available through Edgar and most in most companies post their their filings on their websites. So the problem with financial information is that the most reliable, top shelf, most easy to use, most up to date financial information is paywalled by a few financial data companies. The privatization of corporate data creates a two tiered information system. The public can access outdated, erroneous and hard to read public information. And people who can afford to subscribe to fancy financial data services can get minute by minute financial updates and investment information. This information asymmetry causes the very problems that the Security and Exchange Commission was tasked with preventing consumers fall prey to online stocks buying scams and panics, losing money on bad deal. While people who can afford expensive data tools have faster access to information and better financial information at their fingertips than the general public. The final market that I talk about is news. So when public news becomes private property owned by data companies, both the quality and availability of news declines. And I think we've seen that as as as we've watched more and more of our news sources become either paywall or just shut down or bought out by these consolidated news enterprises, right? Private data companies have participated in the collapse of the news industry. Over the past decade, local news sources have been shuttered and sold off to national news corporations, which has made local news hard to get and has made the news that people receive more biased towards the view points of whatever company owns the remaining news services. News was once considered a public necessity and news infrastructure was supported and subsidized by federal, state and local government. Moving away from government supported news information systems had led has led to inaccessible or disappearing local news and to the spread of misinformation instead of vetted verifiable news. So I made this chapter the chapter before the solutions chapter because I actually think that some of the best solutions to our current information problems can be found in our past systems for ensuring that we had access to news information. So public news systems and public broadcasting systems, I think are good models and that is kind of one of the things I discuss at the end of the book when I talk about what we can do, right? So big data problems, the problems I identify in each of these chapters call for blended multifaceted solutions. There's no easy fix to this problem because it's such a big problem that involves so many different types of information. There's there's no one magic thing that will balance the informational playing field. So in the last chapter of the book, I talked about a variety of suggested solutions and ideas. Some of them are new. Some of them are wonderful ideas that other experts have made that I wanted to draw together into one space. So here are just a couple of them just for a little preview of the book. I discuss antitrust interventions that could allow information projects to flourish without inevitably being procured or tamped out by data analytics companies. I talk about enforcing consumer protection rules and maybe rethinking privacy rights, maybe passing data privacy laws that could prevent data analytics companies from siphoning and selling our personal data and risk products without our consent. And one of the things I talk most about is ensuring that we have sufficient public funding and public support and resources to help information experts like all of you support and maintain our public information infrastructure. Because even though much of this critical information should be public, it isn't free to produce information. Anyone who's tried to produce an open access journal or maintain any sort of open access infrastructure knows that it takes a lot of time and it takes a lot of money and resources to set up an enduring, long lasting open access data and information system and to curate and keep track of those kind of ecosystems online, right? So we need sufficient support. And that's where I kind of get to some of my public broadcasting and public news television analogies for the digital era. So I conclude by saying this. I truly do believe that despite all of the gloom and doom that I sew in the chapters of my book, I believe there is a possibility to treat essential information more as a public resource and less as a part of these big, dainty analytics companies proprietary portfolios. I do think that if we stop treating private data as an extractive industry, we can all swim in the ocean of knowledge. And I have to credit that quote to Carl Malamud, who does a lot of wonderful work through publicresource.org in the legal information spectrum. So and also in the academic information spectrum. So to borrow his quote, that's kind of the conclusion I come to. And it's upbeat, but it's food for thought. And I hope that this book will generate ideas from all of you to that are even more better, more better, even better and more practical than the solution that I introduced in this book. So I hope to hear from you and I hope you enjoy data cartels.