 This is Think Tech Hawaii, Community Matters here. Welcome and aloha everybody. My name is Mark Shklav and I am the host of Think Tech Hawaii's Law Across the Sea program. A new term of art has entered into our vocabulary. The new word is Big Data. Today our Law Across the Sea program is titled Big Data Has Arrived. And my guest is Hawaii Attorney Glenn Melchinger. Glenn is a director at the law firm of Alston Hunt Floyd and Ng and his practice often involves issues related to computer systems, cyber technology, and Big Data. I've asked Glenn to talk to us today about Big Data to define it, to tell us where it's from, where it's going, and address some of the issues that have developed around this new world and new word. So Glenn first, welcome, good to see you here today. Thank you for being my guest. Thank you very much Mark, it's a pleasure to be here. Glenn, what is Big Data? Where is it from? Where is it going? You're going to have to dumb this down a little bit for me too, okay? Because I need to understand it. Right. Well, no, I think you're plenty savvy yourself. But Big Data is a term that's used in many different ways. And it's rather vague and abstract, but try and simplify it a little bit. Essentially we're talking about huge amounts of data that have been gathered. It's said, or I've read this statistic that in the past couple of years, we've accumulated more data than in the prior, say, 3,000 years of human existence. I also heard that statistic when we were back in 2010 and 2012. So maybe now we have even more. It's exponentially increasing. There's a lot of different types of data. And this is like on the computers and on the internet, or I mean, is that where we're talking? Right. And so what are we comparing it to? We're comparing it to maybe people who are doing cave drawings or writing down the movement of stars or Archimedes getting out of his bathtub and writing the water displacement or whatever, or monks handwriting Bibles. And from all of that time coming forward, we end up with a lot of data. But now we're talking about computers that create data by themselves. An example might be, if you look at Google, Google is an attempt to, of course, manage and enable everybody to search through a ton of data, which is the internet, the world wide web. And the way they've achieved doing that is by creating even more data about that data and indexing it. So there's layers and layers and layers, and it's all different types. You end up with video data, you end up with increasingly, say, AI-generated information about that data. Artificial intelligence. Correct. Artificial intelligence. Algorithms that'll go and look at an image and identify, OK, that's a cat and label the photos. There are competitions to this. And you get, of course, user-generated data is one that we're probably the most familiar with. Every time you walk down the street with your smartphone, there's probably GPS tracking you. That's data. There's, you might be posting something to Facebook for data. You might be emailing, texting, all sorts of different things. You might pay for something with some payment software on your phone. And you end up with a ton of data that we're all generating every year. And so many of us, I mean, if you watch people walking down the street, they're not supposed to do it as they cross the street, but they're all on their hand set of some sort. Right. Right. And I think it is now against the law to look at your phone as you're crossing the street. I don't know whether that works for paper or not, but correct, there's a ton of data being generated all over the place. And corporate data, companies, every time you go on Amazon, the flip side of that is they're tracking what you're looking at, they're tracking what you're interested in. You go to your Netflix queue and it has suggestions for you. That's based on an algorithm that looks at what you've looked at in the past and ranks things and tries to predict what you might be interested in next. So we as humans begat data and data begats more data is what I'm hearing you say. I think that's right. There's more and more data about data also being generated and there's a huge and potentially lucrative area of study of data analytics which people are going into and trying to figure out what that data can be used for. So the big data, that term, we're talking about all of this now because I've heard it more and more, more and more frequently, big data or big data. I've heard that being used and I'm not sure what it's being used. What the ultimate meaning of it is, is it scary? Should we be happy with it? Is it something, what are the uses of it in our lives? My thought is that essentially, just to go back and define it a little bit more, what is big data, it's something that is perhaps inscrutable. It's hard to absorb how much is there or exactly what's in it. And I came into this from the context of litigation and searching through documents trying to find specific things. And I learned there's a competition called the TREC competition. It's a text retrieval conference and all the e-discovery software vendors get together and try to assault a big piece of maybe over a million emails. And they try and find particular things in it. And my understanding is that there's some corpus of data that were being used for that and it took some years of them being used before people realized, well, there's personally identifiable information in there. There's social security numbers, there's credit card numbers, there's other things. And in particular, I'm talking about an Enron data set of actual email. And it took a while before that maybe got cleaned out pretty well. So it's data that's so big that you might have it sitting right in front of you, more or less, but you really can't pierce it and you don't know what's in it perhaps. But my understanding is that there are attempts to pierce it. It is being pierced now in many different areas, commerce and litigation, personal use, and what's that about? What's happening in that respect? And there are a lot of people who, and I think the key word perhaps that we used before is the word predict. There are a lot of people who want to look at what data there is and try and see where things are going to go. Netflix wants to know, for example, what are you going to look at? They want to make suggestions because the more they suggest things that you want to see, the better, the more often you're going to go back. Because you may not know from a small interface on your phone or your tablet what they have available. So it depends on their algorithm to serve up these things. An algorithm is just sort of a, that's how people are attacking big data. Again, in the litigation context, it's predictive coding or technology assisted review, it's ways of taking sort of brute force computing power and going against all the text and finding what's word is related to another word. How often they come up in what type of documents and delivering that to you rather than trying to read through a million emails from start to finish. Well, so in the commercial use of it, is big data the total universe of data? Or is it somehow broken down into smaller parts? Or how does the algorithm work in that respect? And then I want to talk about the litigation aspect of it too. But first, the commercial. Sure, sure. So the tools that people are using to go against big data. So there's several different issues in the question. So one is perhaps the algorithms and the tools that people are using. Second is, what is the data? What are people using? What are using is the raw source that they're going to analyze with their algorithm or whatever software they develop. And then you also asked back there, what's sort of the promise of, where does all of this go and why bother and who cares, right? So first off, to talk about the algorithms. People are increasingly getting better at attacking what's in a big data set because of the advent of EI and neural net computers that can actually learn what you're looking for. And a lot of people, I sat and watched some MIT classes on autonomous vehicles and neural nets and so on. And it's interesting to hear the professors admit, they don't really know why it's learning what it's learning. But some of the tools have really improved dramatically in the past year even. Even just a couple of months ago, there was a significant achievement. Chess, we have 1997 Deep Blue Beats, Gary Kasparov. It took 20 years before anyone could tackle the problem of Go, however. And so DeepMind, Google, purchased the US, a UK company. And they created a program called AlphaGo. And Go is the Japanese version of chess, if you will. Right. And it's, however, it's also a lot more complex. Chess, you have 64 squares. You got eight pawns. You got a limited number of pieces. The full-size Go board is 19 by 19. What I've heard is that the different branches and possibilities, mathematically, that the possible Go games are greater and larger than the number of atoms in the universe or something like that. It's not, I mean, maybe you're familiar with Moore's law, which is that computing power or the amount of transistors you can scribe on a silicon wave for double every two years or something like that. Apparently, we're coming to the end of that according to Intel by 2020 or so. But then we have quantum computing to go to, so maybe we'll be okay. But the computing power cannot possibly cover all the possible outcomes if it's the number of atoms in the universe. And it cannot then say, okay, this is the best move for you to do right now in Go. So some different level of intelligence had to be developed in the system. So it's an incredible achievement. You know, in October 2015, the AlphaGo program beat the first professional player and then by, I think it was about May of this year, it beat the highest rank in the world champion. And it was given a ninth Don rating and then it sort of retired. But there's a film on this, but even more interesting perhaps is how did it get so strong? How did it become better than humans at what only humans have been able to do? And what you tell me, I mean, it's like you need to have some ability to discern where you're going in a way. It's not just math. Exactly, exactly. And that's why it's been such a wake up call and it was the Holy Grail of AI, as I understand it. It's because it requires some sort of higher order level of strategy involved perhaps. So in this case, you end up with it finally achieving that victory and the training it apparently did was to play itself. The researchers had the program go back and forth and play itself and it was trained on amateur games. So it's becoming, you know, are we getting into, you know, the computers will become self aware and I don't know, you know. But it's not generalizable at this point it seems, but there are increasingly sort of more general programs that can learn different games. One of them was one that starts to play Pong and everyone remembers perhaps this game, well not everyone. The youth probably do, but I remember one of the first video games I played, you know, and the little thing goes back and forth on the screen. The computer can figure that out when it gets to more complex games, then it becomes more difficult. Okay, now we have this vast amount of data out there. What are the, what are the commercial companies trying to dig from that? And then I want to hear your response to that. Then I want to take a little bit of a break and then I want to come back and hear about law, where we're going. So companies I think are, they're looking for any kind of, they're looking through their data to try and figure out where there is some kind of actionable intelligence. It would make sense to talk a little bit about sort of what the promise of big data is. And big data sets have been used, for example, to help reallocate resources in a very efficient way. There's somebody who's done to prevent too many women in India from having to use cooking fires, which damaged their lungs. They have created a system and managed and wrangled a bunch of different data sets together to figure out where do we put clean fuel stores that are close to every single village in India. How do we allocate that and put them on the map? And there are people who've solved these problems using data sets and using algorithms. Kind of the promise. The companies, they're looking at it to find how do we find the next big thing? How do we find the next product that's going to hit? How do we become Apple or something? How do we make money? But I am pleased to hear that there sounds to be some good that comes out of this. And I want to talk about that after the break. Thank you. Nothing is making sense for me and you. There's got to be solutions. How to make a brighter day. Hello, everyone. I'm DeSoto Brown, the co-host of Human Humane Architecture, which is seen on Think Tech, Hawaii every other Tuesday at 4pm. And with the show's host, Martin Desbang, we discuss architecture here in the Hawaiian Islands and how it not only affects the way we live, but other aspects of our life, not only here in Hawaii, but internationally as well. So join us for Human Humane Architecture every other Tuesday at 4pm on Think Tech, Hawaii. Welcome back. I'm Mark Schwab, host of Law Across the Sea on Think Tech, Hawaii. And I am with Glenn Melchinger, and we are talking about big data or big data. And however you want to pronounce it, I suppose. The question is, what does it do? Where is it from? Where is it going? What is it being used for? And I was glad, Glenn, when we left off, I was a little bit worried about big data and how it was being used. Because, I mean, we were talking about the ability now that it can play a game of gold and win. But that requires a little, I mean, does that require intelligence? I'm not sure. But then you also talked about some ways that it's being used to help people. To tell where we can use gas, I guess, in India to cut down on harmful effects. And I guess that you delve into that data and find out what makes sense in the location of stores and that type of thing. And also, but generally speaking, my impression was big data was being used by commerce to make money. To figure out what's the best way to go. How do we use it in law? Tell us a little bit more about that. And I want to talk at some point about the pros and cons more. I want to talk about if we're losing something by using big data or going into it. Let's talk about the law. Where are we with the law? What are the goods and bads, if you will? Or is there bads or goods? Well, you know, it's hard to perhaps put a judgment on just the data. The goods and goods are bad. It's out there. On the other hand, it is becoming something that a select few very large companies have lots of. And it is essentially becoming an asset for them to the extent that they can actually create some value out of it. I've heard a statistic recently that a lot of attempts to do data analytics on big data and create something or monetize it in some way have not necessarily come into fruition. It may be that somebody goes through a whole process in a company and then they find out, well, my employees, they really aren't making better decisions about anything. And so how it plays out is a good or bad, that's a little hard to say. That's a judgment call, right? I think it's a judgment call. You know, I mean, if we sort of step back sort of scientifically, the data is data. It's information that's out there. There are people, however, are pointing to a bad potential use of it, or perhaps you should say a harmful use. There's a bestseller out called Weapons of Math, M-A-T-H Destruction. And Kathy O'Neill wrote this and became a New York Times bestseller last year. And she talks a lot about how the algorithms specifically can be discriminatory. And why is that? Well, if you're looking to predict the future from a past data set in which you've only collected for certain types of things, and it's limited in certain ways, and for that matter, the data only encompasses what is actually measurable in some way. You know, you can't run a lot of analytics on things that you can't quantify. And that's the field she comes from. She was a quant in someone who's doing math in a Wall Street term. Then, you know, you have to begin to look at algorithms perhaps as something like an argument or an opinion. That's her take. She's saying math destruction. Weapons of math destruction, right. And so she points in particular in her book to a couple of particular examples. Pre-2008, all the trading algorithms and other things and the type, the inability to perceive risk of subprime mortgages and collateralized debt obligations and that kind of thing led to collapses, led to a lot of people losing their jobs. And what do we learn from that is her question. And I'm not, you know, taking position on her book either way, but I think the question ends up being, what do you do with the data? What do we decide to do with the data? Is there some overarching morality that we're going to impose on how to use it, for example? And it comes down to humans then, what I hear you saying. Even though we're dealing with data and machines and computers and technology, humans are still involved in this. And it's still coming back to the good and the bad of people. Right. And we're living at an age which is incredible because we're bordering on whether we really get to this stage where we're living in the Terminator movies or something like that. Or whether HAL is going to become sentient and kick us out of the spaceship or whatever it is. And there are people who are warning about this. Masayoshi Song is talking about the singularity when computers exceed human level understanding. But that's one of the benefits of big data too. You have microscopes which allow you to see small stuff. You have electron microscopes, even smaller particle colliders. All these types of things. Telescopes. We can see through things with X-rays, et cetera. Big data. There's one example that I came across. Somebody who was watching his kid learn language. He did something that all parents should do, I'm sure, put nine cameras in all of his rooms and recorded everything that everybody was saying. God forbid we live in that. But he made fascinating discoveries about how his son's vocabulary increased when a new word would become learned. And it wasn't necessarily how many times his boy saw it. It was maybe different contexts. The word comes up in different contexts across different ways. So there's things that big data told him about how we learn language that are beyond our ability, especially when you're sleepless and waking up and doing midnight feedings or whatever the heck it is, that any of us are going to proceed. So that may be one of the further promises of big data, which is it's extra superhuman level of perception about trends and how things go. And one thing, with respect to the legal aspect, I've heard people are concerned about their privacy. And they're concerned that anybody now or somebody with the right algorithm can go in and find your background and your banking information and go deep into a lot of things about you. How are we protected? Are we protected or is this a new world that we still have to walk into? We have to wait and see. I'm blanking on the name of the case, but I think the Supreme Court has picked up on the third party doctrine as it's called, which is if I choose to walk down the street with my phone and I have my GPS turned on, that GPS data which sits in the hands of a third party ISP or sits in the hands of a phone service provider, that's not my data anymore. So there are cultures however that have very, the EU has a very different conception about what is private, what is personal. They have a very strict regime about bringing data out of Europe into the US or anywhere else. So what happens? What's the result of that? Well, the result of that is in the US, once you put it out there, more or less generally speaking, you kind of lose control to some extent. Especially if someone is criminally intent on getting your information, hackers, etc., if it's out there, they're going to get it. So there might be more protection in Europe than in the United States? There may be. I hear you saying it right now. Right. I mean, at least in Europe, having control over your information is deemed fundamental right. And in some other countries, if you look at China, which just passed the new cybersecurity law in broad strokes, it creates the ability for the Alibaba's and the Baidu's to become the computational force to help the government find people they want to find. And even Apple and other people, other companies that have tried to get into that market have had to bow to that. So you may or may not remember there was an Apple versus FBI fight some time ago about whether anyone could create a backdoor through the cryptography and the iPhone. Apple fought that. Ironically, that was solved by somebody coming in saying, we'll break it. Don't worry. And my understanding is that for about seven figures, they just broke into the iOS and got the data. So on the other hand, yes, so privacy with or privacy, that's a big question. That's a big thing for the future and for law and practice. And I'd like you to, as we close out here, I'd like you to tell me a little bit about where you see this going. I feel a little bit of hope because, well, both hope and sometimes despair, I guess, but because human beings are involved. And some of this requires the judgment of human beings, although machines seem to be adopting some of that in some ways. I don't know if they can become humanity. Perhaps not. Perhaps that's too much to expect. I think it is. But where are we going with this? What are your concluding thoughts on the big data? Well, I mean, I hope it becomes, the dialogue heads it into it's being a tool. For good. For some constructive social purpose. Right. And using it with other tool. There's another bit of, well, it gets into another further discussion and I'm sure how much time we have left. But there are ways of people thinking about using what they're calling thick data, which is data, the unmeasurable, the things that you can't quantify and also combining those two things together to make decisions. Because at the end of the day, I hope people begin to realize that if you have a set data set that may or may not be biased, it's not going to help you predict what's really coming down the pike. The future is unknowable. So perhaps the question ultimately becomes one that Lewis Carroll, a mathematician in his own right, asked through Humpty Dumpty in the looking glass several years ago, Alice is querying saying, can you really use words? And you say data, I say data, whatever. Can you really use words to mean so many things? And he says basically that the question is who is to be master? That's all. And that seems to be the fundamental question of our time. Okay, well, thank you very much. I think we have a closing photo of Humpty Dumpty for us and Alice. And so perhaps we have to look back in time for some answers and what it all means. Glenn, thank you very much. I appreciate your time. Thank you, Mark. Aloha. Aloha.