 Mae bod hoyr'r wph Odyssey'n ei dweud yn cyllideb yn cwrs fawr ac mae ddweud yn gweithio hoff �fyn. So we ar yr un immunity ac mae gyda prototype a'r w Tongan at Speydig yn fan hynny'r blaesiau yn ysgrifwyd maen nhw os ydych chi'n mynd i ddam y methu gweithio oherwydd yma, tous yn ysgrifwyd. Felly mae'n gweithio adael a mae o'n gweithio gwaith yn ein bod yn twunio'r gwneud eithaf gweithio'r adau. Mae gyda i fe wnaeth oed yma. Gweithio mewn gwirionedd a'i amgylchedd mae'n gwirionedd mae nhw gallwn gyda un oed o'r wneud. Felly mae'r bwysigol yn ddigonol, ein bwysigol yn ddigonol. Felly, mae'n ddigonol ar y cyfnodol yn cyd-dweithio'r gwahogynion mynd i'r gwybod, mae'n gweithio'r app. Yn ystod, mae'n eistedd o'r ddigonol yn y gweithio'n gweithio ar oedol, sy'n gallu wneud yn gwneud i'r ffordd sosialol. Ond mae'n gweithio'n ddigonol ar yr Arhyn Ddweudolol Acesol Cymnol, ac y cwmdeithasio cymdeithasio y Basch yn Cardiff. Ac yn ein grwp o'r rheswm y gynllun wedi'i ddweud yng Nghymru, ac ydych yn ddweud, sydd ydych yn y cymdeithiol cymdeithio sydd yn ei cyfnod, â'r rwyf yn ei wneud, i gyd, ystod ychydig bwysig ar gyfer cymdeithasio. Dda nhw'n amlwg ydyn ni'n dweud yw'r cyfnod. sy'n ei wneud y ddweud hynny, ond mae'r ddweud yw'r modr. Dwi'n cael ei fod yn ystod. Felly, dwi'n cael ei fod yn cael ei ddweud. Felly, rydyn ni'n iawn rÙl o'r ysgol yma ar gyfer y Codau Ym Mhwyllfa Llywodraethol ac ym mhwyfyrdd o'r ysgol yma, ym mhwyfyrdd o'r rhaid yma, ..a phobl gallwn gwneud hynny sy'n cael ei wneud... ..ymwesol yn gallu, i fel dim peth. Yn ymgymru yma yng Nghymru... ..y'r analysion rhai o gyfrifyddoeddwyd ac y cefnoddiant... ..y un pethau ar hyn nesaf rhyw bwysigol... Mae'r adegu i'r ysgol yng nghylch yn ychydig ar hyn o'r hyn o'r hyn o'r hyn o'r hyn o'r hyn o'r hyn o ddaeth. Y prifysgau sy'n ddod o'r cyffredin. Felly, y gallwn y rhai ychydig i'w gweithio i'w ddod o'r holl, ddyn nhw ffnwys i'r holl i'r holl, boi'n ddod o'r holl. Felly, mae'n eu bod yn oed yn ychydig. Felly, mae'n ddod o'r holl yn oed. Felly, mae'n bobl yn ddiddordeb. Gwell, yna, mae genedlau y generais yn ymdweud o'r tyn. Fel hyn yn fwy o'r tyn, dyna, mae'n meddwl ar hyn o'r tyn. Mae'r gweithio a'r ffordd yn ddigonol. Felly, mae'n bwysig yn gweithio'r app. Roeddwn i'n gweithio'n gweithio. Felly, mae'r mineb yn ymdweithio, ac mae'n mewn gwneud gweithio'n ddysgu... Mae'n rai o'r amlwg, mae'n ceisio'r appau oherwydd yn ei wneud. Yn ysgol, mae'r ffordd o'r ffordd yn y dyfodol yn y ddiagram yn dwylo, ond mae'n mynd i'n dweud i ddweud y mynd i ddim yn sicr o'r cychwyn. Prydo'n rai o'r dweud o'r data sy'n... ..eg oedd y bwysig o'r apps i'r cyffredin iawn a'i ddweud. Rydyn ni'n dweud i'r appau. Fy enw'r ysgol, rydyn ni'n gyda'r mae'r cyffredin iawn... how frequently certain apps open and close network sockets. And we make the data available to the user. I think one telling example of how our data isn't really ours. Well, what you can do is pull out your smart phone, pull out the SD card, mount it on your computer and you can access the data on your terms and not the smartphones. No, you can't, because, well, at least not with any Google Nexus device since the first one, which was meant to be given to hackers because they don't have SD card slots. I wonder why. Unless you root your device, you mount the device, there's no SD card slot, you have to access the data on the device's terms. So, press a button, our app will extrude its internal SQLite database into somewhere where the user can mount the SD card or not and get at it. And we've seen young coders doing that and getting their SQLite files, which you can play with in Python standard library, and they're merrily plotting histograms of when apps have been pestering them with notifications and starting to look at their own data. And this is the slightly ironic bit. Periodically, the app phones home and uploads suitably non-creepy anonymised data to our server, CCAN, or at least built on CCAN, which claims it's basically Drupal for data. However, we gave them a phone with the app installed on it, there's the keep and treasure, and you can't really see it, but there is a start and stop button. So, not only do you have the choice to install the app or not, you can press a stop button and you can go, right, I'm not going to play anymore. So, in some sense, yeah, we're sort of out NSAing the NSA. On the other hand, honest cub, it's there to do some sort of social good. So, what does it do? If you look at your Droid app, I mean your Droid settings, you can see that you get aggregated network traffic data. So, in the, since I last reset the counter, how much data has Facebook eaten. Wasn't my idea, actually. This was sort of doing it the sort of easy way and reading the instructions came from one of our young coders. So, instead of what we do is just fairly easy, well, I managed it, get Android service up and running and just poll networks traffic data for individual apps every half a second. And if you see a constant sort of monotonic increase in the amount of data, that's one chunk and you log it. Slightly more deviously, Android is just about still a Linux system, lurking on top of a Linux kernel. And every Android app is a user on that Linux system and there's an API call. You can find out which app corresponds to which UID. And I guess a back door, not really a back door. So, I think Google doesn't particularly worry about it apparently. You can fer it around in the proc slash whatever PID slash net slash TCP or UDP. As the case may be, you can look in that little virtual file, pull it every half a second and you get a hex encoded IP address and more tellingly and more usefully perhaps a hex encoded port. So, if our app shares a group with the other apps in question, we see their network traffic, we see their sockets. So, we know what protocols have been used, we know when it's been opened, we know the IP address and perhaps more interestingly, the port. So, do that every half a second. Not as reliable, we don't get every single bit of every single app playing ball with that. But I guess it gives us a little more information or detailed information than the standard traffic data dodge. There's a picture. You can't see it. Chrome has gone and done something at, well, nearly three o'clock in the afternoon. But it was doing on TCP 443, it was nice HTTPS, a whole lot of pull 80 stuff underneath it. Interestingly, give a little rundown of the apps that do play ball. Google Plus does, Skype does, most web browsers do. So, other single things we record because other apps will get to this are recording it already. Network names, MAC addresses, et cetera. One of the slightly hackier and cheaper things I've done is that you can have your app ask the user to be an accessibility service. And instead of reading out notifications in a synthesised voice or flashing them up in large print, it's just dumping the time and the name of the app, not the details of the notification that be far too creepy. It's dumping that to its internal database as well. And also GSM cells. Yeah, full GPS would have been a bridge too far. Not nice for the device's battery and altogether far too creepy. I just get the cell IDs. Now, we didn't, because of the nature of the project, we didn't want to go cap and hands of Google and say, can we use your location API? And that wouldn't allow us to differentiate data that came from wireless hotspots and data that came from just cell locations. So, very kindly, open cell ID have a big database full of, there is damn it, much every, not entirely complete, every cell ID in the UK. And so we don't hammer their poor API, which they've so generously given out. What we do is I've included a huge compressed dump of UK cell IDs. And we also didn't particularly want to go cap and hands of Google and yet then use their Maps API either. So, it's a pity you can't see this. This is actually some of my cell data. So I'm prepared to have my mobile life, admittedly somewhat invisible on this screen here. I'm the red dot. Yes, I live somewhere in Highgate. That's the GSM cell I connect to most frequently. It's got me on the wrong side of the Archway Road. So I can't be particularly well stalked by it. I work in somewhere in Central London, not especially far from Covent Garden. Sometimes I walk home through Camden and Kentish Town. That's about it. Enough to sort of get some, do some funky maybe K-means clustering and see, does my behaviour differ much from one of some of our teenage coders? How many sort of nexuses of social activity does a teenage coder have as opposed to a middle-aged one? And we just do that by, again, didn't want to use Google Maps API. So we get an Android web view, the controlless web browser, and use the very lovely Openlayers JavaScript library to display OpenStreetMaps. And yes, I'll admit it now. The UI and the UX. I never said I did UI or UX. Is clunky. To anyone watching at home, yes, I will fix the layouts. I promise. I've had other fish to fry. The data. Just an example. We've only been running this for, I guess it's three months now. So maybe you can see the histograms. Turns out three of our young coders downloaded and use the same game. Don't tap the white tile. That literally encapsulates what it is. You've got to not tap the white tile. You've got to sort of scrolling Mondrian picture, and you have to press the tiles, only just not the white ones. And it suffers pretty much the same fate. We've got a histogram of time of day, and we've normalised for each individual user. Suffers the same fate as most games apps seem to do. It covers about a three-week period for all the players. So spikes at certain times of the day. Five o'clock a bit at breakfast. And not much anywhere else. And eventually it peters out. Of course this is not temporal. But we could show how quickly the usage falls off when they get bored with it. Except player B was pretty keen playing this thing around seven o'clock in the evening. But not nearly as keen. As on this title, we've got two uniform, just about uniform distributions, butted together. There's not an hour of the day when this app has not phoned home. Only slightly more likely to phone home at breakfast than it is to use the net at four in the morning. What could it be? Some massively addictive multiplayer online game? Minecraft? Are you building a cathedral on the sly? No. Well the idea is, you move the little red dot for the scrolling maze. That was it. It's called The Line. And the list of permissions it requires are, well I can't read them here either, but it wants full GPS. It wants to know when you've turned the little flashlight on. It wants to know your Wi-Fi state. It wants to access the SD card. It wants GSM cell location as well. The works. So once you've used our little tool to identify an app that's getting up to these sorts of capers, can we fight back? This is not going to be proper hard-billed infosec stuff. Cos I'm not that yet. But can we grab the app's package file, the .apk file that you get from the Play Store? De-compress it. In there there's a file called androidmanifest.xml. It's literally that, a manifest what services and permissions and events is an app concerned with. Can we have a go at decompiling and seeing what it's up to? Turns out you don't need to have a not particularly pleasant afternoon trying to get the Android SDK installed. There is a Chrome or better still Chromium app you can just point it at the relevant bit of Play Store and you will get your apk file. Love a little tool called Android apk tool. One line of pointing it at the apk file. What looks within? Don't really need to be a Java code or Android code. I'm not really one of those. To see what's going on. There is an intent filter, in other words an event filter looking for things that are called cn.jpus. So some sort of push notification service. I wonder why that was, because I wonder why that was why the line app was being so utterly busy on port 3000, which is weird. There is a service from umeng.com who are an app analytics service. I don't know which service wraps the other one. So yeah we're getting intense and push notification services so yes the app is phoning home and it's a notification service, it's on a weird port. That's why it's on port 3000. Dex to Jar. Dalvik virtual machine isn't a Java virtual machine. It takes its own little thing, so you can use Dex to Jar to get a jar filed back again and the JD suite of Java and you can get the source code and without knowing anything about proper semi-native Android coding the usual suspects. Where is the phone state listener and where are the location listener objects? They're there. So jpush.cn has got them, umeng has got them and there's also a class provided by tensent.com that is making reference to latitude and longitude. So yes to send you these notifications it really really needs to know exactly where you are apparently. So yes it really is phoning home that often and telling it's masters exactly where you are. So yes I will fix the UX. Can we look for patterns and interesting anomalies in other data? Social networks for instance. Do some funky machine learning, maybe a bit of K-means spatially and temporally. Get some interesting insights on the behaviour of users. We're going to hold a second hack day. In fact the picture on the side of the front was a bunch of our young rewrite state coders playing nicely together. Much like grown-up coders often don't. Can we confront them with their data? Or in fact can they confront their data and how will that change their attitudes? I'll leave that to my cultural studies learned colleagues. I'd love to tack on the little inbuilt demographic survey so when we open this thing out to more individuals how does the behaviour depend on the gender they present, age, et cetera. We'll gradually let the data be available to the right people. Probably the academic community at first. We'll finally get on to the play store and maybe I'll get a little more hard boiled and start when apps emerge that look right for the prodding. Do you think perhaps a little more impressive if you're wild with Wireshark or the Bert Prox or Drozer, et cetera, which I misspelled but luckily you can't see it? You cannot see any of this text but we are not on the play store yet but if you want to play with the app co-on and give me your data you can get to keep it too. There's a GitHub Io site. We are KingsBSD at Twitter. We have a blog. Feel free to file past and have a look at it. The slides will be on slides if you Google or search for KingsBSD. You'll get our slideshare and therefore you can get the slides. Any questions? Yup. Yeah, the... With your Seltar location you can get a rough location without asking for location permission? You have to ask for location permission but you're not tapping into Google's service that packs... There's nothing to say. There's no way of going to Google and going, tell me where this Seltar is because it knows and it doesn't want to tell you. It'll tell you the phone's location based on multiple Seltars and Wi-Fi networks which is for us too creepy and invasive. It'd rather just go, where was the nearest Seltar? It's suitably defocused because it's about the network, it's about the cell network, not so much about the user and it's just a reasonable proxy for where the user is. Ooh, a feature request. Yeah. Interestingly, when I sort of peeking into what the line was made out of, I think these things are sort of assembled on an assembly line with not too much tender loving care. There's a C++ library that the actual game canvas is on and there's a completely separate allegedly cross-platform app builder and when I... and there are a lot of... three other games that look exactly... suspiciously like the line that appeared to be used exactly the same C++ library and there's no... I haven't yet decompiled the C++, I don't know if I will, these are in fact the same apps decompiled and repackaged but people do that so it's pretty easy to perform surgery on these things because even I can get at the innards so you're quite at liberty unless it's a paid app to grab the line, to grab the line, open it up and perform some surgery. I mean there are... I found myself playing this zombie shooting game for a little bit and if you've got no network access it just goes, I'm not playing. I want to... no. Not as far as I know. I guess one of the things called the android hackers handbook there's a sort of a pulp, well there are two packed publishing tomes on android pen testing. What you could do I guess if you really desperately wanted to play the game would be to in a you know level up in your sort of infosec food and sort of hacking food would be to torture the app by putting it in in an emulator and it can go through some sort of proxy and you could you could sort of convince it that it's on the net when it isn't really I suppose. Cyanogen, yeah. I'll repeat that because that's a good one if the mic's picked it up but apparently yep if you get the likes of Cyanogen Mod you can to some extent you can feed an app false data and go look if this thing's not going to shut up and insist on having my location it'll just send it some rubbish and it'll be fobbed off with that. That's another good point because I I don't know if there's any more data I could successfully ring out of apps but the one thing I haven't done is assume that the phones are rooted because you can't because phones aren't mostly going to be rooted. Keep going. Buy a nicer platform I guess or a less invasive one. This is the thing even now this is sort of my business and I install some of these apps just to see what they do now instead of just blindly hitting install without any missions now surely after I've blindly accepted the missions now I'm more likely to oh I should have probably should have read those people are obviously willing to make this swap they go yeah I'll move the little red ball in exchange for telling I mean yeah the ad service is clearly being sent location now we don't know to how fine grained the actions it takes based on that it might just go oh right country right language there you go what if they team up with somebody else they've got that data don't be to two tinfoil hat over this but oh yeah we've got they can be used for purposes they didn't intend it for it at first and of course we have you have got no say in what their intended purpose was in the first place who's going to install the app and try it out or who thinks it's none of none of our damn business stony silence oh you have sorry that exactly yeah there's market share and all that but ideally we'd have wanted and some of them actually are we'd have wanted our teenage coders to collaborate with the writing of all the apps yeah so if it's IOS that's another hundred dollars per phone just for the privilege of writing of a developer license android at least you can say okay I'm going to install this not from the app store without routing the phone and if you route an iPhone it completely I think it messes up the user experience quite horribly because all the DRM stuff isn't going to work anymore so yeah I'm afraid no iPhones yet unless you want to write one go on you know you want to am I evicted yet yeah apparently because like when if you dump to a SQLite database every time a cell location changes and that happens quite a bit the phone gets warm so yeah we it's got better if you um I cache events and then every five minutes squirt those down to the SQLite database and it is less bad um if you're in a crisis then stop recording and we only also I only upload to the ccan server every ten minutes if you're on wifi so I'm not evil and well I am I'm not evil enough to eat up your data plan I'll stalk you that being said I mean what have I really uploaded a bunch of cell locations fairly vague I mean the top app in the socket database or the socket table rather is massively by a long margin the web browsers all you can go is well how much of it is port 80 and how much not so you've used some apps we want to be it's about the apps and not well to some extent is going to be about the user but we're mostly into stalking the apps and not the user probably sometimes we can't avoid doing both tumbleweed so to do what sorry eventually just getting into the um sort of I don't know if you've heard of Shibboleth getting into the UK academic authentication system is do intend to share the data like I said I didn't want to pass judgment on what the apps are doing already but it would be a little bit strange if we air quotes gave away all the data to the man so we've got to think about that I guess if you weren't from an academic research group and you came to us and asked us nicely can we have some data then yeah obviously it would we'd like it as open as possible without it doing any sort of harm so if we were utterly sure that the that really open public access wouldn't sort of do that then yes it's written into the terms of the research grant whatever we build has got to stick around for at least five years was anyone stunned into silence by my lightning talk where I talked about our Twitter analytics at least there weren't any there wasn't any matrix multiplication in this one okay I think I think you've suffered enough classed this mystery letter off thank you