 Thank you. Thank you So hi everyone, thanks to be there. It's my first talk ever in French or English or whatever So if I'm nervous, that's hopefully I guess it's normal Before before I start I will just present I will just make some excuses for the export of Bitcoin We'll take some shortcut to for you for everyone to come to understand the principle. So I Know that you some of the Bitcoin experts may be offended by the shortcut. I will take I'm sorry Basically this talk is about These are an amazing the Bitcoin network by regrouping addresses using public sources So we'll start right away because I only got half an hour to to make the talk So I'll go right away. So first of all what this talk is not about Those question I read it a lot Have it every every time I talk about Bitcoin have those questions and seriously I Hate answering those questions because I don't have any opinion on those questions mostly I Don't care. I don't know if Bitcoin will press Occurrency if you buy is a cluster, that's your choice Who is that they're she like I'm about to have no idea that you have Every sentence that starts with does empty guts does something something I don't care and is Bitcoin real money It's a pen So now what is this? What is the talk about? It's about addresses about block block chain transaction And what's is the limit of pseudo anonymity because we tell that Bitcoin is anonymous network, but it's not exactly it So during this presentation, I'll try to to to to teach you how to Associate addresses to a possible entity and then after use external sources of information to match a real person the entity over a Note of the network and finally we'll get a tool release because you know I'm pretty sure you just won't have the tools and Try to to use it at your home. So first of all Bitcoin 101 Had to can as I said, I'll do some search cut So first of all the mattresses is your identity on the network it based on the public key So basically you generate the public private key you take your public key you hash it You encode it and you get an address. So that's mostly it so that's a simpler representation of a private key a public key, sorry and So far I've seen 70 million addresses on the network Now for the block and blockchain a block mostly contains transactions So when a transaction is created on a network, it says in the unconfirmed state as soon as it's included in the block It's became confirmed Depending like I know for a Bitcoin expert You have to wait some blocks to make sure that it's really really still confirmed in the block But for our purpose of the presentation when when a transaction is in the block, it's confirmed The block are chained together and they create the blockchain So the chain of blocks that created and start creating the big registry of all the transaction is the blockchain And a block is created every is minded every 10 minutes And as I said previously new transaction are included in this new block created by the network and they are became in the confirmed state So that will be in the transaction history for ever or at least For the time that Bitcoin exists Transactions so one thing to understand is a transaction is a moody movement as you do transaction all days In the real life but Bitcoin is still the real life, but well you understand what I mean So basically one thing is the input equals the outputs So for example here on the slide if you want to Pay Bob so Alice want to pay Bob She will refer to previous transaction where Alice received money from Fred and Alice received money from Charles And then she will said okay. I received two bitcoins on this transaction I received three Bitcoin in the other transaction now I have the right to spend those Bitcoin and to send it to someone else so basically she will send it to Bob So the output here will be Bob five Bitcoin two plus three equals five if Alice wanted to send only one Bitcoin to Bob What she what you have done is for example send one Bitcoin to Bob and four to Alice So she's sending the change back to herself so that's if you want to Send money to someone you basically send the money you want and keep the rest by sending it to yourself directly So that's the important that the amount of input equals the output As I said previously Alice referred to previous transaction to say that hey guys I received money in this transaction back back in the days now Well, I would like to spend them so she referring to output of previous transaction So Alice said okay. I have this transaction here. I received money in the output now I will spend it as an input in the new transaction So that's it. So basically she is just a right that okay in the in the letter I have received money there. I now spend it and signed it in the new transaction and To prove that have the right to send it and to prove that it's really me that have this addresses I will sign the transaction the input to make sure to prove you that I own the private key associate to this addresses and Fun fact that I have approximately 50 I Out of point guys when I was in elementary school, I think I missed the class where we're explaining numbers in English I'm pretty dumb and with number in English. I'm sorry. So 65 million transactions. I've been seen on the network So as I said previously so input and output so input so no sorry output is an amount of money that you want to send and Next to the hash of the recipient public key So for example, I want to send money to Lauren next next year I think it's public key. I hash it and put three bit coin the hash of Laurent public key and When I want to spend the money received from me what it will do it will Take the hash when you receive money from me those so we take the transaction hash put there said she received shed He received money from me in this output he will sign all The previous transaction with this private key and put this public key next to it for the network to verify that The identity to verify the signature So when the the network verify the signature what's happened is the network Take the public key She the network hash the public key and compare to the previous hash 160 and if it matches it okay, so that's good So the public key match now have to verify that he owned the private key associated to this public key So basically it will the network will take public key and verify the signature that is made with the private key And if the signature is good you said, okay, so that's a good person and he proved me that he had this private key So do a small recap here because I made less some some of some people in the process So basically you have a public key you ash you get a hash 160 you encode it and you get an addresses So that's important for my tool. So that's why I redo the the diagram there And the other the other is really like the the exchange mechanism the transaction mechanism So as you can see in the transaction three you got input that's referred to transaction one output and transaction two output So I will take the first input get the output referred take the hash 160 and code it and get the address zero I'll do the other thing for the other one here. So I will take transaction output at it takes zero So this one I'll take this as 60 here. I'll convert it to an addresses and get another addresses So basically if the transaction is valid that I mean that was able to sign Input one and input zero both of the input I was able to sign with the private key associated to those addresses That's mean that possess a certain point. I was able to have those two private key on my position To sign and to create a valid transaction So I can tell that those two addresses belong to One entity on the network So that's two addresses that belong to one entity. It could be a person. It could be an Automated system. It could be a laundry scheme. It could be anything, but it's belong to an entity in the network So for example if in this this slide I have four transaction that I parse and I found that addresses one and two was used together in a transaction I found that addresses four five six was used Simultaneously in it in a transaction and seven ten and also another transaction that use two four seven So basically I can associate it like this like it. Okay, so that's mean that the person built. I have the address one two four seven Because it it's reused the same addresses in different transaction so By doing this you're leaking the information that you have one two Four seven and you can do for all the note and then merge it together and tell that that's all the addresses of an entity So basically you can regroup addresses selling that those addresses belong to an entity on the network But I don't know it's still anonymous as I still have Adresses I don't have a real identity that need I have to to get external source for that So basically the goal of the tool is to create a graph of the Bitcoin of the blockchain So I will take every block take the transaction of every block create a node using the input addresses of all those transaction merge those node together and Finally, I will get a graph where a node represented entity and an edge represent a money flow That's mean money moving from an entity to another one My assumption as I said previously I assume that if you use multiple addresses In an imp as input of its transaction. They belong to the same entity That's mean that there's one special cabinet rejection that are negligible multi-sign transaction So it's it's the same principle as In the company you need to have two percent to sign a check for the check to be valid So it's possible to do it in Bitcoin But the number of transaction that use this kind of mechanic of mechanic is Negligible so basically we just don't crawl those transaction assumed that it does not exist because they are quite Small it's what they are quite in small number in the network And then the other assumption is people don't know how to use Bitcoin and reuse addresses in multiple transaction So it's instead of sending the change to a new addresses They created they sent back the change in the other in the already existing address and reuse this addresses multiple time That's mean Adam can do like the merge that I explained in the previous slide The tool now the funny part because I know you just want to heard a good story and that's a good story So basically I become with a graph program and say oh what that will be complicated I will like it will be really really complex as a program I don't know how many I've no know if I may I will be able to do it for the North sick presentation And finally the graph problem takes me a day a day and a half and a big tether problem Take me until last week to figure out So it's work well in theory the data is public that will be easy, right? You know you have all the blockchain you can you can download it from the internet It's it's easy it's public and it's only note in memory So that's mean that's an arrow of bite and in figure so that will totally fit in Mori like I have more than enough memories Well, I was wrong as you can imagine So idea number one get a block from the blockchain take each transaction of this block Take each input of the transaction get the reference output to get the hash 160 Calculate they have at the Bitcoin address from the size 60 put it in a graph And finally watch it run for three months and Flipping tables, so basically first idea was wrong Idea number two now, so I started over get a block take each transaction of the block Take each input of this transaction. They come present the public key in the input Calculate the addresses for the public key because you know the hash 160 is just an ashing of the public key and Wait those addresses does not exist. So all the addresses that I calculated in fact of where never existing So what I mean, I did something wrong here. So again a table. I've been flipped So Eddie on my tree and send what you think one in number two in fact before converting it to an hash 160 you don't have to compress the public key and Do good code grab the Bitcoin like our object oriented like the links list Python good code everything's like Was clean was undisturbed Undisturbed I was my best 100 line of code I made everything was clear was easy to follow find that the merge routine is linear and We will I will still be crawling by now if I wouldn't have stopped this program It would have worked but I would have wouldn't be able to present today So again the tables idea number two and number four. Yeah, that's something I do when I have problems so And I decided to revert the clean code I Did crappy code with hash map a stable that was totally a mess and possible to follow But it worked so watch a crawl run run the crawler for two or three hours get the out of memory exception said Okay, and now I have a big problem So I cannot do it in links list and I cannot do it in hash map. I'm I'm screwed mostly so another table in the flip Idea number five realize that you were running Python 32 bit with a space limit of 4 gig Decide to use Python for 40 64 bit Run in for six 16 hours straight and get another memory exception and this one was real Exhausted my ram of my laptop plus my swap file So, okay, no problem. I'll use a DB. I've just done to DB like every 1000 block I'll just dump into DB and then merge with the DB. I'll figure out how to make it and My transactional DB just died mostly. So I was Still in a bad position and finally I was able to use MongoDB with multiple instance like inserting Each 1000 block into the DB wait for 44 hours 24 hours. Sorry. I know what I'll tell you And finally I was able to make it work So the minimum system requirement for running the tools like for crawling it not the DB for crawling it It's 30 gig of our drive 32 gig of RAM Python's 64 bit for sure 24 hours of 24 hour free time. So I'm Are you willing to do it live and for table to flip? That's really really important if you will not succeed it So finally the part that you that you want the demo so the tools like this. That's my tools. I know It's the least impressive tool ever So basically what I will present today is like as I tell you I was dumping into DB So how I have the MongoDB on my computer and we're going to have a real use case So this use case is take from Radio, Canada Stories Basically the the interview someone who has a crypto locker and lock all this file And she had to pay to get back his video from your kids and like the classical crypto locker stuff The point is they didn't they didn't blur the addresses. So as you can see or may not if you are back You have like the kind of dense you can maybe Guess the beginning of the addresses and since happy DB, so I will able to find this address is back. So We're doing it live So so basically whoop. I will not scoop death this one So I have the DB there. So I Have the DB if I if I do for example Have approximately like as I said before 70 million entries in database So I'll find I'll try to guess the The addresses so I'm sorry. I know if you know, I'm working on sliding right now and that's what I need Okay, so the address so the addresses that we found start with this comment I'm scooping it a bit to my presentation, but well, it's for this for the bed So basically the address is starting with one LG and new v something something something. So, oh, yeah, nice We found an addresses. That's really interesting. That's I crawled this addresses and it's associated with the node ID and I'm I want to know if for example This bad person have other addresses that he using will see So basically, I'll try to get for the node ID. So I will tell get me all the addresses of This node ID So basically we got four other addresses that associated with this person Thank you my groupies So basically what's we can do now if we want to know how many money this Crypto malware locker received so it was you so what we know now It's this this people behind this crypto lockers also at least for other three other addresses So we'll go on website and check for your addresses. Oh, this one I've received 11 Bitcoin. Oh This one to Bitcoin This one 2.5 and This one 49 so So at certain points you reach this addresses of the crypto locker receive approximately 70 Bitcoin and it was working in the time that we did it Approximately 600 dollars US per Bitcoin. So approximately they received $150,000 with this campaign at least So what we can do next is for example, well, okay, so I have like a use case a real use case from From ECEP they tell me okay We found those addresses and like in the blog post I said, okay We decode discover a crypto malware by high sir partner find it you have this addresses and Also what the diversion that we got has this addresses. We think it's pretty sure that they are linked together Well, we can confirm for real so for example If we take the node ID of the first addresses and find it you see that the node ID is 1258 538 and the other node ID as you may think and you're right Would be exactly the same node ID so we can confirm for real that two addresses belong to the same person so and now what we want to know is how many addresses that this This entity possesses addresses. I'm really slow typing. I'm sorry guys with a So we got over 160,000 different address related to this node So basically that's a laundry scheme or that's an automated tools or someone and if If like we want to go further we can know how many money we have but seriously guys I'm sorry, but I will not do it today because the I'm continue my slide It's in The what's next basically What's next for my tools is? To release the crappy code as I tell you before that you can crawl the network And you can take the DB from there and continue crawling from now to like the so the DB that we are there will be available is Made Monday not this week the week before so you can continue crawling for this block and just go further And you'll get all the the all the database until now so basically That's all the group all the transaction all that you all the node ID that Represent a person and the addresses that's related to this node ID as you can know what you can do right now is to take Every addresses find something interesting on the internet using for example Google and then find other addresses associated with the addresses that belong to the same entity so basically the version 2 will be the follow the money so basically I will crawl again the blockchain and get every transaction and and Do the money movement between the node so you'll be able to know how many money does it move from one note to the other note Or how many money were sent into a big language scheme and then from this big language scheme Is there any like amount of money that you look like the first that was sent so we can said okay? She was trying she was node number one here sending the language scheme and then send it back to the Node number two so we said okay That's probably the same person because but as much money that is sent into the language scheme that is exit and going to this note Also, so that will be in version two eventually and it will be still available on my github also and Version three eventually will get the collaborative tool so that people will be able to Comment to add comment on this note said okay this node is probably this bad guy this node is probably this version this node is probably this system and You'll be able to design a nice the Bitcoin network like this So that's it for me. Do we have any question do we have time for question or I busted I'm pretty