 So let's go ahead and we'll introduce ourselves. I'm Tom Holt. I'm an assistant professor at Michigan State University. I run an open-source laboratory there for internet-based research on computer hackers in various communities online, be they deviant, criminal, or otherwise. And I'm a criminologist by training. So my interest is in understanding social dynamics of behavior, what drives individuals, what motivates them, and sort of subcultural forces and dynamics. I've been looking at hacking for the last five or so years, and just find it fascinating. I'm Max Kilger, and I'm a profiler for the Honey Net Project. I'm a social psychologist by training. So my main interests are motivations, future threats, counterterrorism, stuff like that. And I also run the Spartan Devils Honey Net Project, which Max is a partner of as well. Our other two colleagues who are not able to make it today, Dr. Stromsky and Dr. Smirnova, played a critical role in developing this research. In fact, some of the slides that we'll go through are directly generated by them. I am not a social network analyst by training, so a lot of what we'll go through should do have very, very, very specific questions if you are a network person. We can certainly try to get as good an answer as possible. So with all of that in mind, let's go ahead and jump into the presentation itself. First and foremost, you go to DEF CON, you're going to hear a lot about the new hotness. You're going to hear about Conficker, you're going to hear about Storm, you're going to hear about the zero-day exploits, all the really funky, sexy, brand-new stuff. But at the same time, there's a wide range of old stuff that's circulating around. If you look at the hacker community regularly, you've probably seen pinch. There's all kinds of free bots out there. There's crypters. You can still find sub-7 versions floating around, script kitties downloading anything and everything. So you've got this dynamic of old and new. Why do we have so much old stuff circulating around? Why do we have new stuff that only a handful of people know about? What really accounts for this relationship? So there's a variety of explanations that can be given. The old stuff works, and some of it works really well, if you download the program and know how to compile it, if it's not totally broken. If you buy it, you can get a lot of that stuff cheap. You can find a cryptor for $2 or $3. You can buy various iterations of pinch in some sort of working format for $20, $30. You can buy all kinds of stuff. Another potential explanation is that the really, really high-skill individuals who have the capacity to program and develop a root kit that works really, really well are not possibly as high in number as the semi-skilled individuals out there who can run the programs, who can kind of understand how they work technically. Do you have any color to add, Max? No, it's okay. And we're just going to sort of tag-team this throughout, so you're probably going to hear a couple different voices as we go. So it might be a reflection of audience and ability. There's a smaller potential triangle at the top of the great guys and then a wider audience of the folks who can kind of use these tools and maybe can't. So with that in mind, the question that we're asking is how do we find those really, really high-skill folks? How do we identify the individuals who can create the new tools, who can find the zero-days, who are writing the code to really make this stuff happen? And then where do they sit relative to the population of semi-skilled users, of the individuals who sort of have the ability, maybe they don't? There's not a lot of social science research looking at the hacker community in this kind of context. We've looked at social ties qualitatively with interviews and things like that, but there's not a lot of predictive research. Attempts to model and proactively find individuals and figure out what they're up to. And so that's our goal. What we've done is to go to all the resources that are out there. There's forums, there's IRC, there's blogs, there's all kinds of stuff that you can go to to mine for information. And so what we've done is focus on the Russian and Eastern European hacker community using data generated from social networking sites, from individual resources where they're blogging, where they're talking about themselves through live journal. And the blogs themselves give you a great deal of information. They do talk to a certain degree about what they're up to, what they're coding. They talk about their social relationships, who their friends are, who they're talking to. And you can get information about where they live, what they feel is important, how they spend their time. So you can understand a lot of the social world of these individuals looking at this kind of data. And this slide gives you a sense of all the various bits of information that are immediately provided in a profile on live journal. They can list their location, their education, be it through affiliation where they're directly attending. They give biographical information. They do talk to a certain degree about their social state at any given point in time, whether they're angry or they're happy or they're sad, what are they listening to. There are interests, and these can be geographically bound, and that speaks again a little bit to location. We also get a lot of information about friends. And on live journal, there are some distinctive points about social relationships. Friends are the individuals that one person looks at. So a friend is someone that you look to for information. Also friends are people who read your journal but don't have access to your really sensitive stuff. Mutual friends, and this is the most important point, are the individuals who recognize and read each other. So it's an immediate recognition of we are peers, we are clubs. And then there's communities, the different groups that people belong to. And actually what it sort of brings up is an interesting point is what you do this information really takes you in several different directions. If you're interested in prosecution and pursuit, this is one way that you're going to go after that. But you also take an entirely different direction, which is basically more from an maybe an intel community standpoint, which is where is the community going, what groups are going to coalesce together, what new threats are going to emerge. And so really depending on what you're going to do, this information can serve several purposes. And so for our purposes, we're going to try to understand what the networks look like, what's the community look like in terms of demographics, in terms of peer relationships, etc. So to give you a nice robust example of what a profile looks like, this is an individual named Crash. Crash is a person who was extremely successful and skilled. In 2006 and 2007, he released a bot called SuicideDDoS. It was deemed in the top three by some different communities in terms of active functioning bots. He sold it, he's released a lot of different code. And so to look at his site, you can gather a lot of information. We can see his physical location in the Russian Federation. We can see his email address. We get the teams that he belongs to. So the Hell Nights crew, rootkits.ru, and then Crash is his own website. We get his interests, and you can see there are some good interests here. They're in alphabetical order, so he talks about alternative, he's into beer, botnets, bugs. Cyberterrorism, which is kind of an interesting one to throw out there as important. Drugs, exploits, hacking, programming, ring zero. So he's got a ton of interests, even traditional terrorism. He doesn't give us anything about school, but we see his friends. So we have a sense of his associations. And to read Crash's blog and to look through the content, you can get some good surprises. Crash isn't here, is he? Oh, good. Okay, hopefully not. Crash, during a marathon programming session of about 48 hours, decided that he was fed up and needed to take a break, and so he took a screenshot of his home setup. Mid-process of writing a root kit for Linux systems. He says, I've been smoking, I've been drinking, I've been up for 48 hours. I need to take some sleeping pills and just pass out. So he takes this picture beforehand. You can see he's got two laptops. I don't know how clear this might be from the back, but he's got two different boxes running and a tower. He's got Ali Debug on one. He's got another sort of assembler program running. So he's mid-progress. Plus, you can see this giant empty bottle of vodka in the center. So he's drinking, he's coding, he's trying to have fun. 427 a.m. and just plowing along, and that was the title of that post. So you can find some really sexy stuff in these blogs. So with that in mind, what we're going to do is walk you through the social dynamics of this community using 364 individuals from eight different groups. These are all individuals with live journal profiles. The groups that we selected are actively involved in the distribution, creation and sale of malicious software, and to a lesser extent stolen data. So this group includes Haxona, Mazafaka, if you're familiar with any of these communities, some of these might be ones that you've heard of, the Hell Nights crew, so lots of different stuff. Dr. Smirnov, who's part of the project, is a native Russian, and she was instrumental in helping to translate a lot of this information given that I understand Russian a little bit, but it's not as good as her ability. We also did a lot of open-source searching, going to Google and looking at each specific handle, plus a variety of keywords to try to assess the individual and their level of risk. Are they heavily involved? Are they producing malware? Are they selling it? Are they just posting in hacker sites? Do they just run a security blog? Are we not finding them at all? And so using that as a metric of risk. And one of the things I want to make clear here is this is not a random sample, and so sort of basic inferential statistics that you might otherwise use aren't really applicable here. Understandably, you can't really get a random sample out of the hacker community. It's very difficult. The numbers that you see and the trends that you see, they're trends, but you have to take them with just a tiny bit of green assault just because it's not a true random sample. And that's an excellent point. And that kind of ties into some of the threat level assessment as well. What is important for you all to know is that we look at this data. We coded the individuals from zero to three or from one to four. So if you're a one, we don't have anything about you, same with zero. If you're a two, that means you're a computer security blogger. You're just talking about what's going on in vulnerabilities and hacks, et cetera. You're not actually participating in attacks yourself. The twos are the moderate individuals who are posting in hacker sites who have maybe some status or they're registered in a variety of places. And then our high-risk individuals, our threes or our fours, are the people who are actively releasing malware, who are administrators of sites who have specific roles and functions within each group. So crash would be an example of our three or our four assessment. So with that in mind, here's the data. Here's the different groups that we have. You can actually see that there's a lot of variation in terms of how many members there are in each of the group. Two of them have over 100 members and there are a couple of them in single digits. So there is a lot of variance. And that's a very important point. Given when you talk about a community like Damage Lab or Mazafaka or Zloy, these are groups that have very wide and high numbers of registered members. So the reflection that we're getting here might just be a slice of that community. The account variation here as well speaks to the type of live journal that they have. There's the basic account, there's a paid account, which gives you a little bit more functionality, and then there's plus accounts which give even more. You can set up a donation program where people can pay you or give you something in particular. So there's some variation here in terms of how these individuals have accounts as well. So beyond accounts, now let's talk about risk or threat levels. So we talked already about this. We searched on individual handles, we gathered the information, we compile it, and we put it together to see what's happening. So this is again the detail. We have zero, one, two, and three. Threes are high risk, two are low risk, one's a security blogger, zero is no risk. Let's see what this population looks like in a nice pie chart here. So actually what you can see is that if you believe the numbers, you really have to worry about 6.3% of the folks, plus or minus something. So that's actually kind of an interesting thing. Almost 70% basically don't have to worry about. It's the top six where you're going to have problems. And so the twos, which comprise about 19%, are those semi-skilled or at least somewhat active individuals who are looking at the community. And they're larger in terms of percentile than our threes. So again, if you believe the notion that there's more high skill than low, pardon me, there's less high skill than low skill, this is reflected out here. In terms of communities and risk, this again speaks to some variation. And there is actually a lot of variation here as you can sort of see. You can look at the mean risk number and look at the standard deviation. And this is kind of interesting because in the past the hacking communities have been a lot more, what you'd call sort of, Stadious Homogeneous. That is, they had skills that were sort of very similar in nature and that's how they sort of formed as a group. But what we're finding now, sort of years later, is that there's a lot more variety in the skill levels that we see in the hacking groups, which is a very interesting sort of phenomena. Yeah, there's some clustering almost where the high skilled people are hanging together. So the Hell Nights group, the two guys who are in it are very, very good. Hackzona, there's a lot more variants. Ruhack, and Ruhack is interesting here because they are a live journal only group. They don't have any other metric outside of this community. There's only nine members, but their risk level is somewhat higher than Damage Lab and Mazafucka. And that's actually really interesting because you might normally just sort of roll over, are you hack and just say, oh, well, we have to worry about those guys. But when you look at this sort of mean risk level, what it says is there's probably something else here. Maybe you better take a closer look at these guys. So in terms of risk and group membership, one thing that we noted is that the higher an individual's level of risk, the more groups they belong to. And that, again, might be an important indicator. They're not just in one community, they're participating across community. Yeah, I think that's actually probably a pretty key indicator. So in terms of journal entries, what we wanted to do is just look at some of the basic relationships in terms of their use in live journal and their threat and group assessments. So this is the average number of journal entries by group. And you can see Ruhack has a very high level of journal entries. But the Hell Nights crew don't have as many, neither does Kupzu, neither does Lloy. So journal entries point to some kind of relationship. In terms of risk, however, the relationships are somewhat more distributed. It's not a really good indicator of anything other than just posting. You can see that our risk three folks, the more dangerous folks, are as active in terms of talking as the risk zero. So moving from posting, let's look at friends, at the individuals these people look at. And so the distribution here, we see BH crew. And for those of you who don't know, the BH crew or Bugger Hucker crew goes out, finds active members who can identify zero-day exploits, has them work on code, and they put out sort of little manuals online about, here's the latest stuff. And so the BH crew have a lot of friends. Damage lab, the Hell Nights crew, Lloy don't have nearly as many. So this is somewhat interesting. When we look at friends by risk class, we see that the level three folks are looking at a lot of people. The risk ones are looking at a lot more. The distribution is again somewhat different. So friends, let's get to the really sexy meaty part of this. The mutual friends, the people who know each other and talk to each other frequently. We see Mazafaka and Ruhack have a lot of mutual friends. The Hell Nights crew, Damage Lab, Kupzu, not as many. And this is a somewhat significant point in terms of group variation. Thinking about mutual friends, where do the risk levels come into play? And here you can see that the high-risk folks, risk group three, have more than risk two and more than risk one. And that actually makes pretty good sense because they're a much more sort of cohesive group. They're more highly skilled. They need to trust each other more. They're more center-focused. They're a much more sort of dense social network and less likely to sort of want to associate with outsiders. And as we have lots of people coming in, there's open seats all over the place. I can see a lot of them. So if you've got a seat next to you, raise your hand real quick so we can get everybody into a chair and not have all the aisles blocked. Cool. So yeah, if you've got an open space, let's get everybody in here. So we've got mutual friends. I'm going to try to continue talking, and I'm going to talk really loud for a few minutes just so I can be heard over the den. So we've got mutual friends. Now let's look at group membership. Group membership, again, is an important indicator because what we're seeing is how many different communities these people belong to. It could be that they're with another forum group like Mazafaka. It could be that they're just tied to a heavy metal group, something like that. So here we've got Haxona belongs to a lot of different groups. Ruhack, again, there's pretty high distribution. Given that Ruhack has a small number of people, but higher risk, that's pretty relevant. Yeah, and in fact, what you see is the groups with the higher risk individuals in them, they tend to have more narrow interest. They tend to have less group membership interest as well. In terms of group membership and risk, we see an interesting pattern here. Our high-risk folks don't belong to as many communities. And that might be an indicator, again, of a certain degree of isolation. They don't want to be as heavily involved perhaps. They want to stay sort of under the radar and so they're going to stay sort of much less visible by not joining as many groups and having as many sort of casual associations. So the final metric that we wanted to consider is how long have these groups actually been around? How long are they established within LiveJournal? Bugger Hooker Crew, Cup Zoo, Zloy, Mazafaka have had a pretty long-term relationship in terms of months, so 50 months or more. That's a pointer about these community establishments. When we think about the hacker community, largely it's going to be forums, IRC, but it seems like they're active in some other outlets as well. In terms of month and risk, everything seems to be relatively the same. There's not an easy point of identification or attribution in terms of time. That's kind of interesting because you would think that the people that were highly skilled and have been around for a while, they'd be around longer on LiveJournal, but that's really not the case. It's not a very good predictor. So now that we've covered the basics of LiveJournal, let's look at the population itself. Let's get into some of the demographic relationships. When we think about the hacker community, there's some ideas that are always permutated about gender, about education, age, etc. So using the data that we have, we wanted to try to parse out as many demographic relationships as possible. Given that individuals might not necessarily admit certain bits of information about themselves, we went through an extrapolated information based on posts. So in terms of age, someone might get a shout saying happy birthday on a specific date, and so we can use that as a metric to understand age. I see Q profiles which individuals give their numbers. When you look at the profile, sometimes they say how old they are. In terms of education, assuming that there's a standard trajectory of age within the Russian educational system, we can give some idea about how old an individual might be. And actually they may just stop for a second. One of the questions we're often asked is, well, if these guys are really skilled and they're trying to stay under the radar and they're under pressure from law enforcement or Intel sources, why are they putting this stuff up on the web? We get that question a lot. And a lot of that has to do with social identity. That is, if you're out there in that virtual world and you don't put any information out there, you really don't have a social identity. You can't really communicate with other people. They don't know who you are. You don't really have an identity. So in order for them to basically exist in that virtual world, they really have to put some information out there so that they have some sort of online identity. That's sort of like issue one. The second issue has to do sort of with commerce. If you're creating malware and trying to sell it or you're carting and trying to sell it, you're like any other business. If you don't advertise, you don't get customers. And so even though you may be very skilled and you're collecting this information you're going to sell, you have to advertise in order to be able to sell it. So you've got to be out there. So just to give you an example of how we would determine age. One of the users, ZDoZ, or I believe that's a correct pronunciation, said he graduated from high school in 2002. He gave us when he studied at a university. So from 2003 to 2008, he says he's been studying somewhere preparing for studies. So taking that as a baseline, if we assume most Russians graduate by 16, if he's in university for five years, then he might be around 22 years of age. So this is an estimate. This is one potential. This might not be concrete, but it's at least a rough guess. And so looking at age distribution, this chart points to something extremely relevant. We see in terms of age patterns, they're somewhat similar across group with one exception. We have nothing about the Hell Nights group. And that, given that these are individuals who are extremely high risk, we don't have any idea about age. And that actually might be a useful clue is when there's not very much information, that suggests that maybe you better hunt harder. These guys might be adept at hiding themselves, but very instrumental. Pardon? Two members of the Hell Nights crew have this. So again, it's, but these both individuals are high risk individuals. So in terms of age and risk, the distribution is somewhat the same. But there is kind of an interesting note, which is sort of, you see there's sort of a slight decline with each class, which sort of makes sense because some of the younger hackers tend to be a bit more adept, pick things up a little faster. And so that's why you see that, except for risk class three, we're the highly skilled folks. And which basically sort of says, if you're going to be highly skilled, that takes some time, experience, and education. And so that you have to sort of earn your chops. That's where you're a bit older. So looking at age, we also look at education. About 14% of this sample were actively enrolled in some kind of educational institution. But again, we see some unusual information provided by these folks. So some of them say that they're at the Scientific Research Institute of Sorcery and Wizardry. I want to join that one. So clearly a fake place that's actually tied to a popular book within the Russian community. Fake location. We get information that five individuals are attending Lomonoso Moscow State University, which is a very prestigious institution there. The National Research Nuclear University, Moscow State Technical University, and some of these places do in fact have supercomputing capabilities and very strong engineering and computer programs. So the few people who give us these bits of information appear to be, to some degree, in very high class educational places. We get a little bit of a typology here as well. Some are studying engineering, mathematics, physics, and then some are at technical institutions. So distributed somewhat interestingly. Then we looked at location. And location can be termed based on the place where they talk about going to school, could be determined by ICQ profiles, could be from the communities that they belong to, as there are some geographically bound communities. So it might be pertaining to Ukraine or to St. Petersburg or something else. Interests, same way. And we can corroborate the information that they provide in terms of a location with some of the descriptions of places and locations that they visit in the journal entries themselves. And this is really actually pretty important for the pursuit of prosecution, guys, is there's really a tendency for people to eventually disclose where they live or where they're going to school. It will eventually come out. So if you look hard enough, you will find it. And also this points to some trouble with extradition. If we're thinking about how some of these places might work with law enforcement here. Looking just at a general distribution, we have 64% where we can't determine anything about locations. So a healthy percentage is hidden. But of the individuals where we can gather information, many of them appear to reside within the Russian Federation and to a lesser extent to Ukraine. And so breaking that out a little bit further, looking at Russian Federation, we see Moscow, St. Petersburg, and Novosibirsk as the most prominent locations noted. And then the 10% in Ukraine and 2% in Belarus. But location talks about something very, very interesting. The Buggerhucker crew talks about meetups in person in certain places in Moscow. The Buggerhucker crew, guys, know each other really, really well. And if you read the information on the right here, B.H. Crew CQ, the command, reminds that we are seeking the information on unfaithful follower EPXOFF, his nicknames, he gives a variety of nicknames, his approximate IP address, his probable place of work, all resources, a common pool of money, something. The Buggerhucker crew is putting on the table. All comments are screened, we won't forget you. That actually really points out a very interesting component of the hacking community, which is there are very strong norms within the community and within the groups. And when you violate that norm, you screw someone over, basically they will come after you until hell freezes over. So it's much stronger than you might find, say, in a traditional mainstream society if someone breaks a norm. So looking at locational variation, we just wanted to compare what we have in terms of Russian Federation against other metrics. So the graph here at the top is information from a variety of studies on runet use, or in other words, internet usage across Russia. And though our data is not perfect, it does sync up to a degree with their metrics. That sort of gives you some warm and fuzzy feeling. It's not a random sample, but looking at the data, it doesn't seem to skew bizarrely, and so at least you feel a bit better about the results. And we also wanted to compare it against known information in the hacker community. So we took data from the Chaos Construction's Hackaround Conference of 2007 where individual attendees could list their name, their group affiliation, and where they were coming from. And so taking that data, we broke it down into various categories, and you can see the graph here at the top is the chaos part. The bottom is our information, and again, it syncs up pretty well. So just using hacker metrics, it appears to match up to a degree. So in terms of location, much like education, we get some people giving extreme details, or at least some unusual examples. So Nate N8 says that he lives in Value Towns in Bobway, and that he only speaks Albanian. Kind of unusual. We get another guy, Arcanoid, who talks about going to St. Petersburg. He says that he lives in Moscow. To read his blog, he appears to reside in both places, or at least travel regularly. But just like education, we get some fake places, the Scientific Research Institute of Sorcery and Wizardry. Also, a handful listed Hogwarts. They're sometimes going to block that information. We also got some conflicting information in terms of gender. Given that there are male or female attributions for words and phrases in Russian, we were able to determine something about gender. And with Mr. Buggers, he's one of the founders of the BH crew. He says that he's female in his ICQ profiles, but everywhere else, he's male. His pronouns appear to be tied to male gender. So it's somewhat mixed. It's hard to really fair it out. Usually with a bit of sort of semantic analysis, you can sort of figure out pretty clearly and pretty quickly whether they're male or female. But this was very blurred. In terms of gender overall, the women in this sample were about 5% of the total community. So a very small slice of this group. The distribution and individual use of the blogs. Bob Nielken, for example, just talks about her day-to-day life. Coyote the One talks a lot about programming, but nothing in terms of hacking. And then Thirteen Ya just talks about her friends and what she's up to. One of the interesting side notes is, in discussing it with some people, is that females in the hacking community tend to have very strong skills and are very expert. And so actually probably more dangerous than the average male. And so to look at the sample, we only had one high-risk, one very skilled individual female named E7 who was really well connected to the rest of the network. These are some of the programs that she's released on her own site. There's a remote shell program, a couple of root kit task managers and different mechanisms in a network scanner. And she's had some other tools that are out there. This is a shot of her live journal page. She's got 521 people that she's looking at, 403 mutual friends. She's a member of 87 communities. She's very densely connected. And there's a picture of her that appears on a great deal of the individuals in this network site. So her face is very, very well known. In terms of live journal, now that we've done some of the demographic stuff, let's look at interest, because individuals can say exactly what they are fascinated by. And so the average number of interest, just as a sheer volume question, the Hell Nights crew, even though there's two members, they have the largest number of interest overall. Ruhack, that live journal based group only, is pretty high up there. Damage Lab, Mazafucka, not quite as high. In terms of interest by class, just as an average, our risk three guys talk about the whitest number of things. And that generally sort of coincides with the assumption that people in the high risk categories are generally curious about everything. And so they have a lot of interests, it makes sense. And so given the interests that are out there, we tried to see if we could predict individual risk based on the interests that they described. So we set it up initially with four categories. Malware, just grabbing all the various phrases that are used that fit into this category. Same with operating systems, drugs and alcohol, and assembly language. You're like the four essentials of hacking, right? So we figured maybe this would be useful. Given that we have such a small number of active individuals, we broke it down into zero, where we have no history of attacks or malware, passive actors, the bloggers, and then active actors. So zero, one, and two. And we try to predict across these three groups. And to look at the predictive analysis, malware is significant, operating systems are significant, but drugs and alcohol and assembly language are not. Damn. And so the higher threat level individuals in looking at this are very, very detailed about what they're interested in, and perhaps this might be part of it. The low level guys just talk about hacking computer security in these broad terms. The higher threat individuals get into botnets, buffer overflows, ring zero, et cetera. They give a wide range and very specific information. So taking interest in getting relatively little out of that, we thought, well, let's just make a broad computer hacking category. And looking at some interests in computer hacking, the Hell Nights crew, 100% hacker interest. Same with RUHAC, they have a pretty high percentage, but some of the other groups are much lower. And then interest in computer hacking by risk, you can see risk level three guys, that's primary importance. This makes very good sense. The higher the risk, the more interest in hacking, which makes perfect sense. Now, this is a sociogram. We figured we got this information. Let's try to map out what the relationships look like based on individuals who express an interest in hacking. And so the little dots up here are nodes, or specific individuals. The lines or the connections between these individuals. And the size of the dot reflects an individual's interest. And so the higher size dots or the bigger dots, like the two yellow ones at the bottom, are indicative of higher interest in computer hacking. The smaller dots, less interests. And you can sort of see that the groups also seem to cluster around the interest. So they're in clumps in various parts of the sociogram. So it's very good. We use some simple sort of free social networking software available in the Net, Patrick. Yeah, Patrick. It's actually pretty good stuff. You should go snag it. I think it's by something in the Czech Republic. And, oh, actually Polish. Polish? Yeah, Poland. So to look at this, what's really important to note when you examine sociograms, it's important to see where individuals lie. And what this graph suggests is that individuals are really, really well connected. There's nobody who's isolated. There's no one just sitting out on the periphery. There's lots of dense connections between individuals. So this suggests computer hacking is an interest that ties the group together in quite a number of ways. When we break it out by risk, again, there's density of connections. The green represents unknown or no risk. The yellow is no risk. The orange is moderate risk. And the red is our high-risk individuals. And you can see that there's sort of two clusters of high-risk individuals. And you can see that again and again. And we'll talk just a little bit more about it in a couple of slides or so. And so because of the density of connections, we can say that despite their level of risk, people know each other. People are examining one another. Now, taking all the different interests that are out there, what we wanted to try to do is refine our categories a little bit further. So we listed everything out into a table and then split it down into 46 different categories. So alcohol, animals and wildlife, art, et cetera. And determined low-risk and high-risk. So if somebody's talking about malware, if they're talking about specific forms of hacking, ring zero, et cetera, that's more of a high-risk indicator than say heavy metal music, drugs and Bluetooth. So low-risk and high-risk interests. So to look at low-risk interests across group, Haxona has a lot of low-risk interests. Ruhack, Hellnites, Mazafucka have much less. When we look at interest by risk level, we see that the risk level two guys have a healthy amount but not as much as our zero risk. And our risk three are much, much lower. The risk three high hazard guys are really interested more in hacking stuff, not necessarily music or other stuff. And then to look at high-risk categories, we see the opposite, which is what we would want to find. We see that the Hellnites crew have a pretty high number of high-risk interests. Mazafucka, Ruhack, a little bit better. But some of the groups do not. When we look at risk by risk, so high-risk interests by risk level, this is a good thing to see. We see a real high trend up on the high-risk individuals. Perfectly sensible. So again, here's another sociogram, size reflecting how many high-risk interests individuals have. And you can see that, again, there's a pretty dense series of connections, but our green guys, our no-risk folks, are very, very tightly coupled, less so with our low-risk and high-risk individuals. Makes sense when we talked about the high-risk people sort of being sort of a very interconnected, interconnected, strong, dense network. And to look at risk based on density of connection. So here we're seeing where individuals are in the network based on the connectivity and who they talk to and how. And what's important to note here is where our red individuals are, our high-risk players, and then the distribution of others throughout this network. And what you can sort of see is, once again, that sort of classic split for these high-risk individuals. There's one set in the center there that's sort of a very core, very tightly packed. They have a lot of connections with each other and also with some of the other hacking groups. And then there's this other set of guys that are sort of off to the left. And they're much more sort of isolated and not quite as connected. And that's very interesting. It's sort of an economy that's come out in our research. And you'll notice that the green or the no-risk are sort of distributed throughout, somewhat evenly. And so they're throughout the network, but our higher-risk individuals are more closely into the center. You've got a couple of clicks, sort of here and here, of individuals connected to one another. So that's risk. What we wanted to do next is just get a general sense of social ties. And so this is the entire network based, color-coded, bi-group membership. And so we have the BH crew here in the middle. We have Ruhack dead center, well-connected to a variety of groups. And then up at the top, we have the Damage Lab guys, some of them very uniquely tying others together. The Hell Nights crew are up here, and they're just, to a degree, they're on the periphery. But we see this clustering of Zloy, of Cup Zoo. So the network is relatively dense. Everybody is tied to a degree to one another. And what you can sort of see is it's very, very dense within each of the groups and much less dense across the groups, which is important. And density is going to tell us something perhaps about how information flows. So because there's a lot of redundant ties here, we might be able to explain why so many things get recycled perhaps. Right, and really important to look at these communications channels because you want to figure out how's an exploit being passed? Where's it going to go to next? How's the information being transmitted? It's going to be transmitted through these social bonds. So, I'm sorry, can you say that a little louder? We are trying to dig down a little bit further. We've just been running through some of the initial predictive and network graphs, so we haven't really dug down into why some of these actors are on to a degree the periphery, but we do want to look at that further. Yeah, like these two individuals here acting as primary connectors, yeah, that is relevant, but we also do see some ties to others. So we might argue in social networking research that these couple of individuals here are hubs that do link some groups together, but there are also outside relationships that do matter, so there might be certain individuals who are more prominent within these networks than others. And sort of key communicators across the groups. But we need to look at that a bit further. Beyond network actors, we wanted to look at the strength of ties across groups, so how often and in what way these groups look at one another. And so what you can sort of see, what's really interesting here is you can sort of see the thick lines indicate stronger ties, the thin lines indicate weaker ties among the different groups. And what you see is damage lab at the top, which is basically reasonably strongly tied to many, most of not all the other groups. And one of the things that's sort of interesting is if you're a pursuit and prosecution guy and you're having a tough time getting into one of the groups, what you do is, well, let's take a look at these ties. Let's see another group. Oh, look, there's a strong tie between hell nights and damage lab, so I'll go find somebody in damage lab, I'll compromise them, and they'll lead me to the hell nights guy. So you can kind of see a triumph or an up here of damage lab, cup, zoo, and the hell nights, and then hack Zona having some pretty strong relationships. Everything on the left is a little bit less well connected than those on the right, and that's an important point. Couple of final slides in terms of popularity. These are mutual friends and the number of mutual friends that a person has. So the size of a node is indicative of how many mutual friends they have. So if you're big, you've got a lot of mutual friends. If you're small, you don't have as many. And so here, this is just by network group, and we're seeing in the middle here, the blue, we have Mazafucka, the reds, or the Ruhack individuals who are relatively dead center, and then Zloy over here. So we have some clustering. There are some individuals from the BH crew who are spread throughout the groups. We have the Damage Lab guys up here on the left. So there's some distribution that varies by group membership. And more important than anything else we think is where do these high risk actors sit and what's their popularity like? And so the color coding here, the fours are high risk people. You can see they're relatively centered in the middle of this network. They're tightly clustered, followed by the moderate risk individuals, although we have a couple on the periphery. And then the no risks are spread throughout. And once again, you sort of see that dichotomy for the high risk group, which is there's a central core sitting there, where in the center, and then these folks way sitting out on the side more isolated. So once again, the same evidence that points to persistent characteristic of the group. And so to look at this as a function of connectivity, the community itself appears to be looking toward these high risk people. They're well connected. They have pretty high levels of mutual friends, relatives, some of the others. And that's important because when we look at all of this data as a whole, we can make a couple of conclusions. We can make a few ideas about what this means for the broader hacker community, particularly this Russian community here. First and foremost, looking at blogs and social network sites can give you a massive amount of information, a lot to parse through. The problem, however, is that you've really got to drill down into this to actually find things of value, to find reasonable metrics in terms of risk and otherwise. Next, the Russian hacker community, just in terms of demographics within this population, appears to be in Moscow, St. Petersburg, a few other places, and the gender and education levels that we see, given there's not a lot of data about their higher institutions, et cetera. Perhaps these are individuals who've learned on their own. Maybe they have gone to a technical school. Maybe they haven't, and the gender additionally, there's a small number of high threat hackers in this particular network, given that we're looking at a wide variety of communities. Here, perhaps this might give some support to the notion that, relatively speaking, there's a smaller number of high risk actors than there are low risk within this community, within the larger community. And the redundancy that we see, the heavy ties across a lot of these groups, might be an indication of why certain things filter throughout the community more heavily. Why there's multiple iterations of pinch and lots of different releases of the same tool. Recycling happens a lot. Primary innovation might not be as common in tools. So the configures, the storms, et cetera, are going to happen. But secondary innovation, taking the stuff that's older and tweaking it in a variety of different ways, might happen more heavily within this particular network. Finally, given the high centrality of the high risk individuals, the community knows who's important, knows who to look at, and they're getting some information from those high risk folks. The moderate risk people are on the periphery, but are to a degree looking at them as well. Our group membership information might not be as strong a predictor as we initially hoped for, given the significant value of, say, the RUHAC group. Which really comes to sort of an interesting observation, which is, years and years ago, the community was a very strong meritocracy, where your status really was very tightly tied to the skills that you had in networking, hacking, computer operating systems, things like that. And what we're sort of seeing is it's not quite the meritocracy it used to be, and we're seeing a lot less sort of, a lot of more variety of skill sets inside a particular hacking group, which is sort of a shift from years before. And if we're going to take anything away from this in terms of do interests matter, is there any kind of key indicators to help us get at high-risk folks? First and foremost, general interests aren't all that helpful. Talking about drugs, talking about beer, lots and lots of people are doing that. But very specific interests are to a degree helpful. They can give us some indicators of who's going to be significant. Belonging to multiple groups is also going to be an indicator of risk. And centrality, even if you don't know who is a high threat, looking at the networks as a whole might help to indicate who's important. Because that really addresses one of the really big problems, which is there are a lot of people out there to sift through. And if you have to sift through them all, you're not going to be able to do it. You really have to sort of prioritize and concentrate on what look like the highest probability targets. All of this, however, is tempered to a degree by some of the limitations that we have. This is just a portion of the actors that are out there, given the thousands of people who appear to be enrolled in all these different forums and websites that are out there. There might also be some self-selection, some bias here in terms of live journal. Also, the no-risk individuals or the zeros within this network might actually be high-risk players that are operating very, very well below the radar. Right, so that sort of big chunk that you saw, the 70% of these sort of no-risk unknowns, probably one or 2% of them are probably the nastiest folks there. And they're really under the radar, and there's no information. And so if you're operating on a site that's got robots.txt turned off and you're really heavily involved in some nasty stuff, we're not going to find you just through a basic Google search. The other point to make is that we really need to mine through this data even further. We want to try to improve our predictive model if we can at all, and we want to try to gather some more data, if possible. And the last thought I'd like to sort of leave you with is basically every human behavior leaves an evidence trail. You might not have the sensitivity to pick it up, but it's there. And if you look hard enough, you may find it. So with all of that having been said, if you've got any questions, if you've got any comments, you all are awesome for having come to this talk because we really, really appreciate it. We thank you all for showing up. If you have any questions, you can hit me on email.