 So last time we talked about the scientific method applied to digital investigations. Today we're going to talk specifically about digital forensic investigation processes, and they are a little bit different than traditional investigations, whereas the scientific method applies to any type of investigation you're trying to do. So, whenever we're thinking about investigation procedure, we have to think that every case is different. Standard operating procedure for each case type is a little bit difficult to create because, again, certain types of cases will be different than others. For example, child exploitation and hacking will be very, very different traces, investigation processes, investigation or data sources that we'll be looking at will be very different. So, every case is different. That means that making general, or should I say specific rules for every type of case becomes a bit of a problem. And that's where your expertise comes in, knowing how to investigate certain types of cases. So procedure will be based on laws in your jurisdiction. Every jurisdiction, every country will have different approaches or acceptable processes to use, but for digital investigations, because computers are basically the same everywhere, most countries are very similar in the way that we approach or the acceptable procedures that are in place. So standards of evidence are usually driven by requirements of judges, like we talked about before. The evidence that you derive and what that evidence looks like, the forms and formats of the evidence that you can submit to court are potentially have different requirements and different jurisdictions. So just be aware of what your country does and how your country wants the data to be presented to courts. So many investigation methodologies have been proposed. The one we're going to be talking about today is from the Digital Forensic Research Workshop, and it was proposed in 2001. It was essentially the first time that digital forensic science was actually being defined by the academic community as well as practitioners. It's one of the older, more well-accepted methodologies, but there are a lot out there and they're all quite similar, just with different specific features. So a very basic procedure for digital investigations is first off, acquire data without altering or damaging the original. So we want to make copies of the data from the suspect copy without changing anything at all. Now, this is pretty easy from an investigation point of view because with computers, computers are very good at copying data exactly as it should be. So the first thing we have to do is make a copy or duplicate of the suspect's data and verify that it actually is the same. I'll talk about how to make that verification later. And then verify recovered data is the same as the original. So we work with copies of the data instead of the originals. And when we do those analyses, we can extract some information and verify that the data is actually the same as what is on the suspect system. Again, going back to auditing and third-party investigators verifying what you've actually done and whether you handled the data properly or not. Then we want to analyze the data without modifying it. So once we do our analysis, we are interacting with the copy of the suspect's data. We don't want to modify any of that suspect data or the copy of the suspect data while we're analyzing it. Our analysis procedures might change information and we want to make sure that that doesn't happen. And then we want to clearly report our findings. So generally most countries accept that if you can acquire the data without altering it, you can verify the recovered data is the same as the original and you can analyze data without modifying it and clearly report your findings. Most countries accept at least a general version of that. So for the Digital Forensic Research Workshop investigation procedure, the first step is identification. So identify first off that a crime or event has actually taken place. And this is very difficult with investigators or investigations. Most types of crimes are actually reported by victims. So police detect quite a few different types of crimes online, offline, and a patrol officer might walk by and see a crime taking place. But they only actually detect a very small portion and victims themselves come and complain to police much more often. However, not all victims make complaints. Not all people know that they were victimized in some way. So investigators actually don't see as many crimes as there are. We actually see a very small amount. So we need to figure out during the identification phase, first off, that a crime has actually taken place. We've detected the crime. It might be a complaint for digital systems. We might have an anomaly detection system or an intrusion detection system actually tell us that something has happened. And a lot of investigations start through auditing. Auditors find that some money went missing or something like that and investigators start their case at that point. So first, we have to identify that some crime has actually taken place. That's exactly like traditional investigations. Next is preservation. We need to preserve the digital crime scene. So think about the computer itself as the crime scene. The crime took place either using the computer or within the computer we can kind of say. So we want to make sure that we preserve that data just like a picture. Take a picture or a snapshot of the state of that computer and that snapshot is what we will actually do our analysis on. So we have to think about a lot of different things. Is the computer currently on? Is the device currently on? While computers and devices are on, they're constantly changing. So how can we best preserve the data in a system that's constantly changing? There's a couple different techniques we'll talk about whenever we talk about live data forensics. How do we preserve case-relevant data? How do we identify what data is most likely to be case-relevant? And that goes back to information gathering and hypothesis testing, essentially, to find out what data or what information is going to be related to this hypothesis. And then we also think about who is taking custody of physical or digital artifacts. Again, starting chain of custody, like we talked about before, once I collect this device, I have to record first off that I had possession of it and I need to know who had possession of this device every step of the way until we get to court. So preservation, making sure that we're actually collecting the data and ensuring that it's not changing through the entire investigation process. Next is collection. So collecting both the suspect hardware and the data. So a lot of systems now, well, a lot of devices, there are a lot of devices now. So we want to make sure that we're actually collecting the hardware and the data and all of those things are complete. We also have to think about then if we're collecting physical devices like phones, like tablets, like computers, do we actually have legal authority to collect those devices? If we walk into somebody's house and we just take their computer, we're stealing. Even if you're a police officer and you do that, you're stealing unless you have authority to actually collect that device. So think about who gives you that authority. It's usually something like prosecution, judges, etc. Where do we actually get that authority to collect physical devices from? Normally it's given by some sort of warrant and the warrant says our scope, essentially, of what we can take. So next is what is the scope of the warrant? So are we actually allowed to take mobile devices or just computers? Can we take the mice? Can we take the keyboard? What is the scope of the warrant? And what are we limited or how are we limited in that warrant? And that depends on how you actually write the warrant itself. Volatile data. Volatile data has become much more relevant within the last, let's say, 10 years. We now have very good ideas about how to collect usually RAM or encrypted hard drives and things like that whenever the computer is on. So how do we actually collect the data from a computer that's on whenever that data is changing constantly? And how do we ensure that we can verify the data is the same as the original if it's constantly changing? We'll talk about that. Static data or post-mortem analysis data or post-mortem acquisition is much more straightforward because the data is not changing. So we just make copies of data that's not changing. It's much easier. We'll talk about that as well. And data reduction. Now computers and devices have a lot of storage space, which means they have a lot of data on them. And we need very, very fast, efficient ways to try to reduce the data that's not relevant to our case so we can focus on only the information that is relevant. During the collection phase, we're also labeling and documenting all hardware cables and connectors. And I'll link to more information about how we can do that labeling. And we want to verify collected data. And we'll talk more about this as well. Whenever we're verifying collected data, we want to have a witness document actions of the evidence custodian. So whoever's actually collecting the physical computer or the device, someone should be documenting what they're doing, why they're doing it, what time they're doing it, all of that information. And we can verify collected data using something called hashing, which we'll do hashing in a lab very soon. Next is examination. So once we've actually collected the data, we've acquired it, we have copies of the original. We want to examine the collected hardware and data. So we do not want to modify any of the original. Again, we want to verify that the data is exactly the same as the suspects. And we want to extract or let's say interpret all of the data and get it into information form. So like we talked about in the first lecture this week, converting the data into human readable information that we can actually make some conclusion or some, we can learn something from essentially. So this kind of translation process. A lot of that translation process can be done automatically with forensic tools, but not all of it. So we also need to understand how to do that, some of those things manually as well. Some investigators like to use examination checklists to make sure that they've covered the main kind of topics during investigations, especially for newer investigators. My recommendations for checklists, if you're going to list, for example, child exploitation cases, what things should we be looking at? First off, don't make checklists too specific because again, every case will be a little bit different. Next, don't write on the checklists because it can be considered as case notes and it may be admissible. We don't necessarily want our checklists to be admissible to court. Otherwise they might question why we did this and not something else. So we don't necessarily need to include those. And maintain chain of custody, like we've talked about before, chain of custody knowing where the data is and who had access to the data the whole time. That needs to be very well documented and we will also talk about that in a second. Examination phase requires preprocessing to get the data into a manageable form or a form that we can actually understand. We use a lot of filtering techniques. So all of the data we're trying to filter everything out that's not relevant to us and focus on only the things that we're interested in whenever we're proving or supporting a hypothesis. We use a lot of pattern matching. So searching for words, we want to find certain words but not other words, right? So we need to do a lot of pattern matching and we'll talk about how to do that. We do a lot of hidden data discovery and a lot of hidden data extraction or deleted data extraction. For digital investigators, common tasks are keyword searching, recovering deleted information and just analysis of the information that already exists. So basically parsing all of this out and turning it into information that we can use to prove different hypotheses. Once we've extracted or we've examined the data and converted it into information, a form of information that we can now understand, we need to analyze the data. So this is where the actual human brain comes in. We need to figure out what does this information actually mean in terms of the question that we're asking? Does it support or deny some hypothesis that I'm investigating? So we analyze extracted information, figure out what does the information mean in terms of the case that we're looking at? How does this information relate to given hypotheses and how does the information relate to the overall question being asked? Again, context is very important. If I have one small piece of information and I don't have the context in which it was found, I might come to false conclusions. So we need to understand not only this particular piece of information but everything else related to it as well. And like we talked about before, testing one hypothesis will likely or most definitely lead to asking more questions. So hypotheses, once you find something you're going to be asking a lot more questions along the way, and that's the investigation procedure. Analysis, so data must be analyzed in context like I talked about. Think about an example where we find... we look at all the images on a suspect's computer and we find maps of soul. We find pictures of guns. We find pictures and maybe instructions on how to make bombs, things like that. What kind of a conclusion are you going to come to? If you find soul and guns and bombs, you're probably going to think, okay, this is probably some terrorist plot. Whoever's computer this is is trying to blow something up or do something horrible. Or maybe they were just looking at the news and it was talking about terrorism and they're also interested in visiting soul. So if we don't have the proper context, we can jump to conclusions. And it's very easy to jump to conclusions like that just looking at those things right next to each other when actually maybe it was one or two news stories that they were just reading. So we have to think about the computer as a whole, not only focus on one particular piece of information. So analysis methods, there's basically three types and this is also true for traditional investigations. The first off is relational analysis. So relating people, places, things and also link analysis. Basically trying to link multiple people together to make networks to figure out how are these people connected? How is this person connected to a phone number? What numbers did that phone number call? Things like that, trying to relate things together. This is a very powerful method to give people a lot of information relatively easy because we all quite easily understand relations between people and objects. So it's also very easy to show a jury or a court. Functional analysis could be like how a system or application works and how was it configured? We use this a lot for things like malware analysis. We want to understand how is this virus working? What was it doing? What are the functions that it could have done? So functional analysis, basically how things work. Then temporal analysis, looking at the timeline to identify patterns or gaps in time. This is actually one of the ones that I use very, very often. If we look at all of the timeline of the computer, we can find out when people were the most active, what they were doing at certain times, and eventually you can start to understand when people have been modifying times to try to make it look like they're doing other actions. And then finally, our next preservation. Results must be communicated clearly. So anytime we're talking about investigation, no matter how good you are at investigating, if you don't present your results well, no one will understand you. So presentation is by far the most important, but it's also potentially the most boring phase, but we still have to do it and we have to do it very, very well. So we want to summarize the key results first in some sort of executive summary. We want to restate the details of the investigation later in the case report. It should be very comprehensive about all actions taken, what information you found, and how you think that information relates to the actual case. There should be a factual basis for all conclusions, and we should show support for our conclusions and present evidence as much as possible. Use the evidence to guide the reader to some conclusion. So whatever evidence you have, just show it in terms of kind of like a story and let the person come to their own conclusions about the evidence you're presenting. Usually they'll come to the same conclusion as you. Ensure your documentation is detailed enough that another forensic analyst could do the analysis and come to the same conclusions as you. You should be able to essentially follow the investigation procedure that you've done and come to exactly the same results. A visual timeline or link analysis can help demonstrate the results. So showing these links between people and objects and credit cards or phone numbers or whatever, showing that very visually will help people to understand much easier than trying to explain it. And then finally, the decision process. Once we've actually made our presentation, we've gone through this whole collection analysis, all of this process and presented it to our bosses. It's usually our bosses that have to make some sort of decision about whether we need more investigation or whether we can just... or whether our investigation is essentially finished. If it's enough for courts, if they think they can get either a prosecution or the suspect is potentially not guilty based on our analysis, then it's enough. Otherwise, we need to do more investigation. So that's it for today. Thank you very much.