 In this set of slides, we will talk about data exfiltration in advanced persistent ref. We have mentioned when discussing APT that one of their goals is espionage. If you recall the APT lifecycle, once the APT has gained access and established its footsault in the target network, it will actively gather and extract information of interest. Such information can include intellectual property, customer information, and state secrets. Data exfiltration in APTs is often a structured approach, which makes sense given the sensitive nature and the value of the target information. Data exfiltration goes through the following or some of the following stages. First, the APT will take care of establishing access to staging servers, that is servers inside and outside the target network where the gathering information can be temporarily placed. The following operation will then be once the information of interest is available to copy it to the internal staging servers. Once enough information has been gathered, this will be aggregated, compressed, and encrypted. At this point, the data are ready for the real exfiltration, which entails copying data from the outside to the outside staging servers. Finally, data is removed from the staging servers to cover the tracks of the attack. There are several methods for exfiltrating data. These methods can be generally categorized in two orthogonal ways. First, the channel can be either overt or covert. Overt channels are channels that are in use also by other computers for transferring data between locations. HTTP download or FTP are some examples. A overt channel is likely to be monitored, but on the other hand, the advantage of using an existing channel is that the target network allows it to use. In a covert channel, information is intentionally hidden, for example by embedding it into another protocol as it happens in tunneling, for example. Both overt and covert channels can be encrypted to add an additional layer of protection against detection. Let's now have a look to one example of covert channel, namely DNS tunneling. Tunneling is the operation of embedding a protocol inside a different protocol. An example is to tunnel IPv6 traffic over IPv4, that is, by encapsulating IPv6 packets into IPv4 packets. In this case, tunneling is used for enabling connectivity in case a end-to-end IPv6 path does not exist. Tunneling is therefore a transmission mechanism and not per se malicious. In case of DNS, data can be embedded into DNS queries. DNS is a good candidate for data exfiltration, or for tunneling more in general, because it is in most cases not heavily monitored from a security's perspective. A large number of DNS queries leave daily the target network for the name resolution process, so it will be easy to sneak in some additional packets. Also, DNS tunneling does not mean using crafted packets with destination port 53. That exfiltration using DNS will make use of the regular name resolution process in practice behaving like regular DNS traffic. Let's have a look at how DNS tunneling works and how we can exfiltrate data over it. On a higher section level, a tunnel allows communication between the client side tunnel endpoint and the server side tunnel endpoint. Using the tunnel, data that are inputted at the client side can be received at the server side. Under the hood, the server side endpoint runs a DNS server that is authoritative for a forged domain. The client side endpoint will embed data in DNS query by manipulating the subnoun name part of the query. After the name resolution process, this query will be delivered to the server side endpoint, which will extract the data and use it. Let's keep another example of data exfiltration, steganography. Steganography is defined in general as the practice of concealing information within a different message. In cybersecurity, this could mean hiding information inside a file, image or video, for example. Steganography is different from cryptography because cryptography aims at hiding the meaning of the message, while steganography hides the message itself. Of course, steganography and cryptography can be combined to make the detection of hidden channels more complex. The question is then where a message can be hidden. Steganography allows an attacker to be creative and a message or part of it can be hidden essentially everywhere. In unused protocol fields, in redundant spacing files, among frames in a video or audio. Let's have a look about this last example. Voice over AP has been used in several ways for implementing a covert channel. The overall idea is to embed data into VoIPedStream in a way that the overall perceived performance is still good enough for the user not to notice that something is at hand. There are several reasons for choosing VoIP. For example, because VoIP session uses several protocols, therefore there are more options for hiding information and a higher chance to remain undetected. Also, VoIP can achieve a relatively high steganographic bandwidth, that is the bandwidth at which you can exfiltrate data. For example, with a voice stream of 50 packets per second, you can achieve a bandwidth of 50 bits per second just by adding one bit of information to each frame. Researchers have identified several techniques for VoIP steganography. Steganophony refers to adding additional data to voice payload, taking care that the overall quality is still okay. Also, as we mentioned already, data can be added to a news field in VoIP related protocols. Audio packets can be intentionally delayed, since packets that arrive late will be discarded by real-time multimedia protocols, but they can still be picked up by the covert channel endpoint. Data can also be exfiltrated by masquerading it as voice over AP traffic, by converting data octets in audio tones and faking a call. Finally, the hiccups technique inserted malformed packets in the VoIP stream, leveraging on the fact that those will be dropped at the destination, but again, they can be intercepted by the covert channel endpoint. As for APTs in general, it can be fairly complex to detect data exfiltration, mostly given the multitude of options an attacker can choose from to exfiltrate data. In any case, data exfiltration detection is one of the steps that will mostly take advantage from a shift of focus, namely from detection of incoming malicious activity to detection of outgoing malicious activities. A large spectrum of techniques can be applied when detecting data exfiltration. Network monitoring and host operation monitoring can identify anomalous data flow or host performing suspicious operation, like frequent encryption and compression of data. Digital watermarking can, for example, help in tracking data from forensic reasons. Also, with respect to mitigation, several techniques are possible. At the base of those, however, there is always the need to properly define security policies about who can access what data and if and how data should be encrypted. Some systems are proactively trying to deny access to data or they do not perform some operation if suspicious activities are identified. This could be the case, for example, of a mail server refusing to forward a suspicious attachment. Another example of mitigation is implementing self-protecting data, namely an additional software layer implementing corporate security policies that takes care that access to the data is available only to authorized entities.