APT Detection Indicators – Part 3

APT Detection Indicators – Part 3: Command & Control Channels

APT Detection Indicators – Part 1
APT Detection Indicators – Part 2

When securing a network most organizations are more concerned with controlling inbound traffic than outbound traffic. However, outbound traffic is a significant risk that is used by malware and targeted attackers as channels for Command and Control (C&C) as well as Data Exfiltration.

APT Detection Indicators - Part 3

Understanding C&C and C&C channels is critical to effectively detect, contain, analyze, and remediate targeted malware incidents. Malware allows attackers to remotely control computers via C&C channels using infected computers. These activities pose a threat to organizations and can be mitigated by detecting and disrupting C&C channels on the network.

This APT Detection Indicators – Part 3 blog describes, as follows:

  • Risks associated with Outbound Traffic
  • Typical Command and Control Channels
  • Techniques used to circumvent controls
  • Methods for detecting and preventing evasion techniques

There is no way to eliminate all risk associated with outbound traffic short of closing all ports since attackers are very creative in hiding their activities testing for available protocols to tunnel and leveraging various obfuscation techniques. However a good understanding of the techniques and risks should enable organizations to detect abnormalities (also see: APT Anomaly Detection) and make informed decisions on improving and fine tuning egress policy.

It is vital to practice heightened operational awareness around critical data and assets. Organizations should segment and wrap critical data within the deeper protection of well monitored infrastructure (also see Adaptive Zone Defense). In addition, layered defensive tactics (multiple layers and means of defense) can prevent security breaches and, buy an organization time to detect and respond to an attack, reducing the consequences of a breach.

A Recap on Malware

Malicious software, also known as malware, has existed for almost as long as computers have been around. A lot of effort has been put into stopping malware over the years but malware still remains a growing pandemic. Everyday, a huge amount of malware is released.

Botnet Army

Command and Control Channel Establishment

Botnets

Botnets consist of computers infected with malware which are called bots. These bots connect to a C&C infrastructure to form a bot network or botnet. The C&C infrastructure allows the attacker to control the bots connected to it. Bots can be instructed to steal user data, (financial) credentials or credit card details from the infected computers. A large group of bots can be used to perform a Distributed Denial of Service (DDoS) attack and bring down a server. Criminals also sell bot access to other criminals.

Targeted Attacks

In the case of a targeted attack the attacker wants to infect a specific organization. This is quite different from the regular botnets described above, where the criminal is not interested in which machines they infect. The goal of a targeted attack can be to steal certain data from the target or sabotage target systems.

This is achieved by infecting one or just a few computers with malware which contacts a C&C server. The C&C server allows the attacker to remotely control the infected computers. The control functionality can be used to infect other computers or search for documents the attacker is interested in. After the data of interest has been found the attacker gives instructions to exfiltrate the data. The exfiltration usually happens via a channel separate from the C&C channel.

Detecting targeted attacks is much harder than detecting untargeted attacks. The malware is only sent to a few targets, making anti-virus detection unlikely, as antivirus vendors are unlikely to obtain a sample of the malware. Detecting the C&C traffic also becomes harder as Intrusion Detection System (IDS) signatures for malware are unlikely to be available and the C&C infrastructure is less likely to appear on any blacklists.

Simple malware may be caught by sandboxes, they are useful pieces in Solving the APT Defense Puzzle. But in the case of targeted attacks the malware authors test their attacks before releasing them. Thus, it becomes more difficult to detect, classify, and attribute APT threats via sandbox-based methods. Thus, detection of targeted attacks relies heavily on heuristics or human inspection as the last line of defense.

Malware C&C Network Protocol Usage

Command and Control channels can vary widely in their complexity. The control infrastructure can range from simple HTTP requests to a malicious domain to more complicated approaches involving the use of resilient peer-to-peer technologies that lack a centralized server and are consequently harder to analyze. A small group of malware uses TLS to encrypt (some of) their communication. It is interesting to note is that almost all of the TLS traffic is described as HTTPS traffic. Furthermore, most of the known samples fail to complete the TLS handshake. This may indicate that the malware does not actually implement TLS, but merely communicates on a port which is normally used for TLS connections which is very typical.

APT CandC Example

Advanced Threat Actor using C&C Channel Example

C&C Channel Detection Techniques

The following are some examples of C&C channels and the techniques used to detect them. We will explore this topic in greater detail in future blogs together with the use of open-source tools.

Blacklisting

A simple technique to limit access to C&C infrastructure is to block access to IP addresses and domains which are known to be used by C&C servers.

Signature based

A popular technique for detecting unwanted network traffic is to use a signature based Intrusion Detection System (IDS). The advantage of signature based detection is that known bot traffic can be easily detected if malware researchers have created a signature. The disadvantage is that bots are often obfuscating or encrypting their traffic which makes it much harder or even impossible to write a signature.

DNS protocol based

Malware needs to know the IP address of the C&C infrastructure to communicate. This address can be hard-coded or it can be retrieved from a  domain name. Using a domain name provides more flexibility as it allows the attacker to change the IP address easily. The infected computer doesn’t even need to have outbound connectivity. As long as it can resolve the host name through a local DNS server that performs recursive lookups on the Internet. DNS has been involved in two recent large-scale breaches that resulted in the compromise of millions of accounts.

Network administrators should look for, as follows:

  • DNS responses which have a low to very low TTL (time to live) value, which is somewhat unusual
  • DNS responses which contain a domain that belonged to one of a long list of dynamic DNS providers
  • DNS queries which were issued more frequently by the client than would be expected given the TTL for that hostname
  • DNS requests for a hostname outside of the local namespace which were responded to with a resource record pointing to an IP address within either 127.0.0.0/8, 0.0.0.0/32, RFC1918 IP space, or anywhere inside the public or private IP space of the organization
  • Consecutive DNS responses for a single unique hostname which contained only a single resource record, but which changed more than twice every 24 hours.

Maintaining a DNS server and C&C server at a fixed address increases the chance that it will be taken down. Therefore, attackers have started using fast-flux domains. These are domains for which the owner rapidly changes the IP address to which a domain points and, optionally, the IP address of the DNS server as well.

IRC protocol based

First generation botnets used Internet Relay Chat (IRC) as a channel to establish a central command and control mechanism. They connect to the IRC servers and channels that have been selected by the attacker and waits for commands. Although the IRC botnets are easy to use, control and manage, they suffer from a central point of failure.

Peer to peer protocol based

To overcome the IRC issue, peer to peer architecture is used in the second generation of botnets where instead of having a central C&C server, the attacker sends a command to one or more bots, and they deliver it to their neighbors. Increasingly the peer to peer (P2P) protocol is being used for C&C channels.

Examples include Zeus v3, TDL v4 (Alureon), and ZeroAccess. A roughly 10x increase in the number of malware samples has been observed using P2P in the past 12 months.

P2P C&C channels are often easily identified by DNS, reverse DNS or passive DNS as they generally do not try to hide – unless they are malicious. Typically all members of a malware P2P swarm have been compromised with the same malware. Detect one and you will quickly identify hundreds of compromised assets.

HTTP protocol based

The second generation implementation leveraging a P2P botnet is difficult and complex. Therefore, attackers have begun to use the centralized C&C model once again, using the HTTP protocol to publish the commands on certain web servers.

The vast majority of malware examined is using HTTP as the C&C protocol. According to Mandiant 83% of all backdoors used by APT attackers are outgoing sessions to TCP port 80 or 443. However, only a few samples use TLS to communicate with the C&C server. All of the TLS malware allows connections to servers with invalid certificates. If the servers indeed use invalid certificates this property could be used to detect these use cases. Similarly, the double connection attempt in the case of an invalid certificate might trigger detection.

The majority of the examined malware uses HTTP based C&C channels. The HTTP requests generated by these malware samples are usually GET  requests with a spoofed User-Agent. Where the majority of malware spoofs the User-Agent of the installed Internet Explorer version. Thus, detecting spoofed User-Agents might provide a method for C&C channel detection.

Here are some indicators that can be used to detect C&C channel sessions simply by passively looking at network traffic:

  • The certificate isn’t signed by a trusted CA
  • The domain names are random (i.e. don’t really exist)
  • Validity period is stated to be exactly one month

Temporal based

A bot regularly has to send traffic to the C&C server in order to able to receive new commands. Such traffic is sent automatically and is usually sent on a regular schedule. The behavior of user-generated traffic is much less regular, thus bots may be detected by measuring this regularity

Anomaly detection

Anomaly detection is based on the assumption that it is possible to build a model of legitimate traffic content. Anomaly detection of network traffic can be a very powerful tool in detecting command & control channels. Unfortunately, to be most effective the baselining (defining what is “good” about the network) should take place before the first compromise. However, some forms of anomaly detection still add tremendous value:

  • Develop a quick set of signatures to ensure that each TCP session on port 80 and 443 consists of valid HTTP or SSL traffic, respectively. Use a tool such as FlowGrep, or review proxy logs for failures. This would be a useful exercise in general for all traffic that is not relayed through an application proxy, and is not blocked from direct access to internet resources.
  • Persistent connections to HTTP servers on the internet, even outside regular office hours should be exceptions not the rule, so valid exceptions can be filtered out, making this a potent mechanism to identify compromises. Is the attacker operating from the same time zone as your organization?
  • Persistent requests for the same file on a remote web server, but using a different parameter can indicate data smuggling over HTTP.

Correlation based

One method to reduce the number of false positives for bot detection is to require several correlated events before raising an alert. This allows the  system to use events which by themselves have a high false positive rate. However, by requiring multiple events the system is able to filter out most false positives. The events may be correlated for a single host or for a group of hosts.

The advantage of using correlations to detects bots is that there are fewer false positives compared to using just the individual events. At the same time, this can be a disadvantage because stealthy bots, which generate just one or two events, may not be detected.

CC Channel Detection

C&C Channel Detection Techniques

Social Networks

In order to defeat social network-based botnets, organizations must think ahead of the attackers. Regardless of the channel, provider, or account, social network messages are in text. Thus, if malware wants to use social networks for their C&C, they would encode their commands textually. Just like legitimate messages may include web links, so might C&C messages (e.g., links for downloading payload).

Web-based Attack/Detection Characteristics

By using an HTTP connection as a communication channel, a web-based malware attack can avoid detection by a firewall and increase the threat of the attack. One of the attack characteristics is its small traffic signature, which also fits perfectly well within the normal traffic flow. Since most firewalls do not filter HTTP traffic, it is therefore not easy to detect any abnormal behavior.

In addition, the fast-flux domain technique allows a fully qualified domain name (FQDN) that points to multiple IP addresses. These IP addresses can be scattered all over the world, making a malicious domain difficult to be tracked and analyzed. Attackers can make a fast-flux domain constantly associate with various IP addresses.

However, a fast-flux domain requiring numerous IPs is a useful characteristic. Detection of fast-flux domain techniques together with the use of connection regularity can be used as the basis for web-based detection. In addition to enhancing the accuracy of detection, it can be used also detect different types of botnet/malware.

Conclusion

By using the results of malware analysis to hone C&C channel detection capabilities, an organization can begin remediating a malware incident. Any identified C&C channels serve as helpful indicators of compromise (IOCs) that can be used to detect other instances of the same or similar malware. IOCs related to C&C include domain names, IP addresses, protocols, and even patterns of bytes seen in network communications, which could represent commands or encoded data. Matt Jonkman’s team regularly publishes updated signatures for known Command and Control channels. If setting up such a system sounds like a bit of work, have a look at BotHunter.

CnC Detection IndicatorsComing Soon

In APT Detection Indicators – Part 4 we will add details to this introduction to C&C Channel detection techniques as well as integrate with the prior introductory APT Detection Indicators – Part 2 discussion of free and open source tools (FOSS) with some hands-on examples developing and using Indicators of Compromise. While the likes of Security Onion, as good as it is, doesn’t provide the same level of functionality one might expect from a commercial product, it still offers certain custom features inherent to those products.

Many commercial vendors are now supplementing detection and alerting with visualization techniques. More and more FOSS tools have been meeting the needs of security visualization practitioners for years. Security Onion includes Squert which in turn makes use of AfterGlow and the graphviz libraries to provide on demand visualizations of captured traffic. Making use of the premise of an attacker scanning from a beachhead host (laterally pivoting), related scanning traffic from the pivot host then presents itself in a tidy visualization.

Thanks for your interest!

Nige the Security Guy.

APT Anomaly Detection – Part 1

APT Anomaly Detection – Part 1: Eliminating the Noise

The rapid discovery of a breach is key to minimizing the damage of a targeted attack. Context-aware anomaly detection improves an organizations security accuracy and efficiency by bringing relevant suspect events into focus and thus helps eliminate distracting noise.

APT Anomaly Detection

Improve security analyst efficiency, reduce operational overhead and cost by eliminating noise

In APT Anomaly Detection – Part 1 we present a primer on the various options for Network Behavior Analysis as a complement to other core technologies and tools, adding to the capability to detect and investigate targeted attacks. The series then digs into and focuses upon improving the accuracy of events through triage to improve detection precision as well as eliminate the noise.

Signal to Noise Ratio

It’s a known fact that a lot of time is typically wasted on analyzing false positives generated by technology that is not correctly baselined, customized, tuned, optimized. Depending upon the environment, false positives can often be numerous and very difficult to verify, costing analysts a valuable time determining whether or not something is an event the analyst should be worried about.

Signal to Noise

Security Event Signal to Noise Ratio

Organizations today are exposed to a greater volume and variety of network attacks than ever before. Adversaries are exploiting zero-day vulnerabilities, taking advantage of risks introduced by cloud and mobile computing, and applying social engineering tactics to compromise user accounts. Advanced attackers are both patient and clever, evading detection at the network level. Security professionals wrestle with efficiently detecting these threats and effectively resolving them.

Reportedly Neiman Marcus experienced 60,000 alerts during their latest breach and Target was flooded with alerts. In both cases, the alerts failed to generate proper action. Relying on a tool (or tools) for alerts is useless if it generates too much noise and not enough signal. Too many alerts without the proper context fail to guide the right response.

Insider attacks are on the rise. To monitor and act on internal abuse, as well as comply with data protection regulations, organizations need to tie network security events to local systems and user credentials. Correlating threat information from intrusion prevention systems with actual user identities (logged on to local systems) allows security professionals to identify breaches of policy and fraudulent activity more accurately within the internal network.

Context-Aware Security

Traditional defenses, such as signature-based anti-malware tools and stateful inspection firewall technology, are less and less effective against new threats, they have no knowledge of applications in use, normal traffic patterns or user activity in the context of a network’s normal behavior patterns. New approaches to security, such as those focusing on context awareness and security intelligence, will provide the next generation technology to cope with evolving threats.

Inside IT: Context-Aware Computing

Leveraging Context-Aware Security

Context-aware security is the use of supplemental information to improve security decisions at the time the decision is made. The goal? More-accurate security decisions capable of supporting more-dynamic business and IT environments as well as providing better protection against advanced threats.

If possible, all information security infrastructure must become context-aware – endpoint protection platforms, access control systems, network firewalls, IPS systems, security information and event management (SIEM) systems, secure web gateways, secure email gateways, data loss prevention (DLP) systems, and so on.

The shift to incorporate “application awareness”, “identity awareness”, “virtualization awareness”, “location awareness”, “content awareness” and so on are all facets of the same underlying shift in information security infrastructure to become context-aware.

Why Context-Aware Security is Needed

To understand contextual security, organizations should understand the signature of a typical attack. A common type of APT attack involves embedding Trojan horse code in PDF documents delivered as an email attachment. When the unsuspecting email recipient clicks on the attachment, malicious code is unleashed, but it doesn’t immediately execute, delaying until any antimalware program is no longer watching. When the Trojan does finally execute, it discretely begins collecting data and sending GET requests to commonly visited sites to test network connectivity. If it detects an active network connection, the Trojan initiates a status beacon message to a command-and-control node.

APT Attack Kill Chain 2

The Signature of an Advanced Targeted Threat

As malware authors continue to introduce new antivirus evasion techniques, organizations must learn how to detect attacks that have slipped through the net and are living on the network. As the Mandiant APT1 report illustrated to the security community, attackers are capable of staying inside an organization’s network for years if organizations lack robust measures to detect and remediate attacks.

Network Baseline and Behavior Analysis

Network Behavior Anomaly Detection (NBAD) techniques were originally developed to monitor network traffic thresholds for shifts or changes that could indicate attacks or signal specific issues with systems. Over time, NBAD evolved into Network Behavior Analysis (NBA) which focuses on the establishment of a comprehensive network baseline. This overall baseline is then continually monitored for deviations or exceptions that may alert an analyst to an issue.

Behavior-based Anomaly Detection

Behavior-based Anomaly Detection

There are three main components of a network behavior monitoring strategy for use in information security and network operations:

  • Traffic Flow Patterns and Data: Network flow data such as NetFlow, sFlow, and Jflow.
  • Network Performance Data: Simple Network Management Protocol (SNMP) events and Quality of Service (QoS) data for system and network performance.
  • Passive Traffic Analysis: Passive analysis tools can continually monitor traffic for protocol anomalies, tunneled protocols, use of non-standard ports and protocol field values, etc.

Ultimately, these tools can also provide a much higher degree of visibility into what systems and applications are communicating with one another and where they are on the network, which provides intrusion prevention systems with much needed environmental context.

Forensic NetFlow and IPFIX analysis tools are ideal security layers with which to detect and investigate APTs. Network flows provide a complete account of all network activity both at the perimeter of the network as well as the network core. Advanced flow analysis solutions trigger alarms by monitoring for suspect behavioral patterns within the network flows. Identifying suspicious traffic patterns involves automated correlation of different types of contextual information then, deciphering the intent and danger associated.

One of the best ways to detect if internal hosts are communicating with other external APT launch points, is to compare NetFlow data to a host reputation list. By sending NetFlow from the Internet facing routers to a NetFlow collector that can compare all flows to the host reputation database, internal machines talking with known compromised Internet hosts, can be identified.

Getting started with Contextual Security

In order to combat these increasingly common scenarios, organizations must implement four lines of defense.

Rule Sets: Usually in conjunction with an intrusion detection system such as Snort.

Formulating effective rule sets is a fundamental portion of the contextual approach to network security. Rule sets are typically reactionary since they are usually only formulated after an attack vector has been identified but are still an important tool.

Also see, the APT Detection Indicators series which discusses Indicators of Compromise (IoCs) that can be used to develop and correlate rules.

Statistical Correlation: Utilize statistical and correlation methods to analyze the latest trends in malware.

This is the key that ties all of the other methods together since it meshes rule sets, log examination and data exfiltration monitoring. Correlation methods are used to examine whatever alerts are currently configured and to look for relationships between each alert that is triggered. These relationships can be with regard to type of alert, port number or any other type of selector configured by the security analyst. Statistical methods do not rely on prior knowledge of an attack vector, but rather on the time and frequency of a set of alerts.

Monitoring: Monitor for unusual data exfiltration attempts.

The most important portion of a context-aware security paradigm, examining and blocking data exfiltration attempts is the last line of defense when attempting to combat APT attacks. It is important for an organization to know what should and should not be leaving the network.

Log Analysis: Strongly emphasize the need to manually examine logs.

Automating log reviews with tools such as Splunk is a popular technique, and when operating in a highly trafficked network, automation is indeed a necessity. However, when attempting to discover new attacks against a network, nothing is as effective as human observation and intuition. Human intuition, along with informed experience should alert the security administrator to any site that looks suspicious, which could then spawn a new network monitoring rule to block that avenue of attack in the future.

Context Reduces Noise

As attackers become better at hiding out on networks, organizations need to be aware of the context surrounding security events to better sniff out APTs and reduce the noise. This means setting up the right kind of alerts based on Indicators of Compromise (IoCs) as well as previous attack vectors and correlating the information between triggered alerts. Most importantly this means having some human eyes monitoring data leaving the network and looking over logs to become familiar with the network and spot interesting traffic that may not be coded yet as triggers.

If an organization cannot connect all the dots across its network, it will be unable to fend off a new breed of persistent, stealthy malware. The organization needs to consider is this something that they build and operate in-house since security is mission critical?, whether they partner with consulting or a service to co-source both monitoring and skilled resources in a Hybrid SoC? or, outsource completely to a managed service since security is just not a core competency – although this needs strong process integration in terms of contextual awareness of the internal business operations as well as strict SLAs to ensure preparedness to respond.

Conclusion

Protecting an organizations data from APT invasion is an ongoing and daily task. Vigilance and healthy paranoia is a good defense against the possible insurgence. Many experts combating APTs suggest that organizations always be on the alert, that is assuming an APT is always present or already underway and to operate defensively rather than passively. Use Red Teams (see: APT Red Teams) to keep skills current and hone capabilities and response.

Holistic Logging

Improving communications visibility with evolving contextual anomaly detection is one of the best ways to detect internal malware that has circumvented the traditional firewalls. Many APTs have no trouble sneaking right past even the best security appliances, however, they have a habit of exhibiting the same suspicious behaviors, see Defensible Security Posture for details on the signature of an APT and Cyber Kill Chain.

In APT Anomaly Detection – Part 2 we will expand upon the above topics in more detail as well as discuss the options to add contextual sources, as well as fine tune and improve detection precision to improve analyst efficiency and reduce operational overhead and cost. This post is complemented by the APT Detection Indicators blog series which discusses Indicators of Compromise (IoCs) as well as useful open source tools and techniques to detect APTs.

Thanks for your interest!

Nige the Security Guy.