A Reality Check on Firewall Visibility
One question which I love to ask next-generation firewall (NGFW) and intrusion prevention system (IPS) administrators is whether they have seen a gradual decline in their deployments’ security efficacy over the last few years. Most answer this question with a resounding “yes,” and then wonder how I knew. With over 90% of Internet traffic being encrypted with Transport Layer Security (TLS), and most intranet applications not being that far behind, this is not exactly a tough nut to crack. From URL filtering to malware detection to IPS signatures, all advanced network security appliances rely on deep packet inspection (DPI) and full application message reassembly to detect and block prohibited or malicious content. As TLS encryption becomes the default mode of network communication between clients and servers, this is something that quietly happens to the NGFW and IPS operators without ever experiencing any changes at the device configuration or policy levels. The only way to regain full network-level visibility into the traffic flows and deliver a reasonable threat protection efficacy is to enable TLS decryption prior to or at the security device level.
Opportunistic Decryption
There are two typical deployment modes for TLS decryption: inbound and outbound.
The inbound one is relatively straightforward since the network security device in front of a server possesses the latter’s private key and an associated certificate. Therefore, an NGFW or an IPS device in the middle of the flow will appear to the client just like the legitimate server, thus enabling full TLS decryption and DPI capabilities. This allows maintaining a desired security efficacy score with all the DPI features, albeit at a lower performance level. While some modern security appliances leverage state-of-the-art hardware components to significantly accelerate TLS decryption, an overall throughput degradation of 50-70% from the cleartext inspection use case is not uncommon. If aiming to achieve a high NGFW or IPS efficacy in the modern world, one must sacrifice some performance for visibility.
The outbound use case is trickier since the security device must spoof and resign the server’s certificate by a private Certificate Authority (CA) – since no sane publicly trusted CA would issue an intermediary CA or a wildcard certificate to a random edge security device, at least not for long. Then one must either get this private CA’s identify certificate into each managed client’s local trust store or expect the clients to constantly accept the browser’s nagging security warnings. Many mobile device and Software-as-a-Service (SaaS) cloud applications use TLS mutual certificate authentication or public key pinning, which both break outbound decryption on a transit security device. This makes DPI highly impractical in extranet edge deployments with many unmanaged clients or lots of undecryptable SaaS traffic. However, NGFW and IPS solutions like Cisco Firepower Threat Defense (FTD) have some tricks up their sleeve to gain visibility into TLS traffic without going down the dark path of full decryption.
Seeing into TLS 1.2
When a client opens a TLS 1.2 connection to a server, it may include a special cleartext extension which is called Server Name Indication (SNI) into the initial ClientHello message. As the name implies, this extension indicates which Fully Qualified Domain Name (FQDN) the client is trying to reach. This cleartext field may be used by content delivery network (CDN) providers or transit load-balancers to process the session in a certain way without having to terminate the TLS layer. However, this field can also be used by an extranet edge firewall to loosely determine what resource a client is trying to access for URL categorization, SaaS application detection, and even full TLS decryption engagement purposes. Be warned that since there is no guarantee that the client will supply a true destination FQDN with this extension, one can only put limited faith into this early classification decision.
A TLS 1.2 server will respond back to a client with a ServerCertificate message which contains the server’s identity certificate in cleartext. Similarly to the SNI extension, an in-path firewall can read the server’s FQDN from that message without engaging TLS decryption. Since the chances of a mutual collusion between the client and the server are lower, the transit security appliance can make the same URL categorization, application detection, and other security policy decisions with much higher confidence. It can also compare the server’s stated identity to the previously inspected SNI data to detect and block a client who is trying to circumvent the edge security checks. It is fair to say that one cannot reliably verify a server’s identity and its possession of the private key which corresponds to the presented certificate without completing the full TLS handshake. Full assurance requires the NGFW or IPS to engage in TLS decryption with all the associated caveats.
How TLS 1.3 Changes the Game
By now, most of us know that TLS 1.3 is the new standard. While it brings many improvements to the security posture and especially speed of TLS connection establishment, it does not make TLS 1.2 obsolete or insecure. The main security benefit of TLS 1.3 is in only supporting ciphers which offer Perfect Forward Secrecy (PFS), so every connection to a certain server cannot be decrypted with a single compromised private key. However, you would be hard-pressed to find a modern TLS 1.2 implementation that does not take advantage of the optional PFS capabilities either. Therefore, both TLS 1.2 and 1.3 will co-exist in most networks for many years to come.
One of the biggest myths which I hear about TLS 1.3 is that it makes decryption by transit security devices impossible. This is simply not true, since both inbound and outbound TLS decryption happens exactly as with TLS 1.2. However, TLS 1.3 no longer lets the server present its certificate to the client in the clear. While the SNI extension is still unencrypted, the passive inspection approach which was described above is no longer as reliable. Therefore, most transit security devices typically engage full TLS decryption for 1.3 connections as early as possible. This in turn leads to a degraded customer experience when a security device encounters a resource that should not be decrypted by policy, since the only way to disengage TLS inspection on that transit device is by dropping the session and letting the client re-establish it directly.
One solution to this problem is implemented in the upcoming FTD 6.7 software with a feature called TLS Server Identity Discovery. When this capability is enabled for NGFW and IPS use cases, the FTD intercepts a TLS 1.3 handshake message from a client to an unknown server and then opens a side connection to this server to discover its identity. FTD uses the same source IP address and TCP port as the client and mimics the ClientHello message as much as possible to get the server to present its true certificate. Once the server’s identity is established, FTD applies an appropriate application or URL policy to permit or deny access, or even engage full TLS decryption. It also caches the server’s identity to avoid repeated identify lookups for multiple clients that access the same resource. This significantly improves both the security efficacy and user experience, especially when full TLS decryption may be required by policy. Expect to see many similar NGFW and IPS features that rely on passive inspection and behavioral inference rather than DPI in future FTD releases.
At the Crossroads of Decryption
The job of an NGFW or an IPS administrator was never easy, but now there’s also the pervasive encryption to worry about. For inbound deployments, TLS decryption must be at least considered to maintain a reasonable level of security efficacy with threat protection features that require DPI, such as malware blocking or intrusion prevention. The same goes for outbound deployments in highly controlled environments where private CA certificate distribution and undecryptable flows do not present a problem. There are definitely performance implications, but one may get pleasantly surprised by how much throughput certain modern security platforms offer even with full TLS decryption. Stateful scalability features, such as FTD clustering, allow pooling multiple physical security modules or appliances to satisfy even the most demanding TLS throughput requirements. For all other cases, look for a security product that provides at least some level of visibility into TLS traffic without requiring full decryption.
To get started with inspecting TLS encrypted traffic on Cisco FTD, refer to the “Understanding Traffic Decryption” chapter of Cisco Firepower Management Center Configuration Guide. Keep in mind that we publish FTD throughput numbers with 50% of fully decrypted TLS flows in each Firepower appliance data sheet.
thanks for such a good info about TLS 1.3.very informative
check out what Nubeva (nubeva.com) is doing with symmetric key intercept. They can discover final symmetric encryption keys and send them to NGFWs before the first packet even arrives. This allows the NGFW to do its normal decryption and full packet inspection without proxying or MITM latency. Plus the original client-server connection is never interrupted.
Billy,
The concept of endpoint cooperation in transit decryption is not new, though most of the prior proposals leave it to the application to willingly supply the decryption keys rather than passively sniff the memory and extract them without consent. There are several problems that I see with such an approach:
1. Once extracted, the keys can be saved to decrypt a captured session even after it is concluded; it is generally considered a very poor privacy practice. By allowing the key scrubber to examine application memory, you are also exposing a lot more than just the keys.
2. It does not help outbound decryption at all, given that you cannot readily install such a memory scrubber on unmanaged devices. Even with fully managed devices, very few would allow such broad memory access; this excludes pretty much every mobile endpoint.
3. The memory scrubbing process isn’t free, so it merely shifts cycles from an edge firewall to the compute layer. As long as one goes through the trouble of deploying such a solution, you could also drop the network layer completely and simply inspect application messages above the TLS layer on the workload itself, effectively making it a RASP.
4. As long as the actual decryption happens on a transit network device, it is still MITM with all the appropriate caveats, including the performance and latency impact. The key extraction costs that are shifted from the network device to an external memory scrubber are relatively minor as compared to the actual decryption/encryption overhead for the rest of the session. Even with a highly transactional traffic profile, you will likely spend more cycles passing the keys back and forth.
5. While the original TLS session is not interrupted, you also lose the ability to modify its content (e.g. for virtual patching purposes).
All things considered, we are back to the good old MITM, but with extra steps.
Thanks for information, I will waiting FTD 6.7 version.
Great work Andrew!
Very informative.