Steganography is the ancient art of invisible communication, where the goal is to hide the very fact that you are trying to hide something. It adds another layer of protection after cryptography, because encrypted message looks like gibberish and everyone immediately notices that you want to hide something. Steganography embeds the (encrypted) secret message into an innocuous looking object such that the final communication looks perfectly normal. The “analog” form of steganography is the art of writing with invisible ink. The digital version hides the message by a subtle modification of the cover object. Probably the most researched area in digital steganography uses digital images as a cover media into which the message is inserted. The oldest (and very detectable) technique replaces the least significant bit (of each colour channel) with the communicated message. Shown below, the first picture is the cover object and the second one is the stego object.



The hidden message (shown below) is in the picture in 3-bit resolution.


Naturally, steganography can serve both sides. Dissidents in oppressive countries may enjoy it to give them private means of conversation without raising suspicion to authorities. On the other hand, terrorist groups and spies use it for coordination. This brings us to why the research in steganography, and especially in steganalysis thriving to detect the presence of steganography in the communication channel is important for cyber security. If we think about the desired properties of a botnet’s command and control (C&C) channel from the higher perspective, it should be invisible, which is the exact property steganography tries to achieve. Thus, there is a great deal of inspiration one can get from steganalytic techniques if one wants to detect and disrupt C&C channels. Botnet designers are aware of the potential of steganography and are starting to use it, fortunately in a very naive way, as described in some recent reports and blog posts [1],[2],[3].

The traditional approaches to steganalysis focuses on detecting the presence of a message in just a single object by using different statistics of cover and stego objects. The accuracy of the steganalysis depends on the length of the hidden message and the choice of the cover object. Over the last couple years, my colleague Dr. Andrew D. Ker and I have been investigating an alternative scenario, which we believe to be more realistic. In our scenario, the steganalyst intercepts many objects (images) from multiple users. His focus is now not on identification of a particular object(s) with a hidden message, but rather to point on a user who is guilty by using the steganography. In our point of view, this scenario is highly realistic as it covers for example an edge device connecting the internal network to the Internet, or a cloud proxy service such as CWS. We have validated this approach on images from a popular social network service. Although we have improved the accuracy of the steganalysis in the wild, still the errors were intolerable to prove that the steganography has been used in front of the court. Such proof can be obtained if and only if the message from the stego object can be extracted, which means we have to get the embedding key. Is it possible to do so?

In our recent paper, we discuss that it is possible by the oldest techniques exhausting all possible keys. This approach works only if two following conditions are satisfied:

  1. The key space has to be exhaustible. In our work, we argue that the key-spaces in most steganographic algorithms are typically exhaustible, since they are derived from a passphrase that can be guessed by dictionary attack, or they initialise random number generator by integer seed of size 32 bits.
  2. The plain-text is in a recognizable format, so that the key can be verified as being either valid or being wrong.

In the paper, we consider refined versions of the key exhaustion attack exploiting metadata such as message length or decoding the matrix size, which must be stored along with the payload such that the message can be correctly extracted from the stego object. We show how simple errors of implementation lead to leakage of key information and powerful inference attacks, but the complete absence of such information leakage seems difficult to avoid. We believe, and this is our future work, that the stenographer has to make a choice when implementing a new algorithms to be either more susceptible to key-exhaustion attack, or to the statistical attack.

The scenario investigated in the paper assumes that multiple stego-objects embedded with the same key are available to the steganalyst. Since the keys are notoriously re-used, it is highly probable. Moreover, it can be assumed in botnet’s C&C communication as well. Thus, our work paves roads toward better analysis of stenographic C&C channels.

The paper was awarded as the best paper at the prestigious ACM Information Hiding and Multimedia Security workshop 2014 in Salzburg, Austria (www.ihmmsec.org).

For further reading, I recommend the paper available at Andrew Ker’s website: http://www.cs.ox.ac.uk/andrew.ker/docs/ADK62B.pdf

[1] http://blog.sucuri.net/2013/07/malware-hidden-inside-jpg-exif-headers.html

[2] http://www.fireeye.com/resources/pdfs/fireeye-hot-knives-through-butter.pdf, page 15

[3] http://blog.malwarebytes.org/security-threat/2014/02/hiding-in-plain-sight-a-story-about-a-sneaky-banking-trojan

In the paper, I used the old affiliation with Czech Technical University in Prague because elements of the work had been done before the acquisition of Cognitive Security by Cisco.


Tomas Pevny

Technical Leader

SCO's threat defense in Prague