All posts by abaumhof

Search Engine Poisoning (malicious ad)

September 19, 2013 abaumhof Leave a comment

A very valid question that comes up all the time is “how do people get infected with malware” or “how do people lost personal information?” and there are so many ways that people are blown away by some of the examples I show them.

Today I came across one nice one again… Malicious Ad’s or Search Engine Poisoning… I used coinbase for some bitcoin activities and I wanted to transfer some bitcoins. So I typed in “bitcoin” into google and this is what came up

So far so good and everything looks great. I now just click on the first link as this is an ad where someone pays Google money and Google not being evil, must mean that this is good, right? wrong.

All visual signs suggest that this is legitimate and the URL goes to google.com, but that should be ok as well, right? A look at the source reveals that this goes to google before it goes to one URL shortener to another URL shortener and then to the final destination!)

after the first URL shortener, we’ll see this!

oops…

Luckily it was already known that this site is up to no good, as this server did hold a number of “nice” phishing pages designed to steal your bitcoin wallet information. With the current price of over 120 USD for one bit coin, that could be a very lucrative business

Some examples are:

Approximately 1h after notifying google, the malicious ad was gone, but please make sure you double-check where you click on.

Uncategorized

How I missed doing Research

September 19, 2013 abaumhof Leave a comment

I can’t believe that the last post on this blog was in 2011… what a ride the last two years almost have been…

I have the best intentions to provide more insights into some of the work that I’m doing and hope it is interesting / inspiring to someone.

Malware

how people get infected – or the perfect storm which luckily turned out to be harmless

November 24, 2011 abaumhof Leave a comment

One of the questions that always comes up at conferences and discussions is “how exactly get people infected with malware?”. The funny thing is that we malware researcher deal with this on a day-by-day basis, and the obvious reply ranges from malicious ad-banners to infected homepages/drive-by-downloads and phishing. All these techniques unfortunately work great.

One technique that is always a bit overlooked are malicious emails due to the “perception” that spam filters are effective. (whatever that means – especially in light of the RSA attack where the initial payload was an email which ended up in the spam folder and people moved it to the inbox to execute the attachment!!!). And my spam filter typically works really, really well…

Well, the other day I received an email from facebook that someone commented on a photo of myself

The interesting fact was that this email wasn’t in my spam folder, but rather in my inbox. Being one of the over 800 million facebook users, I thought: cool… let’s check it out and clicking on this link straight away downloaded a file called “FBviewer.exe”… Yeah, facebook viewer… makes sense.

Unsurprisingly this doesn’t smell right, so I asked virustotal what all the AV engines think of this file. As it turnes out, not much. 3 out of 43 AV engines detect something. Two with a heuristic (which may well be a false positive as well) and one AV detects something, but something completely wrong. The VT link is here: http://www.virustotal.com/file-scan/report.html?id=a34587d5bb473761ad9deb406dbc7515815325ca98d896238b696adf339b43cb-1320717477.

The disturbing fact is that none of the big AV engines detected this, so the “real” detection rate is close to zero.

I’ve no idea how Avast thinks this is Carberp, because this is a Spyeye trojan. When we analyzed the trojan, the C&C server was obviously still active, but luckily didn’t reply with a configuration files, so this instance of my Spyeye trojan was instructed to do nothing.

In conclusion, this was almost a perfect storm. A fairly well written email that made it past the spam filter with a convincing topic that most people would fall for. The download of a trojan with virtually zero detection rate and probably a fairly high hit rate because people really want to find out what’s happening with this photo.

Luckily the trojan didn’t do anything on our system, but hey once you are owned, you are owned.

in-depth report, Malware

Man In The Browser: Mac OS X Edition: 1 of 3

November 16, 2011 abaumhof Leave a comment

1 Introduction

In one of the recent TrustDefender Labs reports (“Man-in-the-Browser 101 or it works as designed (Win & Apple Mac) – http://www.trustdefender.com/trustdefender-labs-blog-td-labs-blog-man-in-the-browser-101-or-it-works-as-designed-win-aamp-apple-mac.html) we gave a brief over-view of what a Man In The Browser attack was all about, and in that discussion we briefly touched on Man In The Browser (MITB) attacks on Mac OS X. But is this really plausible?

The reputation for Apple’s OS has been that its more secure and doesn’t suffer these problems. Well, we have touched on, and debunked this myth earlier, but the malware examples we have looked at have all been pretty unsophisticated. So maybe there is something in this idea of a more secure OS? Well, given the title of this article, you can guess that’s not true!

We will be looking at three different ways that MITB can be achieved on Mac OS X as well. The most interesting fact is that it is actually fairly easy to do and works exactly the same way as on Windows. Given the detailed knowledge of MITB from the fraudsters on Windows, one can only assume that they have the knowledge as well on Mac and it is only a matter of time before they maintain two complete different versions of their trojan, perhaps with a shared configuration.

2 Browser Hooking

In the previous article, we talked about function hooking, and, although that article was focused on Windows, the concept is exactly the same for Apple OS’es. The basic idea is simple. When the web browser calls a system function, we re-route the call to execute our code first.

We have identified three different methods for getting our code to be called before a system function is called. This list is not exhaustive by any means, and we will cover each in turn in a later in-depth report, but the point is there are multiple ways of achieving this goal.

Specifically,

Library overloading using DYLD_LIBRARY_PATH
Function overloading using DYLD_INSERT_LIBRARIES
Code injection

The first two of these use features made available by the dynamic linker, and the last is a more complicated method where we inject some code directly into another processes memory. However, regardless of which of method is used, they all rely on the same basic idea. Rather than directly calling a system function, we first call our function first. Like this:

We will cover lazy linking, and explore in more detail the details this image shows in a later article. For now, its enough to know that the system used to resolve symbols in shared libraries, can be perverted and used to call our code instead. All three of the mentioned methods exploit this mechanism, albeit in slightly different ways. Typically, in this situation, the injected code will do its thing, and then call the original system code, so the system call still does what it needs to for the process to run (at least apparently) correctly.

2.1 Results

So, now we get to the interesting part! We have the pieces in place, we just need to figure out the appropriate places to hook, and inject our code. The details of this aren’t really important here, but suffice to say its not a big issue to figure these things out. For MITB malware, we need to find a function that is called to retrieve network traffic, ie our HTML content, from the network. And, if we hook at the right place, we can get access to html content before its encrypted and sent, and also after its received and decrypted. This means that SSL doesn’t help protect the content. We can then add whatever content we like… and we can do all this with no special access rights!

This sounds very familiar from all the MITB trojans on windows such as Zeus, Spyeye, Carberp, Gozi, Mebroot,… And yes, exactly the same thing is easily doable on Apple Mac OSX as well.

So how does it look like?

Note the https url, the lock symbol indicating a secure session… and the extra form field in the sign on section. In case you aren’t familiar with citibank’s sign on page, the relevant section of the same site without our hook looks like this:

Note that we picked citibank for this exercise simply because they are a well known bank. There is no inherent flaw in their website (that we know of, anyway), its a fairly typical banking site, secured with SSL and a certificate from a trustworthy CA (whatever that means in these days!).

Its also worth noting that the code required to inject an extra form element into the page was of medium complexity and highly targeted to this particular website. This was mentioned in an earlier article that suggests the meaningful Intellectual Property is in the configuration file rather than the code injection method. All major MITB trojans share the same “Zeus style” web-inject configuration format to target many brands at the same time. This is where the real value of a MITB lies, and given its brand-specific, is no tied to the delivery platform at all.

All three methods outlined in this report will result in the same compromised page where 99% of the content comes from citibank, but some additional form fields or malicious JavaScript fragements are injected locally – even though EV-SSL is used and the padlock is visible.

The above results are possible using all three methods.

2.2 What next?

Over the next few articles we will look in greater detail at the 3 methods of hooking mentioned above. We will also touch on the security features available in Mac OS X (10.6 and 10.7) and how effective they are for this kind of attack.

in-depth report, Malware

Zeus Code Update Part 2 – Who is behind the recent changes?

November 6, 2011 abaumhof Leave a comment

Executive Summary

In our TrustDefender Labs report last month, we looked at different samples of the notorious Zeus trojan that are based on the leaked source code. We focused on the changes the authors did to the encryption of the configuration files in order to circumvent automated detection tools and to stay undetected for longer. So the analysis was quite limited to a fairly narrow set of changes.

This time, we look at similar Zeus trojans from a different angle. We consider the entire set of changes and we try and estimate the complexity of those changes, which may give us an insight into the level of understanding shown by the respective authors when altering the Zeus source code.

Much to our surprise we were able to conclude that these changes were made either by the original Zeus author or by authors who know the source code very, very well and are specialists in developing malware.

In any case, the release of a series of versions, with quite substantial changes, shows that whoever is working on the code is quite capable of making extensive and sometimes fundamental changes in a short space of time.

We can therefore expect the pace of development to continue into the future, and even if the original author is no longer working on the code, we expect further versions to emerge at short intervals.

Methodology

Rather than looking in minute detail at every routine, which would be painstaking, and take too long to be viable, we look at the overall interactions between functions – which functions call other functions, and we then map that call pattern back to the reference copy of Zeus at the time of the source code release. This lets us see which new functions have been introduced, which have been deleted and which have been changed.

This is a much smaller problem, is possible to automate, and still gives us a good feel for the changes made. There are still some challenges – for instance, the way Zeus is built means that sometimes several functions are folded into one, and sometimes they are not. This is done by the Zeus author to evade detection from signature scanners. Also, some functions are interchangeable, so any of a set may be called. Lastly, compile-time defines can be used to include or remove functionality, possibly so that some functions can be sold at a premium.

However, we can compensate for all these.

Once this is done, we can then take a more detailed look at a smaller set of functions that stand out as a result of this analysis.

This approach does not pick up all changes – for instance, small changes in logic in functions do not affect the flow, and may not therefore be picked up. However, even with this limitation, the results are useful.

The IDA Pro program was used extensively in this analysis, as this has a great deal of functionality enabling the cross referencing of functions.

Benchmark Copy

We used a benchmark copy of Zeus to compare the other copies to. The copy was chosen to be very similar to the copies of Zeus created by compiling the source code. However, it is an in-the-wild copy – we did not create it by compiling the source code.

It is though not an exact map to the source code, and therefore we also took care to compensate for this in our analysis.

ICE IX

The copy of ICE IX we used had MD5 checksum of 62f770d7db6dd6825b793ec5c456d7e2 and a size of 99840 bytes.

When we analysed this copy we found it was generated from 58 source modules which contained 634 routines . All of the source modules were present in the original Zeus source. Of the 634 routines, only one of them could be said to be new code. All the others were from the original Zeus source code, and although there may have been minor logic changes to these, in essence, the code remains the same.

The one routine added was not complex, being a string manipulation routine to convert a 16 byte hash to an ASCII string.

We would have to class ICE IX as containing only the most basic of changes, which could be done with little understanding of how the Zeus code fits together.

In addition, though we did not verify this ourselves, it has been reported elsewhere that the author has also introduced a string handling bug which causes a memory leak.

Ironically, this sample turned out to be closer to the Zeus source code than the representative sample we had been previously using.

RC4 replaced with AES

The next copy of Zeus considered is the one where the RC4 algorithm was replaced by AES. The sample used had an MD5 hash of d67d38800d6463d3db835f64224654e6 and a size of 200192 bytes.

The sample contained 655 routines in 59 modules. There was one new module introduced, which contained 11 functions used to disguise kernel calls.

In total there were 23 new functions. The other new functions w ere added to existing modules. 3 were added to the CoreInject module to handle thread injection, and 9 to the Crypt module to replace the RC4 algorithm with AES.

This is a much bigger set of changes than with ICE IX.

However, there are still some interesting points.

As noted in our previous paper, there are some anomalies with the way the AES algorithm is called. AES is only ever used to encrypt 16 bytes at a time, never the entire block to be encrypted. This suggests that the author was unfamiliar with the AES code which is imported. The likelihood is that it was found somewhere on the internet and just copied in. The net effect is that the new encryption is technically weaker than the old.

The new module disguises the way that kernel calls are made. It does this by pushing Kernel module names onto the stack, and then searching the stack to match the names with hash values. This is nothing new – malware has been doing this for a long time; indeed, I remember using this as a technique to detect if a program is malicious several years ago. This therefore adds nothing to the security of the malware, and makes it more likely, not less likely that it will be detected by security programs. It also does not fit in with the ‘ethos’ of the rest of the malware. Everywhere else, if a string needs to be disguised, then the CryptedStrings routine is used. The disguise is also not used consistently. Only a few kernel calls are protected in this way, with no obvious reasons why these should be selected over other system calls. In short, this may be another piece of pre-existing code that the author has imported.

Lastly, the three thread handling routines are very similar to the old thread handling, but use the new module to disguise some of the system calls.

Overall, more knowledge of the Zeus architecture was needed to make these changes than for ICE IX, but they are still not substantial.

Registry Storage Version

The sample used had an MD5 hash of 8807fbdc494e946e25bfdad74cd756d9 and a size of 169984 bytes. It was obtained from the wild – the only one of the three samples where this was the case.

This version of Zeus changes the way that data is stored in the registry and as other researchers have noted, also uses a peer to peer network as its command and control infrastructure.

It contains 79 modules and 788 routines. It is therefore immediately apparent that the changes are on a whole different scale than the other two samples, with 21 new modules and around 241 new routines.

As well as the new functionality, some of the existing modules were almost completely rewritten, most notably the socket handling. Altogether, 110 old routines were retired, and 137 routines were altered.

There has also been some rationalising of code, with the hash routines and random number handling code moved out of the Crypt module into their own modules, and the PESETTINGS code moved from the Core module into its own module.

The new routines incorporate quite substantial changes. A brief summary follows.

The registry code has been altered to change the name registry subkey names are generated. Previously, only three keys were used, their names were randomly generated, and a note of the keys used was stored in encrypted form in the PESETTINGS area. Now the keys used are string representations of base 20 numbers. There is a complex algorithm, described in our previous paper, used to generate these numbers. This allows more data to be stored in the registry, which may have been needed as a result of the changes to the communications architecture.

The communications architecture has been changed to use a peer to peer network. Other researchers have noted that this appears to be based on the Kademlia protocol. This means that the botnet is more robust, and cannot be taken down so easily. This change will also require a change to the C&C code to also use the new peer to peer architecture.

The Mersenne Twister algorithm has been added to the random number code. This may have been copied from code available on the internet – a Google search found the exact same code. It is not clear that the author understands the purpose of the algorithm. It is intended to generate random number sequences. However, in the code it is just used to transform the random number generated by the original code into a different random number.

The crypt code has been changed to allow for the easier use of different encryption algorithms. Currently this is not actually used, so may be a preparatory architecture change for future versions.

Several new kernel calls are hooked

nspr4.dll::PR_Poll
wininet.dll::InternetSetOptionA
wininet.dll::InternetSetStatusCallback
wininet.dll::InternetSetStatusCallbackW
ws2_32.dll::WSARecv
ws2_32.dll::recv

The HTTP inject mechanism has support for regular expression matching added. This also means that the way that configuration data controlling this activity is stored is changed. Previously this was stored in the configuration area using an ID of 20007. The new data is stored in a different format using ID 30003. Items are separated using the ‘magic’ data ‘ERCP’. This may stand (when little endian reversed) for Perl Compatible Regular Expression. Items with ID of 30004 and 30005 were also added. These changes will also need a change to the Zeus configuration builder code, so that the new formats of configuration data can be packaged.

Code has been added to rebuild the import table, and also to adjust it after any kernel hooks are removed. Code has also been added to adjust any relocation entries during the routine decryption process, which will otherwise stop the decryption failing if the program is loaded anywhere other than 0×400000.

There is some special code to handle two security urls and set a cookie:

https://*/ebc_ebc1961/ebc1961.asp*=RemoteLogon*
https://securentrycorp.*/Authentication/zbf/index?*domainId=*

The overall impression is that there are a large number of changes that have been made to quite different areas of the program, and that the person doing this was very familiar with the architecture of Zeus. The changes made are sympathetic to the way Zeus ‘does things’.

Other copies of Zeus

During the investigation, we also noticed that several routines in this copy of Zeus were also present in earlier copies we got from the wild. This suggested some other lines of research. It’s possible to get some idea of the evolution of malware by looking at the changes from version to version.

If we see two copies with the same changes, then this will not have occurred by chance. The most likely scenarios are that either the same author has produced the two versions, or that two groups of malware writers are sharing source code with each other.

We were able to find four in-the-wild copies of Zeus which showed a distinct evolution. The first copy was from June 2011. The next copy was the copy under analysis, which was from September 2011. Two further copies were from October 2011.

The June 2011 copy contained 700 routines. It changed the way the HTTP inject mechanism happens, and the way that configuration data controlling this activity was stored, introducing ID 30003 and others. The way the botnet C&C was contacted changed, with alternative URLS being generated and tried on a daily basis if the original C&C becomes unavailable.

The September 2001 copy contained 788 routines. It added base 20 registry key naming, peer-to-peer networking.

The two October 2011 copies are still awaiting full analysis. However, they added a third routine to the collection of encrypted subroutines . This requires corresponding changes to the builder program that creates Zeus, and is not a trivial change. They contained 833 and 823 routines.

Regular Expression Handling

The fact that the copies all contain regular expression handling is very interesting. The version of source code released was 2.0.8.9. However, regular expression handling was only added to Zeus in version 2.1, which was released around October 2010 .

We therefore went back again in our archives and located a copy of the 2.1 Zeus release to compare the regular expression handling. It was effectively identical.

This means that someone out there has the 2.1 source of Zeus, which was never publically released, and is still actively working on it.

Is the original Author Still Active?

From the changes we see that there is intense ongoing development with this branch of Zeus.

We now stray into the realms of conjecture. Someone has access to the source code for the 2.1 version of Zeus which was never publically released. This person making these changes is obviously very familiar with the Zeus architecture and has made complicated changes. The replacement of the communications architecture with the peer to peer architecture is a non-trivial change and requires back end changes to the C&C as well. The addition of a third encrypted subroutine requires changes to the build process. The addition of a new format for the configuration inject data requires changes to the configuration packaging code. These indicate the author has an understanding of the whole Zeus package, not just the malware client itself.

The way the changes are made is sympathetic to Zeus. The modules added contain small numbers of routines and are logically organised. The CryptedStrings routines are used to hide interesting strings from casual view. Often when a new person takes over maintenance of code, they start doing things in their own preferred way which is different to the original author. However, this does not appear to be happening here.

The copies are in the wild, suggesting a buyer or buyers are readily available.

The Zeus source code was released around the weekend of 7th May . The first version of this new branch of Zeus was detected at the beginning of June 2011 – less than one month later. This is a very short space of time for someone new to Zeus to read and understand the code, and then to create such an extensive derivative work.

We have to wonder therefore, if this is actually still the work of the original author. He announced his retirement, and released the source code. However, this may have just been a smokescreen. Certainly, whoever is continuing to maintain the Zeus code is very familiar with the whole Zeus environment, not just the malware itself. It could also be someone who has worked closely with the original author in the past, or who the author has trained up to take his place.

Are there any signs that this might not be the original author? Well, possibly. In the June 2011 version, the new configuration data handling introduced two memory handling subroutines that were placed in their own module, and not in the standard memory handling module. The fragmenting of routines into different modules is also arguably finer-grained than with the original code.

As with all such guesses, only time will tell. All we can do now is sit back and wait for further information.

Conclusion

The first two copies analysed have fairly trivial changes which are (in my opinion) not the work of the original author.

The third copy is far more interesting. There is a case to be made from the availability of 2.1 source code, the complexity of changes, and the short period between the release of the source code and the emergence of versions of this variant, that the original Zeus author is still at work.

This is still conjecture at this point, and it may also be due to other authors familiar with coding this type of malware.

We can therefore expect the pace of development to continue into the future, and even if the original author is no longer working on the code, we expect further versions to emerge at short intervals.

Applications used

The applications used to research this document were:

IDA PRO 6.2
ActiveState Perl
010 Hex Editor
Editplus text editor

A series of IDA scripts and perl programs were used to auto-analyse each Zeus sample and create a cross-reference listing of routines. A further series of perl programs were then used to analyse and compare the cross reference listings.

in-depth report, Malware

New IceIx (Zeus variant) changes its encryption method (again)

October 20, 2011 abaumhof Leave a comment

Executive Summary

In our last in-depth report we looked at enhancements of the notorious Zeus trojan that focus solely on making life harder for automated detection tools and tracking software. We looked at three variants that are based on the leaked source code. The fear is that a proliferation of too many different variant will make life harder to detect and track the various trojans.

One of the variants is called IceIx and on October 13 we noticed the presence of a new in-the-wild IceIx Zeus variant. We therefore decided to take a brief look. The sample had a size of 169,984 bytes and an MD5 of ed34b46a4524c7d05e45200eaf09f765. It contained exactly the same number of routines at the previous variant, 634. There were minor changes to around 20 of the routines, the rest were unchanged.

As you’ll see below, the changes are minimal, but the result is that all automatic decyrption routines will fail!

Well, it seems that the bad guys continue to “tweak” the encryption algorithm and the arms race continues up until we finally implement some proactive solutions rather than reactive countermeasures.

RC4 Variant

This version of Zeus is very similar to the initial ICE IX release, but has the RC4 algorithm tweaked. In the standard algorithm, a 256 byte ‘state’ table is initialised with the RC4 key, and the data is then decrypted as follows.

The modified algorithm changed the way the x and y values are altered.

The areas affected in the decryption process are indicated in red in the diagram (please see our previous paper[1] for an explanation of the processes).

Although the areas affected seem extensive, in fact only a trivial change is needed to the rc4 algorithm for a security researcher to be able to decrypt the configuration file.

Notes on altering a Cryptographic Algorithm

For a layman to alter a cryptographic algorithm is usually bad. Seemingly trivial changes may introduce flaws that a cryptographer can exploit, and the chances are that the resulting output may be less resistant to cryptographic attack.

For a Zeus author, a slight lessening of cryptographic resistance probably does not matter. After all, the key and algorithm is provided with the Zeus source anyway. The end goal is to make life harder for a malware analyst, not a cryptographer, and so in this case the trade off is presumably acceptable to the author.

Configuration file download

The previous variant added extra parameters, ‘id’ and ‘hash’ to the HTTP command used to get the configuration data. This version changes the names of the parameters to ‘bn’ and ‘sk’.

The configuration file available at the ZeusTracker site has a version number 0×01010500.

Botnet

The botnet name used by this variant is rotewa11.

Summary

It seems as though this Zeus author has found at least one buyer even though his changes to Zeus are minor. Is seems that advertising does pay, even for malware.

This is the first time we have seen a Zeus variant using and altered version of a standard cryptographic algorithm, although no doubt it will not be the last.

[1] http://www.trustdefender.com/zeus-trojan-update-new-variants-based-on-leaked-zeus-source-code.html

Curiosities, Malware, Strategy

Research about “Why isn’t everyone hacked every day” also applies to the fraudsters

October 5, 2011 abaumhof Leave a comment

Ok, so I was reading the article by Michael Kassner “Why isn’t everyone hacked every day” (http://www.techrepublic.com/blog/security/why-isnt-everyone-hacked-every-day/6633) which talks about a paper by Cormac Herley and Dinei Florencio about “Where Do all the attacks go?” (http://research.microsoft.com/pubs/149885/WhereDoAllTheAttacksGo.pdf)

In short the paper gives a plausible explanation of why the internet still works at large even though the security state is a mess (well, not exactly, but almost).

One of the key statement that caught my eye was “Thus, how common a security strategy is, matters at least as much as how weak it is“.

Basically this means if you are a bank and you deploy the same security solution than anyone else, you are more likely to be hit as the fraudsters can just reuse an attack vector.

However it occurred to me that this is obviously also true for the bad guys and if all the bad guys would be using the Zeus trojan to perpetrate their crime, the good guys can obviously be much better prepared to defeat the attacks.

If the bad guys are using a new trojan (with new configuration files, encryption, hooking, …) all the time, then the good guys have a much harder time to provide the same level of protection.

If you use any off-the-mill Zeus trojan service, chances are high that you’ll be detected pretty quickly through zeustracker and lots of services provided by many companies that will give financial institutions an early warning.

However if you either use a different trojan or alter the trojan to make the detection and decryption harder (such as the Zeus variants based on the leaked Zeus source code - http://www.tidos-group.com/blog/?p=429), the chances of not being detected will increase. We have seen this over and over again and when Carberp hit the scene, it made a big impact because nobody really knew what was happening and how things worked.

So while the paper by Herley and Florencio talk about the security industry, the same thing applies to the fraudsters as well and as such I’m sure we’ll always see a proliferation of lots of different trojans for exactly this reason.

in-depth report, Malware

Zeus Trojan Update – New Variants based on leaked Zeus Source Code – TrustDefender indepth report

September 30, 2011 abaumhof Leave a comment

This is a blog post featuring the TrustDefender indepth report for “Zeus Trojan Update – New Variants based on leaked Zeus Source Code” by Alex Shipp / Andreas Baumhof

1 Introduction

When the source code of the Zeus Trojan (v.2.0.9.8) leaked into the public in April this year, it was clear that this will have some serious implication for the security industry. At the time, there was speculation that this would result in a large amount of new variants, as malware writers got hold of the code and started work on their own versions.

After a period of silence, we have seen at least three new variants based on the leaked Zeus source code appearing within the last couple of weeks. None of the three variants modified the core of the Zeus code; all of them focused on AV evasion and making sure that security researchers/tools cannot easily decrypt the configuration files.

The configuration files define what a Zeus Trojan does, and are therefore the holy grail to each Trojan.

In this report, we look into great detail with respect to these new variants and what changes were introduced.

The Zeus Trojan is complicated, with more than 600 subroutines. Rather than examine the entire code for changes, this research just looks at the processes involved in obtaining a decoded configuration file. This is a useful benchmark for a researcher, because the information we are usually interested in are the sites under attack by any particular copy of Zeus, and any custom code used in those attacks. Both these pieces of information are contained in the configuration file, which is encrypted.

2 Executive Summary

The three variants that were released within just a few weeks are:

ICE IX
- This version of Zeus is claimed by its author to make life harder for tracking companies by making it more difficult for them to download the configuration file.
- This is clearly the stab at ZeuS Tracker and all security companies that try to monitor configuration files and decrypt them to provide an assessment to a brand whether they are exposed.
Registry Storage Version
- This version stores the configuration file in different locations and also attempts to clear the encryption keys in memory.
- Many configuration decryption tools search for the RC4 keys in memory which this version clearly tries to avoid.
RC4 replaced with AES
- The RC4 encryption algorithm has been replaced by AES.
- This means that all the existing tools don’t work anymore and have to be rewritten.

Three different authors have developed strains of Zeus with their own ideas of how to improve the malware. However, in each case there have been no fundamental changes to the architecture, but just small changes which are more in keeping with the way Zeus has evolved itself. With a small amount of effort, it is still possible for security researchers to obtain and decrypt the configuration files.

The worst-case scenario of a proliferation of uncrackable Zeus variants has not therefore materialised. This may still be a question of time. The effort needed for a large scale rewrite of Zeus may mean that we have to wait a while longer for the nastier variants to arrive. Alternatively, it might also mean that basing your malware on code to which security researchers also have access is a bad move for malware authors. They may have decided it is better to write their own malware from scratch, and not reuse code which may be considered compromised.

Lastly, old versions of Zeus still proliferate. Many of the samples we see are still based on the good old original source!

The picture above summarises the changes in the three versions. More detail is supplied in the following notes.

4 The Zeus Trojans in detail

4.1 Benchmark Copy

To start with, we take a brief look at a benchmark copy of Zeus to see how the configuration file could be decoded at the time of release of the source code. The benchmark copy chosen was very similar to the copies of Zeus created by compiling the source code. The following diagram summarises the steps needed to decrypt the Zeus configuration file.

Zeus Protection Algorithm

The diagram shows just how complicated Zeus has become in order to try and protect its configuration file from security researchers.

Zeus applies a permutation algorithm to its ‘BaseConfig’ data area to create a key used by the RC4 encryption algorithm.
This key is used to search the Zeus exe for a marker used to store another data area, the ‘Overlay’ area.
Once found, the key is then used to decrypt the entire Overlay data area, recovering data identified as INSTALLDATA.
This INSTALLDATA data area contains an XOR key and size information used to decrypt code which Zeus needs to carry on its execution.
Zeus now copies itself to a new location, updates the Overlay area with new data it calls PESETTINGS, and starts the new copy.
Zeus XORS its ‘BaseConfig’ data with another data area to extract a second RC4 key, and a URL.
The URL is used to download the Zeus configuration file.
The RC4 key is used as the first stage in decryption.
An XOR based algorithm called ‘VisualDecrypt’ is used to complete the decryption. The parameter file is now available, but is still stored in an internal compressed Zeus format.
The UCL decryption algorithm is used to decompress parts of the configuration file, which can now finally be viewed by security researchers.
The new copy repeats steps one to three to decrypt the Overlay area and recover the PESETTINGS data. A new RC4 key from the PESETTINGS data is used to re-encrypt the configuration file from step 8 and store it in the registry. Zeus can now use this local copy rather than downloading the configuration data each time.

Even though this is incredibly complicated, this is still a simplification of the actual process. The Zeus builder kit can vary the order and locations of all the data areas within the BaseConfig and Overlay areas so that recompiling the same source code can create a version with different characteristics.

Furthermore, most copies of Zeus are packed with one or more packers, which means that several unpacking steps need to be taken before the real work can begin.

However, we can make some simplifications as well. We said earlier that one of the goals of the security researcher was to decrypt Zeus configuration files. If this is all we need to do, then we can miss out most of the steps. Starting with a copy of Zeus and a configuration file, we only need to carry out steps 6, 8, 9 and 10 to end up with a decoded parameter file. If we don’t have a parameter file, we can carry out step 7, or even run the copy of Zeus in a virtual machine and capture the network traffic as it downloads a copy. We can use either copy of Zeus for this purpose – the original, or the new copy it creates.

The stages coloured red in the diagram illustrate this.

Most of the time, this is the situation that security researchers are in. In some cases the research might have available a compromised machine, and with it, the copy of the configuration stored in the registry, plus the changed copy of Zeus. In this case it is a little more difficult, but the parameter file can be recreated by following steps 1, 2, 3, 11, 9 and 10. Here, the copy of Zeus used must be the changed copy. The RC4 key generated in the Overlay area is unique to the machine that Zeus is on, and no other copy will therefore work. Again, the stages coloured red in the diagram illustrate this.

We can now take a look at the new versions of Zeus to see how they have affected this process.

4.2 ICE IX

This version of Zeus is claimed by its author to make life harder for tracking companies by making it more difficult for them to download the configuration file. This is claimed to prolong the life of the botnet, giving a greater return on investment. Analysis of the code shows that this is done by changing the format of URL used to fetch the configuration data. Extra data is appended which is checked for correctness before the Zeus serve will return the configuration file.

Whether this will cause the tracking companies the difficulties the author envisages is a question for consideration. The tracker site zeustracker.abuse.ch already has a blog post on the new format[1], which suggests it did not give them any particular difficulty. In any case, no matter how difficult the author makes his algorithm, the copy of Zeus can always be executed so that it will download its configuration file. This suggests that any arms race the author may choose to pursue is essentially futile.

There are other change in this version of Zeus also designed in making it more attractive to potential purchasers[2], but these are outside the scope of this paper.

The areas of change are indicated in red in the following diagram.

4.3 Registry Storage Version

This version of Zeus changes the way that data is stored in the registry. The benchmark version uses 3 subkeys with randomly generated names and stores the names in the Overlay area. This version uses many more subkeys. To avoid having to expand the Overlay area, the subkey names are generated using an algorithm which eventually produces a base 20 number which is used as the subkey name.

It is not clear if this has been done to make life harder for security researchers. It may simply be a new scheme to allow for the easy expansion of registry keys used.

The registry data is no longer protected by the VisualDecrypt algorithm. This might be because the author decided that because the 100 bit RC4 algorithm should take longer than the lifetime of the universe to crack[3], there was not much point in adding extra protection with an XOR layer.

Some attempt is also made to clear the RC4 keys in memory after usage. This may help protect against memory probes attempting to locate the key while the program is running. However, it does nothing to protect against static analysis. Also, the coverage is not complete, and some copies are not destroyed; and of course, the keys are still in memory while they are being used.

Lastly, the configuration file location is no longer stored in the same way. Instead of a single hardcoded URL, the BaseConfig data contains a series of IP addresses and ports. These are contacted on a rolling basis via a UDP based protocol. This makes it harder for security researchers to download the configuration file, although as mentioned earlier, running the bot and making it download the file itself is always an option.

As usual, changes are indicated in red.

4.4 RC4 replaced with AES

The final copy of Zeus we look at has the RC4 algorithm replaced by AES.

This replaces one algorithm uncrackable in the lifetime of the universe with another, also uncrackable in the same time span. It therefore adds nothing to the theoretical safety of the configuration file, although of course it does require security researches to add AES to their arsenal of tools, if they have not already.

Indeed, there are some minor flaws in the author’s implementation of AES, which render it less secure than it could be, and possibly less secure than the RC4 version. However, in practice they probably still leave the configuration file uncrackable with readily available technology[4], and the author can always trivially correct these flaws in the next version.

There is one area where security is enhanced. The code which was protected in step 5 by a 4 byte XOR key is now protected by AES. The XOR key was trivially easy to deduce, because the protected code did not differ much from one version of Zeus to another. Thus, by knowing what 4 of the bytes probably should be, the key can be determined in only a few tries. Encrypting with AES gives a much higher degree of protection. However, only two start-up routines are protected in this way, and knowledge of this code is not really necessary for decryption of the configuration file, or for understanding the main routines of Zeus. As before, changes are indicated in red in the following diagram.

5 More information

Please feel free to get in touch with us at labs@trustdefender.com

6 Appendix – technical details

This section looks in more details at the specific changes