Maximum Security:

A Hacker's Guide to Protecting Your Internet Site and Network

11 Trojans

This chapter examines one of the more insidious devices used to circumvent Internet security: the trojan horse, or trojan. No other device is more likely to lead to total compromise of a system, and no other device is more difficult to detect.

What Is a Trojan?

Before I start, I want to offer a definition of what a trojan is because these devices are often confused with other malicious code. A trojan horse is

An unauthorized program contained within a legitimate program. This unauthorized program performs functions unknown (and probably unwanted) by the user.
A legitimate program that has been altered by the placement of unauthorized code within it; this code performs functions unknown (and probably unwanted) by the user.
Any program that appears to perform a desirable and necessary function but that (because of unauthorized code within it that is unknown to the user) performs functions unknown (and probably unwanted) by the user.

The unauthorized functions that the trojan performs may sometimes qualify it as another type of malicious device as well. For example, certain viruses fit into this category. Such a virus can be concealed within an otherwise useful program. When this occurs, the program can be correctly referred to as both a trojan and a virus. The file that harbors such a trojan/virus has effectively been trojaned. Thus, the term trojan is sometimes used as a verb, as in "He is about to trojan that file."

Classic Internet security documents define the term in various ways. Perhaps the most well known (and oddly, the most liberal) is the definition given in RFC 1244, the Site Security Handbook:

A trojan horse program can be a program that does something useful, or merely something interesting. It always does something unexpected, like steal passwords or copy files without your knowledge.

Another definition that seems quite suitable is that given by Dr. Alan Solomon, an internationally renowned virus specialist, in his work titled All About Viruses:

A trojan is a program that does something more than the user was expecting, and that extra function is damaging. This leads to a problem in detecting trojans. Suppose I wrote a program that could infallibly detect whether another program formatted the hard disk. Then, can it say that this program is a trojan? Obviously not if the other program was supposed to format the hard disk (like Format does, for example), then it is not a trojan. But if the user was not expecting the format, then it is a trojan. The problem is to compare what the program does with the user's expectations. You cannot determine the user's expectations for a program.

Cross Reference: All About Viruses by Dr. Alan Solomon can be found at http://www.drsolomon.com/vircen/allabout.html.

Anyone concerned with viruses (or who just wants to know more about virus technology) should visit Dr. Solomon's site at http://www.drsolomon.com/.

At day's end, you can classify a trojan as this: any program that performs a hidden and unwanted function. This may come in any form. It might be a utility that purports to index file directories or one that unlocks registration codes on software. It might be a word processor or a network utility. In short, a trojan could be anything (and could be found in anything) that you or your users introduce to the system.

Where Do Trojans Come From?

Trojans are created strictly by programmers. One does not get a trojan through any means other than by accepting a trojaned file that was prepared by a programmer. True, it might be possible for a thousand monkeys typing 24 hours a day to ultimately create a trojan, but the statistical probability of this is negligible. Thus, a trojan begins with human intent or mens rea. Somewhere on this planet, a programmer is creating a trojan right now. That programmer knows exactly what he or she is doing, and his or her intentions are malefic (or at least, not altruistic).

The trojan author has an agenda. That agenda could be almost anything, but in the context of Internet security, a trojan will do one of two things:

Perform some function that either reveals to the programmer vital and privileged information about a system or compromises that system.
Conceal some function that either reveals to the programmer vital and privileged information about a system or compromises that system.

Some trojans do both. Additionally, there is another class of trojan that causes damage to the target (for example, one that encrypts or reformats your hard disk drive). So trojans may perform various intelligence tasks (penetrative or collective) or tasks that amount to sabotage.

One example that satisfies the sabotage-tool criteria is the PC CYBORG trojan horse. As explained in a December 19, 1989 CIAC bulletin ("Information about the PC CYBORG (AIDS) Trojan Horse"):

There recently has been considerable attention in the news media about a new trojan horse which advertises that it provides information on the AIDS virus to users of IBM PC computers and PC clones. Once it enters a system, the trojan horse replaces AUTOEXEC.BAT, and may count the number of times the infected system has booted until a criterion number (90) is reached. At this point PC CYBORG hides directories, and scrambles (encrypts) the names of all files on drive C:. There exists more than one version of this trojan horse, and at least one version does not wait to damage drive C:, but will hide directories and scramble file names on the first boot after the trojan horse is installed.

Cross Reference: You can find the CIAC bulletin "Information about the PC CYBORG (AIDS) Trojan Horse" at http://www.sevenlocks.com/CIACA-10.htm.

Another example (one that caused fairly widespread havoc) is the AOLGOLD trojan horse. This was distributed primarily over the Usenet network and through e-mail. The program was purported to be an enhanced package for accessing America Online (AOL). The distribution consisted of a single, archived file. Unzipping the archive revealed two files, one of which was a standard INSTALL.BAT file. Executing the INSTALL.BAT file resulted in 18 files being expanded to the hard disk. As reported in a security advisory ("Information on the AOLGOLD Trojan Program") dated Sunday, February 16, 1997:

The trojan program is started by running the INSTALL.BAT file. The INSTALL.BAT file is a simple batch file that renames the VIDEO.DRV file to VIRUS.BAT and then runs it. VIDEO.DRV is an amateurish DOS batch file that starts deleting the contents of several critical directories on your C: drive, including
c:\
c:\dos
c:\windows
c:\windows\system
c:\qemm
c:\stacker
c:\norton

When the batch file completes, it prints a crude message on the screen and attempts to run a program named DOOMDAY.EXE. Bugs in the batch file prevent the DOOMDAY.EXE program from running. Other bugs in the file cause it to delete itself if it is run from any drive but the C: drive. The programming style and bugs in the batch file indicates that the trojan writer appears to have little programming experience.

Cross Reference: You can find the security advisory titled "Information on the AOLGOLD Trojan Program" at http://www.emergency.com/aolgold.htm.

These trojans were clearly the work of amateur programmers: kids who had no more complex an agenda than causing trouble. These were both destructive trojans and performed no sophisticated collective or penetrative functions. Such trojans are often seen, and usually surface, on the Usenet news network.

However, trojans (at least in the UNIX world) have been planted by individuals that are also involved in the legitimate development of a system. These are inside jobs, where someone at a development firm inserts the unauthorized code into an application or utility (or, in rare instances, the core of the operating system itself). These can be far more dangerous for a number of reasons:

These trojans are not destructive (they collect intelligence on systems); their discovery is usually delayed until they are revealed by accident.
Because most servers that matter run UNIX, some highly trusted (and sensitive) sites can be compromised. By servers that matter, I mean those that provide hundreds or even thousands of users access to the Internet and other key networks within the Internet. These are generally governmental or educational sites, which differ from sites maintained, for example, by a single company. With a single company, the damage can generally travel only so far, placing the company and all its users at risk. This is a serious issue, to be sure, but is relevant only to that company. In contrast, the compromise of government or educational sites can place thousands of computers at risk.

There are also instances where key UNIX utilities are compromised (and trojaned) by programmers who have nothing to do with the development of the legitimate program. This has happened many times and, on more than one occasion, has involved security-related programs. For example, following the release of SATAN, a trojan found its way into the SATAN 1.0 distribution for Linux.

NOTE: This distribution was not the work of Farmer or Venema. Instead, it was a precompiled set of binaries intended solely for Linux users, compiled at Temple University. Moreover, the trojan was confined to a single release, that being 1.0.

Reportedly, the file affected was a program called fping. The story goes as follows: A programmer obtained physical access to a machine housing the program. He modified the main() function and altered the fping file so that when users ran SATAN, a special entry would be placed in their /etc/passwd file. This special entry was the addition of a user named suser. Through this user ID, the perpetrator hoped to compromise many hosts. As it happened, only two recorded instances of such compromise emerged. Flatly stated, the programming was of poor quality. For example, the trojan provided no contingency for those systems that made use of shadowed passwords.

NOTE: The slackware distribution of Linux defaults to a nonshadowed password scheme. This may be true of other Linux distributions as well. However, the programmer responsible for the trojan in question should not have counted on that. It would have been only slightly more complicated to add a provision for this.

As you can see, a trojan might crop up anywhere. Even a file originating from a reasonably trusted source could be trojaned.

Where Might One Find a Trojan?

Technically, a trojan could appear almost anywhere, on any operating system or platform. However, with the exception of the inside job mentioned previously, the spread of trojans works very much like the spread of viruses. Software downloaded from the Internet, especially shareware or freeware, is always suspect. Similarly, materials downloaded from underground servers or Usenet newsgroups are also candidates.

Sometimes, one need not travel down such dark and forbidden alleys to find a trojan. Trojans can be found in major, network-wide distributions. For example, examine this excerpt from a CIAC security advisory ("E-14: Wuarchive Ftpd Trojan Horse"), posted to the Net in 1994:

CIAC has received information that some copies of the wuarchive FTP daemon (ftpd) versions 2.2 and 2.1f have been modified at the source code level to contain a trojan horse. This trojan allows any user, local or remote, to become root on the affected UNIX system. CIAC strongly recommends that all sites running these or older versions of the wuarchive ftpd retrieve and install version 2.3. It is possible that versions previous to 2.2 and 2.1f contain the trojan as well.

wftpd is one of the most widely used FTP servers in the world. This advisory affected thousands of sites, public and private. Many of those sites are still at risk, primarily because the system administrators at those locations are not as security conscious as they should be.

TIP: Pick 100 random hosts in the void and try their FTP servers. I would wager that out of those hosts, more than 80% are using wftpd. In addition, another 40% of those are probably using older versions that, although they may not be trojaned, have security flaws of some kind.

C'mon! How Often Are Trojans Really Discovered?

Trojans are discovered often enough that they are a major security concern. What makes trojans so insidious is that even after they are discovered, their influence is still felt. Trojans are similar to sniffers in that respect. No one can be sure exactly how deep into the system the compromise may have reached. There are several reasons for this, but I will limit this section to only one.

As you will soon read, the majority of trojans are nested within compiled binaries. That is to say: The code that houses the trojan is no longer in human-readable form but has been compiled. Thus, it is in machine language. This language can be examined in certain raw editors, but even then, only printable character strings are usually comprehensible. These most often are error messages, advisories, option flags, or other data printed to STDOUT at specified points within the program:

my_function() 
{
cout << "The value you have entered is out of range!\n";
cout << "Please enter another:"
}

Because the binaries are compiled, they come to the user as (more or less) point-and-shoot applications. In other words, the user takes the file or files as is, without intimate knowledge of their structure.

When authorities discover that such a binary houses a trojan, security advisories are immediately issued. These tend to be preliminary and are later followed by more comprehensive advisories that may briefly discuss the agenda and method of operation of the trojan code. Unless the user is a programmer, these advisories spell out little more than "Get the patch now and replace the bogus binary." Experienced system administrators may clearly understand the meaning of such advisories (or even clearly understand the purpose of the code, which is usually included with the comprehensive advisory). However, even then, assessment of damages can be difficult.

In some cases, the damage seems simple enough to assess (for example, instances where the trojan's purpose was to mail out the contents of the passwd file). The fix is pretty straightforward: Replace the binary with a clean version and have all users change their passwords. This being the whole of the trojan's function, no further damage or compromise is expected. Simple.

But suppose the trojan is more complex. Suppose, for example, that its purpose is to open a hole for the intruder, a hole through which he gains root access during the wee hours. If the intruder was careful to alter the logs, there might be no way of knowing the depth of the compromise (especially if you discover the trojan months after it was installed). This type of case might call for reinstallation of the entire operating system.

NOTE: Reinstallation may be a requisite. Many more of your files might have been trojaned since the initial compromise. Rather than attempt to examine each file (or each file's behavior) closely, it might make better sense to start over. Equally, even if more files haven't been trojaned, it's likely that passwords, personal data, or other sensitive materials have been compromised.

Conversely, trojans may be found in executable files that are not compiled. These might be shell scripts, or perhaps programs written in Perl, JavaScript, VBScript, Tcl (a popular scripting language), and so forth. There have been few verified cases of this type of trojan. The cracker who places a trojan within a noncompiled executable is risking a great deal. The source is in plain, human-readable text. In a small program, a block of trojan code would stand out dramatically. However, this method may not be so ludicrous when dealing with larger programs or in those programs that incorporate a series of compiled binaries and executable shell scripts nested within several subdirectories. The more complex the structure of the distribution, the less likely it is that a human being, using normal methods of investigation, would uncover a trojan.

Moreover, one must consider the level of the user's knowledge. Users who know little about their operating system are less likely to venture deep into the directory structure of a given distribution, looking for mysterious or suspicious code (even if that code is human readable). The reverse is true if the user happens to be a programmer. However, the fact that a user is a programmer does not mean he or she will instantly recognize a trojan. I know many BASIC programmers who have a difficult time reading code written in Perl. Thus, if the trojan exists in a scripting language, the programmer must first be familiar with that language before he or she can identify objectionable code within it. It is equally true that if the language even slightly resembles a language that the programmer normally uses, he or she may be able to identify the problem. For example, Perl is sufficiently similar to C that a C programmer who has never written a line of Perl could effectively identify malicious code within a Perl script. And of course, anyone who writes programs in a shell language or awk would likewise recognize questionable code in a Perl program.

NOTE: Many Perl programs (or other scripted shell programs) are dynamic; that is, they may change according to certain circumstances. For example, consider a program that, in effect, rewrites itself based on certain conditions specified in the programming code. Such files need to be checked by hand for tampering because integrity checkers will always report that the file has been attacked, even when it has not. Granted, today, there are relatively few dynamic programs, but that is about to change. There is talk on the Internet of using languages like Perl to perform functions in Electronic Data Interchange (EDI). In some instances, these files will perform functions that necessarily require the program file to change.

What Level of Risk Do Trojans Represent?

Trojans represent a very high level of risk, mainly for reasons already stated:

Trojans are difficult to detect.
In most cases, trojans are found in binaries, which remain largely in non-human-readable form.
Trojans can affect many machines.

Let me elaborate. Trojans are a perfect example of the type of attack that is fatal to the system administrator who has only a very fleeting knowledge of security. In such a climate, a trojan can lead to total compromise of the system. The trojan may be in place for weeks or even months before it is discovered. In that time, a cracker with root privileges could alter the entire system to suit his or her needs. Thus, even when the trojan is discovered, new holes may exist of which the system administrator is completely unaware.

How Does One Detect a Trojan?

Detecting trojans is less difficult than it initially seems. But strong knowledge of your operating system is needed; also, some knowledge of encryption can help.

If your environment is such that sensitive data resides on your server (which is never a good idea), you will want to take advanced measures. Conversely, if no such information exists on your server, you might feel comfortable employing less stringent methods. The choice breaks down to need, time, and interest. The first two of these elements represent cost. Time always costs money, and that cost will rise depending on how long it has been since your operating system was installed. This is so because in that length of time, many applications that complicate the reconciliation process have probably been installed. For example, consider updates and upgrades. Sometimes, libraries (or DLL files) are altered or overwritten with newer versions. If you were using a file-integrity checker, these files would be identified as changed. If you were not the person who performed the upgrade or update, and the program is sufficiently obscure, you might end up chasing a phantom trojan. These situations are rare, true, but they do occur.

Most forms of protection against (and prevention of) trojans are based on a technique sometimes referred to as object reconciliation. Although the term might sound intimidating, it isn't. It is a fancy way of asking "Are things still just the way I left them?" Here is how it works: Objects are either files or directories. Reconciliation is the process of comparing those objects against themselves at some earlier (or later) date. For example, take a backup tape and compare the file PS as it existed in November 1995 to the PS that now resides on your drive. If the two differ, and no change has been made to the operating system, something is amiss. This technique is invariably applied to system files that are installed as part of the basic operating system.

Object reconciliation can be easy understood if you recognize that for each time a file is altered in some way, that file's values change. For example, one way to clock the change in a file is by examining the date it was last modified. Each time the file is opened, altered, and saved, a new last-modified date emerges. However, this date can be easily manipulated. Consider manipulating this time on the PC platform. How difficult is it? Change the global time setting, apply the desired edits, and archive the file. The time is now changed. For this reason, time is the least reliable way to reconcile an object (at least, relying on the simple date-last-modified time is unreliable). Also, the last date of modification reveals nothing if the file was unaltered (for example, if it was only copied or mailed).

NOTE: PC users who have used older machines can easily understand this. Sometimes, when the CMOS battery fails, the system may temporarily fail. When it is brought back up, you will see that a few files have the date January 1, 1980.

Another way to check the integrity of a file is by examining its size. However, this method is extremely unreliable because of how easily this value can be manipulated. When editing plain text files, it is simple to start out with a size of, say, 1,024KB and end up with that same size. It takes cutting a bit here and adding a bit there. But the situation changes radically when you want to alter a binary file. Binary files usually involve the inclusion of special function libraries and other modules without which the program will not work. Thus, to alter a binary file (and still have the program function) is a more complicated process. The programmer must preserve all the indispensable parts of the program and still find room for his or her own code. Therefore, size is probably a slightly more reliable index than time. Briefly, before I continue, let me explain the process by which a file becomes trojaned.

The most common scenario is when a semi-trusted (known) file is the object of the attack. That is, the file is native to your operating system distribution; it comes from the vendor (such as the file csh in UNIX or command.com in DOS). These files are written to your drive on the first install, and they have a date and time on them. They also are of a specified size. If the times, dates, or sizes of these files differ from their original values, this raises immediate suspicion.

Evil programmers know this. Their job, therefore, is to carefully examine the source code for the file (usually obtained elsewhere) for items that can be excluded (for example, they may single out commented text or some other, not-so-essential element of the file). The unauthorized code is written into the source, and the file is recompiled. The cracker then examines the size of the file. Perhaps it is too large or too small. The process then begins again, until the attacker has compiled a file that is as close to the original size as possible. This is a time-consuming process. If the binary is a fairly large one, it could take several days.

NOTE: When an original operating-system distributed file is the target, the attacker may or may not have to go through this process. If the file has not yet been distributed to anyone, the attacker need not concern himself or herself with this problem. This is because no one has yet seen the file or its size. Perhaps only the original author of the file would know that something was amiss. If that original author is not security conscious, he or she might not even know. If you are a programmer, think now about the very last binary you compiled. How big was it? What was its file size? I bet you don't remember.

When the file has been altered, it is placed where others can obtain it. In the case of operating-system distributions, this is generally a central site for download (such as sunsite.unc.edu, which houses one of the largest collection of UNIX software on the planet). From there, the file finds its way into workstations across the void.

NOTE: sunsite.unc.edu is the Sun Microsystems-sponsored site at UNC Chapel Hill. This site houses the greater body of free software on the Internet. Thousands of individuals--including me--rely on the high-quality UNIX software available at this location. Not enough good can be said about this site. It is a tremendous public service.

For reasons that must now seem obvious, the size of the file is also a poor index by which to measure its alteration. So, to recount: Date, date of last access, time, and size are all indexes without real meaning. None of these alone is suitable for determining the integrity of a file. In each, there is some flaw--usually inherent to the platform--that makes these values easy to alter. Thus, generating a massive database of all files and their respective values (time, size, date, or alteration) has only very limited value:

...a checklist is one form of this database for a UNIX system. The file content themselves are not usually saved as this would require too much disk space. Instead, a checklist would contain a set of values generated from the original file--usually including the length, time of last modification, and owner. The checklist is periodically regenerated and compared against the save copies, with discrepancies noted. However...changes may be made to the contents of UNIX files without any of these values changing from the stored values; in particular, a user gaining access to the root account may modify the raw disk to alter the saved data without it showing in the checklist.

There are other indexes, such as checksums, that one can check; these are far better indexes, but also not entirely reliable. In the checksum system, the data elements of a file are added together and run through an algorithm. The resulting number is a checksum, a type of signature for that file (bar-code readers sometimes use checksums in their scan process). On the SunOS platform, one can review the checksum of a particular file using the utility sum. sum calculates (and prints to STDOUT or other specified mediums) the checksums of files provided on the argument line.

Although checksums are more reliable than time, date, or last date of modification, these too can be tampered with. Most system administrators suggest that if you rely on a checksum system, your checksum list should be kept on a separate server or even a separate medium, accessible only by root and other trusted users. In any event, checksums work nicely for checking the integrity of a file transferred, for example, from point A to point B, but that is the extent of it.

NOTE: Users who have performed direct file transfers using communication packages such as Qmodem, Telix, Closeup, MTEZ, or others will remember that these programs sometimes perform checksum or CRC checks as the transfers occur. For each file transferred, the file is checked for integrity. This reduces--but does not eliminate--the likelihood of a damaged file at the destination. If the file proves to be damaged or flawed, the transfer process may begin again. When dealing with sophisticated attacks against file integrity, however, this technique is insufficient.

Cross Reference: Tutorials about defeating checksum systems are scattered across the Internet. Most are related to the development of viruses (many virus-checking utilities use checksum analysis to identify virus activity). A collection of such papers (all of which are underground) can be found at http://www.pipo.com/guillermito/darkweb/news.html.

MD5

You're probably wondering whether any technique is sufficient. I am happy to report that there is such a technique. It involves calculating the digital fingerprint, or signature, for each file. This is done utilizing various algorithms. A family of algorithms, called the MD series, is used for this purpose. One of the most popular implementations is a system called MD5.

MD5 is a utility that can generate a digital signature of a file. MD5 belongs to a family of one-way hash functions called message digest algorithms. The MD5 system is defined in RFC 1321. Concisely stated:

The algorithm takes as input a message of arbitrary length and produces as output a 128-bit "fingerprint" or "message digest" of the input. It is conjectured that it is computationally infeasible to produce two messages having the same message digest, or to produce any message having a given prespecified target message digest. The MD5 algorithm is intended for digital signature applications, where a large file must be "compressed" in a secure manner before being encrypted with a private (secret) key under a public-key cryptosystem such as RSA.

Cross Reference: RFC 1321 is located at http://www.freesoft.org/Connected/RFC/1321/1.html.

When one runs a file through an MD5 implementation, the signature emerges as a 32-character value. It looks like this:

2d50b2bffb537cc4e637dd1f07a187f4

Many sites that distribute security fixes for the UNIX operating system employ this technique. Thus, as you browse their directories, you can examine the original digital signature of each file. If, upon downloading that file, you find that the signature is different, there is a 99.9% chance that something is terribly amiss.

MD5 performs a one-way hash function. You may be familiar with these operations from other forms of encryption, including those used to encrypt password files.

Some very extreme security programs use MD4 and MD5 algorithms. One such program is S/Key, which is a registered trademark of Bell Laboratories. S/Key implements a one-time password scheme. One-time passwords are nearly unbreakable. S/Key is used primarily for remote logins and to offer advanced security along those channels of communication (as opposed to using little or no security by initiating a normal, garden-variety Telnet or Rlogin session). The process works as described in "S/Key Overview" (author unknown):

S/Key uses either MD4 or MD5 (one-way hashing algorithms developed by Ron Rivest) to implement a one-time password scheme. In this system, passwords are sent cleartext over the network; however, after a password has been used, it is no longer useful to the eavesdropper. The biggest advantage of S/Key is that it protects against eavesdroppers without modification of client software and only marginal inconvenience to the users.

Cross Reference: Read "S/Key Overview" at http://medg.lcs.mit.edu/people/wwinston/skey-overview.html.

With or without MD5, object reconciliation is a complex process. True, on a single workstation with limited resources, one could technically reconcile each file and directory by hand (I would not recommend this if you want to preserve your sanity). However, in larger networked environments, this is simply impossible. So, various utilities have been designed to cope with this problem. The most celebrated of these is a product aptly named TripWire.

TripWire

TripWire (written in 1992) is a comprehensive system-integrity tool. It is written in classic Kernhigan and Ritchie C (you will remember from Chapter 7, "Birth of a Network: The Internet," that I discussed the portability advantages of C; it was this portability that influenced the choice of language for the authors of TripWire).

TripWire is well designed, easily understood, and implemented with minimal difficulty. The system reads your environment from a configuration file. That file contains all filemasks (the types of files that you want to monitor). This system can be quite incisive. For example, you can specify what changes can be made to files of a given class without TripWire reporting the change (or, for more wholesale monitoring of the system, you can simply flag a directory as the target of the monitoring process). The original values (digital signatures) for these files are kept within a database file. That database file (simple ASCII) is accessed whenever a signature needs to be calculated. Hash functions included in the distribution are

MD5
MD4
CRC32
MD2
Snefru (Xerox secure hash function)
SHA (The NIST secure hash algorithm)

It is reported that by default, MD5 and the Xerox secure hash function are both used to generate values for all files. However, TripWire documentation suggests that all of these functions can be applied to any, a portion of, or all files.

Altogether, TripWire is a very well-crafted package with many options.

Cross Reference: TripWire (and papers on usage and design) can be found at ftp://coast.cs.purdue.edu/pub/tools/unix/TripWire/.

TripWire is a magnificent tool, but there are some security issues. One such issue relates to the database of values that is generated and maintained. Essentially, it breaks down to the same issue discussed earlier: Databases can be altered by a cracker. Therefore, it is recommended that some measure be undertaken to secure that database. From the beginning, the tool's authors were well aware of this:

The database used by the integrity checker should be protected from unauthorized modifications; an intruder who can change the database can subvert the entire integrity checking scheme.

Cross Reference: Before you use TripWire, read "The Design and Implementation of TripWire: A File System Integrity Checker" by Gene H. Kim and Eugene H. Spafford. It is located at ftp://ftp.cs.purdue.edu/pub/spaf/security/Tripwire.PS.Z.

One method of protecting the database is extremely sound: Store the database on read-only media. This virtually eliminates any possibility of tampering. In fact, this technique is becoming a strong trend in security. In Chapter 21, "Plan 9 from Bell Labs," you will learn that the folks at Bell Labs now run their logs to one-time write or read-only media. Moreover, in a recent security consult, I was surprised to find that the clients (who were only just learning about security) were very keen on read-only media for their Web-based databases. These databases were quite sensitive and the information, if changed, could be potentially threatening to the security of other systems.

Kim and Spafford (authors of TripWire) also suggest that the database be protected in this manner, though they concede that this could present some practical, procedural problems. Much depends upon how often the database will be updated, how large it is, and so forth. Certainly, if you are implementing TripWire on a wide scale (and in its maximum application), the maintenance of a read-only database could be formidable. Again, this breaks down to the level of risk and the need for increased or perhaps optimum security.

TAMU

The TAMU suite (from Texas A&M University, of course) is a collection of tools that will greatly enhance the security of a UNIX box. These tools were created in response to a very real problem. As explained in the summary that accompanies the distribution:

Texas A&M University UNIX computers recently came under extensive attack from a coordinated group of Internet crackers. This paper presents an overview of the problem and our responses, which included the development of policies, procedures, and sdoels to protect university computers. The tools developed include `drawbridge', an advanced Internet filter bridge, `tiger scripts', extremely powerful but easy to use programs for securing individual hosts, and `xvefc', (XView Etherfind Client), a powerful distributed network monitor.

Contained within the TAMU distribution is a package of tiger scripts, which form the basis of the distribution's digital signature authentication. As the above-mentioned summary explains:

The checking performed covers a wide range of items, including items identified in CERT announcements, and items observed in the recent intrusions. The scripts use Xerox's cryptographic checksum programs to check for both modified system binaries (possible trap doors/trojans), as well as for the presence of required security related patches.

Cross Reference: Xerox hash.2.5a can be found on the PARC ftp site (ftp://parcftp.xerox.com/pub/hash/hash2.5a/). This package is generally referred to as the Xerox Secure Hash Function, and the distribution is named after Snefru, a pharaoh of ancient Egypt. The distribution at the aforementioned site was released in 1990, and source is included. For those interested in hacking the Snefru distribution, the material here is invaluable. (Also, refer to a sister document about the distribution and a more comprehensive explanation: A Fast Software One Way Hash Function by Ralph C. Merkle (there is a full citation at the end of this chapter in the Resources section).

The TAMU distribution is comprehensive and can be used to solve several security problems, over and above searching for trojans. It includes a network monitor and packet filter.

Cross Reference: The TAMU distribution is available at ftp://coast.cs.purdue.edu/pub/tools/unix/TAMU/.

ATP (The Anti-Tampering Program)

ATP is a bit more obscure than TripWire and the TAMU distribution, but I am not certain why. Perhaps it is because it is not widely available. In fact, searches for it may lead you overseas (one good source for it is in Italy). At any rate, ATP works somewhat like TripWire. As reported by David Vincenzetti, DSI (University of Milan, Italy) in "ATP--Anti-Tampering Program":

ATP 'takes a snapshot' of the system, assuming that you are in a trusted configuration, and performs a number of checks to monitor changes that might have been made to files.

Cross Reference: "ATP--Anti-Tampering Program" can be found at http://www.cryptonet.it/docs/atp.html.

ATP then establishes a database of values for each file. One of these values (the signature) consists of two checksums. The first is a CRC32 checksum, the second an MD5 checksum. You might be wondering why this is so, especially when you know that CRC checksums are not entirely secure or reliable, as explained previously. The explanation is this: Because of its speed, the CRC32 checksum is used in checks performed on a regular (perhaps daily) basis. MD5, which is more comprehensive (and therefore more resource and time intensive), is intended for scheduled, periodic checks (perhaps once a week).

The database is reportedly encrypted using DES. Thus, ATP provides a flexible (but quite secure) method of monitoring your network and identifying possible trojans.

Cross Reference: ATP docs and distribution can be found at ftp://security.dsi.unimi.it/pub/security.

Hobgoblin

The Hobgoblin tool is an interesting implementation of file- and system-integrity checking. It utilizes Ondishko Consistency checking. The authors of the definitive paper on Hobgoblin (Farmer and Spafford at Purdue) claim that the program is faster and more configurable than COPS and generally collects information in greater detail. What makes Hobgoblin most interesting, though, is that it is both a language and an interpreter. The programmers provided for their own unique descriptors and structural conventions.

The package seems easy to use, but there are some pitfalls. Although globbing conventions (from both csh and sh/bash) are permissible, the Hobgoblin interpreter reserves familiar and often-used metacharacters that have special meaning. Therefore, if you intend to deploy this powerful tool in a practical manner, you should set aside a few hours to familiarize yourself with these conventions.

In all, Hobgoblin is an extremely powerful tool for monitoring file systems. However, I should explain that the program was written specifically for systems located at the University of Rochester and, although it has been successfully compiled on a variety of platforms, your mileage may vary. This is especially so if you are not using a Sun3, Sun4, or VAX with Ultrix. In this instance, some hacking may be involved. Moreover, it has been observed that Hobgoblin is lacking some elements present in other file-integrity checkers, although I believe that third-party file-integrity checkers can be integrated with (and their calls and arguments nested within) Hobgoblin.

Cross Reference: Hobgoblin and its source are located at ftp://freebsd.cdrom.com/.20/security/coast/tools/unix/hobgoblin/hobgoblin.shar.Z.uu.Z.

On Other Platforms

You're probably wondering whether there are any such utilities for the Windows platform. It happens that there are, though they are perhaps not as powerful or reliable. Most of these tools use checksum integrity checkers and are, therefore, not as comprehensive as tools that employ MD5. Flatly stated, the majority for the Microsoft platform are intended for use as virus scanners.

For this reason, I have not listed these utilities here (a listing of them does appear in Chapter 14, "Destructive Devices"). However, I do want to address a few points: It is generally assumed that trojans are a security problem primarily for UNIX and that when that problem is a Windows problem, it usually involves a virus. There is some truth to this, and there are reasons for it.

Until recently, security on IBM compatibles running Microsoft products was slim. There was no need for complex trojans that could steal (or otherwise cull) information. Thus, the majority of trojans were viruses encased in otherwise useful (or purportedly useful) programs. That situation has changed.

It should be understood that a trojan can be just as easily written for a Microsoft platforms as for any other. Development tools for these platforms are powerful, user-friendly applications (even VC++ far surpasses C compiling utilities made by other firms). And, now that the Windows environment is being used as Internet server material, you can expect the emergence of trojans.

Summary

People generally equate trojan horses with virus attacks and, while this is accurate to some degree, it is not the whole picture. True, trojans on the PC-based operating systems have traditionally been virus related, but on the UNIX platform, a totally different story emerges. On the UNIX platform, crackers have consistently crafted trojans that compromise security without damaging data or attaching unauthorized code to this or that executable.

In either case, however, one thing is clear: Trojans are a significant security risk to any server as well as to machines networked to that server. Because PC-based servers are becoming more common on the Internet, utilities (above and beyond those virus checkers already available) that can identify trojaned files must be developed.

Resources

Following you will find an extensive list of resources concerning object reconciliation. Some of these documents are related to the process of object reconciliation (including practical examples) and some are related to the process by which this reconciliation is performed. All of them were handpicked for relevancy and content. These are the main papers available from the void (some books are sprinkled in as well). I recommend that every system administrator at least gain a baseline knowledge of these techniques (if not actually implement the procedures detailed within).

"MDx-MAC and Building Fast MACs from Hash Functions." Bart Preneel and Paul C. van Oorschot. Crypto 95.

ftp.esat.kuleuven.ac.be/pub/COSIC/preneel/mdxmac_crypto95.ps

"Message Authentication with One-Way Hash Functions." Gene Tsudik. 1992. IEEE Infocom 1992.

http://www.zurich.ibm.com/Technology/Security/publications/1992/t92.ps.Z

"RFC 1446--1.5.1. Message Digest Algorithm." Connected: An Internet Encyclopedia.

http://www.freesoft.org/Connected/RFC/1446/7.html

"Answers To FREQUENTLY ASKED QUESTIONS About Today's Cryptography." Paul Fahn. RSA Laboratories. 1993 RSA Laboratories, a division of RSA Data Security.

http://www.sandcastle-ltd.com/Info/RSA_FAQ.html

"The Checksum Home Page." Macintosh Checksum.

http://www.cerfnet.com/~gpw/Checksum.html

"RFC 1510--6. Encryption and Checksum Specifications." Connected: An Internet Encyclopedia.

http://www.freesoft.org/Connected/RFC/1510/69.html

"RFC 1510--6.4.5. RSA MD5 Cryptographic Checksum Using DES (rsa-md5des)." Connected: An Internet Encyclopedia. J. Kohl. Digital Equipment Corporation, C. Neuman, ISI. September 1993.

http://www.freesoft.org/Connected/RFC/1510/index.html

"Improving the Efficiency and Reliability of Digital Time-Stamping." D. Bayer and S. Haber and W. S. Stornetta. 1992.

http://www.surety.com

"A Proposed Extension to HTTP: Simple MD5 Access Authentication." Jeffery L. Hostetler and Eric W. Sink. 1994.

http://www.spyglass.com/techreport/simple_aa.txt

"A Digital Signature Based on a Conventional Encryption Function." Ralph C. Merkle. Crypto 87, LNCS, pp. 369-378, SV, Aug 1987.

"An Efficient Identification Scheme based on Permuted Kernels." Adi Shamir. Crypto 89, LNCS, pp. 606-609, SV, Aug 1989.

"An Introduction To Digest Algorithms." Proceedings of the Digital Equipment Computer Users Society Australia, Ross N. Williams. Sep 1994.

ftp://ftp.rocksoft.com/pub/rocksoft/papers/digest10.tex

"Data Integrity With Veracity." Ross N. Williams.

ftp://ftp.rocksoft.com/clients/rocksoft/papers/vercty10.tex

"Implementing Commercial Data Integrity with Secure Capabilities." Paul A. Karger. SympSecPr. Oakland, CA. 1988. IEEECSP.

"Trusted Distribution of Software Over the Internet." Aviel D. Rubin. (Bellcore's Trusted Software Integrity (Betsi) System). 1994.

ftp://ftp.cert.dfn.de/pub/docs/betsi/Betsi.ps

"International Conference on the Theory and Applications of Cryptology." 1994 Wollongong, N.S.W. Advances in Cryptology, ASIACRYPT November 28-December 1, 1994. (Proceedings) Berlin & New York. Springer, 1995.

"Managing Data Protection" (Second Edition). Dr. Chris Pounder and Freddy Kosten, Butterworth-Heineman Limited, 1992.

"Some Technical Notes on S/Key, PGP..." Adam Shostack.

http://www.homeport.org/~adam/skey-tech-2.html

"Description of a New Variable-Length Key, 64-Bit Block Cipher" (Blowfish). Bruce Schneier. Counterpane Systems.

http://www.program.com/source/crypto/blowfish.txt