[Linux for Newbies]

Linux for Newbies, Part 10:
Finding Things on a Linux System

by Gene Wilburn

(The Computer Paper, May 2000. Copyright © Wilburn Communications Ltd. All rights reserved)


"Now where did I put that file?" is a common question on all computer systems. On Linux systems file dispersal is a particular fact of life when you add programs to your system either via RPM's or hand-rolled tarballs. Where did the RPM put that executable or library file? Where are those sample Java programs I was experimenting with? Where did that PNG graphic I created with the GIMP get stored?

Finding things is somewhat easier on Linux than on some operating systems because that Linux has a logical file structure. You can usually find binary files, for instance, by simply looking in logical places, such as /bin, /sbin, /usr/bin, /usr/sbin, and /usr/local/bin for binaries. Nonetheless some executables can end up elsewhere, such as /opt/wp/bin or some other directory such as /usr/X11R6/bin.

Most of your own files end up somewhere in your /home directory. If you're running an Apache web server, your HTML docs can be either under Apache itself (/home/httpd/html on Red Hat systems), or in your home directory -- e.g., /home/yourname/public_html.

Even though the logic of the file system helps, when you are looking for something among the thousands of files that typically populate any Linux setup, it's nice to know that there are aids for finding things.

Quick Finders

Quite often you need a quick way to know where a binary file resides. Different Unix systems have slightly differing philosophies about where certain things are stored, and the slight variations sometimes need to be taken into account. If, for example, you download a Perl script, it may have the bangpath statement #!/usr/local/bin/perl as the first line. If the script doesn't run on your Linux system there's a good chance that your Perl is not in /usr/local/bin. To find out where it is, use the which command:

This displays the full path of a program. With this information you can then change the bangpath at the top of the program to #!/usr/bin/perl in the script, or add a symbolic link from /usr/local/bin/perl to /usr/bin/perl.

Locate, which is similar to which, provides a command-line front end to a database of all the files stored on your computer. It often displays a considerable amount of output, so you may need to pipe the results through less or more.

You can use locate to find the related files for a program. For instance, the default vi editor in Red Hat Linux is Vim. You know that Vim has help files you can invoke from within an edit session but you wonder where those files reside. You can find out with the locate command:

The locate command shows that there are a large number of help files in /usr/share/vim/doc.

Locate uses the locatedb database, which must be updated periodically to keep up with changes and additions to your system. Most Linux systems have a cron job set to update the database. However, if you don't run your system 24 hours per day, you may need to update your locatedb manually. You do this by logging in as root and typing the following:

Whereis is a command similar to locate, but somewhat simpler:

Whereis only displays a program's binary executable, its source and its man page locations. This is preferable to locate when you don't want all the extra related file locations.

There are times when you can remember the name of a command but can't remember what it does. What is mformat for instance? Just type the following:

Whatis is complementary to whereis. Because whatis is also a database, it too needs to be rebuilt occasionally. To rebuild the whatis database, log in as root and type:

Sometimes a different kind of search is necessary. You know there's a utility that does such and such, but you can't quite remember its name. What is the name of that command that can format a DOS disk? Apropos to the rescue. Apropos searches the whatis database for matching strings. You can find the program you're looking for by typing:

Ah yes, it's mformat!

These handy finding aids are quick and easy. Once you start using them, finding most files becomes a simple task. But sometimes you're after a needle in the haystack. For that, you turn to find, arguably one of the most useful and powerful tools on a Unix/Linux system.

Find--a Unix/Linux power tool

Find does just what the name suggests--it finds things. It's a brute-force seeker, recursing down through every directory in its path looking for patterns (including wildcards and regular expressions). Because it has no pre-existing database, it's much slower than locate or whereis, but it is thorough and dependable. If find doesn't find it, it most likely doesn't exist.

Linux uses GNU Find, a swiss-army-knife utility with options galore. Find has a somewhat unusual syntax that requires a little study. The most common use of find is to find a single file or a set of files related by name. The following example shows a way to look for any filenames or directory names called "perl", starting at the root directory ("/"):

It should be noted that the syntax in the preceding example is considered very modern and won't work with all versions of find on all Unix systems. Because of this, I prefer the classic, slightly more verbose, syntax of find (using the same example):

This more universal version of the command includes the -print qualifier (meaning print to screen) and surrounds the search term in quotes. The search expression can include wildcards. If you're pretty sure, for instance, that there's a file on the system that ends in "perl" the search can be rephrased as:

Likewise you can surround the entire search term with wildcards, e.g. "*perl*", fishing out any instance of a file or directory that has "perl" anywhere in its name.

Advanced uses of find

Find has an armada of optional qualifiers, including file types, date qualifiers and boolean logic ("and", "or", and "not"). One of the most common uses of find is looking for files of a certain age. For example, you can search down your home directories looking for files that were modified exactly ten days old by using the -mtime qualifier (date last modified):

Likewise you can look for files that have not been accessed for over 90 days old by typing:

The -type qualifier, in this example, tells find to search only for files of type "f" (regular files), as opposed to block files or other binary files. The -atime qualifier inspects the "last access" time stamp on any files found. This can be very useful for cleaning up systems. If you have a rule, for instance, that user files that have not been accessed for over a year should be automatically deleted, you add the -exec qualifier to find:

The syntax following the -exec operator may look a little odd. The "{}" is a placeholder or variable for each instance of a filename that find finds. The "\;" tells find that it is at the end of any additional commands and prevents odd things from happening during shell-script expansions. Anything that uses the "rm" (remove) command is potentially dangerous. Use the -print qualifier in place of -exec during testing until you're sure you've got it right.

Now, let's think about what we've got here. A utility (find) that can traverse an entire file system, or selected parts of the file tree hierarchy. Once it finds things, it can execute commands on the things it finds. We now have the makings of some very powerful scripts, such as a global search and replace on a Web site. You could combine find with a command-line Perl statement to change all instances of "Copyright 1999" to "Copyright 2000" on all the HTML pages of your site, making a backup copy of all the changed files, just to be safe:

To get rid of all the *.bak files this command created, after you've verified that everything was changed correctly, you can type:

This is the inner beauty of Unix--providing logical building blocks that allow you to string together existing utilities to create your own custom utilities.

Another use for find is searching for programs on your system that have special permissions set. Some executables, such as lpr or sendmail, need to execute with root privilege and are "set user ID" to root (SUID). Knowing where these are on your system can help you watch for security vulnerabilities.

You can find all such files on your system by typing:

If you see a file in this listing that doesn't make sense, you may have detected the work of an intruder.

Find has so many qualifiers and so many potential uses, that this only begins to convey a sense of its full power. You can find out more about using find from advanced Unix books, such as Unix Power Tools, O'Reilly & Associates (ISBN 1-56592-260-3, $85.95).

Next time: regular expressions

Gene Wilburn (gene@wilburn.ca) is a Toronto-based IT specialist, musician and writer who operates a small farm of Linux servers.

-30-