Most problems with Linux are really easy to fix if you know how to find Linux error messages. This article will show you the top five places to find relevant error messages.
#1 Find Linux Error Messages By Asking The Linux Kernel
Linux includes a built-in method for keeping track of its most recent error messages. It’s called the Linux kernel ring buffer.
Computer scientists know what a ring buffer does, but most everyone else doesn’t, so here’s a quick explanation.
A ring buffer is a special file often stored in computer memory (RAM). This file always stays the same size. The operating system kernel needs the ring buffer to stay the same because, on Intel and AMD processors, the core parts of the kernel can’t get bigger (or smaller) after the kernel accepts control of the computer. In order to stay the same size, a ring buffer deletes the oldest lines in its memory when newer lines arrive, making it look like lines start at one end of the ring and disappear when they reach the end of the ring.
Because the kernel stores the ring buffer in computer memory, you can’t read it like a regular file. You need to use a special command,
dmesg
. By default, any user can type this command to see what the kernel did recently. The output of dmesg has a specific format, here’s a sample:[143297.900019] sd 0:0:0:0: [sda] Starting disk
[143298.192032] sd 1:0:0:0: [sdb] Starting disk
[143317.464817] agpgart-amdk7 0000:00:00.0: AGP 2.0 bridge
The number in the brackets to the left indicates when the event happened. Unfortunately, the kernel doesn’t keep track of time the same way you or I do. The kernel knows nothing about days or hours—that’s the job of end-user software—the kernel cares only about how many seconds since you turned on the computer. The first message above occurred 143,297 seconds after I most recently turned on my computer.
So to figure out when a warning or error occurred, you need to know how many seconds your computer has been running. Luckily, the kernel makes it easy for you to find this number—run the following command:
cat /proc/uptime
. The number of seconds since the kernel began running appears first. Subtraction will tell you exactly how many seconds ago the error occurred—or you can keep reading to discover how to get the computer to do the math for you.The second field in the ring buffer output usually indicates the Linux kernel driver responsible for printing the Linux error messages. In the example above, the messages come from “sd” (the SCSI Disk controller) and agpgart-amdk7 (the AGP [video card] Graphics Address Remapping Table [GART] for my AMD K7 motherboard).
In some cases, as in the example above, the third field shown is the hardware address. The rest of the message is the actual warning or error message.
#2: Find Linux Error Messages By Checking The System Log
All sorts of programs besides the Linux kernel need to store their warning and Linux error messages—and many programs want to keep those messages around even after the kernel ring buffer fills up, so programmers added the system log program to Unix and Linux.
The system log program,
syslogd
, starts running shortly after the computer boots up. The first thing it does is download all of the messages from dmesg
, so you’ll find all of your kernel ring buffer messages neatly stored with easy-to-read date stamps. After starting up, syslogd creates a special file, /dev/log
, to which any program can write an error message. Syslogd reads each message as it’s written and adds the useful information shown below:Jan 28 07:58:45 callisto kernel: [164734.375126] PM: Basic memory bitmaps freed
Jan 28 07:58:52 callisto acpid: 1 client rule loaded
Jan 28 07:58:52 callisto anacron[31557]: Anacron 2.3 started on 2011-01-28
The fields shown here are the date, the hostname (callisto in my case), the name of the program that created the log entry, and optionally the program’s Process I.D. (PID), followed by the actual log entry.
To read the logs generated by syslogd, you need to become root. This is a security measure designed to protect the logs from hackers and snooping users, as the logs may contain confidential information. Use
su
or sudo
to become root on your system and run the following command to read the system log: less /var/log/syslog
.#3: Find Linux Error Messages For Long-Running Programs
Ever wonder why so many Linux programs end in the letter “d”? It comes from a very old computer science term, “daemon” (pronounced demon), which comes from ancient Greek mythology—a daemon was a soulless creature created by the Greek gods to do boring, mundane work over and over. Linux daemons, like syslogd, also spend all of their time doing the same job over and over again.
Because these programs are usually so boring, they tend to store their logs separately from the system log. The logs are created by syslogd, so they use the same format above and you still need to be root in order to read them. Daemon log files are where you want to look if a Linux service—such as printing—stop working. Read them by typing the following command as root:
less /var/log/daemon.log
#4: Find Linux Error Messages For Programs With A Lot To Say
Some programs, most of them daemons, print a lot of warnings, errors, or just general information, so they try to avoid cluttering the system log or the daemon log by printing messages to their own log files. Almost every one of these programs stores its files in the
/var/log
directory. Programs name their files after themselves. For example, the MySQL database server stores its logs in /var/log/mysql
.You can see which logs are available by typing,
ls /var/log
. You can read the log you want by typing, ls /var/log/filename
, where filename is the file you want to read.#5: Find Linux Error Messages For Broken Programs
Although you can’t usually fix kernel and daemon errors without error messages, sometimes end-user programs will give you a tough time too. If the graphical program you’re using behaves weirdly or gives you obscure errors, use the following simple method to get a little bit more information:
- Find out the program’s command name. Every program has one. Usually you can guess it from the program name—for example, Firefox’s command name is
firefox
. If you can’t guess it, right click on the program’s icon (launcher) and check the properties—the command name will be listed. - Open a terminal.
- Type the program’s command name and repeat the steps that cause problem.
- Close down the program. In the terminal should appear the program’s debugging messages.
What To Do With Linux Error Messages
Many Linux error messages will tell you what the problem is right away—so all you need to do is fix the problem. Other messages are more obscure. For these messages, I suggest you start by searching Google for the text of the error—you’ll almost always find a mailing list post explaining the Linux error messages. If you still can’t figure out the error message, find the user mailing list for the program you’re using and politely ask for help there. Be sure to include a good description of your problem and the complete Linux error message in your email.
David A. Harding is a Linux Professional Institute certified system administrator and freelance writer with over 10 years experience working with Linux. He’s been published in over a dozen magazines and has given over 50 presentations about Linux—including two Software Freedom Day keynotes. Dave always loves to hear from readers atdave@dtrt.org. Also see this article written by Dave Linux Tools for Windows. Dave tends to find Linux error messages all over the place, in one disturbing case, on a plane about to take off.
0 comments:
Post a Comment