Pigs In Space

Today I noticed that my root filesystem has a little less free space than I would really like it to have, so I decided to do a bit of cleanup…

Finding The Space Hogs

I’m a bit old-fashioned, so I still tend to do this sort of thing from the command line rather than using fancy GUI tools. Over the years, this has served me well, because I can get things done even if I only have terminal access to a system (e.g., via ssh or a console) and only have access to simple standard commands on a system.

One quick trick is that you can easily find the top ten space hogs within any given directory using the following command:

du -k -s -x * | sort -n -r | head

Let me break down what this does.

The ‘du’ command provides you with information on disk usage. Its default behavior is to give you a recursive listing of how much space is used beneath all directories from the current directory on down.

The ‘-k’ option tells du to report all utilization numbers using kilobytes. This will be important in a moment when we sort the list, as most modern versions of du will default to using the most appropriate unit. it’s much easier for a generic sort program to automatically sort 1200 and 326 than it is to sort 1.2G and 326M.

The ‘-s’ option tells du to only report a sum for each file or directory specified, rather than also recursively reporting on all of the subdirectories underneath.

The ‘-x’ option tells du to stay within the same filesystem. This is important if you’re exploring a filesystem that might have other filesystems mounted beneath it, as it tells du not to include information from those other filesystems. For instance, if you’re trying to clean up /var and /var/lib/lxc is a different filesystem mounted from its own separate storage, you don’t want to include the stuff under /var/lib/lxc in your numbers.

Finally, we specify an asterisk wildcard (‘*’) to tell du to spit out stats for each file and directory within the current directory. (Note that if you have hidden files or directories — files that begin with a ‘.’ in the Linux/Unix world — the asterisk will ignore those by default in most shells.)

Next, we pipe the output of the du command to the ‘sort’ command, which does pretty much exactly what it sounds like it should do.

The ‘-n’ option tells sort to do a numeric sort rather than a string sort. By default sort will use a string sort, which would yield results like “1, 10, 11, 2, 3” instead of “1, 2, 3, 10, 11”.

The ‘-r’ option tells sort that we want it to output results in reverse order (i.e., last to first, or biggest to smallest).

Finally, we pipe the sorted output to the ‘head’ command. The head command will spit out the “head,” or the first few lines of a file. By default, head will spit out the first ten lines of a file.

The net result is that this command gives the top ten space hogs in the current directory.

On my system I know that it’s almost invariably the /var filesystem that sucks up space on my root filesystem, so I started there:

[root@server ~]# cd /var
[root@server var]# du -k -s -x * | sort -n -r | head
3193804 lib
2859856 spool
1386684 log
195932 cache
108 www
88 tmp
12 kerberos
12 db
8 empty
4 yp

This tells me that I need to check out /var/lib, /var/spool, and /var/log. The /var/cache directory might yield a little space if cleaned up, but probably not a lot, and everything else is pretty much beneath my notice.

From here, I basically just change directory to each of the top hitters and repeat the process, cleaning up unnecessary files as I go along.

Leave a Reply

Your email address will not be published. Required fields are marked *