Monthly Archives: May 2018

Sys Army Knife – What’s in list x but not list y?

Sys Army KnifeIt’s time once again to pull out your sys army knife and explore how to best use some of the tools available to system administrators out there! These “sys army knife” posts explore how to use common Linux/Unix command line tools to accomplish tasks that system administrators may encounter day-to-day.

I’m regularly involved in large-scale data center migration projects, so I quite commonly have to look at two different lists of things and figure out which entries are unique to each list.

For instance, I might have a list of machines that we’re planning to migrate. If someone gives me an updated list of machines in the data center. I have to figure out if there are machines we don’t have to migrate after all, or if there are new machines we have to plan for.

Sysadmins do it with one line

If each of your lists contains only unique values, this task can be done with a simple one liner, like this:

cat file1 file2 file2 | sort | uniq -u

For example, let’s say that I have two lists. The first list is in a file named x and looks like this:

appserver01
appserver02
dbserver01
webserver01
webserver02
webserver03

The second list is in a file named y and looks like this:

appserver02
appserver03
dbserver01
webserver01
webserver03
webserver04

This shows the values unique to x:

$ cat x y y | sort | uniq -u
appserver01
webserver02

… and this shows the lines unique to y:

$ cat y x x | sort | uniq -u
appserver03
webserver04

How does it work?!?

What the commands above do is this: They take one copy of one file, two copies of a second file, sort the results, and then only print out lines that occur a single time.

You start with one copy of the first file, which means you have one copy of every line in that file. Then you add two copies of the second file. This means that you will have three copies of any line that is in both files, and two copies of any line that only occurs in the second file, but you’ll still only have one copy of any line that only exists in the first file. Thus, if you search for lines that only occur once in the final results, you’ll only find lines that are unique to the second file.

Here’s a little more detail:

The first part of the command (cat file1 file2 file2) concatenates together one copy of file1 and two copies of file2 and spits that out.

We then take the output of that cat command and pipe (‘|’) it to the sort command, which will produce a sorted copy of the data it receives. We need to do this because the next command we use expects its input to be sorted, and won’t produce correct results if the input it receives isn’t sorted.

Finally, we pipe the sort output to the ‘uniq’ command. The ‘-u’ option to the uniq command tells it to only print unique lines (i.e., lines that only exist once).

There can be only one…

You may encounter situations where the contents of your lists have duplicate values. If you have no duplicate values in file1, but duplicate values in file2, the command chain will still work as expected. However, if you have duplicate values in file1, all of those values will be ignored even if they only exist in file1. This is because the ‘uniq -u’ command looks for lines in the file that only exist once in either file.

The quick and easy way around this is to simply create a copy of file1 that removes any duplicates before starting:

sort -u file1 > file1.nodupes

Then use that file without the duplicates in the command chain:

cat file1.nodupes file2 file2 | sort | uniq -u

The beauty of it all

This may seem like an esoteric problem that you’re not likely to encounter very often, but you might be surprised how often this problem comes up. Here are just a few examples off the top of my head:

  • Find files that are unique between two servers
  • Find installed packages that are unique between two servers
  • Using old and new server lists figure out which servers are gone and which are new

These commands are all very simple standard commands that exist on pretty much any Unix or Linux system out there: I started using these commands way back in the late 80’s on a MicroVax II running ULTRIX and have since used them on multiple versions of AIX, BSD, HP-UX, IRIX, Linux, and SunOS/Solaris.

Systemd Sucks… Up Your Disk Space

Over the last several years, the advent of systemd has been somewhat controversial in the Linux world. While it undeniably has a lot of features that Linux has been lacking for some time, many people think that it has gone too far: They think it insinuates itself into places it shouldn’t, unnecessarily replaces functionality that didn’t need to be replaced, introduces bloat and performance issues, and more.

I can see both sides of the argument and I’m personally somewhat ambivalent about the whole thing. What I can tell you, though, is that the default configuration in Fedora has a tendency to suck up a lot of disk space.

Huge… tracts of land

The /var/log/journal directory is where systemd’s journal daemon stores log files. On my Fedora systems, I’ve found that this directory has a tendency to grow quite large over time. If left to its own devices, it will often end up using many gigabytes of storage.

Now that may not sound like the end of the world. After all, what are a few gigabytes of log files on a modern system that has multiple terabytes of storage?

Like the whole systemd argument, you can take two different perspectives on this:

Perspective 1: Disk is cheap, and if I’m not using that disk space for anything else, why not go ahead and fill it up with journal daemon logs?

Perspective 2: Why would I want to keep lots of journal daemon logs on my system that I probably won’t ever use?

I tend to take the second perspective. In my case, this is compounded by several other factors:

  1. I keep my /var/log directory in my root filesystem and deliberately keep that small (20GB), so I really don’t want it to fill up with unnecessary log files.
  2. I back up my entire root filesystem nightly to local storage and replicate that to remote storage. Backing up these log files takes unnecessary time, bandwidth, and storage space.
  3. I have a dozen or so KVM virtual machines and LXC containers on my main server. If I let the journal daemon logs on all of these run amok that space really starts to add up.

Quick and Dirty Cleanup

If you’re just looking to do some quick disk space reclamation on your system, you can do this with the the ‘journalctl’ command:

journalctl --vacuum-size=[size]

Quick note: Everything in this post requires root privileges. For simplicity, I show all the commands being run from a root shell. If you’re not running in a root shell, you’ll need to preface each command with ‘sudo’ or an equivalent to run the command with root privileges.

When using the journalctl command above, you specify what size you want the systemd journal log files to take up, and it will try to reduce the journal log files to that size.

Note that I say try. This command can’t do anything to log files that are currently open on the system, and various other factors may reduce its ability to actually reduce the total amount of space used.

Here’s an example I ran within an LXC container:

[root@server ~]# du -sh /var/log/journal
168M /var/log/journal
[root@server ~]# journalctl --vacuum-size=10M
Deleted archived journal /var/log/journal/ac9ff276839a4b429790191f8abb21c1/system@f24253741e8c412a9fe94a48257c2b35-0000000000000001-00055dcc288c8a73.journal (16.0M).
Deleted archived journal /var/log/journal/ac9ff276839a4b429790191f8abb21c1/user-2000@b54e732b7ea1430c95020d6a6553dccb-0000000000000f7b-00055dcef80287ee.journal (8.0M).
Deleted archived journal /var/log/journal/ac9ff276839a4b429790191f8abb21c1/system@f24253741e8c412a9fe94a48257c2b35-0000000000002c74-00056030edf54f82.journal (8.0M).
Deleted archived journal /var/log/journal/ac9ff276839a4b429790191f8abb21c1/user-2000@b54e732b7ea1430c95020d6a6553dccb-0000000000002d0e-00056056271d449c.journal (8.0M).
Deleted archived journal /var/log/journal/ac9ff276839a4b429790191f8abb21c1/system@f24253741e8c412a9fe94a48257c2b35-0000000000003d92-00056295d1dfc0cb.journal (8.0M).
Deleted archived journal /var/log/journal/ac9ff276839a4b429790191f8abb21c1/user-2000@b54e732b7ea1430c95020d6a6553dccb-0000000000003e4d-000562bca405ac7c.journal (8.0M).
Deleted archived journal /var/log/journal/ac9ff276839a4b429790191f8abb21c1/system@000562f8e6bc4730-4bc5e6409eab3024.journal~ (8.0M).
Deleted archived journal /var/log/journal/ac9ff276839a4b429790191f8abb21c1/system@866bd5425da84c0387e801f0d9f0dbe0-0000000000000001-0005630ace84e3a6.journal (8.0M).
Deleted archived journal /var/log/journal/ac9ff276839a4b429790191f8abb21c1/system@0005630ace895afa-db4ba70439580a20.journal~ (8.0M).
Deleted archived journal /var/log/journal/ac9ff276839a4b429790191f8abb21c1/user-2000@b54e732b7ea1430c95020d6a6553dccb-0000000000004f97-00056526bc67d197.journal (8.0M).
Deleted archived journal /var/log/journal/ac9ff276839a4b429790191f8abb21c1/system@000567364f741221-fef3cfcfe59c68bc.journal~ (8.0M).
Deleted archived journal /var/log/journal/ac9ff276839a4b429790191f8abb21c1/system@00056779411bc792-2224320b49ef5929.journal~ (8.0M).
Deleted archived journal /var/log/journal/ac9ff276839a4b429790191f8abb21c1/system@7e3dc17225834c50ab9cbec8c0551dc4-0000000000000001-000567794116176d.journal (8.0M).
Deleted archived journal /var/log/journal/ac9ff276839a4b429790191f8abb21c1/user-2000@b54e732b7ea1430c95020d6a6553dccb-0000000000006c42-000567adbbc43bff.journal (8.0M).
Deleted archived journal /var/log/journal/ac9ff276839a4b429790191f8abb21c1/system@000567ef1af7e427-3c61c0089c605c91.journal~ (8.0M).
Deleted archived journal /var/log/journal/ac9ff276839a4b429790191f8abb21c1/system@ae06f5eac535470a823d126d23143e57-0000000000000001-000569a28b47d93a.journal (8.0M).
Vacuuming done, freed 136.0M of archived journals from /var/log/journal/ac9ff276839a4b429790191f8abb21c1.
[root@server ~]# du -sh /var/log/journal
32M /var/log/journal

As you can see, while this did reduce the logs significantly (from 168M to 32M), it was unable to reduce them down to the 10M that I requested.

It’s also really important to remember that cleaning up log files with journalctl is not a permanent solution. Once you clean them up they’ll just start growing again.

The Permanent Fix

The way to permanently fix the problem is to update the journal daemon configuration to specify a maximum retention size. The configuration file to edit is /etc/systemd/journald.conf. On a Fedora system the default configuration file looks something like this:

# This file is part of systemd.
#
# systemd is free software; you can redistribute it and/or modify it
# under the terms of the GNU Lesser General Public License as published by
# the Free Software Foundation; either version 2.1 of the License, or
# (at your option) any later version.
#
# Entries in this file show the compile time defaults.
# You can change settings by editing this file.
# Defaults can be restored by simply deleting this file.
#
# See journald.conf(5) for details.

[Journal]
#Storage=auto
#Compress=yes
#Seal=yes
#SplitMode=uid
#SyncIntervalSec=5m
#RateLimitIntervalSec=30s
#RateLimitBurst=1000
#SystemMaxUse=
#SystemKeepFree=
#SystemMaxFileSize=
#SystemMaxFiles=100
#RuntimeMaxUse=
#RuntimeKeepFree=
#RuntimeMaxFileSize=
#RuntimeMaxFiles=100
#MaxRetentionSec=
#MaxFileSec=1month
#ForwardToSyslog=no
#ForwardToKMsg=no
#ForwardToConsole=no
#ForwardToWall=yes
#TTYPath=/dev/console
#MaxLevelStore=debug
#MaxLevelSyslog=debug
#MaxLevelKMsg=notice
#MaxLevelConsole=info
#MaxLevelWall=emerg

The key line is “#SystemMaxUse=”. To specify the maximum amount of space you want the journal daemon log files to use, uncomment that line by removing the hash mark (‘#’) at the start of the line and specify the amount of space after the equals (‘=’) at the end of the line. For example:

SystemMaxUse=10M

You can use standard unit designators like M for megabytes or G for gigabytes.

Once you’ve updated this configuration file, it will take effect the next time the journal deamon restarts (typically upon system reboot). To make it take effect immediately, simply tell systemd to restart the journal daemon using the following command:

systemctl restart systemd-journald

Note that if you’ve specified a very small size, like the example above, this still might not shrink the logs down to the size specified. For example:

[root@server ~]# systemctl restart systemd-journald
[root@server ~]# du -sh /var/log/journal
32M /var/log/journal

AsĀ  you can see, we still haven’t reduced the log files down below the maximum size we specified. To do so, you have to stop the journal daemon, completely remove the existing log files, and then restart the journal daemon:

[root@server ~]# systemctl stop systemd-journald
Warning: Stopping systemd-journald.service, but it can still be activated by:
systemd-journald.socket
systemd-journald-audit.socket
systemd-journald-dev-log.socket
[root@server ~]# rm -rf /var/log/journal/*
[root@server ~]# systemctl start systemd-journald
[root@server ~]# du -sh /var/log/journal
1.3M /var/log/journal

Ta-da! Utilization is now down to a minimal amount, and as the log grows the journal daemon should keep it to down to a size less than the maximum amount you’ve specified.

Pigs In Space

Today I noticed that my root filesystem has a little less free space than I would really like it to have, so I decided to do a bit of cleanup…

Finding The Space Hogs

I’m a bit old-fashioned, so I still tend to do this sort of thing from the command line rather than using fancy GUI tools. Over the years, this has served me well, because I can get things done even if I only have terminal access to a system (e.g., via ssh or a console) and only have access to simple standard commands on a system.

One quick trick is that you can easily find the top ten space hogs within any given directory using the following command:

du -k -s -x * | sort -n -r | head

Let me break down what this does.

The ‘du’ command provides you with information on disk usage. Its default behavior is to give you a recursive listing of how much space is used beneath all directories from the current directory on down.

The ‘-k’ option tells du to report all utilization numbers using kilobytes. This will be important in a moment when we sort the list, as most modern versions of du will default to using the most appropriate unit. it’s much easier for a generic sort program to automatically sort 1200 and 326 than it is to sort 1.2G and 326M.

The ‘-s’ option tells du to only report a sum for each file or directory specified, rather than also recursively reporting on all of the subdirectories underneath.

The ‘-x’ option tells du to stay within the same filesystem. This is important if you’re exploring a filesystem that might have other filesystems mounted beneath it, as it tells du not to include information from those other filesystems. For instance, if you’re trying to clean up /var and /var/lib/lxc is a different filesystem mounted from its own separate storage, you don’t want to include the stuff under /var/lib/lxc in your numbers.

Finally, we specify an asterisk wildcard (‘*’) to tell du to spit out stats for each file and directory within the current directory. (Note that if you have hidden files or directories — files that begin with a ‘.’ in the Linux/Unix world — the asterisk will ignore those by default in most shells.)

Next, we pipe the output of the du command to the ‘sort’ command, which does pretty much exactly what it sounds like it should do.

The ‘-n’ option tells sort to do a numeric sort rather than a string sort. By default sort will use a string sort, which would yield results like “1, 10, 11, 2, 3” instead of “1, 2, 3, 10, 11”.

The ‘-r’ option tells sort that we want it to output results in reverse order (i.e., last to first, or biggest to smallest).

Finally, we pipe the sorted output to the ‘head’ command. The head command will spit out the “head,” or the first few lines of a file. By default, head will spit out the first ten lines of a file.

The net result is that this command gives the top ten space hogs in the current directory.

On my system I know that it’s almost invariably the /var filesystem that sucks up space on my root filesystem, so I started there:

[root@server ~]# cd /var
[root@server var]# du -k -s -x * | sort -n -r | head
3193804 lib
2859856 spool
1386684 log
195932 cache
108 www
88 tmp
12 kerberos
12 db
8 empty
4 yp

This tells me that I need to check out /var/lib, /var/spool, and /var/log. The /var/cache directory might yield a little space if cleaned up, but probably not a lot, and everything else is pretty much beneath my notice.

From here, I basically just change directory to each of the top hitters and repeat the process, cleaning up unnecessary files as I go along.