copyenv: Copy the environmental variables of any running process

In a previous article I talked about copysshenv, which is meant to extract the environmental variables related to working with an ssh-agent out of a running process. As explained in the article, this is necessary if your system starts ssh-agent when you log in to a desktop, but you later want to access the same running agent when you log in remotely. ssh-agent uses environmental variables to let ssh clients know how to talk to it. The normal way for an ssh client to have access to those variables is the have been a child (or grandchild, or great grandchild, etc) of the ssh-agent process, inheriting the variables that way. If you can't arrange that, you're out of luck, unless you run on Linux and use copysshenv, in which case you can find the ID of a process that has the environment variables and use copysshenv to extract them from the process.

That version of copysshenv has two flaws. First, you have to know the ID number of the process (usually by using the ps command. Second, it only worked for ssh-related environmental variables.

So I wrote copyenv, which you can get here: Handling the second flaw is actually trivial, and is achieved by no longer limiting the output to the ssh variables. In fact, I've since re-implemented copysshenv by replacing it with the following one-line script: copyenv | grep SSH.

The old copysshenv required the process ID as the command-line argument, but the new copyenv allows either the process ID or the process name. It works a lot like the Linux version of the killall command. (In fact, copyenv uses the pidof, which is part of the same package as killall and probable shares source code.) As such, it has some of the same drawbacks of pidof and killall, which is the inability to determine which process is desired if there are two or more instances of the same process running. In this case, copyenv will print the environmental variables of all of them, but for each variable printed, it will print the ID of the process it came from.

Here's a truncated example of running copyenv on all the running bash processes. The output is sorted so you can compare the same variable across each running copy of bash.

jdimpson@artoo:~$ copyenv bash | sort
export COLORTERM=gnome-terminal # pid 7669
export COLORTERM=gnome-terminal # pid 7672
export COLORTERM=gnome-terminal # pid 7681
export COLORTERM=gnome-terminal # pid 7689
export COLORTERM=gnome-terminal # pid 9732
export DBUS_SESSION_BUS_ADDRESS=unix:abstract=/tmp/dbus-OQ3vPHkSk8,guid=ac373a4c08f01e435d5427a24aa28ce3 # pid 7669
export DBUS_SESSION_BUS_ADDRESS=unix:abstract=/tmp/dbus-OQ3vPHkSk8,guid=ac373a4c08f01e435d5427a24aa28ce3 # pid 7672
export DBUS_SESSION_BUS_ADDRESS=unix:abstract=/tmp/dbus-OQ3vPHkSk8,guid=ac373a4c08f01e435d5427a24aa28ce3 # pid 7681
export DBUS_SESSION_BUS_ADDRESS=unix:abstract=/tmp/dbus-OQ3vPHkSk8,guid=ac373a4c08f01e435d5427a24aa28ce3 # pid 7689
export DBUS_SESSION_BUS_ADDRESS=unix:abstract=/tmp/dbus-OQ3vPHkSk8,guid=ac373a4c08f01e435d5427a24aa28ce3 # pid 9732
export DISPLAY=:1.0 # pid 7669
export DISPLAY=:1.0 # pid 7672
export DISPLAY=:1.0 # pid 7681
export DISPLAY=:1.0 # pid 7689
export DISPLAY=:1.0 # pid 9732
export GNOME_DESKTOP_SESSION_ID=Default # pid 7669
export GNOME_DESKTOP_SESSION_ID=Default # pid 7672
export GNOME_DESKTOP_SESSION_ID=Default # pid 7681
export GNOME_DESKTOP_SESSION_ID=Default # pid 7689
export GNOME_DESKTOP_SESSION_ID=Default # pid 9732
export GNOME_KEYRING_SOCKET=/tmp/keyring-sZEbo7/socket # pid 7669
export GNOME_KEYRING_SOCKET=/tmp/keyring-sZEbo7/socket # pid 7672
export GNOME_KEYRING_SOCKET=/tmp/keyring-sZEbo7/socket # pid 7681
export GNOME_KEYRING_SOCKET=/tmp/keyring-sZEbo7/socket # pid 7689
export GNOME_KEYRING_SOCKET=/tmp/keyring-sZEbo7/socket # pid 9732
export GTK_RC_FILES=/etc/gtk/gtkrc:/home/jdimpson/.gtkrc-1.2-gnome2 # pid 7669
export GTK_RC_FILES=/etc/gtk/gtkrc:/home/jdimpson/.gtkrc-1.2-gnome2 # pid 7672
export GTK_RC_FILES=/etc/gtk/gtkrc:/home/jdimpson/.gtkrc-1.2-gnome2 # pid 7681
export GTK_RC_FILES=/etc/gtk/gtkrc:/home/jdimpson/.gtkrc-1.2-gnome2 # pid 7689
export GTK_RC_FILES=/etc/gtk/gtkrc:/home/jdimpson/.gtkrc-1.2-gnome2 # pid 9732
export HISTCONTROL=ignoreboth # pid 9732
export HOME=/home/jdimpson # pid 4580
export HOME=/home/jdimpson # pid 7669
export HOME=/home/jdimpson # pid 7672
export HOME=/home/jdimpson # pid 7681
export HOME=/home/jdimpson # pid 7689
export HOME=/home/jdimpson # pid 9732
export LANG=en_US.UTF-8 # pid 4580
export LANG=en_US.UTF-8 # pid 7669
export LANG=en_US.UTF-8 # pid 7672
export LANG=en_US.UTF-8 # pid 7681
export LANG=en_US.UTF-8 # pid 7689
export LANG=en_US.UTF-8 # pid 9732
export "LESSCLOSE=/usr/bin/lesspipe %s %s" # pid 9732
export "LESSOPEN=| /usr/bin/lesspipe %s" # pid 9732
export LOGNAME=jdimpson # pid 4580
export LOGNAME=jdimpson # pid 7669
export LOGNAME=jdimpson # pid 7672
export LOGNAME=jdimpson # pid 7681
export LOGNAME=jdimpson # pid 7689
export LOGNAME=jdimpson # pid 9732
export LS_COLORS=no=00:fi=00:di=01;34:ln=01;36:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:su=37;41:sg=30;43:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.svgz=01;31:*.arj=01;31:*.taz=01;31:*.lzh=01;31:*.lzma=01;31:*.zip=01;31:*.z=01;31:*.Z=01;31:*.dz=01;31:*.gz=01;31:*.bz2=01;31:*.bz=01;31:*.tbz2=01;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.rar=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.7z=01;31:*.rz=01;31:*.jpg=01;35:*.jpeg=01;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.mng=01;35:*.pcx=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.vob=01;35:*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01;35:*.rm=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.gl=01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.yuv=01;35:*.aac=00;36:*.au=00;36:*.flac=00;36:*.mid=00;36:*.midi=00;36:*.mka=00;36:*.mp3=00;36:*.mpc=00;36:*.ogg=00;36:*.ra=00;36:*.wav=00;36: # pid 9732
export MAIL=/var/mail/jdimpson # pid 4580
export MAIL=/var/mail/jdimpson # pid 7669
export MAIL=/var/mail/jdimpson # pid 7672
export MAIL=/var/mail/jdimpson # pid 7681
export MAIL=/var/mail/jdimpson # pid 7689
export MAIL=/var/mail/jdimpson # pid 9732
export OLDPWD=/ # pid 7669
export OLDPWD=/ # pid 7672
export OLDPWD=/ # pid 7681
export OLDPWD=/ # pid 7689
export PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:$HOME/bin:/home/jdimpson/bin # pid 9732
export PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:$HOME/bin # pid 4580
export PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:$HOME/bin # pid 7669
export PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:$HOME/bin # pid 7672
export PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:$HOME/bin # pid 7681
export PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:$HOME/bin # pid 7689
export previous=N # pid 7669
export previous=N # pid 7672
export previous=N # pid 7681
export previous=N # pid 7689
export previous=N # pid 9732
export PREVLEVEL=N # pid 7669
export PREVLEVEL=N # pid 7672
export PREVLEVEL=N # pid 7681
export PREVLEVEL=N # pid 7689
export PREVLEVEL=N # pid 9732
export PWD=/home/jdimpson # pid 7669
export PWD=/home/jdimpson # pid 7672
export PWD=/home/jdimpson # pid 7681
export PWD=/home/jdimpson # pid 7689
export PWD=/home/jdimpson # pid 9732
export QEMU_ALSA_DAC_BUFFER_SIZE=4096 # pid 4580
export QEMU_ALSA_DAC_BUFFER_SIZE=4096 # pid 7669
export QEMU_ALSA_DAC_BUFFER_SIZE=4096 # pid 7672
export QEMU_ALSA_DAC_BUFFER_SIZE=4096 # pid 7681
export QEMU_ALSA_DAC_BUFFER_SIZE=4096 # pid 7689
export QEMU_ALSA_DAC_BUFFER_SIZE=4096 # pid 9732
export QEMU_AUDIO_ADC_FIXED_FREQ=48000 # pid 4580
export QEMU_AUDIO_ADC_FIXED_FREQ=48000 # pid 7669
export QEMU_AUDIO_ADC_FIXED_FREQ=48000 # pid 7672
export QEMU_AUDIO_ADC_FIXED_FREQ=48000 # pid 7681
export QEMU_AUDIO_ADC_FIXED_FREQ=48000 # pid 7689
export QEMU_AUDIO_ADC_FIXED_FREQ=48000 # pid 9732
export QEMU_AUDIO_DAC_FIXED_FREQ=48000 # pid 4580
export QEMU_AUDIO_DAC_FIXED_FREQ=48000 # pid 7669
export QEMU_AUDIO_DAC_FIXED_FREQ=48000 # pid 7672
export QEMU_AUDIO_DAC_FIXED_FREQ=48000 # pid 7681
export QEMU_AUDIO_DAC_FIXED_FREQ=48000 # pid 7689
export QEMU_AUDIO_DAC_FIXED_FREQ=48000 # pid 9732
export QEMU_AUDIO_DRV=alsa # pid 4580
export QEMU_AUDIO_DRV=alsa # pid 7669
export QEMU_AUDIO_DRV=alsa # pid 7672
export QEMU_AUDIO_DRV=alsa # pid 7681
export QEMU_AUDIO_DRV=alsa # pid 7689
export QEMU_AUDIO_DRV=alsa # pid 9732
export QUIET=no # pid 7669
export QUIET=no # pid 7672
export QUIET=no # pid 7681
export QUIET=no # pid 7689
export QUIET=no # pid 9732
export runlevel=2 # pid 7669
export RUNLEVEL=2 # pid 7669
export runlevel=2 # pid 7672
export RUNLEVEL=2 # pid 7672
export runlevel=2 # pid 7681
export RUNLEVEL=2 # pid 7681
export runlevel=2 # pid 7689
export RUNLEVEL=2 # pid 7689
export runlevel=2 # pid 9732
export RUNLEVEL=2 # pid 9732
export SESSION_MANAGER=local/artoo:/tmp/.ICE-unix/6910 # pid 7669
export SESSION_MANAGER=local/artoo:/tmp/.ICE-unix/6910 # pid 7672
export SESSION_MANAGER=local/artoo:/tmp/.ICE-unix/6910 # pid 7681
export SESSION_MANAGER=local/artoo:/tmp/.ICE-unix/6910 # pid 7689
export SESSION_MANAGER=local/artoo:/tmp/.ICE-unix/6910 # pid 9732
export SHELL=/bin/bash # pid 4580
export SHELL=/bin/bash # pid 7669
export SHELL=/bin/bash # pid 7672
export SHELL=/bin/bash # pid 7681
export SHELL=/bin/bash # pid 7689
export SHELL=/bin/bash # pid 9732
export SHLVL=2 # pid 7669
export SHLVL=2 # pid 7672
export SHLVL=2 # pid 7681
export SHLVL=2 # pid 7689
export SHLVL=3 # pid 9732
export SSH_AGENT_PID=6937 # pid 7669
export SSH_AGENT_PID=6937 # pid 7672
export SSH_AGENT_PID=6937 # pid 7681
export SSH_AGENT_PID=6937 # pid 7689
export SSH_AGENT_PID=6937 # pid 9732
export SSH_AUTH_SOCK=/tmp/keyring-sZEbo7/ssh # pid 7669
export SSH_AUTH_SOCK=/tmp/keyring-sZEbo7/ssh # pid 7672
export SSH_AUTH_SOCK=/tmp/keyring-sZEbo7/ssh # pid 7681
export SSH_AUTH_SOCK=/tmp/keyring-sZEbo7/ssh # pid 7689
export SSH_AUTH_SOCK=/tmp/keyring-sZEbo7/ssh # pid 9732
export SSH_AUTH_SOCK=/tmp/ssh-HgWvto4579/agent.4579 # pid 4580
export "SSH_CLIENT= 1493 22" # pid 4580
export "SSH_CONNECTION= 1493 22" # pid 4580
export SSH_TTY=/dev/pts/4 # pid 4580
export STY=9731.pine # pid 9732
export "TERMCAP=SC|screen|VT 100/ANSI X3.64 virtual terminal:n :DO=E[%dB:LE=E[%dD:RI=E[%dC:UP=E[%dA:bs:bt=E[Z:n :cd=E[J:ce=E[K:cl=E[HE[J:cm=E[%i%d;%dH:ct=E[3g:n :do=^J:nd=E[C:pt:rc=E8:rs=Ec:sc=E7:st=EH:up=EM:n :le=^H:bl=^G:cr=^M:it#8:ho=E[H:nw=EE:ta=^I:is=E)0:n :li#24:co#80:am:xn:xv:LP:sr=EM:al=E[L:AL=E[%dL:n :cs=E[%i%d;%dr:dl=E[M:DL=E[%dM:dc=E[P:DC=E[%dP:n :im=E[4h:ei=E[4l:mi:IC=E[%d@:ks=E[?1hE=:n :ke=E[?1lE>:vi=E[?25l:ve=E[34hE[?25h:vs=E[34l:n :ti=E[?1049h:te=E[?1049l:us=E[4m:ue=E[24m:so=E[3m:n :se=E[23m:mb=E[5m:md=E[1m:mr=E[7m:me=E[m:ms:n :Co#8:pa#64:AF=E[3%dm:AB=E[4%dm:op=E[39;49m:AX:n :vb=Eg:G0:as=E(0:ae=E(B:n :ac=140140aaffggjjkkllmmnnooppqqrrssttuuvvwwxxyyzz{{||}}~~..--++,,hhII00:n :po=E[5i:pf=E[4i:k0=E[10~:k1=EOP:k2=EOQ:k3=EOR:n :k4=EOS:k5=E[15~:k6=E[17~:k7=E[18~:k8=E[19~:n :k9=E[20~:k;=E[21~:F1=E[23~:F2=E[24~:F3=EO2P:n :F4=EO2Q:F5=EO2R:F6=EO2S:F7=E[15;2~:F8=E[17;2~:n :F9=E[18;2~:FA=E[19;2~:kb=:K2=EOE:kB=E[Z:n :*4=E[3;2~:*7=E[1;2F:#2=E[1;2H:#3=E[2;2~:#4=E[1;2D:n :%c=E[6;2~:%e=E[5;2~:%i=E[1;2C:kh=E[1~:@1=E[1~:n :kH=E[4~:@7=E[4~:kN=E[6~:kP=E[5~:kI=E[2~:kD=E[3~:n :ku=EOA:kd=EOB:kr=EOC:kl=EOD:km:" # pid 9732

Like copysshenv, the output from copyenv is suitable to copy to a file that can then be "sourced". A practical application would be to duplicate the DBUS settings from another bash process:

jdimpson@artoo:~$ env | grep DBUS
jdimpson@artoo:~$ ps auxwww | grep bash
jdimpson 520 0.0 0.1 5968 3392 pts/0 Ss Nov09 0:00 bash
jdimpson 528 0.0 0.1 5968 3444 pts/1 Ss Nov09 0:00 bash
jdimpson 533 0.0 0.1 5968 3476 pts/2 Ss Nov09 0:00 bash
jdimpson 540 0.0 0.1 5968 3428 pts/3 Ss Nov09 0:00 bash
jdimpson@artoo:~$ copyenv 533 | grep DBUS > tmp
jdimpson@artoo:~$ cat tmp
export DBUS_SESSION_BUS_ADDRESS=unix:abstract=/tmp/dbus-9GBQaur4oj,guid=1ce705587b2d8d0bee00c8094cd9540c
jdimpson@artoo:~$ . tmp
jdimpson@artoo:~$ env | grep DBUS
export DBUS_SESSION_BUS_ADDRESS=unix:abstract=/tmp/dbus-9GBQaur4oj,guid=1ce705587b2d8d0bee00c8094cd9540c

Here's what copyenv looks like:


# print the environmental variables of a process, given the process name or the process ID.

# copyright 2009 Jeremy D. Impson
# note: TERMCAP env var embeds newlines by putting a \, then a real newline.  this script removes the leading \ and newline, replacing it with literal \n. but it does nothing to unescaped newlines. They will break this script, or at least, generate incorrect output.

if [ -z $PROC ]; then
# if $PROC isn't a process ID number, assume it's the name of the process, 
# and use pgrep or pidof to translate to process ID number.
if [ ! -e "/proc/$PROC/environ" ]; then
	if which pgrep > /dev/null; then
		#PROC=`pgrep -x -u $USER $PROC`; # less elegant but more robust to let the file permissions on the /proc/<pid>/environ file handle access control
		PROC=`pgrep -x $PROC`;
	else if which pidof > /dev/null; then
		PROC=`pidof $PROC`;
	fi;  fi
# if we had to translate the process name to ID number, then we may have found multiple processes 
# with the same name. count them.
for pid in $PROC; do
	CNT=`expr $CNT + 1`;
# *sigh* At this point I should just rewrite in perl.  Too many multiline hacks.
if [ $CNT -gt 1 ]; then

# look up the environmental variables for each process, and print them out.
for pid in $PROC; do
	if [ ! -e "/proc/$pid/environ" ]; then
		echo "Can't find process environment for $1 (/proc/$PROC/environ)" >&2 ;
		if [ ! -r "/proc/$pid/environ" ]; then
			echo "Can't read environment for $pid (/proc/$pid/environ)" >&2 ;
			#sed -e ':a' -e '$!N;s/\\\n/\\n/;ta' -e 's/\x00/\n/g' < /proc/$pid/environ | sed -e 's/^/export "/' -e 's/$/"/' -e "s/$/ # pid $pid/";
			sed -e ':a' -e '$!N;s/\\\n/\\n/;ta' -e 's/\x00/\n/g' < /proc/$pid/environ | while read VAR; do
				if echo $VAR | egrep -q '[ 	]'; then
					echo -n export \"$VAR\"  # VAR has spaces, so print quotes
					echo -n export $VAR 
				if [ -z "$PRINTPID" ]; then
					echo " # pid $pid"; # if there are more than one process, print a comment to help keep track
exit $RET;

copyenv first tests to see if the user has supplied any argument. If not, copyenv will use it's own process ID number. This probably isn't useful except for demonstration purposes. Then, it checks the user's argument to see if it is avalid process ID number by looking up an entry in the /proc filesystem. If the numer is a valid ID, and the user has permission to look at the process' environment, this check will succeed. If it fails, copyenv assumes the user argument is a process name rather than ID number, and uses either pgrep or pidof to convert the name into one or more ID numbers.

copyenv then looks up the environvmental variables for each ID number in turn, reformatting each into a human- and shell-readable format, and printing the result. If there were more than one ID numbers, copyenv also prints the process ID that each variable value is associated with, in case that is important to you. (Makes it good for later grepping, for example.)

The value of any environmental variable may contain a space, so copyenv detects this and tries to quote the value so that the output is suitable to be sourced by a shell. It tries to detect spaces, and only qoutes them if a space is present. It should just quote all of them, but that would require detecting and escaping any embedded quotes in the variable values. I really should do that, because right now, if a variable value contains both space and embedded quotes, the output of copyenv will be invalid shell syntax.

Some common environmental variables are complex, such as the TERMCAP variable, which contains embedded newlines. copyenv does not handle this correctly. I can't think of an easy way to correct this, and I've never needed to copy the TERMCAP variable, but it would be nice to handle this correctly, for sake of completeness.

Snippet for handling GUI-invoked shell scripts

I write a lot shell scripts that are meant to get invoked via GNOME's "open with" function. As such, the script needs to handle things I'd normally rely on the commandline shell invocation to handle. For example, say "html_pretty_print" is a tool to rewrite an HTML file, removing extraneous white space and doing proper hierarchical indenting so it's readable. I want to run it on an HTML file, so that I save the original but put the results back into a file with the same name as the original.

Typically, if I were typing at the command line, I'd do something like this:

$ html_pretty_print data.html > data-pretty.html; mv data.html data.html-bak; mv data-pretty.html data.html

Now the original has been backed up and the reformatted version has the original file name. I consider html_pretty_print a well-behaved shell utility because it makes no assumptions about where the user wants to put the output. (In fact, the actual html_pretty_print can also read it's input from standard input rather than reading it from the file handed to it on the command line, making it deal as a pipe-based filter, but that fact isn't relevant to the matter at hand.)

Anyway, the above command line is how I'd do it manually because it captures the state of the process better (in case I walk away in the middle of the process for some reason) If I were writing a script to totally automate the process, I'd probably do it this way:

$ mv data.html data.html-bak; html_pretty_print data.html-bak > data.html;

Fewer moves and no need for a temp file.

But if I'm using GNOME or another window manager's file explorer, I want to left-click on an HTML file and select Open With then html_pretty_print, and have all that work done for me.

When you do an "Open With", your script gets passed the full path name to the file you clicked on as the first argument. However, the current working directory could be anywhere (typically your home directory), so when your script writes out a file like the temp file, you had better be very specific as to where it wants to put the file, and the backup file. That is, you need to make sure all files are referenced with absolute path names. Usually, the right place is to put it in the same directory as the original file.

Here's a code snippet for devising the name of the backup file such that it gets put in the same directory as the original. I could modify html_pretty_print itself, but I generall prefer to do it in a wrapper. That way I can continue to use html_pretty_print as a well-behaved command line utility, and the wrapper approach works even when the command you're using can't be modified.


i="$1"; # $1 will include full path when invoked by GNOME
test -z "$i" && exit 1;
j=`basename "$i"`-bak;
jdir=`dirname "$i"`;

mv "$i" "$j"

html_pretty_printer "$j" > "$i";

Since $j and $i are full qualified file names, this wrapper can be called from any working directory. The variable jdir can be used to (presumably) to store any temp files you might want. I'd name this wrapper something like "openwith-html_pretty_printer".

Obviously, you can build in a lot more logic if you want a single script to handle both cases. I haven't figured out a reliable way to tell if you've been invoked by Open With yet. I bet you could figure it out if you spend some quality time with some test cases and the "env" command.

The snippit uses "basename" and "dirname" command line utilities common to all Linux and most Unix distributions. Note that "basename" can also strip off file endings, so if you wanted to you could do a variant where instead of turning "data.html" to "data.html-bak", you could change it to "data.bak" by replacing the third line with:

j=`basename "$i" .html`.bak

A real example of wanting to do that is if you're, say, coverting an AVI file to FLV.

script-declutter: clean up detritus that shows up when you run script

You can download script-declutter from

Hey, if you aren't a regular Unix command-line user, and don't know what the "script" command is for, you probably won't like this article.

I'll finish up my socat series later. For now, here's a short and clever little script. I'm sure someone has done this before, but it was fun implementing myself.

There's a traditional Unix command called script. According to its manpage,

Script makes a typescript of everything printed on your terminal. It is useful for students who need a hardcopy record of an interactive session as proof of an assignment, as the typescript file can be printed out later with lpr(1).

If the argument file is given, script saves all dialogue in file. If no file name is given, the typescript is saved in the file typescript.

When script was first introduced (some time in 1980), it was born in an environment where the GUI was not ubiquitous, where several users would log in to the same computer concurrently, and the majority of the software the user ran would be from shell command line. The script command, which can be used to save both the input from the user, and the output from the commands, was very useful to create a transcript, sometimes called a typescript, of a user's interactive session.

It's important to realize that script records the characters verbatim. For example, say a user meant to type "ls" but accidentally typed "la". He realizes his mistake and hits the backspace key to erase the "a", then typed "s". In this example, the script command would record "la^Hs" in the typescript file. The "^H" is not meant to be a caret ("^") followed by a capital H, but instead is one way to represent the single, otherwise non-printable Control Character (in the ASCII character set) that means "delete one character where the cursor is currently".

The manpage goes on to say,

Certain interactive commands, such as vi(1), create garbage in the typescript file. Script works best with commands that do not manipulate the screen, the results are meant to emulate a hardcopy terminal.

vi isn't the only thing that can cause escape sequences and control characters to be captured in a typescript file. User's of the bash shell (which includes almost every Linux user) usually have an environment variable called PROMPT_COMMAND set in such a way that it sets the title of an xterm terminal according to the current working directory, username, and hostname. On my Ubuntu 8.04 system,

jdimpson@freedom:~$ echo $PROMPT_COMMAND
echo -ne "\033]0;${USER}@${HOSTNAME}: ${PWD/$HOME/~}\007"

The "\033]0;" and "\007" are Escape Sequences that, when printed to the screen, tell the terminal program not to print something, but to modifiy the title bar, according to what's in between the two escape codes. However, the typescript file created by the script command will record the escape sequences.

The specific details of non-printable characters (both Escape Sequences and Control Characters) is beyond where I want to take this article. The following links can get you started with the theory of what's going on:

If you use "cat" or "less" or "more" to look at the contents of a script-generated typescript file, you probably won't notice any of these non-printable characters. That's because, on playback, the escape and control commands in the typescript file will get played back just like the regular text. However, if you were to use certain editors (like vim or even Microsoft Wordpad) you'll see the funny characters, although what you see will vary depending on how each editor decide to portray non-printable characters.

If you want to put a typescript file on-line somewhere, or send it as an email attachment, and you can't be sure who or what will be viewing the file, you'll want to strip out the non-printables. On top of that, you'll probably also want to see the final state of the file, after all backspaces and similar control characters have been processed. In other words, you want "la^Hs" to be converted to "ls" not "las". This means resolving all the control character and escape sequence commands. "script-declutter" tries to do this. It ain't perfect, but it's very effective.

Here's a very simple sample using the script command.

jdimpson@artoo:~$ script
Script started, file is typescript
jdimpson@artoo:~$ echo hello
jdimpson@artoo:~$ exit
Script done, file is typescript
jdimpson@artoo:~$ cat typescript
Script started on Sun 19 Apr 2009 02:05:27 AM EDT
jdimpson@artoo:~$ echo hello
jdimpson@artoo:~$ exit

Script done on Sun 19 Apr 2009 02:05:35 AM EDT

You can't see any of the control and escape characters, because I used cat to display the typescript file. But if I use vim, it looks like that:

Script started on Sun 19 Apr 2009 02:05:27 AM EDT
^[]0;jdimpson@artoo: ~^Gjdimpson@artoo:~$ he^H^[[K^H^[[Kecho hello^M
^[]0;jdimpson@artoo: ~^Gjdimpson@artoo:~$ exit^M

Script done on Sun 19 Apr 2009 02:05:35 AM EDT

Note that when you use cat to view the file, all the control and escape characters get interpretted by the terminal display. Here you can see that while I ultimately ran the command "echo hello", I started out by type "he", then backspaced and retyped the line. Note also the double printing of the command prompt, "jdimpson@artoo". The first is actually the PROMPT_COMMAND mentioned earlier, with the "^[]0;" being a representation of the output of the "\033];0" part. "\033" is the octal representation of ESCAPE, and "^[" is a somewhat common alternate way to depict ESCAPE. Note also that the end of every line ends with "^M", which is a way of depicting the carriage return character.

vim is one way to look at the actual file. Another tool you can use is "hexdump -C", which gives you this output:

jdimpson@artoo:~$ hexdump -C typescript
00000000 53 63 72 69 70 74 20 73 74 61 72 74 65 64 20 6f |Script started o|
00000010 6e 20 53 75 6e 20 31 39 20 41 70 72 20 32 30 30 |n Sun 19 Apr 200|
00000020 39 20 30 32 3a 30 35 3a 32 37 20 41 4d 20 45 44 |9 02:05:27 AM ED|
00000030 54 0a 1b 5d 30 3b 6a 64 69 6d 70 73 6f 6e 40 61 |T..]0;jdimpson@a|
00000040 72 74 6f 6f 3a 20 7e 07 6a 64 69 6d 70 73 6f 6e |rtoo: ~.jdimpson|
00000050 40 61 72 74 6f 6f 3a 7e 24 20 68 65 08 1b 5b 4b |@artoo:~$ he..[K|
00000060 08 1b 5b 4b 65 63 68 6f 20 68 65 6c 6c 6f 0d 0a |..[Kecho hello..|
00000070 68 65 6c 6c 6f 0d 0a 1b 5d 30 3b 6a 64 69 6d 70 |hello...]0;jdimp|
00000080 73 6f 6e 40 61 72 74 6f 6f 3a 20 7e 07 6a 64 69 |son@artoo: ~.jdi|
00000090 6d 70 73 6f 6e 40 61 72 74 6f 6f 3a 7e 24 20 65 |mpson@artoo:~$ e|
000000a0 78 69 74 0d 0a 65 78 69 74 0d 0a 0a 53 63 72 69 |xit..exit...Scri|
000000b0 70 74 20 64 6f 6e 65 20 6f 6e 20 53 75 6e 20 31 |pt done on Sun 1|
000000c0 39 20 41 70 72 20 32 30 30 39 20 30 32 3a 30 35 |9 Apr 2009 02:05|
000000d0 3a 33 35 20 41 4d 20 45 44 54 0a |:35 AM EDT.|

Again, you can see the control and escape characters, not only in some human-readable format (in the right-hand column), but also the hexadecimal representation (in the middle two columns). This is often less ambiguous, and is how I figured out all the values for the escape & control characters, as used in the script-declutter, which I'll show you in a minute. Hexdump prints the control characters as plain dots ("."). It does the same for the non-printable parts of the escape sequences (usually the escape character itself). Note that it also prints tabs and newlines as dots.

Here's how you would use script-declutter to clean up the typescript file:

jdimpson@artoo:~$ script-declutter typescript > out

And here's what it looks like in vim.

Script started on Sun 19 Apr 2009 02:05:27 AM EDT
jdimpson@artoo:~$ echo hello
jdimpson@artoo:~$ exit

Script done on Sun 19 Apr 2009 02:05:35 AM EDT

This looks just like the original file when printed to the screen via cat. Hexdump agrees.

jdimpson@artoo:~$ hexdump -C out
00000000 53 63 72 69 70 74 20 73 74 61 72 74 65 64 20 6f |Script started o|
00000010 6e 20 53 75 6e 20 31 39 20 41 70 72 20 32 30 30 |n Sun 19 Apr 200|
00000020 39 20 30 32 3a 30 35 3a 32 37 20 41 4d 20 45 44 |9 02:05:27 AM ED|
00000030 54 0a 6a 64 69 6d 70 73 6f 6e 40 61 72 74 6f 6f |T.jdimpson@artoo|
00000040 3a 7e 24 20 65 63 68 6f 20 68 65 6c 6c 6f 0a 68 |:~$ echo hello.h|
00000050 65 6c 6c 6f 0a 6a 64 69 6d 70 73 6f 6e 40 61 72 |ello.jdimpson@ar|
00000060 74 6f 6f 3a 7e 24 20 65 78 69 74 0a 65 78 69 74 |too:~$ exit.exit|
00000070 0a 0a 53 63 72 69 70 74 20 64 6f 6e 65 20 6f 6e |..Script done on|
00000080 20 53 75 6e 20 31 39 20 41 70 72 20 32 30 30 39 | Sun 19 Apr 2009|
00000090 20 30 32 3a 30 35 3a 33 35 20 41 4d 20 45 44 54 | 02:05:35 AM EDT|
000000a0 0a |.|

You can see the file is shorter, and lots of characters are gone.

script-declutter is a Perl script. Here's what it looks like.

#!/usr/bin/perl -wp

# clean up control characters and other non-text detritus that shows up
# when you run the "script" command.

# xterm titlebar escape sequence
$xtermesc = "\x1b\x5d\x30\x3b";

# the occurence of a backspace event (e.g. cntrl H, cntrol W, or cntrl U)
$backspaceevent = "\x1b\\\x5b\x4b"; # note escaping of third character

# ANSI color escape sequence
$ansiesc = qr/\x1b\[[\d;]*?m/;

# technically, this is arrow-right. For some reason, being used against
# very long backspace jobs. I don't fully understand this, as evidenced
# by the fact that is off by one sometimes.
$bizarrebs = qr/\x1b\[C/;

# used as part of the xterm titlebar mechanism, or when
# a bell sounds, which might happen when you backspace too much.
$bell = "\x07"; # could use \a

$cr = "\x0d"; # could use \r

$backspace = "\x08"; # could use \b

while (s/(.)(?=$backspace)//) { s/$backspace//; } # frickin' sweet
# For every ^H delete the character immediately left of it, then delete the ^H.
# Perl's RE's aren't R, so I wonder if I could do this in one expression.
while (s/(..)(?=$bizarrebs)//) { s/$bizarrebs//; }

# notes
# ^[[7P has been spotted. Based on"[7P" it appears to be a numbered cursor jump, moving 7 characters (not sure if left or right).

The -w flag, passed to Perl in the first line, turns on some better syntax checking. The -p tells Perl to read in each line of standard input, run the entire script on the line, the print the now transformed input line.

The script then has a BEGIN { ... } section. This section gets run at startup, but doesn't get run for each line of input. The content of the BEGIN section sets up some variables which will get used in regular expressions in the main part of the script. Specifically, it creates patterns to match the xterm titlebar escape sequence, some control characters that show up when certain keyboard events (like when control-H, control-W, or control-U get pressed) occur, ANSI color escape sequences, some strange backspace pattern that I've seen but which I don't understand, and finally the bell, carriage return, and backspace control characters.

I figured out the hex value for these characters using hexdump. You'll see in the comments that some of the control characters, like carriage return, could have been matched using a couple of characters, like "\r". These are C-style or printf-style special format characters. See printf for a complete list. Not every control & escape character has a special format character, so for consistency I didn't use them in the script. They are in the comments as cross-reference.

Most of the control characters and escape sequences are simply deleted. The fact that the bell control character is also part of the titlebar escape sequence isn't a problem as long as they're dealt with in the right order. But both the regular and strange backspace control characters are handled in what I think is a very clever manner. Here's the relevant line:

while (s/(.)(?=$backspace)//) { s/$backspace//; }

This makes egregious use of C-style side effects, and makes no apologies for it. Obviously, it's a while loop. The test portion is actually a substitution. It says "replace the first character in the current input line that is followed by a backspace control character with nothing". The body of the while loop says "replace the first backspace control character with nothing". Because everytime the test portion performs its substitution it also returns true, the while loop will continue looping until there are no more backspace characters. BTW, I realize now that this should work:

while (s/(.$backspace)//) { 1; }

I like the looks of my first version better, although I suspect that it's less efficient. The other backspace handler works the same way, except that it deletes two regular characters for every occurrence of the backspace pattern, and the patter itself is three characters long rather than only one. I've only seen this is happen occasionally. As I've mentioned, I don't understand what this strange backspace variation means, and I seem to recall that it sometimes doesn't line up right, so some control characters or should-be-deleted regular characters get left behind.

The final comment also illustrates that script-declutter isn't perfect (because my understanding of script and control characters isn't perfect).

Another good example of when a script needs decluttering is when you run the GNU ls command in a typsecript. Depending on how it's configured to work by default, or what command line options you give it, or what environmental variables are set, ls will color the names of files according to what kind of file (regular, directory, block, character, symlink), and according to the file's extension (.zip, .midi, .deb, .txt, etc). (See also the dircolors command for more info.) ls uses ANSI escape sequences to set colors, which the script command will capture, and script-declutter will remove, like so:

jdimpson@artoo:~/foo$ script
Script started, file is typescript
jdimpson@artoo:~/foo$ ls -l
total 496
-rw-r--r-- 1 jdimpson jdimpson 0 2009-04-19 11:21
-rw-r--r-- 1 jdimpson jdimpson 0 2009-04-19 11:21 baz.deb
prw-r--r-- 1 jdimpson jdimpson 0 2009-04-19 11:20 fifo
drwxr-xr-x 2 jdimpson jdimpson 4096 2009-04-19 11:20 foo
-rw-r--r-- 1 jdimpson jdimpson 0 2009-04-19 11:20 foo.txt
-rw-r--r-- 1 jdimpson jdimpson 0 2009-04-19 11:21 typescript
-rw-r--r-- 1 jdimpson jdimpson 496183 2009-04-19 11:21 warty-final-ubuntu.png
jdimpson@artoo:~/foo$ exit
Script done, file is typescript
jdimpson@artoo:~/foo$ cat typescript
Script started on Sun 19 Apr 2009 11:21:47 AM EDT
jdimpson@artoo:~/foo$ ls -l
total 496
-rw-r--r-- 1 jdimpson jdimpson 0 2009-04-19 11:21
-rw-r--r-- 1 jdimpson jdimpson 0 2009-04-19 11:21 baz.deb
prw-r--r-- 1 jdimpson jdimpson 0 2009-04-19 11:20 fifo
drwxr-xr-x 2 jdimpson jdimpson 4096 2009-04-19 11:20 foo
-rw-r--r-- 1 jdimpson jdimpson 0 2009-04-19 11:20 foo.txt
-rw-r--r-- 1 jdimpson jdimpson 0 2009-04-19 11:21 typescript
-rw-r--r-- 1 jdimpson jdimpson 496183 2009-04-19 11:21 warty-final-ubuntu.png
jdimpson@artoo:~/foo$ exit

Script done on Sun 19 Apr 2009 11:21:51 AM EDT

The colors don't get picked up when I copy & paste, so here's a screenshot of the directory listing so you can see the colors.

But "vim typesecript" shows this:

Script started on Sun 19 Apr 2009 11:25:08 AM EDT
^[]0;jdimpson@artoo: ~/foo^Gjdimpson@artoo:~/foo$ exit^H^H^H^H^[[Kexit^H^H^H^Hls -l^M
^[[00mtotal 496^M
-rw-r--r-- 1 jdimpson jdimpson 0 2009-04-19 11:21 ^[[01;^[[00m^M
-rw-r--r-- 1 jdimpson jdimpson 0 2009-04-19 11:21 ^[[01;31mbaz.deb^[[00m^M
prw-r--r-- 1 jdimpson jdimpson 0 2009-04-19 11:20 ^[[40;33mfifo^[[00m^M
drwxr-xr-x 2 jdimpson jdimpson 4096 2009-04-19 11:20 ^[[01;34mfoo^[[00m^M
-rw-r--r-- 1 jdimpson jdimpson 0 2009-04-19 11:20 ^[[00mfoo.txt^[[00m^M
-rw-r--r-- 1 jdimpson jdimpson 0 2009-04-19 11:25 ^[[00mtypescript^[[00m^M
-rw-r--r-- 1 jdimpson jdimpson 496183 2009-04-19 11:21 ^[[01;35mwarty-final-ubuntu.png^[[00m^M
^[[m^[]0;jdimpson@artoo: ~/foo^Gjdimpson@artoo:~/foo$ exit^M

Script done on Sun 19 Apr 2009 11:25:15 AM EDT

The special characters before and after the file names are the ANSI color escape sequences. Note also the second line of the script has the string "exit" followed by a bunch of ^H's. That's from me using bash's command line history--rather than typing "ls -l", I hit up-arrow twice to find ls -l in the history.

Here's the hexdump -C:

jdimpson@artoo:~/foo$ hexdump -C typescript
00000000 53 63 72 69 70 74 20 73 74 61 72 74 65 64 20 6f |Script started o|
00000010 6e 20 53 75 6e 20 31 39 20 41 70 72 20 32 30 30 |n Sun 19 Apr 200|
00000020 39 20 31 31 3a 32 35 3a 30 38 20 41 4d 20 45 44 |9 11:25:08 AM ED|
00000030 54 0a 1b 5d 30 3b 6a 64 69 6d 70 73 6f 6e 40 61 |T..]0;jdimpson@a|
00000040 72 74 6f 6f 3a 20 7e 2f 66 6f 6f 07 6a 64 69 6d |rtoo: ~/foo.jdim|
00000050 70 73 6f 6e 40 61 72 74 6f 6f 3a 7e 2f 66 6f 6f |pson@artoo:~/foo|
00000060 24 20 65 78 69 74 08 08 08 08 1b 5b 4b 65 78 69 |$ exit.....[Kexi|
00000070 74 08 08 08 08 6c 73 20 2d 6c 0d 0a 1b 5b 30 30 | -l...[00|
00000080 6d 74 6f 74 61 6c 20 34 39 36 0d 0a 2d 72 77 2d |mtotal 496..-rw-|
00000090 72 2d 2d 72 2d 2d 20 31 20 6a 64 69 6d 70 73 6f |r--r-- 1 jdimpso|
000000a0 6e 20 6a 64 69 6d 70 73 6f 6e 20 20 20 20 20 20 |n jdimpson |
000000b0 30 20 32 30 30 39 2d 30 34 2d 31 39 20 31 31 3a |0 2009-04-19 11:|
000000c0 32 31 20 1b 5b 30 31 3b 33 31 6d 62 61 72 2e 7a |21 .[01;31mbar.z|
000000d0 69 70 1b 5b 30 30 6d 0d 0a 2d 72 77 2d 72 2d 2d |ip.[00m..-rw-r--|
000000e0 72 2d 2d 20 31 20 6a 64 69 6d 70 73 6f 6e 20 6a |r-- 1 jdimpson j|
000000f0 64 69 6d 70 73 6f 6e 20 20 20 20 20 20 30 20 32 |dimpson 0 2|
00000100 30 30 39 2d 30 34 2d 31 39 20 31 31 3a 32 31 20 |009-04-19 11:21 |
00000110 1b 5b 30 31 3b 33 31 6d 62 61 7a 2e 64 65 62 1b |.[01;31mbaz.deb.|
00000120 5b 30 30 6d 0d 0a 70 72 77 2d 72 2d 2d 72 2d 2d |[00m..prw-r--r--|
00000130 20 31 20 6a 64 69 6d 70 73 6f 6e 20 6a 64 69 6d | 1 jdimpson jdim|
00000140 70 73 6f 6e 20 20 20 20 20 20 30 20 32 30 30 39 |pson 0 2009|
00000150 2d 30 34 2d 31 39 20 31 31 3a 32 30 20 1b 5b 34 |-04-19 11:20 .[4|
00000160 30 3b 33 33 6d 66 69 66 6f 1b 5b 30 30 6d 0d 0a |0;33mfifo.[00m..|
00000170 64 72 77 78 72 2d 78 72 2d 78 20 32 20 6a 64 69 |drwxr-xr-x 2 jdi|
00000180 6d 70 73 6f 6e 20 6a 64 69 6d 70 73 6f 6e 20 20 |mpson jdimpson |
00000190 20 34 30 39 36 20 32 30 30 39 2d 30 34 2d 31 39 | 4096 2009-04-19|
000001a0 20 31 31 3a 32 30 20 1b 5b 30 31 3b 33 34 6d 66 | 11:20 .[01;34mf|
000001b0 6f 6f 1b 5b 30 30 6d 0d 0a 2d 72 77 2d 72 2d 2d |oo.[00m..-rw-r--|
000001c0 72 2d 2d 20 31 20 6a 64 69 6d 70 73 6f 6e 20 6a |r-- 1 jdimpson j|
000001d0 64 69 6d 70 73 6f 6e 20 20 20 20 20 20 30 20 32 |dimpson 0 2|
000001e0 30 30 39 2d 30 34 2d 31 39 20 31 31 3a 32 30 20 |009-04-19 11:20 |
000001f0 1b 5b 30 30 6d 66 6f 6f 2e 74 78 74 1b 5b 30 30 |.[00mfoo.txt.[00|
00000200 6d 0d 0a 2d 72 77 2d 72 2d 2d 72 2d 2d 20 31 20 |m..-rw-r--r-- 1 |
00000210 6a 64 69 6d 70 73 6f 6e 20 6a 64 69 6d 70 73 6f |jdimpson jdimpso|
00000220 6e 20 20 20 20 20 20 30 20 32 30 30 39 2d 30 34 |n 0 2009-04|
00000230 2d 31 39 20 31 31 3a 32 35 20 1b 5b 30 30 6d 74 |-19 11:25 .[00mt|
00000240 79 70 65 73 63 72 69 70 74 1b 5b 30 30 6d 0d 0a |ypescript.[00m..|
00000250 2d 72 77 2d 72 2d 2d 72 2d 2d 20 31 20 6a 64 69 |-rw-r--r-- 1 jdi|
00000260 6d 70 73 6f 6e 20 6a 64 69 6d 70 73 6f 6e 20 34 |mpson jdimpson 4|
00000270 39 36 31 38 33 20 32 30 30 39 2d 30 34 2d 31 39 |96183 2009-04-19|
00000280 20 31 31 3a 32 31 20 1b 5b 30 31 3b 33 35 6d 77 | 11:21 .[01;35mw|
00000290 61 72 74 79 2d 66 69 6e 61 6c 2d 75 62 75 6e 74 |arty-final-ubunt|
000002a0 75 2e 70 6e 67 1b 5b 30 30 6d 0d 0a 1b 5b 6d 1b |u.png.[00m...[m.|
000002b0 5d 30 3b 6a 64 69 6d 70 73 6f 6e 40 61 72 74 6f |]0;jdimpson@arto|
000002c0 6f 3a 20 7e 2f 66 6f 6f 07 6a 64 69 6d 70 73 6f |o: ~/foo.jdimpso|
000002d0 6e 40 61 72 74 6f 6f 3a 7e 2f 66 6f 6f 24 20 65 |n@artoo:~/foo$ e|
000002e0 78 69 74 0d 0a 65 78 69 74 0d 0a 0a 53 63 72 69 |xit..exit...Scri|
000002f0 70 74 20 64 6f 6e 65 20 6f 6e 20 53 75 6e 20 31 |pt done on Sun 1|
00000300 39 20 41 70 72 20 32 30 30 39 20 31 31 3a 32 35 |9 Apr 2009 11:25|
00000310 3a 31 35 20 41 4d 20 45 44 54 0a |:15 AM EDT.|

Now, after running "script-declutter typescript > out":

jdimpson@artoo:~/foo$ script-declutter typescript > out
jdimpson@artoo:~/foo$ cat out
Script started on Sun 19 Apr 2009 11:25:08 AM EDT
jdimpson@artoo:~/foo$ ls -l
total 496
-rw-r--r-- 1 jdimpson jdimpson 0 2009-04-19 11:21
-rw-r--r-- 1 jdimpson jdimpson 0 2009-04-19 11:21 baz.deb
prw-r--r-- 1 jdimpson jdimpson 0 2009-04-19 11:20 fifo
drwxr-xr-x 2 jdimpson jdimpson 4096 2009-04-19 11:20 foo
-rw-r--r-- 1 jdimpson jdimpson 0 2009-04-19 11:20 foo.txt
-rw-r--r-- 1 jdimpson jdimpson 0 2009-04-19 11:25 typescript
-rw-r--r-- 1 jdimpson jdimpson 496183 2009-04-19 11:21 warty-final-ubuntu.png
jdimpson@artoo:~/foo$ exit

Script done on Sun 19 Apr 2009 11:25:15 AM EDT
jdimpson@artoo:~/foo$ hexdump -C out
00000000 53 63 72 69 70 74 20 73 74 61 72 74 65 64 20 6f |Script started o|
00000010 6e 20 53 75 6e 20 31 39 20 41 70 72 20 32 30 30 |n Sun 19 Apr 200|
00000020 39 20 31 31 3a 32 35 3a 30 38 20 41 4d 20 45 44 |9 11:25:08 AM ED|
00000030 54 0a 6a 64 69 6d 70 73 6f 6e 40 61 72 74 6f 6f |T.jdimpson@artoo|
00000040 3a 7e 2f 66 6f 6f 24 20 6c 73 20 2d 6c 0a 74 6f |:~/foo$ ls|
00000050 74 61 6c 20 34 39 36 0a 2d 72 77 2d 72 2d 2d 72 |tal 496.-rw-r--r|
00000060 2d 2d 20 31 20 6a 64 69 6d 70 73 6f 6e 20 6a 64 |-- 1 jdimpson jd|
00000070 69 6d 70 73 6f 6e 20 20 20 20 20 20 30 20 32 30 |impson 0 20|
00000080 30 39 2d 30 34 2d 31 39 20 31 31 3a 32 31 20 62 |09-04-19 11:21 b|
00000090 61 72 2e 7a 69 70 0a 2d 72 77 2d 72 2d 2d 72 2d ||
000000a0 2d 20 31 20 6a 64 69 6d 70 73 6f 6e 20 6a 64 69 |- 1 jdimpson jdi|
000000b0 6d 70 73 6f 6e 20 20 20 20 20 20 30 20 32 30 30 |mpson 0 200|
000000c0 39 2d 30 34 2d 31 39 20 31 31 3a 32 31 20 62 61 |9-04-19 11:21 ba|
000000d0 7a 2e 64 65 62 0a 70 72 77 2d 72 2d 2d 72 2d 2d |z.deb.prw-r--r--|
000000e0 20 31 20 6a 64 69 6d 70 73 6f 6e 20 6a 64 69 6d | 1 jdimpson jdim|
000000f0 70 73 6f 6e 20 20 20 20 20 20 30 20 32 30 30 39 |pson 0 2009|
00000100 2d 30 34 2d 31 39 20 31 31 3a 32 30 20 66 69 66 |-04-19 11:20 fif|
00000110 6f 0a 64 72 77 78 72 2d 78 72 2d 78 20 32 20 6a |o.drwxr-xr-x 2 j|
00000120 64 69 6d 70 73 6f 6e 20 6a 64 69 6d 70 73 6f 6e |dimpson jdimpson|
00000130 20 20 20 34 30 39 36 20 32 30 30 39 2d 30 34 2d | 4096 2009-04-|
00000140 31 39 20 31 31 3a 32 30 20 66 6f 6f 0a 2d 72 77 |19 11:20 foo.-rw|
00000150 2d 72 2d 2d 72 2d 2d 20 31 20 6a 64 69 6d 70 73 |-r--r-- 1 jdimps|
00000160 6f 6e 20 6a 64 69 6d 70 73 6f 6e 20 20 20 20 20 |on jdimpson |
00000170 20 30 20 32 30 30 39 2d 30 34 2d 31 39 20 31 31 | 0 2009-04-19 11|
00000180 3a 32 30 20 66 6f 6f 2e 74 78 74 0a 2d 72 77 2d |:20 foo.txt.-rw-|
00000190 72 2d 2d 72 2d 2d 20 31 20 6a 64 69 6d 70 73 6f |r--r-- 1 jdimpso|
000001a0 6e 20 6a 64 69 6d 70 73 6f 6e 20 20 20 20 20 20 |n jdimpson |
000001b0 30 20 32 30 30 39 2d 30 34 2d 31 39 20 31 31 3a |0 2009-04-19 11:|
000001c0 32 35 20 74 79 70 65 73 63 72 69 70 74 0a 2d 72 |25 typescript.-r|
000001d0 77 2d 72 2d 2d 72 2d 2d 20 31 20 6a 64 69 6d 70 |w-r--r-- 1 jdimp|
000001e0 73 6f 6e 20 6a 64 69 6d 70 73 6f 6e 20 34 39 36 |son jdimpson 496|
000001f0 31 38 33 20 32 30 30 39 2d 30 34 2d 31 39 20 31 |183 2009-04-19 1|
00000200 31 3a 32 31 20 77 61 72 74 79 2d 66 69 6e 61 6c |1:21 warty-final|
00000210 2d 75 62 75 6e 74 75 2e 70 6e 67 0a 6a 64 69 6d |-ubuntu.png.jdim|
00000220 70 73 6f 6e 40 61 72 74 6f 6f 3a 7e 2f 66 6f 6f |pson@artoo:~/foo|
00000230 24 20 65 78 69 74 0a 65 78 69 74 0a 0a 53 63 72 |$ exit.exit..Scr|
00000240 69 70 74 20 64 6f 6e 65 20 6f 6e 20 53 75 6e 20 |ipt done on Sun |
00000250 31 39 20 41 70 72 20 32 30 30 39 20 31 31 3a 32 |19 Apr 2009 11:2|
00000260 35 3a 31 35 20 41 4d 20 45 44 54 0a |5:15 AM EDT.|

All the control and escape characters are gone.

So much UDP that it is all over you screen

The last article teaches how to use socat by comparing it first to cat then to netcat. It skimped on socat's UDP-related features, because netcat only implements a subset of them. This article picks up where the last one left off, with respect to UDP. After this article will be one more that discusses advanced socat features.

It turns out there are a lot of subleties when dealing with UDP, even before multicast is mixed in. We'll abandon the comparisons to netcat, as we've exceeded what netcat can do. But first a quick reminder of one way socat does UDP.

socat as a UDP server on port 11111.

socat STDIO UDP-LISTEN:11111

and then as a UDP client.

socat - UDP:localhost:11111

Recall from the previous article that socat's command-line structure requires two addresses. The first command is the server because it connects its standard I/O to the UDP-LISTEN: (UDP-L for short) address. So this is a UDP server listening on port 11111. The second one connects it's standard I/O ("-" is a synonym for "STDIO") to UDP, connecting out to port 11111 of localhost. This is the client. Both read from the standard input, send the data over the network, and both print to standard input what the recieve from the network.

UDP Connection Behaviours
Most textbooks make a big deal about UDP being connectionless, but I think this tends to make people give up on UDP prematurely. Notionally, there is a limited concept of a connection between the above pair of commands. That is, there's a unique pair of address/port tuples that unambiguously defines whether a UDP packet belongs to this connection. If it has the address and port of the server and the address and port of the client, then it's part of this connection.

Behaviour: Client-Server, single-plexed
To see this, start a pair of socat processes as described above, one using UDP: (client) and the other UDP-LISTEN: (server), and have the client send data. This effectively starts a connection (although a weak one). At first, when you start both the client and the server, the server cannot send any data to the client, because it doesn't know how to talk to the client. The client must send some data so the server learns about it. More significant, If you kill the client, restart it, then try to send data again (from either client or server), it may return a permission denied error, but regardless the new data won't be received on the other end. This is because the server determines that a connection is established based on the source and destination IP addresses and ports. When you restart the client, it chooses a new source port, so when it sends new data, the server doesn't recognize it as part of the old connection. Because UDP has no connection semantics, the server has no idea the original client was killed, but it still rejects new connections. If you add the fork option to the server, new connections will be accepted, but old connections will hang around indefinitely, and you can't predict which client will receive the data sent from the server.   In a script, you'll often be better off to use a loop structure around calls to socat (omitting the fork option), if you do have to handle multiple connections.

It's true that there's no way to test if a UDP socket is "connected". So UDP definitely makes no service level guarantees such as ordering, guaranteed delivery, or acknowledgment of existence, but these are qualities of a connection, not part of the definition. Meh, semantics.

(By the way, the reason you may get a permission denied message is because most IP stacks will send ICMP Port Unreachable packets in response to incoming UDP packets that aren't delivered to some receiving application. When the sending system receives the ICMP packets, it tells the sending process that permission is denied. However, such packets are often dropped by firewalls, and may not be required in the IP stack implementation, which is why you might not get the permission denied message.)

Behaviour: client to client
Oddly, if you kill and restart the server, the client has no problem sending data on the new connection. So the client (the UDP: address) has an even looser concept of a connection. This makes sense. A client sending it's 500th packet does little different than when it sent the first one. (There's no set-up protocol.) So you could do this: on one system "foo", run "socat STDIO UDP:bar:11111,sourceport=11111" and on system "bar", run "socat STDIO UDP:foo:11111,sourceport=11111". By causing both "clients" to bind on a specific source port, they can act as peers and talk to each other. Either process can be killed and restarted as many times as you want, and they will always resume their conversation.

SysCall Reference: connect() & bind()
(Also, for my own reference, the UDP: socat address type creates the socket handle, the waits. When data is available to send, it calls bind() on the socket only if the sourceport option is set, then calls connect() on the socket, attaching it to the destination IP address, then uses read(), write(), and select() to share data. UDP-L:, on the other hand, creates the UDP socket handle, immediately bind() it to the listen port, then waits on select(). When data is incoming, it calls recvfrom() with a "MSG_PEEK" option so that it can figure out the source port and IP address, then uses connect() to attach that source IP and port to the socket. It used read() and write() after that.  It can't receive from new client because it uses connect(), unless you use the fork option)

The downside of the peer approach is that they will only talk to each other, likes peers, rather than one being open to receiving from anyone else, like a server receiving from a client. Fortunately, socat also has UDP-SENDTO:, UDP-RECVFROM:, and UDP-RECV: addresses.

UDP-SENDTO: doesn't seem to behave any differently from UDP: address. Perhaps there is a subtle difference that I can't see at the moment. Example: "socat STDIO UDP-SENDTO:foo:11111".

SysCall Reference: sendto()/recvfrom()
(Again, for reference, UDP-SENDTO address does nothing on startup except to make the socket handle. When data is sent, it is sent via the sendto() system call. No call to connect() is made, bind() is called only if sourceport or bind options are used. It uses recvfrom() to read data, and select() so that it doesn't deadlock. I understand the theoretical differences between how UDP:'s behaviour of connect() followed by write(), and UDP-SENDTO:'s behaviour, using only sendto(). But I fail to appreciate a meaningful difference in overall behaviour. Specifically, while in general successive calls to sendto() can be directed at different destination IP addresses, socat has no way of arranging that to happen.)

Behaviour: Simple Multiplexed Server
UDP-RECVFROM: will wait for incoming data. When it gets a packet, it will then send any number of packets back to whoever sent the incoming one. But it won't ever wait for any more incoming packets. It will only send packets back to the source of the first packet received. This puzzled me, so I checked the man page, and indeed that's exactly what it's supposed to do. The man page goes on to say that this behaviour, when augmented with the fork option, is "similar to typical UDP based servers like ntpd or named." I suppose it's because some UDP-based services are strictly packet-based--a single packet from the client is answered by the server (with one or more response packets), after which the transaction is over (e.g NTP and DNS). Handling multiple packets in both directions would require an extended application protocol to sort out ordering and retries (or at least acknowledgments) and that's not necessary for every application. It's suitable for very simple and short-messaged client-server applications. You can catch a single message from the above UDP-SENDTO example with "socat STDIO UDP-RECVFROM:11111,fork".

SysCall Reference: socat-specific behaviour
(For reference, UDP-RECVFROM creates the socket, bind()'s to the given listen port, then waits for data using select(). When it receives a single packet, it calls recvfrom(). Then it goes back to select(), but only to wait for more data to be ready to send over the socket. This makes me wonder whether this "one incoming packet only" behaviour is built in to the recvfrom() system call. I tend to think it is not, but rather is a conscious design decision on the part of the socat author(s).)

Behaviour: Data Receiver
The UDP-RECV address will also wait for incoming data, just as UDP-LISTEN and UDP-RECVFROM do. However, UDP-RECV will receive all packets sent to it's listen port, from any and all clients. And it cannot send data back to any client. It's suitable for data collector applications. You can use both the UDP and UDP-SENDTO addresses to send to it. It aborts with an error if you try to make it send data. The "-u" option might be useful to prevent trying to use UDP-RECV in the wrong direction, should that be a problem. Example: "socat STDIO UDP-RECV:11111".

SysCall Reference: recvfrom()
(UDP-RECV uses recvfrom() to receive.)

Behaviour: Datacast
Finally (sorta), UDP-DATAGRAM address exists primarily to send and receive broadcast and multicast applications, both symmetric and asymmetric. You need to use the broadcast option to make broadcast address work, otherwise you get an error. You most likely need to use the ip-add-membership= options to make multicast to work. (You wouldn't if some other application instructs the OS to do the proper IGMP protocol that makes multcast work.) It also works on standard unicast addresses.

Broadcast example: "socat - UDP-DATAGRAM:,broadcast,sp=11111"
Multicast example: "socat - UDP-DATAGRAM:,bind=:11111,ip-add-membership="
Unicast example: "socat - UDP-DATAGRAM:,sp=11111"

None of these differentiate between sources. They truly are connectionless. On receive, they'll pick up any packet that makes it's way to their network interface, assuming the packet is destined for the same port that they are listening on. On send, they transfer packets to the IP address listed. The port they send to is controlled by the "sp=port" option (or "bind=:port" option for multicast). (You could argue the unicast example is't connectionless because it won't send to anybody, but it has to put something as the two address.) All these examples listen on an send to the same port. You could listen on 11111 and send on 11112, for example, but then to get two-way communication the other side would have to do the opposite, and in a broadcast or multicast example, you'd end up with a very strange partition of nodes, where one portion of nodes can talk to the other portion, but not to other members of their portion. When everyone sends to and listens one address, everyone can send & receive from everyone else.

If you had three machines, each running one of the above examples, then the unicast system could send to either the broadcast or multicast system, assuming it was using the right destination unicast address. The unicast system would receive from the broadcast system, but not the multicast one. The multicast system can receive whatever the broadcast system sends, but not the reverse. These behaviours might depend on the OS you're running, and perhaps even the ethernet driver it has. I don't think I would ever count on any of these behaviours for a production system, but they might make good tricks for testing or network investigation.

SysCall Reference: sendto()/recfrom(), again
(UDP-DATAGRAM calls recvfrom() (with the MSG_PEEK option), then again without that option. It does so continuously, and from all clients that send data. When sending data, it calls sendto().)

UDP-DATAGRAM with same source and destination ports, is what you'd most commonly use with Multicast applications. It might also be useful with UDP-RECV if you just need to listen (although UDP-DATAGRAM will do that, too), and with UDP and UDP-SENDTO for just sending data (again, UDP-DATAGRAM does that, too). I can't think of any cases where it's useful with UDP-RECVFROM.

Even more fun, if you had three machines, run this on two of them: "socat -d -d UDP-DATAGRAM:,broadcast,sp=11111 UDP-DATAGRAM:,broadcast,sp=11112", and on the third one run this: "echo hi | socat - UDP-SENDTO:ipaddr:11112,sp=11112" (where ipaddr is the IP address of one of the first two machines). Then go home for the day. Oh, what a great joke to play on the Network Support group!

Well, that's all I got for UDP and socat. It should augment the previous article. I've got one more planned, for covering some advanced topics.

Learning socat in terms of netcat

In my previous post on sslrsh I wrote about a script to allow remote shell access over SSL. The script made extensive use of socat. It reminded me of how feature-complete socat is, and has motivated me to capture some socat recipes. Note that these aren't general purpose scripts; they are just snippets of functionality listed here for future reference.

I'm not the only person who has a socat tutorial, but I think this post is unique because it will attempt to describe socat by comparing it to a tool that is doubtless a major inspiration for socat, namely, netcat. Hopefully, it will clarify how to use socat, demonstrate how much more featureful socat is, but also show why you shouldn't go ahead and delete netcat outright.

This is part one of a three part series. This one compares socat with netcat. The next one will delve into UDP with socat, and the last one will get into some advanced topics.

Final comment before we start. As of this writing, socat version 2.0 has entered some beta release. socat 2.0 addresses a limitation in socat 1.x, which is that "addresses" in socat 1.x are not completely uniform, and they are not layerable. For example, there's no way to run SSL over UDP, even thought socat knows about both protocols. Similarly, there's no way to have an SSL connection be tunneled through a web PROXY, meaning you have to resort to the hack found in sslrsh. socat 2.0 addresses these limitation, but uses an enhanced syntax, which means 1) it will be even more complicated to use, and 2) this post may become obsolete rather sooner than expected.

But before we compare socat to netcat, let's compare it to their common namesake, cat.

Use cat to display a file on standard output.

jdimpson@artoo:~$ cat file.txt
This is the content of file.txt

Use socat to display a file on standard output.

jdimpson@artoo:~$ socat FILE:file.txt STDOUT
This is the content of file.txt

In general, socat takes two arguments. Both are called addresses. In the above example, FILE:... is one address, and STDOUT is the second. It's customary but not required to spell the address name in upper case. We'll see lots of address types in this post, as well as in a couple follow-on posts that I've got planned.

If you just run "cat" by itself, it will read from standard input and write to standard output, and you have to press control-D to end.

jdimpson@artoo:~$ cat
hello, world!
hello, world!

The first line after the command is typed in, the second is printed by the command.

Here's the equivalent using socat.

jdimpson@artoo:~$ socat STDIN STDOUT
hello, world!
hello, world!

Apparantly, STDIN and STDOUT are both synonyms for STDIO, and socat doesn't care if you send input to the STDOUT address, or read output from the STDIN address. "socat STDOUT STDOUT", "socat STDIN STIN", and "socat STDIO STDIO" all appear to work identically.

But even here socat can improve the situation. We can add a history, so that we can just hit up arrow to repeat what we've typed in earlier, just like bash can do. It utilizes the GNU Readline library.

jdimpson@artoo:~$ socat -u READLINE STDOUT
hello, world!
hello, world!
hello, world!
hello, world!
hello, world!
hello, world!

To get this output, I first typed "hello, world!", pressed enter. socat wrote the second line. The I pressed up arrow to get the third line, and enter to get the fourth. Finally, one more up arrow for the fifth, and again enter for the sixth.

The "-u" flag tells socat to run in unidirectional mode. As we'll see later on, socat usually passes data between the first and second addresses in either direction, something cat does not do. When both addresses end up I connecting to the terminal, as is the case here, it's undertermined as to whether the line you type is being read by the first address an sent to the second, or vice versa. It took me a while to figure this out (over a month after I originally posted this!). By forcing unidirectional mode, only the first address reads what you types, and passes it to the second one.

When you quit using control-c or control-d, the terminal gets messed up, and you have to type "reset" (even though you may not be able see what you're typing) to fix it. The READLINE address has a couple of options, one of which lets you set a history file, which stores the the input history across invocations, just like your "~/.bash_history" file. This example isn't how you'd normally use READLINE, but I'm postponing further discussion on READLINE to another post.

Use cat to create a file (then again to display it)

jdimpson@artoo:~$ cat > file.txt
I like writing files using cat and control-D!!
jdimpson@artoo:~$ cat file.txt
I like writing files using cat and control-D!!

Note that, technically, the shell is actually writing the file by virtue of the redirect symbol (greater than sign).

Use socat to create a file (then use cat to display it)

jdimpson@artoo:~$ socat -u STDIN OPEN:file.txt,creat,trunc
socat needs some funny commands to write files!
jdimpson@artoo:~$ cat file.txt
socat needs some funny commands to write files!

Again. a huge difference between socat and cat is that socat, by default, is bidirectional. So both addresses are read from and written to. cat is always unidirectional. And, in socat, when either one of the addresses sends an EOF (End of File), it waits some amount of time and then exits. And again, the "-u" flag tells socat to be unidirectional. Without it, the above socat invocation will read from the file, get EOF, and exit. Or, if the file doesn't exist, it will quit with an error. There would be no time to type anything in. If instead you pipe something in to socat, like this echo foo | socat STDIN OPEN:file.txt,creat,trunc, the -u isn't needed. Presumably, when invoked within a shell pipe, socat realizes that the fact and know that pipes are always unidirectional, and will behave as if the -u flag were given.

Note the options used, creat and trunc. You could also use append, and lots of other options available to the open() system call. Also, without the trunc option, socat will write bytes into the file in-place. Omitting trunc and using the seek option, you can change arbitrary bytes in the file. There's rdonly and wronly options (read-only and write-only, respectively). I had thought that if I used wronly option, I wouldn't need the -u flag. That didn't work because socat still tried to read from the file, got an error, and exited. Probably the determination of uni- or bi-directionality is done without input from address-specific options. It does work as expected if you pipe input into socat. socat also has a CREATE address based on the creat(), but this is equivalent to OPEN with the creat option.

That covers the major forms of cat, and how socat emulates them, and in some cases enhances them. I don't suggest ever using socat to do what cat can do, but you should have a better sense for how to invoke socat. Now let's compare socat with netcat.

In netcat, connect to TCP port 80 on localhost, as a poor man's web browser.

jdimpson@artoo:~$ nc localhost 80
User-agent: netcat, baby!

HTTP/1.1 200 OK
Date: Wed, 28 Jan 2009 13:06:43 GMT
Server: Apache/2.2.8 (Ubuntu) PHP/5.2.4-2ubuntu5.4 with Suhosin-Patch mod_ssl/2.2.8 OpenSSL/0.9.8g mod_perl/2.0.3 Perl/v5.8.8
Last-Modified: Sat, 10 Jan 2009 22:01:08 GMT
ETag: "24008-369-4602802c0f100"
Accept-Ranges: bytes
Content-Length: 873
Connection: close
Content-Type: text/html

I typed in the first three lines (third one is an empty line). The rest is output from the server.

In socat, connect to TCP port 80 on localhost, as a poor man's web browser.

jdimpson@artoo:~$ socat - TCP:localhost:80
User-agent: socat, natch!

HTTP/1.1 200 OK
Date: Wed, 28 Jan 2009 13:07:51 GMT
Server: Apache/2.2.8 (Ubuntu) PHP/5.2.4-2ubuntu5.4 with Suhosin-Patch mod_ssl/2.2.8 OpenSSL/0.9.8g mod_perl/2.0.3 Perl/v5.8.8
Last-Modified: Sat, 10 Jan 2009 22:01:08 GMT
ETag: "24008-369-4602802c0f100"
Accept-Ranges: bytes
Content-Length: 873
Connection: close
Content-Type: text/html

Note the "-". That's a shortcut for writing "STDIO", so the above command is equivalent to "socat STDIO TCP:localhost:80".

netcat as a server, listening on TCP port 11111.

nc -l -p 11111

Use "nc localhost 11111" from another window to connect to it. You can type in both windows, and should see each others input in both

There are a couple versions of netcat out there, and some versions (like the OpenBSD one) have had the command flag syntax changed. If you get an error running netcat as decribed above, try it without the -p flag, like this: "nc -l 11111". Some people really don't get the ideas of compatibility and portability--if they want to change the way a program works (presumably because they think they are improving it), fine. But the should also change the name of the program so that hundreds of scripts don't break, and that future script writers don't have to test for each version. Anyways...

socat as a server, listening on TCP port 11111.

socat STDIO TCP-LISTEN:11111,reuseaddr

Use "nc localhost 11111" from another window to connect to it. You can type in both windows, and should see each others input in both.

Note that because socat is bidirectional, it doesn't matter which order you put the addresses. The above is equivalent to "socat TCP-LISTEN:11111,reuseaddr STDIO".

TCP-L can be used as a shortcut for TCP-LISTEN. The reuseaddr option lets you quit socat and run it again immediately. netcat does that by default.

netcat as a server, listening on TCP port 11111, handling multiple connections. This one is untested, and created from memory.

nc -L -p 11111

There used to be some versions of netcat that could handle more than one incoming connection when given the "-L" flag, but I can't find a copy of netcat that works that way, nor even any documentation for it. (Maybe I've imagined it!) It's almost equivalent to this shell script snippet: "while true; do nc -l -p 11111; done", except that this snippet only handles one connection at a time, not multiple ones. The OpenBSD variant of netcat has a -k option which works just like the shell snippet, but still doesn't handle multiple simultaneous connections.

socat as a server, listening on TCP port 11111, handling multiple connections.

socat STDIO TCP-L:11111,reuseaddr,fork

Now open two more windows and run "nc localhost 11111" in each. These are clients to your socat server. What you type in each client window gets displayed in the server window. But what you type in the server window only goes to one of the clients. Each line alternates between each client.  The fork option to TCP-L tells socat to fork a new process for each received connection on port 11111.  Each new process then reads and writes on the standard input/output.

TCP: will use IPv4 or IPv6 depending on which type of address you provide. TCP-LISTEN: will listen on all local addresses (IPv4 and IPv6) unless limited by the bind option. There exist TCP4, TCP6, TCP4-LISTEN, and TCP6-LISTEN variations, as well.

netcat as a UDP server on port 11111.

nc -u -l -p 11111

and then as a UDP client.

nc localhost 11111

socat as a UDP server on port 11111

socat - UDP-LISTEN:11111

and then as a UDP client.

socat - UDP:localhost:11111

Again, UDP-L can be used instead of UDP-LISTEN. UDP will use IPv4 or IPv6 depending on which type of address you provide. UDP-LISTEN will listen on all local addresses (IPv4 and IPv6) unless limited by the bind option. There exist UDP4, UDP6, UDP4-LISTEN, and UDP6-LISTEN variations, as well.

socat has other UDP-based addresses that implement other communication patterns beyond what netcat can do. I started to enumerate them here, but the UDP subject ended up dominating this article, so I've pulled it out and link to it here, so this one can remain focused on comparison with netcat.

The coolest, and most dangerous, netcat option is -e, which causes netcat to execute a command when it connects out or receives a connection. A simple remote access server looks like this:

nc -l -p 2323 -e /bin/bash

The strict equivalent simple remote access server in socat is:

socat TCP-LISTEN:2323,reuseaddr EXEC:/bin/bash

However, you can improve on this in several ways. First, the argument to -e in netcat has to be the name of an executable program, found somewhere on the disk. It can't be multiple commands, and can't rely on shell behaviours, like variable handling or wildcard expansion. Not a major impedance, because you can always write out your commands into a shell script, but sometimes doing that is inconvenient. But socat has the SYSTEM address, which uses the system() call rather than a call to exec(), which is what -e in netcat and EXEC in socat do. It enables something like this:

socat TCP-LISTEN:2323,reuseaddr SYSTEM:'echo $HOME; ls -la'

As always whenever the system() call is involved, be aware when writing scripts to not allow unchecked input to be invoked by the system() call. If you try the above in netcat ("nc -l -p 2323 -e 'echo $HOME; ls -la'"), you'll get an error like this: "exec echo $HOME; ls -la failed : No such file or directory", because netcat tried to execute a program called, literally, "echo $HOME; ls -la", spaces and all. Some versions of netcat have a "-c" option, which uses system() instead of exec(), which would allow multiple commands and shell behaviours to work. But again, it depends on which version you have.

netcat is often employed as a data forwarder, aka a simple proxy, listening for incoming connections only to redirect data to another destination port and/or address. It does so by going in to listen mode with -l, then using -e to invoke itself as a client. Because of the use of exec() instead of system(), you have to put the client call into a shell script. First, the client script, "nc-cli", looks like this:

nc localhost 22

Then the call to netcat looks like this:

nc -l -p 2323 -e "./nc-cli"

This redirects incoming connections to port 2323 around to port 22.

(Sometimes you see inetd or xinetd configured to use netcat to do redirecting.)

Of course, you can implement the exact netcat behaviour with "socat STDIO EXEC:nc-cli", or even "socat TCP-L:2323 SYSTEM:'socat STDIO TCP:localhost:22'". However, there's a better way to do data forwarding with socat, which doesn't need a client shell script or even a recursive call to socat. By now you should have enough information about socat to figure it out yourself, so I'll put the example beneath a cut.
Collapse )

And of course, you can replace either address with any other socat address we've already talked about (UDP, UDP-L), or ones we'll talk about in another post (e.g. SSL).

socat can also handle common forwarding requests that netcat doesn't handle. While netcat can bridge between TCP and UDP (insert the -u flag in the above netcat example as appropriate), it can only handle UDP data that is essentially connection-oriented. With socat, any other communication patterns for which UDP is commonly used are also do-able. Just replace the STDIO address in any of the examples in the socat UDP article with TCP or TCP-L addresses as appropriate.

socat can even behave as a
socket gender changer! This part might be a bit confusing to understand; there used to be a file called "TCP-IP_GenderChanger_CSNC_V1.0.pdf" that described the problem, but it seems to be absent from its original location. So I shall try to describe it. The "gender" of a socket is, in this analogy, whether it is a client or a server socket. So a socket gender changer allows two client sockets to connect to each other, or two server sockets to connect to each other. In either case, the gender changer must be running on a host reachable by both clients or both servers. It can run on the same host as either pair, or on a third host. netcat can do this, but with some limitations.

Why would you need this? Off-hand, I can't think of any network protocols that would allow two clients or two servers to just start communicating. So it's not a capability in demand as often as an audio cable gender changer is. But there is one case where it may be useful. Say you have a host running a service that's hidden behind a firewall. No one can connect to the service because the firewall prevents incoming connections. It will allow outgoing connections. Now imagine you can run software on a system outside of the firewall. If you run a server-server gender changer on the external host, and a client-client gender change on the internal host (with one client connecting to the internal service, and the other to one of the server ports on the external host) you have in effect fooled the firewall into allowing access to the internal service despite its access control rules forbidding incoming connections. The above-referenced URL has the specifics of how to use socat to do this. Notice that socat has all sort of retry and timing options to get the desired behaviour. netcat doesn't have all these options, although you may be able to compensate for their absence with a shell script.

That brings to an end the direct comparison of socat and netcat functionality. There's a lot more that socat can do, which I'll address in another article (one on UDP and multicast, the other on everything else). There are some things netcat can do that I didn't discuss, like how it can do telnet negotiation or port scanning. I really consider those out of place in netcat, because they're too application focused. I tried to point out all the netcat options that are only available in some versions of netcat where appropriate. I didn't talk about the source routing ability of (again, some versions of) netcat. socat can do this too, using the ipoptions option, but it's difficult to use. Mostly, though, I don't know enough about source routing to compare the two; something to add to my list of things to figure out. Don't forget, here's the socat & UDP article.

sslrsh: Remote Shell over SSL using certificate authentication

sslrsh, which stands for SSL Remote Shell, allows you to log in to a remote system over an SSL connection, using X.509 certificates for encryption and for authentication. It's similar to SSH. sslrsh is a shell script. Most of the heavy lifting in the script is done by socat. The same script can run as both client and server.

This script signals a return to my favorite subject, tunneling. My last discussion on this subject got a bit out of hand. My last useful discussion on this subject was based on SSH, and was unique in that it worked without needing root privileges on the remote side of the tunnel. sslrsh is not actually a tunneling tool. It's a remote shell tool. But it's a good introduction for future posts that will use some of these same tools to set up VPN-style tunnels. Before I wrap up this trip through memory lane and get to the point, I want to remind you about mini_ca. We'll be needing some certificate action for this script, and for that we need a Certificate Authority. You can use mini_ca to generate the needed certificates, or you can be difficult and get them some other way.

Here's the usage & license statement:


Usage: sslrsh [-h]
sslrsh [-p port] [-P proxy:port] [-c /path/to/cert] [-a /path/to/cacert] remotehost
sslrsh -s [-p port] [-c /path/to/certificate] [-a /path/to/cacertificate] [ -e shell commands to execute ]

-h This helpful information.
-p Port to use (currently 1479).
-P CONNECT proxy server to use. http_proxy environment variable will be used if set, but will be overridden by this flag.
-c Path to the client or server certificate (must include key)
(currently "sslrsh-client.pem" for client, "sslrsh-server.pem" for server)
-a Path to the signing Certificate Authority's certificate
(currently "sslrsh-cacert.pem")
remotehost System to connect to (client mode)
-s Listen for connections (server mode)
-e Shell command or commands to execute as the server, defaults to "echo Welcome to sslrsh on argentina; /bin/bash --login -i"

sslrsh is copyright 2009 by Jeremy D. Impson <>.
Licensed under the Apache License, Version 2.0 (the License); you
may not use this file except in compliance with the License. You may
obtain a copy of the License at
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
License for the specific language governing permissions and limitations
under the License.

sslrsh needs three files: a Certificate Authority (CA) certificate. a server certificate, and a client certificate. The client and server certificates must also have their private keys embedded within them. The CA certificate must have issued both the server and client certificates. Actually, the server instance of sslrsh needs to have the CA certificate that signed the client's certificate, and the client instance needs to have the CA certificate that signed the server's certificate. Got that? Good.

You can specify the CA Certificate with "-a", and the server/client certificate with "-c". By default the port is 1479, which can be changed with "-p". "-P" will let you specify a web CONNECT proxy for the client. If you want to run as the server use "-s", and supply a remote host destination if you want to run as a client. The server will run a bash shell by default, but you can change what it runs through the "-e" command. NOTE NOTE NOTE: the value specified with "-e" is passed to the system() call, which can have severe security repercussions, especially if in your own script that calls sslrsh, you pass unverified data as the value for "-e". (Hmm. A Taint Mode for the shell would be pretty cool.) Finally, "-h" gets the help/usage and license statement.

Here's an example. It runs the server on host "artoo", and the client on host "argentina", connecting to "artoo" using sslrsh:

On the server

jdimpson@artoo:~$ sslrsh -s
Listening on "1479"
And on the client

jdimpson@argentina:~$ sslrsh artoo
Connecting to "artoo:1479"
welcome to sslrsh

Here's the code:


You can download sslrsh here:

Let's dive in. After setting default values, processing the command line, and providing a usage and license statement, sslrsh gets its certificate material in order. It makes sure the CA certificate is readable, and does the same for the client or server (depending on which mode it runs in) combined certificate and key file. By default, it looks for files in the current environment, with names that just happen to match up with the file names you'd get if you followed the directions below to create them with mini_ca. (What a happy coincidence.) Otherwise, you can use the "-a" flag to direct sslrsh to the CA cert you want. Similarly, the client or server cert can be set with "-c".

Then it figures out what kind of proxying, if any, should be done. If the user has the http_proxy environment variable set, that will be used. If the user specified the "-P" flag, the provided value will be used as the proxy. Regardless of which source the proxy setting comes from, it gets scrubbed and parsed. First, any URL-related text (e.g. "http://") is removed. If it matches the form "server:port", the port is stripped out and assigned to another variable.

With all that out of the way, the script proceeds to figure out if it's meant to run as a client or a server. If as a client, it then checks the proxy settings. If present, the script forks (via the ampersand) an instance of socat that listens on a local TCP port and forwards anything sent to that port on to the specified web CONNECT proxy. It tells the proxy to redirect the connection to the final destination, as given on the command line. This is effectively a proxy to the proxy, because socat's SSL functionality doesn't know how to talk to a web proxy directly.

Then the script executes socat, listening on standard input, connecting it to an SSL socket. The "-,raw,echo=0" argument to socats says: listen on standard input, turn off all processing that the TTY layer normally does, and similarly tell it not to echo input typed by the user back to the user. This is important as we want the server side to receive everything we type, and to present us with everything on the screen. The argument that starts with "SSL:..." controls SSL connection. If no proxy was configured, the SSL connection will connect to the final destination as given on the command line. If there is a proxy, the SSL connection will connect to the listening port of the above described socat instance. Either way, the SSL connection uses both the CA certificate (for authenticating the server), and the client certificate (to present to the server). The rest of the argument uses a number of options to control the SSL connection. The "-ls" and "lp" arguments control how socat performs logging.

It's unfortunate that we must a second instance of socat just to perform this proxying; it would be preferable if the SSL capability within socat could utilize the proxy directly. Apparantly socat version 2.0 will be able to rectify this situation.

If running as a server, sslrsh again executes socat, using it to create a listening SSL socket, then fork and exec a shell command, which by default prints a welcome message then runs an instance of the bash shell. The argument to socat that begins "SSL-L:..." tells it to listen for incoming SSL connections (from the client). Every new connection causes the process to fork and run a shell command, as described in the argument that starts "SYSTEM:...". The rest of that argument has a bunch of options ("pty,setsid,setpgid,ctty") that are some Unix/POSIX voodoo necessary to give the client the appropriate interactive shell experience with its own TTY and job control, while "stderr" makes sure the error from the shell command gets sent to the client. The "-ls" and "lp" arguments control how socat performs logging. The two "-d" flags increase the verbosity level of the logging.

Although I think sslrsh is a neat script, and it has its uses as a lightweight and customizable remote access server, be aware of its limitations. It doesn't do tunneling/port forwarding like SSH does. It may violate the access policy of the system you're running the server on (if you aren't its administrator). It doesn't have robust error checking, especially in the set up of the proxy process, so it's not an easily supportable, enterprise-quality service.

Also, it doesn't matter which client certificate you use to connect to the server, the server will authenticate you as long as the CA created the client certificate. There's no differentiation between clients. A "normal" SSL application would actually read a certificate after validating it. We're not doing that here. It would be nice if the contents of the validated certificate were made available to our server. One way to do this would be to have an option to the SSL function that runs a script after a cert is validated, and is fed the certificate on input or in the environment. The script would return true or false to specify whether the certificate should be accepted. Or, if the SSL function placed the validated certificate, or even just the "Subject" line in the certificate, into an environment variable, our shell (as specified by "-e") could use it to make decisions.

But probably the biggest limitation is that, by default, when you log in to the sslrsh server, the shell you get will be running as the user who started the sslrsh server. But, here's an alternative way to run server which makes it prompt for username and password (in addition to the certificate-base authentication). However, for this to work, the server has to be run as root. It works by replacing the call to the bash shell with a call to the login program. login prompts for username and password, checks them against the server system's password mechanism, then uses setuid() to become whatever username was provided.

On the server

jdimpson@artoo:~$ sudo ./sslrsh -s -e "/bin/login"
Listening on "1479"
On the client, you can see the change

jdimpson@argentina:~$ sslrsh artoo
Connecting to "artoo:1479"
via "localhost:8888" proxy
artoo login: sysadmin
Last login: Wed Nov 26 09:34:29 EST 2008 from on pts/8
Linux artoo 2.6.24-22-generic #1 SMP Mon Nov 24 18:32:42 UTC 2008 i686

The programs included with the Ubuntu system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Ubuntu comes with ABSOLUTELY NO WARRANTY, to the extent permitted by
applicable law.

To access official Ubuntu documentation, please visit:
You have mail.


A note about socat: Its man page describes it like this:

socat - Multipurpose relay (SOcket CAT)

Socat is a command line based utility that establishes two bidirectional
byte streams and transfers data between them. Because the streams can be
constructed from a large set of different types of data sinks and sources
(see address types), and because lots of address options may be applied to
the streams, socat can be used for many different purposes. It might be
one of the tools that one ‘has already needed´.

That's accurate, but doesn't really capture the scope of socat's capability. Just like its namesake cat, socat is all about connecting standard input to standard output. Unlike cat, which primarily operates on files and TTYs, socat can operate on (and create, if necessary) files, TTYs, TCP sockets, UDP sockets, raw sockets, Domain sockets, raw IP frames, child processes, named pipes, arbitrary file handles, as well as files and standard input, output, and error. Additionally, socat knows about a few application-level protocols like SSL, Web CONNECT Proxy, and SOCKS. It also knows about various network access methods, like IPv4, IPv6, IP Multicast, IP Broadcast, and datagram- and session-style transmit and receive. All of these communication mechanism are called addresses, and socat has a rich set of options that apply to numerous address types, such as setting device-specific ioctl()s, IP headers, socket options, and security ciphers, and performing functions like fork(), chroot(), and setuid() in order to get various security and performance behaviours.

Folks who know about netcat might wonder how it compares to socat. Given socat, you don't need netcat, although netcat is a lot simpler to use if all you need is basic TCP or UDP streaming. Personally, I plan on keeping netcat in my own personal toolbox, if only because I can bang out a netcat command line without any conscious thought.

A note on certificates: As mentioned above, the client and server need certificates & keys, and they both need access to a Certificate Authority certificate in order validate each other's certificates. Here's how to use mini_ca to create the necessary certs. (Follow the link to find out where to download mini_ca.)

Create the Certificate Authority, and the CA certificate, like this:


You'll find the CA certificate at "sslrsh-ca/sslrsh-cacert.pem".

Use the CA to create a server certificate/key.


You'll find the server certificate/key at "sslrsh-ca/certs/sslrsh-server.pem ".

Use the CA to create a client certificate/key.


You'll find the client certificate/key at "sslrsh-ca/certs/sslrsh-client.pem ".

notes: Perl Commandline Primer

Hey, I found some really old notes that I wrote, which are still mostly relevant today! This is a Primer for Perl on the Command Line. It's been available at since I wrote it in 1998. I wrote it for a Linux User Group that I helped start around that time ( I've always loved Perl for its flexibility, and using it on the command line (vs writing out perl scripts) for one-shot administrative tasks is for me a big part of that flexibility. Here's the primer in its entirety, unmodified from how it was originally presented at an early LUG meeting. Apparantly, it was presented January 19, 1999. Hmm. According to earlier meeting minutes, the notion of presenting these notes was conceived in early 1999, not in 1998, so that's more likely when this was written, despite the copyright date in the file.


These notes are Copyright 1998 by Jeremy D. Impson <>.
You may copy them in their entirety or in part for your own use (any use)
provided this copyright and credit notice is included in the copy.

- Running perl -

1. Perl on the command line is an incredible administrative tool and a great time server

2. Normal way to run a perl script is by writing the code into file (say, "foo") then
running it like this

$ cat foo
print "hello, world!\n";

$ perl foo
hello, world!

3. If the script has "#!/usr/bin/perl" as the first line, then you can run the script
directly as an application:

$ cat foo
print "hello, world!\n";

$ foo
hello, world!

4. Sometimes writing files out is a pain, especially for one shot deals or when you don't
have write permission. The same affects as above can be gained with the '-e' flag, which says
"execute the next chunk as Perl code".

$ perl -e 'print "hello, world\n";'
hello, world!

- Quotation tips -

5. Must group the argument to -e with either single or double quotes so that the shell will
pass the whole thing as the first argument. Single quotes are recommended, because they prevent
the shell from trying to interpolate various characters that are special to both Perl and the

To use a single quote within your Perl code, use the "q/string/" notation, which perl
will treat just like single quotes.

$ perl -e 'print q/hello, world\n/;'
hello, world\n

(Single quotes (or q//) prevent \n from being interpretted as a newline.)

Another way to write double quotes is to use the "qq./string/" notation, which perl
will treat just like double quotes.


$ perl -e 'print qq/hello, world\n/;'
hello, world

works _exactly_ like

$ perl -e 'print "hello, world\n";'
hello, world

- Commandline flags -

6. '-e' can be used multiple times. Each script that follows each '-e' will be run
as though it were the next line of the script.

$ perl -e 'print "1\n"; die;' -e 'print "2\n";'
Died at -e line 1.

$ perl -e 'print "1\n"; ' -e 'print "2\n";die;'
Died at -e line 2.

7. '-c' can be used to check the syntax of a file without executing it, to see if there are
any syntactical bugs.

$ cat foo
prunt "hello, world\n";

$ perl -c foo
String found where operator expected at foo line 1, near "prunt "hello, world\n""
(Do you need to predeclare prunt?)
syntax error at foo line 1, near "prunt "hello, world\n""
foo had compilation errors.

(Basically, it is saying that it doesn't know what "prunt" is.)

8. '-n' and '-p' both implicitly add a wrapper around your code. '-n' adds this:

foreach (<>) {

# your code goes here


and '-p' adds this:

foreach (<>) {

# your code goes here

print $_;

The effect is that first, all data read in from standard input will have your code applied
to it, then second, each line of each file listed on the commandline will have your code
applied to it. For the case of '-p', after your code applied, the line of data is printed

We'll see how powerful these options are, especially when combined with '-e' and '-i'.

9. '-i' is the sysadmin's best friend. It causes all files opened via the "<>"
construct to be edited in-place. When used with an argument as in '-i.bak',
it is the paranoid sysadmin's best friend, because it makes backups of the file being
edited (taking the original name and appending a ".bak" to it to make the name of the
backup). Note that this is especially useful with the '-n' and '-p' arguments, both of
which implicitly use the "<>" construct.

10. '-M' followed by a module name will cause the perl module to be "use"d. The
allows shell commands to be used as perl functions, for example.

$ cd /
$ perl -MShell -e 'print "\nThe contents of the current directory ", pwd(), "\n", ls(-la),"\n";'

The contents of the current directory /

total 903
drwxr-xr-x 17 root root 1024 Nov 24 16:16 .
drwxr-xr-x 17 root root 1024 Nov 24 16:16 ..
drwxr-xr-x 2 root root 1024 Jul 13 1998 .automount
drwxr-xr-x 2 root root 2048 Dec 18 23:50 bin
drwxr-xr-x 2 root root 1024 Jan 10 01:39 boot
lrwxrwxrwx 1 root root 11 Sep 17 20:53 cdrom -> /mnt/cdrom/
-rw------- 1 root root 126976 Sep 20 17:34 core
drwxr-xr-x 2 root root 21504 Jan 16 00:15 dev
drwxr-xr-x 28 root root 3072 Jan 16 00:14 etc
drwxr-xr-x 14 root root 1024 Nov 8 17:10 home
drwxr-xr-x 4 root root 2048 Oct 26 01:03 lib
drwxr-xr-x 2 root root 12288 Sep 1 15:34 lost+found
drwxr-xr-x 8 root root 1024 Sep 16 19:46 mnt
dr-xr-xr-x 5 root root 0 Jan 15 18:41 proc
drwxr-xr-x 11 root root 1024 Jan 16 00:01 root
drwxr-xr-x 3 root root 2048 Nov 10 00:00 sbin
drwxrwxrwt 7 root root 1024 Jan 16 14:48 tmp
drwxr-xr-x 23 root root 1024 Jan 8 16:11 usr
drwxr-xr-x 22 root root 1024 Sep 3 00:51 var
-rw-r--r-- 1 root root 369087 Dec 30 01:53 vmlinuz
-rw-r--r-- 1 root root 368229 Dec 30 00:40 vmlinuz.old

11. '-0' (zero) followed by a up-to-three-digit hex value, causes the input record seperator
to become the character represented by the hex value. We aren't going to get into this too much
here, except for the special case of '-0777'. 777 represents to character, so it causes Perl to
read its input (from Standard Input, usually) in whole. Another special case is '-00', which
causes Perl to read data in one paragraph at a time.

12. '-d' runs your script under the perl debugger. See for simply the _best_ perl debugger tutorial I have
ever written, er, read.

13. '-v' prints the version of perl you are using. USE VERSION 5.004_04, or higher,
fer crying out loud!

- Text manipulation -

14. Perl excels at text mangling. To change a the string ''
into '' in a group of files, do this:

perl -pi.bak -e 's/source\.syr\.edu\/~jdimpson/\/phred/g;' *.html


perl -pi.bak -e 's#source\.syr\.edu/;' *.html

15. Another, more complicated example would be to remove all multiple occurences of the
word "the", "a", "an", and "and".

$ cat foo
a a
an an
an a an
the them the
the the the

$ perl -pi.bak -e 'BEGIN{ $word = "the|a|an|and"; }; s/\b($word)(\s+\1)+\b/$1/g ;'

$ cat foo
an a an
the them the

16. Remove comments from C code, foo.c

$ cat foo.c
/* some c code that does nothing */

int main() { /* this is main */

/* nothing here... gaurenteed 100% bug free and Y2K-compliant! */

} /* end of main */

/* This software is copyright 1999 by Jeremy D. Impson. You must pay me $1,000,000.54 drachma
in order to use it. */

$ perl -0777 -pe 's{/\*.*?\*/}{}gs' foo.c

int main() {


17. Print first 80 columns of a file.

$ cat foo

$ perl -pi -e 'chomp(); $_ = substr($_, 0, 80); $_ .= "\n";' foo
$ cat foo

18. Get rid of beginning and ending spaces

$ perl -pi -e 's/^\s+//; s/\s+$//;'

- Command automation -

19: Mass renaming of files (change *.htm to *.html):

$ ls *.htm | perl -ne 'chomp(); $file = $_; s/\.htm$/\.html/; print "mv $file $_\n";' | sh

20. Mass renaming of files (uppercase to lowercase):

$ ls
Storable-0.6@3..tar.gz freeamp-1.1.0 ipac-1.00.tar.gz palm lug proj
bin public_html

$ ls *[A-Z]* | perl -ne 'chomp(); $nfile = lc($_); print "mv $_ $nfile\n"' | sh
$ ls
storable-0.6@3..tar.gz freeamp-1.1.0 ipac-1.00.tar.gz palm lug proj
bin public_html

- Miscellaneous -

Nice for printing out path variable:

$ perl -e '@path = split /:/, $ENV{'PATH'}; map { print $_,"\n"; } @path;'

- References -

The Perl FAQ (available at above URL) section 3 and 4

man perlrun


using flock to protect critical sections in shell scripts

This isn't about a shell script, it's about a really cool technique to apply in shell scripts. Have you ever been worried about multiple instances of a shell script running because they might overwrite or corrupt the data or devices they are working on? Here's a way to prevent that.

There's a Unix system call named flock(2). It's used to apply advisory locks to open files. Without exhausting the subject, it can be used to synchronize access to resources across multiple running processes. Note that I said access to resources, not just access to files. While flock(2) does solely act on files (actually, on file handles), the file itself need not be the resource to which access is being controlled. Instead, the file can be used as a semaphore to control access to a critical section, in which any resource can be accessed without concurrency concerns. (Howdja like that pair of sentences?) Note that flock(2) performs advisory locking, which is another way of saying that all parties accessing the resource in question have to agree to abide by the locking protocol in order for it to work. That's still useful to us. flock(2) is used in this manner to protect critical sections in lots of executable programs.

It turns out that, in addition to the flock(2) system call, there is also a flock(1) command line tool. It's a simple wrapper around the flock(2) system call, making it accessible to shell scripts. It's part of the util-linux-ng package.  You can certainly use it to restrict write access to files, a reasonable thing to do in some shell scripts, but there's a more general technique to be had: flock(1) can be used as a semaphore!

I'll start with the general form, discuss a couple alternative forms, and then give some practical examples. Here's the general form, which is good for serializing access to a resource It creates a queue, such that each process waits its turn to utilize the resource:


ME=`basename "$0"`;
exec 8>$LCK;

flock -x 8;
echo "I'm in ($$)";
sleep 20; # XXX: Do something interesting here.
echo "I'm done ($$)";

Everything after the call to flock is the critical section, where you can operate on whatever resource that you need to control access to.

You may be wondering about the use of "exec". Normally "exec" in a shell script is used to turn over control of the script to some other program. But it can also be used to open a file and name a file handle for it. Normally, every script has standard input (file handle 0), standard output (file handle 1) and standard error (file handle 2) opened for it. The call "exec 8>$LCK" will open the file named in $LCK for reading, and assign it file handle 8. I picked 8 arbitrarily. The call to "flock -x 8" tells flock to exclusively lock the file referenced by file handle 8. The state of being locked lasts after the flock call, because the file handle is still valid (think of it as still in scope). That state will last until the file handle is closed, typically when the script exits.

You can see the locking in action if you run this script twice, ensuring that the second one is started before the first one finishes it's call to sleep. I do this in the following example by running the script (called flocktest0) once in the background (by using the "&" to background it), the immediately running it again. Because the script sleeps for 20 seconds, the second call will start before the first one is done (and before it's given up the lock). The output is messed up because the first call is put in the background, but then prints to output, causing it to interfere with the shell's output.

jdimpson@argentina:~$ flocktest0 &
[1] 13978
jdimpson@argentina:~$ I'm in (13978)
I'm done (13978)
I'm in (13982)
I'm done (13982)
[1]+ Done flocktest0

Notice that the second call to flocktest0 doesn't say "I'm in (...)" until after the first call to flocktest0 says "I'm done (...)", even though the second call was started before the first call was finished.

So you can imagine a real script doing something interesting in the critical section rather than "sleep 20", and you can be sure that only one call to that script is doing it's thing at a time, even if it's invoked several times in a row, and each call will eventually get its turn.

Whereas the general form above is for serializing parallel access to some resource by creating a queue, this following alternate form is used when you want only one process to be accessing a resource (or performing some function) at a time, but you don't want to create a queue of processes. Instead, subsequent invocations will exit (or doing something else) rather than queue up. If the initial function does exit, then the next invocation will be allowed to execute. Here's the alternate form.


ME=`basename "$0"`;
exec 8>$LCK;

if flock -n -x 8; then
echo "I'm in ($$)";
sleep 20; # XXX: Do something interesting here.
echo "I'm done ($$)";
echo "I'm rejected ($$)";

The primary difference is the "-n" flag to flock, which tells it not to block, but to exit with an error value. It's put in an if statement, which will do the the "interesting work" (call sleep in this example) in the true clause, and will report that it can't do interesting work in the false clause.

And here's what happens when I invoke it four times in rapid succession, then a fifth time after waiting 20 seconds:

jdimpson@argentina:~$ flocktest1 &
[1] 14644
jdimpson@argentina:~$ I'm in (14644)
I'm rejected (14648)
jdimpson@argentina:~$ flocktest1
I'm rejected (14651)
jdimpson@argentina:~$ flocktest1
I'm rejected (14654)
jdimpson@argentina:~$ I'm done (14644)
I'm in (14657)
I'm done (14657)

Again, you can imagine a real script that does something more interesting than sleep, which will benefit from the fact that only one invocation will actually do anything, and every other invocation will just exit. Or, if you wanted to create a reliable modal function by, in the false clause, sending a message (or calling "kill") to the first invocation, to make it shut down. So calling the script the first time starts the function, and calling it the second time stops the function. It won't lose track of state.

OK, maybe you need help imagining these things. Here are two realistic examples, one for the general form, and one for the alternate. Both examples center around my use of MythTV, specifically, around the mythfrontend program. MythTV is an open source PVR/DVR application, and mythfrontend is the component of MythTV that plays the videos, and with which the user directly interacts.

MythTV also comes with a simple command line tool called mythtvosd, which can be used to send messages to mythfrontend, which will write the mesage to the screen by overlaying it over the video being played. I decided it would be cool to display the sender and subject of all email that I receive. I already use procmail to process my incoming email, so it was easy to insert one procmail rule that strips out the sender and the subject and calls mythtvosd with that information, so I could see it on my TV. It's kind of like biff, for those of you who really know your historic Unix applications.

Trouble is, I tend to get two or three email messages at a time, because I use IMAP to download email from the SMTP servers via a cron job. procmail and mythtvosd are able to process all three messages faster than it takes mythfrontend to scroll the sender and subject strings across the screen. So if procmail calls mythtvosd three times in rapid succession, I will only see on the screen the results for the last email (because subsequent calls to mythtvosd have the effect of canceling the previous one). So I used the general form to create a queue, ensuring that all three emails get scrolled across my screen.

The relevant part of .procmailrc to invoke the following code is

| $HOME/bin/mythemail

And mythemail looks like this:

ME=`basename $0`;
     exec 8>/tmp/$ME.LCK;
     flock -x 8
     mythtvosd --template=scroller --scroll_text="mail from $FROM, regarding $SUBJ"
     sleep 10;
) &

The code that sets the values of $FROM and $SUBJ has been removed; it's a complicated hack that doesn't do anything to make my point about flock.

So that I don't create a long queue that backs up all my email just to display notification on my tv, I use process control ("&" again) to background all the processes blocked on the lock file waiting their turn.

The "sleep 10" is needed because there's no way to know when the text has finished its scroll across the screen, but 10 seconds works well for me. It's actually not long enough for very long from/subject strings (and/or very wide screens), but it's enough to give a sense of how many emails have been received.

The other example has to do with my new Logitech LX710 keyboard, which has lots and lots of extra buttons for playing music and starting email clients and so forth (although half the buttons don't work under X/GNOME--showkeys sees them, but xev does not). I mapped some of the buttons to control mythfrontend. One button activates mythfrontend, one pauses the video and others forward and rewind through the video. This time the trouble is when I accidentally press the activate button more than once. Repeated presses of the on button would cause mythfrontend to start multiple times. That wasn't what I wanted. Once it is on, I don't want it to turn on again.

So I used the alternate form to make that happen. On my system, mythfrontend is already a shell script which eventually calls mythfrontend.real, so I only had to add the following code snippet to the top of the script:


ME=`basename $0`;
exec 8>$LOCK;

if flock -n -e 8; then :
echo "Can't get file lock, mythfrontend already running";
exit 1;

# XXX: rest of mythfrontend...

I left the true clause blank (just a colon), and put an exit in the false clause. This let me keep all the lock handling stuff at the top, and makes it very easy to insert into the beginning of a shell script, which is nice because I'm inserting this into a script maintained by someone else, so I'll have to re-insert it during every upgrade. I could have created my own script to contain the locking code, then have it call mythfrontend, but then anyone/anything that calls mythfrontend directly wouldn't go through the locking code, and my scheme wouldn't work.

We could make this script even more useful by doing something more interesting in the false clause than just exiting. If I wanted to make my keyboard button work like a modal on/off switch, I would have the false clause shut down the running instance of mythfrontend. However, I don't want that, because my initial problem was that I was pressing the activate button by mistake, and I wouldn't want to interrupt the video accidentally.

Some notes and limitations:

In the general form, there's no guarantee in what order the processes that are queuing up to access the resource will be served. It's probably a function of the scheduling algorithm used on your system, but will also be effected by how long each process holds the lock. Starvation is a possibility. I probably should mention that these locking techniques aren't intended to scale to high demand or for long-running processes. Use an enterprise-quality software framework for that sort of thing. These techniques, like all shell scripts (in my personal scripting philosophy) are for short lived or infrequently demanded tasks.

The locked file remains locked either until it is explicitly unlocked, or when the script holding the lock closes the file handle. "flock -u N" will explicitly unlock file handle N. Also, all shell scripts (indeed, all processes) close any file handles that remain open when they exit. Finally, a script can explicitly close file handle N by doing "exec N>&-" . I tend to design my scripts so that there's no need to explicitly close file handles or perform the unlock call, for the similar reasons of scalabas in the previous paragraph.

While I prefer using "exec" to create and name lock file handles, another alternative is to use subshells. Instead of

exec 8>lockfile;
flock -x 8;
# XXX: do interesting work here

you can do

flock -x 8;
# XXX: do interesting work here
) 8>lockfile;

I prefer the former, because it has less impact on the overall structure of your script and it keeps most of the lock file handling code in one place. Subshells also do funny things with variable scope, so I don't use them unless I need them. I'm not sure which one is easier to understand; they both use esoteric behaviours of the shell, specifically how you work with file handles.

For reasons I don't understand, at least with bash version 3.2.39 as compiled for ubuntu 8.04.1, you're limited to single digit file handle numbers. You should be able to to get up to 255, but I get various errors when I use anything greater than 9. While there are a few command line tools that know about more than stdin, stdout, and stderr file handles, they are rare, and knowledge of how to use file handles in the shell is rarer still, so running out of file handle numbers shouldn't be a problem.

According to POSIX, the flock(2) system call has a limitation in that it isn't required to work over remote mounted Network File Systems (NFS). The flock(1) command, being a wrapper around the system call, inherits this limitation if present for your system. There's another system call, fcntl(2), which will work over NFS. Unfortunately, I don't know of any command line way to utilize fcntl(2).

fcntl(2) also has a mandatory locking capability. But be aware that if you were to use mandatory file locking as a semaphore to control access to another, arbitrary resource, the mandatory quality is not transitive to the arbitrary resource, for the simple fact that the resource can still be accessed by a rogue process that chooses not to use the lock. It's a moot point at the moment, but something to keep in mind should someone make a fcnt(1) command line tool.

Finally, if a locked file gets deleted, subsequent lock attempts will succeed even if something is holding an old lock. So if you're protecting access to an arbitrary resource, be aware of the access control permissions of the lock file. You don't want anyone to delete the file from under you.

foreachmail: Run a program on each email in an mbox-style mailbox

Here's one I used today to split out a single mbox file into multiple ones based on the date the email was sent. foreachmail reads an mbox file on standard input, and takes a command (including a shell command line) as an argument. It iterates through the mbox file, finding each individual email message. It then executes the command for each individual email message, sending the email into the command as standard input. It was written in December 2004.

Here's a simple but trivial example:

cat mbox-file | foreachmail "grep '^From:' | sed -e 's/^From: //' "

In the end, this is equivalent to "cat mbox-file | grep '^From:' | sed -e 's/^From: //' ", BUT, internally, there is a major difference. foreachmail determines the start and finish of each single email, and runs the given command only on that message. The given command only has to worry about processing the contents of one mail message. That may not be an issue if you can implement your solution in a line-oriented, single pass over the data. But if how you must process each message based on some content in the message, foreachmail makes it easier.

Here's a complex example, which is the one I used today. Not only is it complex, it is for a very specific purpose unique to my environment. Also it relies on another custom command (dateformat) that is out of scope for this article. But I wanted to show another example of how it can be used. I used it to break a giant list of spam into a bunch of smaller spam files. Each of the smaller files contains every spam received on one day.

foreachmail '(cat > /tmp/mytmp; DAY=`cat /tmp/mytmp | egrep "(single-drop| with SMTP|for <.*>)" | sed -e "s/.*; //" | dateformat -t "%Y-%m-%d%n"`; cat /tmp/mytmp >> spam.$DAY ) ' < allspam

I won't explain the example command. It's sufficient to understand that it reads the email message on standard input, figures out the date the email was received, and appends it to a file named for that date.

As you can see, it is possible to write arbitrarily complex scripts in the first argument to foreachmail.

foreachmail 'i=0; while read l; do if [ "$l" = "" ]; then i=1; fi; if [ $i -gt 0 ]; then echo "$l"; fi; done;' < mbox-file

This one strips out the headers from every email in the mbox-file, printing only the bodies. If you didn't have foreachmail, you might try to do something like this:

grep -v '^[a-z0-9A-Z-]*: ' < mbox-file

This omits any line that has a word ending with a colon at the begining of a line. This is a reasonable heuristic for removing email headers in an mbox file, but it's imperfect. This will also strip out any such lines if the fall within body of the message. And it won't strip out the initial "From " line of the header (note the space), which is particular to the mbox format. Nor does it realize that some headers can be multi-line where subsequent lines are indented but don't repeat the header tag (e.g. Return-path:), therefore it won't strip out every line in multi-line header tags. foreachmail, knowing the structure of email, doesn't have these problems.

Here's the code.
Collapse )

If you don't give it a script to run, it defaults to running procmail. It would probably be better to not do that, but instead provide usage information and exit.

This skips messages containing the subject "DON'T DELETE THIS MESSAGE -- FOLDER INTERNAL DATA", which might be unique to my environment. It comes from the use of pine and alpine email clients which keep metadata in a bogus email message at the beginning of the mbox file. The bogus message always has this subject line, so it can be used to identify and ignore the message.

The while loop finds the start and end of each email sent from standard input. It then passes the message to the mailproc() subroutine. mailproc() finds certain headers (currently only using the subject header, as discussed in the previous paragraph), then executes the given command and sends it the contents of the current message.

foreachmail avoids having to fork()/exec() and do child process maintenance by taking advantage of one of the many ways Perl lets you do interprocess communication. In this case, it uses the open() call, with a pipe ("|") followed by the command. See the "perlipc" man page for more information.

Note the parenthesis following the pipe in the open() call. This creates a subshell, which has the advantage of letting you pass multi-command scripts and have all the standard I/O work the way you'd expect. The significance of this is illustrated in the following example. Assume you run foreachmail like this:

foreachmail "rm /tmp/file; cat > /tmp/file"

You'd be expecting foreachmail to delete the temporary file, then write the contents of the each email message into the same file. (There's no reason you'd do exactly this, but you can imagine a more complicated example that would follow this pattern.) WIthout the parens in the open() command above, the email message gets sent ONLY to the standard input of the "rm" command (which ignores it), and does not get sent to the same of the "cat" command (which will therefore block indefinitely waiting for input). With the parenthesis, the above command works as expected, because it causes the pipe to pass the message to the standard input of the entire command list, not just to the first command in the list. This let's you do stuff like read the header up to the blank line used to separate header from body, then discard the body.

As it turns out, the "formail" utility, part of the "procmail" package, has a "-s" flag, which works similarly to foreachmail. (Eerie name similarity, too.) Major difference is that formail -s needs to be passed the name of an executable binary, not a string representing a shell script. However, that can be dealt with by giving it the name of a shell binary. So the equivalent to the first example using formail is

cat mbox-file | formail -s bash -c "grep '^From:' | sed -e 's/^From: //' "

So use formail if you have it, but if you don't and do have Perl installed, foreachmail is a good option.

Some good improvements would be a usage statement (instead of defaulting to procmail), some signal handling so interrupts kill the entire program rather than just the running subshell, and some better error handling (that last one is always true).

ra2wav: Convert RealAudio files to WAV files

This is getting silly. But this is the end of the mplayer/mencoder theme.

I've used mencoder, part of the mplayer suite, to do lots of different types of file format conversions. Here's another one, only this time it uses mplayer to do the conversion, Normally, mplayer plays A/V data, while mencoder transcodes A/V data between different formats.

However, mplayer has a "-ao" flag ("audio out"), which is normally used to control which hardware interface to send sound to (and/or how to send the audio data). E.g. It can choose between the ALSA or the OSS interfaces on a Linux system, or the sun interface on a Solaris system, or the OSX interface on a Mac. It is used to specify the specific sound card if more than one is present. It can also choose to send the audio to another abstraction layer such as SDL, esd, arts, or jack (all of which typically end up sending to a real sound card, although sometimes one on another computer). It can even use specialized hardware like that found in those TV capture cards that also output audio and video signals.

For some reason, mplayer's -ao flag has the ability to send audio data to plain PCM/WAV data file. This data is more or less identical to what it would send to sound hardware (at least, sound hardware found on PC-based computers). It creates a legitimate WAV file, but note that the data generated this way is not compressed.

sysadmin@artoo:~$ ra2wav
Usage: ra2wav .ra

Collapse )

You can see that the resulting wav file is very large. A logical next step would be to use something like lame to covert it to an mp3 or ogg file.

Here's the code

Collapse )

Note that, because mplayer really is a player but we're using it to transcode data, it needs to be told not to use a video codec ("-vc null") or to actually display video ("-vo null"). Note this is also done when using mplayer to identify data about the file using the "-identify" flag. The word "fast" tells the pcm driver to process faster than real (clock) time.

In writing this post, I couldn't remember why I decided to write this using mplayer. But in trying to create the "correct" version using mencoder, I couldn't figure out how to do it. I expected this to work:

mencoder -oac pcm -o "$j" "$i"

However, it doesn't. It says "WARNING: OUTPUT FILE FORMAT IS _AVI_. See -of help.", then "REALAUDIO file format detected.", and finally "Video stream is mandatory!". Apparently mencoder assumes it's working on a video file (an AVI file, by default). I can't figure out how to make it realize it's only to process audio data. There is an output format flag ("-of") that has a "rawaudio" format, but I couldn't make that work--it still insisted that a Video stream is mandatory. The man page says "-of" is still in beta, perhaps someday there'll be a "wav" or "pcm" format. If anyone has a suggestion, I'd love to learn how to make mencoder do what I want.