nprlive: Listen to NPR's live audio stream

This is a horrible, horrible hack. I only put it up because it fits in my mplayer theme. It's a hack for several reasons: 1) it relies on the current line-by-line layout of XML/HTML rather than parsing it according to the language, and 2) it relies on how NPR's web site currently works (although surprisingly it works over 6 months after it was originally written). I wrote it one night when I was sick of listening to any of the audiobooks I had, as well as BBC 7 (you'll note that this script is kinda-sorta like my first script posted here, bbc7live).

BEGIN EXAMPLE
Collapse )
END EXAMPLE

Here's the code. I hope it doesn't lower whatever respect you may have for me.
BEGIN CODE
Collapse )
END CODE

It's an infinite loop to compensate for any breaks in the program stream. It uses the GET command, which is part of the libwww-perl library. Note that it's called twice, first within a pair of backticks ("`"), then outside of the backticks. The URL is awful, which is NPR's fault. It probably could be simplified; most URLs that long are because the programmers are lazy and embed all the state data into the call even if they aren't needed.

But NPR doesn't deserve all (or even much) of the blame for this hack. First I grep for a keyword, then count positions in the grepped line, then do some string replacements, more grep, more counting, and more replacements. The result is a mms:// URL that is passed to mplayer via xargs.

It's a hack, and very brittle.

mmplayer_encode: Re-encode AVI files for playing on PDAs with mmplayer

This one is going to be a bit confusing, because it will talk about two different pieces of software with similar names: mplayer and mmplayer.  mplayer is a Linux/Unix movie player and converter.  mmplayer is a movie player designed for PDAs and other handheld devices.  mmplayer is available here: www.mmplayer.com/

I still use a Palm OS-based PDA.  I have a Treo 650 cell phone.  I was going to hold out for Android to become prevalent, but that might take a while, so I'm now eyeing the Palm Centro, assuming it can doe Bluetooth Dial-Up Networking (DUN).  Palm lost the PDA/handheld/smart phone wars years ago when they didn't realize that I/O was they key to victory, around 2000.  But I still have a soft spot for them.  And several hundred lines of custom Perl, shell, and C code that I use daily to synchronize my Palm phone into the rest of my life.  ANYWAYS, I use a tool called mmplayer to watch videos on my Treo.  Actually, I haven't done that since the novelty wore off (about 15 days after purchasing it), but that's not my point.

My point, and I do have one, is that this script, mmplayer_encode, uses mplayer and mencoder to transcode movies into the resolution, framerate, and audio bitrate that I consider near-optimal for my Treo 650.  The Treo has a 320x240 screen resolution, and while mmplayer can auto scale, doing so requires CPU resources.  Similarly, the built-in speaker is not BOSE quality, so a high audio bitrate is wasteful.  This scripts automatically rescales everything, resulting in a smaller file.  It also changes the video bitrate, but I'm not sure if that's technically changing the frames/second rate.  It might be the same thing.

BEGIN USAGE
jdimpson@artoo:~/bin$ mmplayer_encode
Usage: mmplayer_encode <avi to recode for mmplayer>
END USAGE

On completion, it creates a file witht he same name as the original, but replaces *.avi with *_treo650.avi.  One thing to note is that this script really should be called mmplayer_encode_treo650 or something, or even mmplayer_encode_jdimpson, because it's really configured to run for my particular phone.  A more general purpose version would have a lot more tuning options.

BEGIN CODE
Collapse )
END CODE

It starts up with the command line check and usage statement, then defines some variables.  These are the ones that would be reconfigurable with command line switches in a more perfect world.  The SCALE and XSCALE values control the resolution (so-separated because they are needed by separate applications).  VBITRATE affects the video bitrate. ASAMPLE sets the audio sample size, and ABITRATE sets the audio compression ratio.

Then it defines four bash functions.  I'll skip these for a second.  After the functions are defined, it does the following:
  1. Kicks off one thread to do a two pass re-encoding of JUST the video.
  2. Waits 5 seconds for the video thread to start up
  3. Kicks off another thread to dump out then re-encode the audio
  4. Waits for both threads to finish
  5. Then merges the new audio back with the video.
The two pass video re-encoding (which are the first two bash functions defined) are two calls to mencoder.  The two passes are part of how the lavc (libavcodec) driver in mencoder works to optimize the re-encoding process. To be honest, I didn't do any exhaustive testing to see what the right re-encoding options were.  I think I just found these two command lines somewhere on the net and incorporated them.

Similarly, I don't recall now why I split the audio out and process it separately.  It's either because one has to do that when using the two phase video processing, or, more likely, because I wanted to take advantage of my then dual CPU system (now replaced by single Dua Core CPU system).  The audio is dumped out with mplayer, then resampled with lame.

The audio and video are merged back together with avimerge.  This is part of the transcode suite of tools, which has a lot of functionality in common with mplayer/mencoder., including the fact that both are hard to use.  I've focused my limited mential facilities on learning mplayer, but to date I haven't figured out how to use mplayer/mencoder to merge audio with video.  I didn't try hard, because I quickly found avimerge, and I'm not a glutton for punishment.  But it would be nice to remove the dependency on the transcode software package. 

BTW, everything gets written to an audio or video log file for debugging.  The result is a file with lower CPU requirements, optimized for 320x240 resolution displays.

flv2avi: Transcode Flash Video files into AVI files (or any other type mplayer understands)

Gonna put the SSH over SSL epic on hold for a while.  Thought I'd do some simple scripts, but ones that follow a theme.  The theme is video processing, and most will use my favorite video playing & manipulating software, mplayer/mencoder.  You've already seen bbc7stream and aviid, both of which used mplayer.  This one converts Flash video into AVI files.  It's a very simple shell script.

BEGIN USAGE
jdimpson@artoo:~/bin$ flv2avi
Usage: flv2avi <Flash Video file>.flv
jdimpson@artoo:~/bin$
END USAGE

BEGIN CODE
Collapse )
END CODE

This is really just a wrapper around mencoder, which is part of the mplayer suite.  mencoder already knows how to read FLV files.  Unfortunately, it doesn't seem to know how to automatically preserve the quality of the video file when converting into another format.  Thus before starting the transcode, there are two calls to mplayer, in identify mode, to figure out the audio bitrate and the video frames per second.  If you're you/'re a regular reader of my blog, you'll realize that mplayer is being used just like it was in my aviid script.

I test the return codes and such because when mencoder encounters an error midway through transcoding, it doesn't delete the partial output.  So I do that in the script.

By the way, you can convert to any video format, not just avi.  But you'll need to figure out they right syntax to mencoder to make that happen yourself.

Interlude: SSH over SSL and the USSR

One day I needed to tunnel an SSH session through an SSL connection.  "Why?", you ask?  Good question.  The short answer is "Because of Communism".  Here's the long answer....

Everyone who is awesome knows that SSH is awesome.  However, there are some sub-awesome people who wish to prevent you from using SSH, complaining about something to do with X11 forwarding or service tunneling or unregulated VPNs, yadda yadda yadda.  So they block outbound port 22 traffic.  In fact, these same sort of people tend to block EVERYTHING, including web traffic.  These Apparatchiks eventually realize that there's no sense in paying for an Internet connection if you're going to block all use of it, so they grudgingly set up an HTTP proxy server.  That's great for the unwashed proletariat masses who only use the web, but not so much fun for we elite Internet Power Users who want to login to a remote system and run VNC to see our desktop.  We need a way to get our SSH traffic past the block, and the HTTP proxy is a good candidate to do that.

The cold war has begun.

HTTP proxies work  by making the web client connect to the proxy and issue GET and POST requests.  The proxy then connects to the intended web server and forwards the requests to it.  The responses are forwarded back.  Unfortunately, unencrypted HTTP traffic, although TCP based, is request-response based and has short-lived sessions. So the  GET and POST functions of the proxy are not easily amenable to tunneling (although it isn't impossible; google "httptunnel" some time). 

That's OK, most HTTP proxies also have to handle encrypted HTTP, aka HTTPS, aka HTTP over SSL (aka TLS).  Because in HTTPS the encryption occurs at the session level (rather than the message level), a proxy can't interfere with encryption negotiation, else it will fail--the user's web browser will report that the server's hostname doesn't match the hostname listed in the public key, which is either a sign you're being "man-in-the-middled", or that web server is poorly administrated.  So most web proxy servers provide a CONNECT function.  The CONNECT function is the web client's way of telling the proxy to "f** * off".  No wait, I mean to "give me a two-way, persistent tunnel to the following destination...".  Perfect.  So we wrap a CONNECT call around the SSH connection.

I'm sure that I was NOT the first to discover this use of the HTTP Proxy CONNECT function, but I did figure it out independently from anyone else.  My first solution was called "nc-ssh-inetd".  It ran out of inetd or xinetd, listening on port 2222, and using netcat (nc) to connect to the proxy.  It looked like this:

BEGIN CODE
 nc-ssh-inetd
#!/bin/sh

(
 echo 'CONNECT impson.tzo.com:22 HTTP/1.0';
 echo 'pragma: No-Cache';
 echo 'proxy-connection: Keep-Alive';
 echo 'user-agent: I_OONZ_JU/0.0';
 echo ;

 cat;
) | nc proxy 80 | ( read x; read x; read x; cat )
END CODE nc-ssh-inetd

Then I'd run "ssh -p 2222 localhost", which would connect to this script.  This script connects to the HTTP proxy called "proxy" on port 80.  It would send it the CONNECT command telling it the SSH server & port to connect to, plus some other headers (including my hilarious made-up user-agent) followed by a blank line.  It would read (and throw away) three lines from the proxy, which were the responses to the CONNECT command.  Then it would just read and write data in both directions (the two calls to "cat").  This left the SSH client and server free to do their thing.  An ugly hack, but it worked.

Other people solved this more elegantly.  PuTTY, my second favorite SSH client (after the one provided by OpenSSH), even built the CONNECT command into the product.  Better yet, OpenSSH created the ProxyCommand directive, which lets you specify an arbitrary command to use to create the network socket over which ssh will start the session.  My nc-ssh-inetd script worked unmodified that way, although I parameterized it so that it would read the SSH server & port and proxy server & port from the command line.  Someone even added CONNECT support directly to netcat, but modifying h0bbit's 1.10 version of netcat is sacreligious and I'll have no part of it.

But since so many people can now easily tunnel SSH traffic through CONNECT-able HTTP proxies, the practice became wide-spread, to the point that even the sub-awesome Apparatchiks noticed that it was happening.  (Probably when their proxy servers started running out of free sockets because so many persistent CONNECT connections were simultaneously in use.) Since most HTTPS traffic is destinated to port 443, they tried to put a stop to SSH tunneling by modifying the HTTP proxies to disallow CONNECT calls to port 22.   (More accurately, they probably paid their proxy vendor large amounts of cash to make a very simple functional addition to their proxy product.)

The arms race escalated.

The most elite of our awesome selves, however, have a certain power.  The power of root. Specifically, the power to arrange for SSH service on a port other than the default port 22.  The comes from either 1) moving the SSH server to another port, 2) running two instances of the SSH server (on 22 and on the new port), or 3) using port redirection rules on a firewall, to rewrite incoming packets destined to the new port to instead go to 22.  OpenSSH even lets you make one server listen on multiple ports at once.  And since SSH is so awesome, we know that we'll never use telnet again, so port 23 is a logical choice. Of course, 23 could also be put on the CONNECT ban list.  So we use another, high level port., say, 8443 (common location for a test HTTPS server). 

It might eventually come to a point where the Apparatchik ban every port but 443.  (If they ban that, no HTTPS traffic will work.)  No problem.  If you aren't already running HTTPS server, just use 443 for SSH.  If you are, you can make a hard choice to shut down your HTTPS server, OR if you're so awesome and elite that you already have a featureful firewall in front of your SSH/HTTPS server, you can use a custom port redirection rule that will redirect most incoming packets destinated to port 443 to your HTTPS server, but any incoming port 443 packets from a certain source address (or list of addresses) to your SSH server.   Any packet coming from the Internet-facing address of the proxy server to port 443 gets redirected to port 22.  The rest remain unmodified.

This worked for quite a while, to the point that the newfound Glasnost seemed permanent.  Then the hard-liners returned!

All of a sudden, outgiong CONNECT-wrapped SSH connections started blocking on startup, to eventually time out.  Normally SSH behaviour has the client connect to a server and wait for the server to initiate the protocl by sending a string specifiying what version of SSH it speaks.  From there, the client and server do their dance to negotiate algorithms, determine shared secrets, verify each other's identity, then set up the login session and any desired tunnels.  Now, the client received nothing from the proxy.  The protocol negotiation couldn't continue.

It wasn't until one bold freedom fighter was able to simultaneously sniff the client and server sides of a connection attempt, using a totally different remote access technique, that the problem was understood.  On the client side, he saw the outgiong connection to the proxy.  On the server side, he saw the related incoming connection from the proxy.  And he saw the appropriate response from the server.  But he didn't see that response make it back to the client!  The client eventually gives up and breaks the connection.

The hard-liners had moved beyond a list of banned ports.  They started monitoring traffic at the protocol level.  That's right: censorship!

The primary purpose of the CONNECT proxy function is to support HTTPS, specifically SSL/TLS.  In SSL the client initiates the session by sending the first pieces of information, not the server as in SSH.  So right there is a measurable behaviour, and the hard-liners were using it to once again block democracy-loving SSH sessions from succeeding.  No longer would CONNECT-wrapped SSH connections work.

One potential solution would be to somehow wrap both the SSH client and server in such a way that the client sends arbitrary data first, then starts up normal SSH.  This requires some sort of hack to be developed, or custom modification to both SSH client and server. 

Plus, if the hard-liners have learned anything through this, these new  proxies should be able to positively identify and allow only legitimate SSL traffic.  If they can't do this yet, they'll learn it eventually.  May as well give them the benefit of the doubt.

And THAT'S why I need to run SSH over SSL.

(How is less exciting, and will be deferred to a later post.)

mini_ca: Create temporary/low security Certificate Authorities for SSL-based applications

(Updated 2009-01-14)
Like most of my articles, this posting focuses on the script, and assumes you already know the basic technology behind it.  In this case, you need to understand how X.509 certificates work, as well as SSL, and what a Certificate Authority (CA) is (at least from a technology perspective, if not from the process and security side).  You also need to understand HERE documents in the shell, and how (and when) to escape variables and other special shell characters.

I actually run my own Certificate Authority using tinyCA, which I use to create and manage X.509 certificates and keys for my HTTPS, Secure SMTP, and Open VPN servers.  Although I don't lock the Root CA key in a safe, maintain a countersigned chain of custody log, nor did I videotape my key signing ceremony, I do treat it with same concern and circumspection that I apply to my ATM PIN, house keys, and bank account password.

Sometimes I need certificate & key material for purposes where I don't want to use my personal CA key, either because it's not that critical for it to remain secure, it's a one-time/temporary application, or I'm doing it for someone else (like a friend or employer).  Usually, this comes about when playing with stunnel or socat, and I need at least one certificate so I can use SSL/TLS for session encryption. So I finally automated the creation of a Certificate Authority and the request and signing of certificates by that CA.  I realize this has been done lots of times by lots of people, but I haven't seen one that creates a re-usable (but single-purpose) CA, nor one that is so convenient to use (again, for a single-purpose).

I call this tool mini_ca, and it works like this:

BEGIN USAGE

jdimpson@artoo:~$ mini_ca
usage: mini_ca <project_name>
END USAGE

There are two parts to the following example.  The first one creates the CA, called "tunnels_R_us". The second one uses it to create a single certificate for a user called "first_one".

BEGIN EXAMPLE1
Collapse )
END EXAMPLE1


This example actually runs it twice, with the same argument each time.  The second time does NOT overwrite the CA certificate, key, or config file, but does re-write the user certificate script.  (Not sure if that's a bug or a feature.)

The major deliverables of this step are the CA's signing certificate, "tunnels_R_us-cacert.pem", and the "create_tunnels_R_us_ucer_cert.sh", which will be demo'd in Example2. All the data the new CA needs is in a directory called "tunnels_R_us-ca" in the current directory.  You can move this directory around--although you'll have to edit the appropriate path variable in create_tunnels_R_us_user_cert.sh file.   It doesn't depend on mini_ca, but it does still depend on openssl, though.

I tried to make the script proof against spaces in the project name, but I suspect it's not (because the OpenSSL config file that gets created probably won't like it), so don't use spaces.  Note that you have to set a password for the CA's private key.  You gotta remember this password.  I couldn't figure out how to override that behaviour  Fortunately, it's pretty easy to delete and re-create the CA if you forget it, although if you re-create the CA, all old certificates won't be valid in the new CA. 

On some systems I get an error message from openssl that says "unable to write 'random state'".  According to the OpenSSL FAQ,  "This message refers to the default seeding file (see previous answer). A possible reason is that no default filename is known because neither RANDFILE nor HOME is set. (Versions up to 0.9.6 used file ".rnd" in the current directory in this case, but this has changed with 0.9.6a.)"  I'm ignoring this, because I don't generally have strong security concerns when using this script  (really, because I'm lazy and ignorance is bliss). 


BEGIN EXAMPLE2
Collapse )
END EXAMPLE2
 
Example2 runs the what I call the create user cert script.  The script was created in Example1, which put it in the CA directory.  In this case, the script name is "create_tunnels_R_us_user_cert.sh". To run it, just give it a unique name for the user for whom you want to create a certificate.  It delivers an all-in-one certificate file that contains the user's private key, certificate, and a Diffie Hellman parameters.  This example names it "tunnels_R_us-first_one.pem".  This file, along with the "tunnels_R_us-cacert.pem" from the first example, are the things most SSL tools (like socat and stunnel) want to have in order to do SSL.

By "user" I don't necessarily mean a person.  In fact, this tool wouldn't be a good choice for generated certificates for personal use (i.e. personal identification & authentication or for use in encrypted and signed email).  The user is just a way to uniquely ID each certificate generated.

Here's the code.  It makes use of the "openssl" utility, part of the OpenSSL library distribution.  This script is very ugly, because it auto-generates both an OpenSSL configuration file, and the create_user_cert.sh file.  It uses HERE documents to do that, and requires some awkward & confusing escaping in both in order to get the desired results.  So it's ugly and ungainly, but otherwise it's pretty simple, because it isn't much more than a wrapper around the previously mentioned openssl command.

BEGIN CODE
Collapse )
END CODE

As I said, it's very simple:  first, it sets up a directory to put the CA files.  These include a private directory where the CA's private key will live; a certs directory where (one copy of) the generated certificates will live; the CA cert password (hardcoded to "1234"!!) and does a bit of houskeeping by creating files and setting permissions to make the openssl tool happy.  Second, it generates the OpenSSL configuration file.  I haven't stopped to fully understand all the settings in the OpenSSL config file or in the creation of the user certificate request, but I know they work for creating SSL certs for Apache web server, stunnel, socat, and OpenVPN.  Third, it creates the CA's key and certificate.  It creates 2048 bit certificates, which you can change my modifying $CERTBITS, and it makes the valid for 100000 days, which you can change by modifying $CADAYS.  Finally, it generates another script that knows how to use the newly created CA to create user certificates.

This autogenerated script is relatively simple as well.  It generates a private key and  a certificate request for the user, and immediately has the CA key sign the request.  It uses the "yes" command to automatically answer "y" when prompted for permission to sign the certificate.  Then it generates the the Diffie Helman parameters.  Finally, it concatenates the private key, the certificate, and the DH parameters into a single file.  It also saves every individual file in the users directory, but you don't really need them anymore once you have the concatenated file.

Both scripts have a shell function called indent defined.  Indent inserts one or more tabs at the beginning of each line of standard input it receives, then passes the input along to standard output.  I use it to offset the output and error from  calls to the openssl tool, so that I can visually distinguish between what openssl says and what my script says.

Now you have a user's concatenated key-cert-dh file that can be authenticated with the CA's certificate.  I'm planning future postings that will require this ability.

You can dowload mini_ca here: http://impson.tzo.com/~jdimpson/bin/mini_ca.

Edit 2009-01-14: Some updates, including typo fix, adding link to mini_ca, creation of the "indent" function, parameterization of the certificate key bit length, and removal of prompting from openssl for CA password and permission to sign certificates.

pppsshslirp: create a PPP session through SSH to a remote machine to which you don't have root

Just finished a major re-write of this script last night. http://impson.tzo.com/~jdimpson/bin/pppsshslirp

For no particular reason, I've collected a lot of tools and trick for tunneling one network through another.  In addition to OpenVPN, IPSec, and other dedicated  tools designed to provide Virtual Private Network functionality, I've played around a lot with running IP traffic through SSH, SSL, and even plain HTTP tunnels.

This is typically done when you want to get around someone's firewall, but also can be convenient for bridging local area protocols (such as old school LAN games like DOOM) across the Internet.  Yeah, OK, but mostly its for piercing firewalls.

At one time I had a script that called a custom piece of C code that would set up a pseudo TTY master/slave pair, then it would fork() to create a child process, running pppd on the parent process/master TTY and an ssh session on the child process/slave TTY.  On the other end of the SSH session it would run another instance of pppd.  The result was basically a hacked up virtual serial port with a Point-to-Point Protocol network link running over it.

Today that approach is superseded in a number of ways.  I use OpenVPN in most situations, because tunneling is what it was designed meant for, yet its easier than setting up IPSec, and it has a higher interoperability track record.  But there are times where even OpenVPN has too much configuration overhead, especially if I'm looking for a temporary solution.  pppd now has a "pty" command, which let me replace my hacky C program but otherwise follows the same virtual serial port design described above.  Even more recently, OpenSSH's ssh client has the "-w" flag, which in Linux and probably other free OSes instructs the local and remote machines to set up virtual network interfaces ("tun" devices).  socat has a similar ability with the TUN "address specification".  OpenVPN and other some tunneling software use the same kind of "tun" device.  VirtualBox can "tun" devices to create a virtual network between the virtual computer and the real one.

However, until now every tunneling solution I came up with required that I have root access on both sides of the tunnel.  This is usually not a problem, but just for kicks I developed a solution where you only need root access on the local side.  You still need to be able to run a process on the remote machine, but it can be as an unprivileged user. There's very little local configuration.  There's also no configuration on the remote side other than having login access (through SSH) and knowing where the slirp binary is.  

The key piece of this solution is slirp.  slirp is a SLIP/PPP emulator.  (Does anyone still use SLIP?) It receives IP packets from a (virtual or physical) serial link, but converts the data into regular BSD socket system calls--that's why it doesn't need to run as root.  So really, I'm not doing anything too clever, because this is more or less what slirp was designed for.  I just came up with what I think is a pretty slick script for easily setting up a slirp-based PPP session running through an SSH session.

BEGIN EXAMPLE
jdimpson@artoo:~$ sudo ~/bin/pppsshslirp  foo.com
running under sudo, assuming you want to run ssh as jdimpson, not as root
Using interface ppp0
Connect: ppp0 <--> /dev/pts/11
local  IP address 10.0.2.15
remote IP address 192.168.50.3
jdimpson@artoo:~$ sleep 10 && ifconfig ppp0
ppp0      Link encap:Point-to-Point Protocol 
          inet addr:10.0.2.15  P-t-P:192.168.50.3  Mask:255.255.255.255
          UP POINTOPOINT RUNNING NOARP MULTICAST  MTU:1500  Metric:1
          RX packets:12 errors:1 dropped:0 overruns:0 frame:0
          TX packets:12 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:3
          RX bytes:104 (104.0 b)  TX bytes:113 (113.0 b)
jdimpson@artoo:~$
END EXAMPLE

It takes a few seconds to set up, which is why I had it sleep 10 seconds before looking at the ppp0 interface. The delay is a factor of how long the remote login takes.

BEGIN USAGE
jdimpson@artoo:~$ pppsshslirp -h
Usage: pppsshslirp [-h] [-u #] [-d] [-i /path/to/ssh/privkey] [-p ssh_port] [username@]remotehost

        -h              This helpful information.

        -u #            Specify the ppp device number (e.g. ppp0, ppp4, etc).

        -d              Don't let pppd go into the background.

        -i /path/...    Specify the ssh private key.  pppsshslirp will use the
                        default ssh file (~/.ssh/id_rsa) UNLESS run through
        sudo, when it will try to guess the private key of the user calling
        the script, rather than root. If pppsshslirp guesses wrong, use this flag
        to override it.

        -p ssh_port     Tell ssh to use another network port.

        username@...    The user on the remote host to log in as.  If
                        username is not specified, pppsshslirp will use the
        current user UNLESS run through sudo, when it will try to guess
        the user calling the script, rather than root.  If pppsshslirp guesses
        wrong, or you just need to log in as a different user, set the
        username accordingly.

        remotehost      The only required option, the remote host
                        where slirp will be run.

pppsshslirp is copyright 2008 by Jeremy D. Impson <jdimpson@acm.org>. 
Licensed under the Apache License, Version 2.0 (the License); you
may not use this file except in compliance with the License. You may
obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an AS IS BASIS, WITHOUT
WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
License for the specific language governing permissions and limitations
under the License.

END USAGE

You may want to skip to the end of the code, because all the interesting stuff happens down there.

BEGIN CODE
Collapse )

tcpdump-remote-server & tcpdump-local-client: Capture packets remotely and view them locally

I have several of those Linksys WRT54Gs that I've hacked and put Linux on (http://www.openwrt.org/ ).  They are combination WiFi access point , ethernet switch, and broadband routers.  One of the reasons I hack them is so I can do cool stuff like the following.  This pair of scripts lets you sniff packets on one machine, but watch the sniff in near realtime on another.  I use them to watch all the packet traffic on my home network.

You could just SSH into the remote machine and fire up wireshark (formerly known as ethereal), but in my case I didn't have enough disk space to install wireshark and all of its dependencies on it.  It has tcpdump on it, but I still want the nice wireshark GUI.  I could capture to a file with tcpdump,  transfer the file to my workstation, then open it up with wireshark.  But I want it to be real-time, and I'd like to have the process partially or wholly automated.

So I wrote these two scripts.  The first, tcpdump-remote-server, runs on the remote system that you want to perform a sniff on.  You give it two arguments, both are optional, (but whose default values may only be useful to me).  The first is the filter rule (in libpcap aka tcpdump format) and the second is the network interface you want to sniff on.  You need permission to open the network interface, which usually requires root access.  You use it like this:

BEGIN  EXAMPLE
root@bort:~# ./tcpdump-remote-server '! tcp port 11111' br0
listening on port 11111

END EXAMPLE

Now it's listening on port 11111.  Anyone who connects to it will get a mouthful of pcap file.  Here's the code:

BEGIN tcpdump-remote-server
#!/bin/sh

RULE="$1";
IF=$2;
PORT=11111
LEN=1500

if [ -z "$RULE" ]; then
        #RULE="! ip net 10.0.0.0/27";
        RULE="! tcp port $PORT";
fi

if [ -z "$IF" ]; then
        IF=br0;
        #IF=vlan1;
        #IF=prism0;
fi

CHILD_FILE=`mktemp /tmp/tcpdump-remote-server-child-XXXXXX`;
trap "rm $CHILD_FILE" 1 2 15 EXIT;
cat > $CHILD_FILE <<HERE
#!/bin/sh
tcpdump -U $CNT -s $LEN -i $IF -w - "$RULE" 2> /dev/null
HERE

chmod a+x $CHILD_FILE

while true; do
        echo listening on port $PORT
        nc -l -p $PORT -e $CHILD_FILE;
        echo done
        sleep 1;
done

END tcpdump-remote-server

This listens on port 1111, waits for a connection to the port, then sends the binary output of tcpdump to whoever connected to the port.  It only handles one connection at a time, but after disconnection it will accept a new connection.

It has a default filter command which filters out its own traffic. If you override it on the command line,it would be a good idea to add the default rule back in conjunction with whatever your custom filter is.  It also defaults to the br0 network interface, which happens to be very convenient for me.  Control-C it several times in succession to stop the server, or use the kill or killall commands from another window.

BTW, this requires netcat (nc), specifically  a version of netcat compiled to have the -e flag.  (I've noticed there are lot of versions of netcat around, some even with different incompatible argument syntax.)  The -e flag causes netcat to execute a command after it initiates or receives a network connection (depending on whether it's being used as a client or server--server, in our case, hence the -l and -p flags). The -e flag is sometimes considered a security risk, but if someone logged on to your system knows what netcat is, knows why the -e command is a security risk (aka fun to play with), then nothing will stop that person from installing their own version of netcat (or anything else).

The code generates a temporary shell script in /tmp (ugly, I know).  This is necessary because the -e flag in netcat is implemented as the execve() rather than system() system call.  That means it doesn't do any command line parsing.  I try to avoid making shell scripts that call other shell scripts, but there's no way toaround it here, so at least I have it dynamically create and delete the secondary script.  It does so in as safe a way as the mktemp command is able, which I suspect isn't all that secure.  No guarantees.  I run this on what is essentially an appliance, with no local users around to exploit file name race conditions.  Plus, I'm in denial.

The second script, tcpdump-local-client, will connect to the server and run wireshark, connecting the output of the server to the input of wireshark, which will render the packet capture in all of its GUI wunderfulness.  You run it, and it pops up wireshark.  You don't even need to be root on the local machine!  The first and only argument is the name of the machine running the server.  Here's the code:

BEGIN tcpdump-local-client
#!/bin/sh

BORT="$1"
if [ -z "$BORT" ]; then
        BORT=bort;
fi
PORT=11111;

nc -v $BORT $PORT | wireshark -k -i -

END tcpdump-local-client

It connects to the the server and pipes it into wireshark, which immediately starts to "capture" packets from the server.  The only gotcha is that if you stop the capture in wireshark, you can't restart it.  You have to exit wireshark and start the client script again.

Using these on my Linksys WRT54g, I can see the traffic going to and from every system on my network, on any system on my network.  It's worth noting that this is incredibly insecure.  Anyone can see everything on your network.  Better would be to use stunnel or socat to do an authenticated and encrypted SSL connection between client and server rather than unauthenticated & unencrypted netcat, but that requires setting up a public/private keys and such.  At the very least, set up some firewall rules from prevent people outside your network from connecting to the server.  Or do what I do and only run it when you need it, and live in denial about the risk!

xauth-user: Let another user pop up windows on your X display

I hacked on this last week to make its output more self-explanatory, so it's dated 2008-05-24, but it's actually a year or so older than that.  xauth-user will allow user1 to utilize user2's X Windows Server display.  Once done, user2 had better completely trust user1, else user2 might get some nekkid picshurs popping up on their screen when trying to demonstrated something to his boss...

Gone are the days of running "xhost +" to allow anyone to draw a window on your X display.  Every Linux distribution I've used in the past 5 years require MIT magic cookies as authentication between an X client and the X Windows Server display.  These cookies are typically stored in ~/.Xauthority.  An X Windows Server display will read this file, and when an X client (say, xterm or firefox) wants to start up and pop up a window on the display, the X Server will challenge the X client.  Only if the X client can also read the .Xauthority file (or gets the same cookie contained therein through some other fashion)  will it pass the challenge and be allowed to pop up a window.

BTW, one user's .Xauthority files can have multiple cookies, one for each unique X display that it might utilize, e.g. :0, :1, remotehost:0, etc.

Here's an example where I'm running an X Server on display ":1".  User jdimpson is running that Server, but user sysadmin wants to pop up an xterm window on jdimpson's desktop.

BEGIN EXAMPLE
jdimpson@artoo:~$ echo $DISPLAY
:1.0
jdimpson@artoo:~$ xauth-user
usage: /home/jdimpson/bin/xauth-user <username> [X display #]
jdimpson@artoo:~$ xauth-user sysadmin :1
now you can run sysadmin's apps on jdimpson's display :1
jdimpson@artoo:~$ sudo su - sysadmin
sysadmin@artoo:~$ DISPLAY=:1 xterm &
sysadmin@artoo:~$
END EXAMPLE

Note that you have to set the DISPLAY variable to the same one you ran xauth-user on.  (xauth-user assumes :0 if you don't list it explicitly.)

BEGIN xauth-user

#!/bin/sh

if [ $# -lt 1 ]; then
        echo "usage: $0 <username> [X display #]";
        exit 1;
fi
ME=`whoami`
USER=$1;
DISP=$2;
if [ -z "$DISP" ]; then
        DISP=:0
fi

#echo copying magic cookie from $ME to $USER;
xauth nextract - $DISP | sudo su - $USER -c "xauth nmerge -"
echo now you can run $USER\'s apps on $ME\'s display $DISP;

END xauth-user

As its name suggests, it is based on xauth, which is the usual tool to hande X Server authentication (and thus to manage .Xauthority files).  It simply uses xauth on the initial user's side to spit out the correct cookie for the given display, piped into xauth on the secondary user's side.  Note that it requires that the initial user must be able to use sudo to su to the secondary user in order to access the secondary user's .Xauthority file.  The actual access to the .Xauthority files is implicit via the default behaviour of xauth.

Be very careful when using xauth-user; if you get the X display wrong, you might overwrite a magic cookie that's active in another X Server, which will make it impossible to run any new X windows.  In the above example, if you ran "xauth-user sysadmin :0" (instead of :1), and if sysadmin already has an unrelated X Server running on :0, then you'll overwrite sysadmin's active :0 cookie with jdimpson's :0 cookie (which may not be valid for anything if jdmpson is running display :1).  Screwed.  If this does happen, check out the generate section of the xauth man page to rectify this situation.  There's probably room to do some sanity checks, if I ever get around to it, at least making sure the current user isn't trying to overwrite his own currently in-use cookie.  Better yet, I could even have it save the old cookie, but that would require something to come a long later to restore it.

Final note.  xauth-user won't work for remote displays the way old "xhost +" used to (i.e. X clients could connect over the the network to display on a remote X Server), because 1) it assumes both users have local accounts with local .Xauthority files, and 2) most modern Linux distributions turn off the X Server's remote access capability anyway.  For remote X access I recommend you use SSH, making sure the X11Forwarding is turned on.

copysshenv: Extract the ssh-agent environment variables from a running process

Dated 2008-04-20, copysshenv extracts the ssh-agent variables from the environment of any running process.  It may only work on Linux (but might work on Solaris, and any other OS that maintains /proc/<process id>/environ containing the current environment for the process).

ssh-agent is used to cache your SSH keys in memory, so that you don't have to retype your SSH key passphrase each time you run ssh or scp.  It works by setting two environmental variables: SSH_AGENT_PID and  SSH_AUTH_SOCK.  Any process (owned by the same user who ran ssh-agent in the first place) can use the contents of these variables to communicate to the running ssh-agent process, specifically to ask it to handle a challenge from an SSH server.  There are a couple of ways to make sure these variables are in your environment (see the ssh-agent man page), but once in a while you find yourself in a situation where your current shell doesn't have the variables set.  This can happen, for instance, when you are logged in remotely to a system, and want to use the ssh-agent already on running on that system to log you in to the next system.  The shell you get when you log in to the system won't have the environmental variables set, even though ssh-agent is already running on that machine.

BEGIN EXAMPLE

jdimpson@artoo:~/bin$ copysshenv  21527
export SSH_AUTH_SOCK=/tmp/ssh-cxNeNz8550/agent.8550
export SSH_AGENT_PID=8625

END EXAMPLE

For it to work successfully, you need to determine a process that you know has the environmental variables already set.  That's a little beyond the scope of this post, but it's a safe bet that any other shell (e.g. bash) process running might have it.  In the above example, process 21527 was an instance of bash running on my Gnome desktop. 

BEGIN copysshenv

#!/bin/sh

PID=$1;
if [ -z "$PID" ];
then
        PID="$$";
fi

sed -e 's/\x00/\n/g' < /proc/$PID/environ | awk '/SSH/ { print "export " $1}'

END copysshenv

Without any arguments, copysshenv checks it's own environment for the variables, which is only useful if you are running it in a shell that already has the right variables.  (And if that were our only requirement, we could have done it like this: env | grep SSH .)  The business with setting PID to $$ makes that happen.

The last line is the main one.  It uses sed to read in the contents of /proc/<proc id>/environ, which is a null-delimited group of strings, each one a key=value pair, one for each environmental variable.  The 's/\x00/\n/g' replaces all the nulls with newlines.  (Effectively, this re-implements the env command, but for any process, not just the current one.)  awk is used to filter for only ones relevant to SSH, and the print command outputs the variables, prepending "export " in front, making the output appropriate for "source"ing, like this:

BEGIN EXAMPLE
jdimpson@artoo:~/bin$ `copysshenv  21527`
END EXAMPLE

Now if you check the current environment with env, you'll see the SSH variables have been set.

Note that since /proc/<process id>/environ is owned by the user running the process, you can only get access to the environment of processes that you own.  Similarly, ssh-agent remains secure because the SSH_AUTH_SOCK file is also owned by the user running the ssh-agent process.  So you can't get access to someone else's SSH keys via this method, unless you are root, or the user has foolishly opened up permissions on both those files.

mailsort.pl: Sort mbox style email messages according to date

OK, this post is quite self-indulgent, so it's best to just get this over with.  I wanted to write about my oldest script, which is dated 1998-07-02.  Unfortunately, it's not such a great script.  First off, I never use this, so it's just luck that it's been in my bin directory for almost 10 years.  Secondly, I'm kind of disappointed that I have nothing older--I subtitled this blog scripts & hacks since 1994 because that's the year I entered college and was first introduced to Unix and programming.  At that time I didn't own a computer, so I moved scripts around on floppy disks (then Zip disks, then CD-ROM disks).  It's possible I do have older stuff in an archive somewhere.  Thirdly, while it seems to still work, I can't really recommend it. I'd be very, very wary of how well it identifies the end of one email message and the start of the next one. Similarly, the code to interpret different ways of writing dates is bound to break spectacularly.  Let me know if you use it with any success.

mailsort.pl reads an mbox-style mail file from the command line, sorts the emails by date, and spits them out sorted onto standard output.  It's written in Perl.  I'm sort of proud of this, because while I can see lots of inefficient memory usage, no handling of standard input (which Perl makes trivial to do), and lots of other naive, inefficient, or nonsensical constructs, I did have some reasonable discipline and sense of structure.

Amazingly, I do remember writing this.  I had just graduated from Syracuse University, was taking a couple months off before going to work, and between visiting friends and going to New Orleans, I spent some time organizing the stuff I saved from college.  One of those things was email.  For most of college, I would archive my email each year, because at first I had no computer and had to use the multi-user time sharing Solaris hosts.  We had very limited disk quotas (I want to say 5 MB!).  But after I got my own computer with (wait for it) a 40 MB  hard disk, I realized that offline archiving of email was unnecessary.  So I went about merging my yearly email archives together.  I didn't just restore from external media, I also reorganized email folder structure, which meant some emails got out of chronological order.  Thus, I wrote this script.

mbox-style email is one of the ways Unix/Linux systems store email in files. Basically, each email "folder" is a single file.  Every distinct email begins with the string "\nFrom ".  That's the word "From" at the beginning of a line, followed by a space.  The rest of the line varies, but usually contains who the email went to and the date. Note that this is not the same as the "From:" or "Date:" headers of an email.  Those come later.

I don't recall the details, but some Unix email software used  different formats, ranging from using the Content-length: header to delineate message, to a ridiculously sane approach of putting each email in its own file, then grouping emails together in subdirectories.  (I know, that's crazy.)

I like the mbox style, not because it's easy to use (it isn't) or because it's more efficient (it's not) but because it's what sendmail uses.  And I am the master of sendmail (http://www.sendmail.org).  But that's another rant entirely, and I'm already egregiously off-topic.

OK, enough of that wank.  This code uses Perl's Mail::Util and Mail::Header modules, which is overkill, but at the time I thought code reuse was the ultimate accomplishment.  Here it is:

BEGIN mailsort.pl
#!/usr/local/bin/perl

use Mail::Util qw(read_mbox);
use Mail::Header;

%nmon = (
        'Jan'   => 1,
        'Feb'   => 2,
        'Mar'   => 3,
        'Apr'   => 4,
        'May'   => 5,
        'Jun'   => 6,
        'Jul'   => 7,
        'Aug'   => 8,
        'Sep'   => 9,
        'Oct'   => 10,
        'Nov'   => 11,
        'Dec'   => 12,
);

die usage() unless @ARGV;

$filein = shift @ARGV;

@msgrefs = Mail::Util::read_mbox ($filein);

$i = 0;
foreach $msgref (@msgrefs) {
        @tmphead = ();
        $date = '';
        foreach $line (@{$msgref}) {

                if (( $new_msg == 0 and $line =~ /^\s*$/ )
                        or $body[$i] ) {
                # start of body

                        $body[$i] .= $line;

                } else {

                        push @tmphead, $line;
                        $new_msg = 0;

                }
        }

        $head[$i] = new Mail::Header (\@tmphead, 'MailFrom' => 'KEEP');
        $new_msg = 1;
        chomp($date = $head[$i]->get ('Date'));
        #print "$i: no date from header obj\n" unless $date =~ /.+/;
        $date[$i] = [ splitmaildate($date) ];
        #print "$i: from splitmaildate($date): @{$date[$i]}\n";
        $nums[$i] = $i;

        $i++;
}

@nums = sort by_date @nums;

foreach $i (@nums) {
        #print "\nSD $i: @{$date[$i]}\n\n";
        $head[$i]->print();
        print "\n", $body[$i];
}

sub by_date {
        my @adate = @{$date[$a]};
        my @bdate = @{$date[$b]};
        my ($afoo, $bfoo);

        foreach $afoo (@adate) {
                $bfoo = shift @bdate;
                $o = ($afoo <=> $bfoo);
                if ($o != 0) { return ($o); }
        }

        0;
}
sub splitmaildate {
        my ($date) = @_;

        if ($date =~
/\s*(\w\w\w,? )?(\d\d?) (\w\w\w) (\d\d+) (\d\d?:\d\d?:\d\d?)\s*(.*)/) {
                $day = $1; $ndate = $2; $mon = $3;
                $year = $4; $time = $5; $foo = $6;
        } elsif ($date =~
/\s*(\w\w\w,? )?(\d\d?) (\w\w\w) (\d\d?:\d\d?:\d\d?) (\d\d+)\s*(.*)/) {
                $day = $1; $ndate = $2; $mon = $3;
                $year = $5; $time = $4; $foo = $6;
        }

        $mon = $nmon{$mon};

        if ($year < 100) { $year += 1900; }

        $time =~ /(\d\d?):(\d\d?):(\d\d?)/;
        $sec = $3; $min = $2; $hour = $1;

        ($year, $mon, $ndate, $hour, $min, $sec);
}

sub usage { "usage: $0 <mail-file-to-be-sorted>\n"; }

END mailsort.pl

All this does is read in the mbox file, dig out the header and body of each, then sorts them according to date and prints them out.  The date parsing code is an accident waiting to happen--there's no good solution, because there's no standard for the date format, so any email client can do whatever it wants with it.