Tags: ssl

Learning socat in terms of netcat

In my previous post on sslrsh I wrote about a script to allow remote shell access over SSL. The script made extensive use of socat. It reminded me of how feature-complete socat is, and has motivated me to capture some socat recipes. Note that these aren't general purpose scripts; they are just snippets of functionality listed here for future reference.

I'm not the only person who has a socat tutorial, but I think this post is unique because it will attempt to describe socat by comparing it to a tool that is doubtless a major inspiration for socat, namely, netcat. Hopefully, it will clarify how to use socat, demonstrate how much more featureful socat is, but also show why you shouldn't go ahead and delete netcat outright.

This is part one of a three part series. This one compares socat with netcat. The next one will delve into UDP with socat, and the last one will get into some advanced topics.

Final comment before we start. As of this writing, socat version 2.0 has entered some beta release. socat 2.0 addresses a limitation in socat 1.x, which is that "addresses" in socat 1.x are not completely uniform, and they are not layerable. For example, there's no way to run SSL over UDP, even thought socat knows about both protocols. Similarly, there's no way to have an SSL connection be tunneled through a web PROXY, meaning you have to resort to the hack found in sslrsh. socat 2.0 addresses these limitation, but uses an enhanced syntax, which means 1) it will be even more complicated to use, and 2) this post may become obsolete rather sooner than expected.

But before we compare socat to netcat, let's compare it to their common namesake, cat.

Use cat to display a file on standard output.

jdimpson@artoo:~$ cat file.txt
This is the content of file.txt


Use socat to display a file on standard output.

jdimpson@artoo:~$ socat FILE:file.txt STDOUT
This is the content of file.txt


In general, socat takes two arguments. Both are called addresses. In the above example, FILE:... is one address, and STDOUT is the second. It's customary but not required to spell the address name in upper case. We'll see lots of address types in this post, as well as in a couple follow-on posts that I've got planned.

If you just run "cat" by itself, it will read from standard input and write to standard output, and you have to press control-D to end.

jdimpson@artoo:~$ cat
hello, world!
hello, world!


The first line after the command is typed in, the second is printed by the command.

Here's the equivalent using socat.

jdimpson@artoo:~$ socat STDIN STDOUT
hello, world!
hello, world!


Apparantly, STDIN and STDOUT are both synonyms for STDIO, and socat doesn't care if you send input to the STDOUT address, or read output from the STDIN address. "socat STDOUT STDOUT", "socat STDIN STIN", and "socat STDIO STDIO" all appear to work identically.


But even here socat can improve the situation. We can add a history, so that we can just hit up arrow to repeat what we've typed in earlier, just like bash can do. It utilizes the GNU Readline library.

jdimpson@artoo:~$ socat -u READLINE STDOUT
hello, world!
hello, world!
hello, world!
hello, world!
hello, world!
hello, world!


To get this output, I first typed "hello, world!", pressed enter. socat wrote the second line. The I pressed up arrow to get the third line, and enter to get the fourth. Finally, one more up arrow for the fifth, and again enter for the sixth.

The "-u" flag tells socat to run in unidirectional mode. As we'll see later on, socat usually passes data between the first and second addresses in either direction, something cat does not do. When both addresses end up I connecting to the terminal, as is the case here, it's undertermined as to whether the line you type is being read by the first address an sent to the second, or vice versa. It took me a while to figure this out (over a month after I originally posted this!). By forcing unidirectional mode, only the first address reads what you types, and passes it to the second one.

When you quit using control-c or control-d, the terminal gets messed up, and you have to type "reset" (even though you may not be able see what you're typing) to fix it. The READLINE address has a couple of options, one of which lets you set a history file, which stores the the input history across invocations, just like your "~/.bash_history" file. This example isn't how you'd normally use READLINE, but I'm postponing further discussion on READLINE to another post.

Use cat to create a file (then again to display it)

jdimpson@artoo:~$ cat > file.txt
I like writing files using cat and control-D!!
jdimpson@artoo:~$ cat file.txt
I like writing files using cat and control-D!!


Note that, technically, the shell is actually writing the file by virtue of the redirect symbol (greater than sign).

Use socat to create a file (then use cat to display it)

jdimpson@artoo:~$ socat -u STDIN OPEN:file.txt,creat,trunc
socat needs some funny commands to write files!
jdimpson@artoo:~$ cat file.txt
socat needs some funny commands to write files!


Again. a huge difference between socat and cat is that socat, by default, is bidirectional. So both addresses are read from and written to. cat is always unidirectional. And, in socat, when either one of the addresses sends an EOF (End of File), it waits some amount of time and then exits. And again, the "-u" flag tells socat to be unidirectional. Without it, the above socat invocation will read from the file, get EOF, and exit. Or, if the file doesn't exist, it will quit with an error. There would be no time to type anything in. If instead you pipe something in to socat, like this echo foo | socat STDIN OPEN:file.txt,creat,trunc, the -u isn't needed. Presumably, when invoked within a shell pipe, socat realizes that the fact and know that pipes are always unidirectional, and will behave as if the -u flag were given.

Note the options used, creat and trunc. You could also use append, and lots of other options available to the open() system call. Also, without the trunc option, socat will write bytes into the file in-place. Omitting trunc and using the seek option, you can change arbitrary bytes in the file. There's rdonly and wronly options (read-only and write-only, respectively). I had thought that if I used wronly option, I wouldn't need the -u flag. That didn't work because socat still tried to read from the file, got an error, and exited. Probably the determination of uni- or bi-directionality is done without input from address-specific options. It does work as expected if you pipe input into socat. socat also has a CREATE address based on the creat(), but this is equivalent to OPEN with the creat option.

That covers the major forms of cat, and how socat emulates them, and in some cases enhances them. I don't suggest ever using socat to do what cat can do, but you should have a better sense for how to invoke socat. Now let's compare socat with netcat.

In netcat, connect to TCP port 80 on localhost, as a poor man's web browser.

jdimpson@artoo:~$ nc localhost 80
HEAD / HTTP/1.0
User-agent: netcat, baby!

HTTP/1.1 200 OK
Date: Wed, 28 Jan 2009 13:06:43 GMT
Server: Apache/2.2.8 (Ubuntu) PHP/5.2.4-2ubuntu5.4 with Suhosin-Patch mod_ssl/2.2.8 OpenSSL/0.9.8g mod_perl/2.0.3 Perl/v5.8.8
Last-Modified: Sat, 10 Jan 2009 22:01:08 GMT
ETag: "24008-369-4602802c0f100"
Accept-Ranges: bytes
Content-Length: 873
Connection: close
Content-Type: text/html



I typed in the first three lines (third one is an empty line). The rest is output from the server.

In socat, connect to TCP port 80 on localhost, as a poor man's web browser.

jdimpson@artoo:~$ socat - TCP:localhost:80
HEAD / HTTP/1.0
User-agent: socat, natch!

HTTP/1.1 200 OK
Date: Wed, 28 Jan 2009 13:07:51 GMT
Server: Apache/2.2.8 (Ubuntu) PHP/5.2.4-2ubuntu5.4 with Suhosin-Patch mod_ssl/2.2.8 OpenSSL/0.9.8g mod_perl/2.0.3 Perl/v5.8.8
Last-Modified: Sat, 10 Jan 2009 22:01:08 GMT
ETag: "24008-369-4602802c0f100"
Accept-Ranges: bytes
Content-Length: 873
Connection: close
Content-Type: text/html



Note the "-". That's a shortcut for writing "STDIO", so the above command is equivalent to "socat STDIO TCP:localhost:80".

netcat as a server, listening on TCP port 11111.

nc -l -p 11111


Use "nc localhost 11111" from another window to connect to it. You can type in both windows, and should see each others input in both

There are a couple versions of netcat out there, and some versions (like the OpenBSD one) have had the command flag syntax changed. If you get an error running netcat as decribed above, try it without the -p flag, like this: "nc -l 11111". Some people really don't get the ideas of compatibility and portability--if they want to change the way a program works (presumably because they think they are improving it), fine. But the should also change the name of the program so that hundreds of scripts don't break, and that future script writers don't have to test for each version. Anyways...

socat as a server, listening on TCP port 11111.

socat STDIO TCP-LISTEN:11111,reuseaddr


Use "nc localhost 11111" from another window to connect to it. You can type in both windows, and should see each others input in both.

Note that because socat is bidirectional, it doesn't matter which order you put the addresses. The above is equivalent to "socat TCP-LISTEN:11111,reuseaddr STDIO".

TCP-L can be used as a shortcut for TCP-LISTEN. The reuseaddr option lets you quit socat and run it again immediately. netcat does that by default.

netcat as a server, listening on TCP port 11111, handling multiple connections. This one is untested, and created from memory.

nc -L -p 11111


There used to be some versions of netcat that could handle more than one incoming connection when given the "-L" flag, but I can't find a copy of netcat that works that way, nor even any documentation for it. (Maybe I've imagined it!) It's almost equivalent to this shell script snippet: "while true; do nc -l -p 11111; done", except that this snippet only handles one connection at a time, not multiple ones. The OpenBSD variant of netcat has a -k option which works just like the shell snippet, but still doesn't handle multiple simultaneous connections.

socat as a server, listening on TCP port 11111, handling multiple connections.

socat STDIO TCP-L:11111,reuseaddr,fork


Now open two more windows and run "nc localhost 11111" in each. These are clients to your socat server. What you type in each client window gets displayed in the server window. But what you type in the server window only goes to one of the clients. Each line alternates between each client.  The fork option to TCP-L tells socat to fork a new process for each received connection on port 11111.  Each new process then reads and writes on the standard input/output.

TCP: will use IPv4 or IPv6 depending on which type of address you provide. TCP-LISTEN: will listen on all local addresses (IPv4 and IPv6) unless limited by the bind option. There exist TCP4, TCP6, TCP4-LISTEN, and TCP6-LISTEN variations, as well.

netcat as a UDP server on port 11111.

nc -u -l -p 11111

and then as a UDP client.

nc localhost 11111


socat as a UDP server on port 11111
.

socat - UDP-LISTEN:11111

and then as a UDP client.

socat - UDP:localhost:11111


Again, UDP-L can be used instead of UDP-LISTEN. UDP will use IPv4 or IPv6 depending on which type of address you provide. UDP-LISTEN will listen on all local addresses (IPv4 and IPv6) unless limited by the bind option. There exist UDP4, UDP6, UDP4-LISTEN, and UDP6-LISTEN variations, as well.

socat has other UDP-based addresses that implement other communication patterns beyond what netcat can do. I started to enumerate them here, but the UDP subject ended up dominating this article, so I've pulled it out and link to it here, so this one can remain focused on comparison with netcat.

The coolest, and most dangerous, netcat option is -e, which causes netcat to execute a command when it connects out or receives a connection. A simple remote access server looks like this:


nc -l -p 2323 -e /bin/bash


The strict equivalent simple remote access server in socat is:

socat TCP-LISTEN:2323,reuseaddr EXEC:/bin/bash


However, you can improve on this in several ways. First, the argument to -e in netcat has to be the name of an executable program, found somewhere on the disk. It can't be multiple commands, and can't rely on shell behaviours, like variable handling or wildcard expansion. Not a major impedance, because you can always write out your commands into a shell script, but sometimes doing that is inconvenient. But socat has the SYSTEM address, which uses the system() call rather than a call to exec(), which is what -e in netcat and EXEC in socat do. It enables something like this:


socat TCP-LISTEN:2323,reuseaddr SYSTEM:'echo $HOME; ls -la'


As always whenever the system() call is involved, be aware when writing scripts to not allow unchecked input to be invoked by the system() call. If you try the above in netcat ("nc -l -p 2323 -e 'echo $HOME; ls -la'"), you'll get an error like this: "exec echo $HOME; ls -la failed : No such file or directory", because netcat tried to execute a program called, literally, "echo $HOME; ls -la", spaces and all. Some versions of netcat have a "-c" option, which uses system() instead of exec(), which would allow multiple commands and shell behaviours to work. But again, it depends on which version you have.

netcat is often employed as a data forwarder, aka a simple proxy, listening for incoming connections only to redirect data to another destination port and/or address. It does so by going in to listen mode with -l, then using -e to invoke itself as a client. Because of the use of exec() instead of system(), you have to put the client call into a shell script. First, the client script, "nc-cli", looks like this:

#!/bin/sh
nc localhost 22


Then the call to netcat looks like this:

nc -l -p 2323 -e "./nc-cli"


This redirects incoming connections to port 2323 around to port 22.

(Sometimes you see inetd or xinetd configured to use netcat to do redirecting.)

Of course, you can implement the exact netcat behaviour with "socat STDIO EXEC:nc-cli", or even "socat TCP-L:2323 SYSTEM:'socat STDIO TCP:localhost:22'". However, there's a better way to do data forwarding with socat, which doesn't need a client shell script or even a recursive call to socat. By now you should have enough information about socat to figure it out yourself, so I'll put the example beneath a cut.
Collapse )

And of course, you can replace either address with any other socat address we've already talked about (UDP, UDP-L), or ones we'll talk about in another post (e.g. SSL).

socat can also handle common forwarding requests that netcat doesn't handle. While netcat can bridge between TCP and UDP (insert the -u flag in the above netcat example as appropriate), it can only handle UDP data that is essentially connection-oriented. With socat, any other communication patterns for which UDP is commonly used are also do-able. Just replace the STDIO address in any of the examples in the socat UDP article with TCP or TCP-L addresses as appropriate.

socat can even behave as a
socket gender changer! This part might be a bit confusing to understand; there used to be a file called "TCP-IP_GenderChanger_CSNC_V1.0.pdf" that described the problem, but it seems to be absent from its original location. So I shall try to describe it. The "gender" of a socket is, in this analogy, whether it is a client or a server socket. So a socket gender changer allows two client sockets to connect to each other, or two server sockets to connect to each other. In either case, the gender changer must be running on a host reachable by both clients or both servers. It can run on the same host as either pair, or on a third host. netcat can do this, but with some limitations.

Why would you need this? Off-hand, I can't think of any network protocols that would allow two clients or two servers to just start communicating. So it's not a capability in demand as often as an audio cable gender changer is. But there is one case where it may be useful. Say you have a host running a service that's hidden behind a firewall. No one can connect to the service because the firewall prevents incoming connections. It will allow outgoing connections. Now imagine you can run software on a system outside of the firewall. If you run a server-server gender changer on the external host, and a client-client gender change on the internal host (with one client connecting to the internal service, and the other to one of the server ports on the external host) you have in effect fooled the firewall into allowing access to the internal service despite its access control rules forbidding incoming connections. The above-referenced URL has the specifics of how to use socat to do this. Notice that socat has all sort of retry and timing options to get the desired behaviour. netcat doesn't have all these options, although you may be able to compensate for their absence with a shell script.

That brings to an end the direct comparison of socat and netcat functionality. There's a lot more that socat can do, which I'll address in another article (one on UDP and multicast, the other on everything else). There are some things netcat can do that I didn't discuss, like how it can do telnet negotiation or port scanning. I really consider those out of place in netcat, because they're too application focused. I tried to point out all the netcat options that are only available in some versions of netcat where appropriate. I didn't talk about the source routing ability of (again, some versions of) netcat. socat can do this too, using the ipoptions option, but it's difficult to use. Mostly, though, I don't know enough about source routing to compare the two; something to add to my list of things to figure out. Don't forget, here's the socat & UDP article.

sslrsh: Remote Shell over SSL using certificate authentication

sslrsh, which stands for SSL Remote Shell, allows you to log in to a remote system over an SSL connection, using X.509 certificates for encryption and for authentication. It's similar to SSH. sslrsh is a shell script. Most of the heavy lifting in the script is done by socat. The same script can run as both client and server.

This script signals a return to my favorite subject, tunneling. My last discussion on this subject got a bit out of hand. My last useful discussion on this subject was based on SSH, and was unique in that it worked without needing root privileges on the remote side of the tunnel. sslrsh is not actually a tunneling tool. It's a remote shell tool. But it's a good introduction for future posts that will use some of these same tools to set up VPN-style tunnels. Before I wrap up this trip through memory lane and get to the point, I want to remind you about mini_ca. We'll be needing some certificate action for this script, and for that we need a Certificate Authority. You can use mini_ca to generate the needed certificates, or you can be difficult and get them some other way.

Here's the usage & license statement:

BEGIN USAGE

Usage: sslrsh [-h]
sslrsh [-p port] [-P proxy:port] [-c /path/to/cert] [-a /path/to/cacert] remotehost
sslrsh -s [-p port] [-c /path/to/certificate] [-a /path/to/cacertificate] [ -e shell commands to execute ]

-h This helpful information.
-p Port to use (currently 1479).
-P CONNECT proxy server to use. http_proxy environment variable will be used if set, but will be overridden by this flag.
-c Path to the client or server certificate (must include key)
(currently "sslrsh-client.pem" for client, "sslrsh-server.pem" for server)
-a Path to the signing Certificate Authority's certificate
(currently "sslrsh-cacert.pem")
remotehost System to connect to (client mode)
-s Listen for connections (server mode)
-e Shell command or commands to execute as the server, defaults to "echo Welcome to sslrsh on argentina; /bin/bash --login -i"

sslrsh is copyright 2009 by Jeremy D. Impson <jdimpson@acm.org>.
Licensed under the Apache License, Version 2.0 (the License); you
may not use this file except in compliance with the License. You may
obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
License for the specific language governing permissions and limitations
under the License.
END USAGE

sslrsh needs three files: a Certificate Authority (CA) certificate. a server certificate, and a client certificate. The client and server certificates must also have their private keys embedded within them. The CA certificate must have issued both the server and client certificates. Actually, the server instance of sslrsh needs to have the CA certificate that signed the client's certificate, and the client instance needs to have the CA certificate that signed the server's certificate. Got that? Good.

You can specify the CA Certificate with "-a", and the server/client certificate with "-c". By default the port is 1479, which can be changed with "-p". "-P" will let you specify a web CONNECT proxy for the client. If you want to run as the server use "-s", and supply a remote host destination if you want to run as a client. The server will run a bash shell by default, but you can change what it runs through the "-e" command. NOTE NOTE NOTE: the value specified with "-e" is passed to the system() call, which can have severe security repercussions, especially if in your own script that calls sslrsh, you pass unverified data as the value for "-e". (Hmm. A Taint Mode for the shell would be pretty cool.) Finally, "-h" gets the help/usage and license statement.

Here's an example. It runs the server on host "artoo", and the client on host "argentina", connecting to "artoo" using sslrsh:

BEGIN EXAMPLE
On the server

jdimpson@artoo:~$ sslrsh -s
Listening on "1479"
And on the client

jdimpson@argentina:~$ sslrsh artoo
Connecting to "artoo:1479"
welcome to sslrsh
jdimpson@artoo:~$
END EXAMPLE

Here's the code:

BEGIN CODECollapse )END CODE

You can download sslrsh here: http://impson.tzo.com/~jdimpson/bin/sslrsh.

Let's dive in. After setting default values, processing the command line, and providing a usage and license statement, sslrsh gets its certificate material in order. It makes sure the CA certificate is readable, and does the same for the client or server (depending on which mode it runs in) combined certificate and key file. By default, it looks for files in the current environment, with names that just happen to match up with the file names you'd get if you followed the directions below to create them with mini_ca. (What a happy coincidence.) Otherwise, you can use the "-a" flag to direct sslrsh to the CA cert you want. Similarly, the client or server cert can be set with "-c".

Then it figures out what kind of proxying, if any, should be done. If the user has the http_proxy environment variable set, that will be used. If the user specified the "-P" flag, the provided value will be used as the proxy. Regardless of which source the proxy setting comes from, it gets scrubbed and parsed. First, any URL-related text (e.g. "http://") is removed. If it matches the form "server:port", the port is stripped out and assigned to another variable.

With all that out of the way, the script proceeds to figure out if it's meant to run as a client or a server. If as a client, it then checks the proxy settings. If present, the script forks (via the ampersand) an instance of socat that listens on a local TCP port and forwards anything sent to that port on to the specified web CONNECT proxy. It tells the proxy to redirect the connection to the final destination, as given on the command line. This is effectively a proxy to the proxy, because socat's SSL functionality doesn't know how to talk to a web proxy directly.

Then the script executes socat, listening on standard input, connecting it to an SSL socket. The "-,raw,echo=0" argument to socats says: listen on standard input, turn off all processing that the TTY layer normally does, and similarly tell it not to echo input typed by the user back to the user. This is important as we want the server side to receive everything we type, and to present us with everything on the screen. The argument that starts with "SSL:..." controls SSL connection. If no proxy was configured, the SSL connection will connect to the final destination as given on the command line. If there is a proxy, the SSL connection will connect to the listening port of the above described socat instance. Either way, the SSL connection uses both the CA certificate (for authenticating the server), and the client certificate (to present to the server). The rest of the argument uses a number of options to control the SSL connection. The "-ls" and "lp" arguments control how socat performs logging.

It's unfortunate that we must a second instance of socat just to perform this proxying; it would be preferable if the SSL capability within socat could utilize the proxy directly. Apparantly socat version 2.0 will be able to rectify this situation.

If running as a server, sslrsh again executes socat, using it to create a listening SSL socket, then fork and exec a shell command, which by default prints a welcome message then runs an instance of the bash shell. The argument to socat that begins "SSL-L:..." tells it to listen for incoming SSL connections (from the client). Every new connection causes the process to fork and run a shell command, as described in the argument that starts "SYSTEM:...". The rest of that argument has a bunch of options ("pty,setsid,setpgid,ctty") that are some Unix/POSIX voodoo necessary to give the client the appropriate interactive shell experience with its own TTY and job control, while "stderr" makes sure the error from the shell command gets sent to the client. The "-ls" and "lp" arguments control how socat performs logging. The two "-d" flags increase the verbosity level of the logging.

Although I think sslrsh is a neat script, and it has its uses as a lightweight and customizable remote access server, be aware of its limitations. It doesn't do tunneling/port forwarding like SSH does. It may violate the access policy of the system you're running the server on (if you aren't its administrator). It doesn't have robust error checking, especially in the set up of the proxy process, so it's not an easily supportable, enterprise-quality service.

Also, it doesn't matter which client certificate you use to connect to the server, the server will authenticate you as long as the CA created the client certificate. There's no differentiation between clients. A "normal" SSL application would actually read a certificate after validating it. We're not doing that here. It would be nice if the contents of the validated certificate were made available to our server. One way to do this would be to have an option to the SSL function that runs a script after a cert is validated, and is fed the certificate on input or in the environment. The script would return true or false to specify whether the certificate should be accepted. Or, if the SSL function placed the validated certificate, or even just the "Subject" line in the certificate, into an environment variable, our shell (as specified by "-e") could use it to make decisions.

But probably the biggest limitation is that, by default, when you log in to the sslrsh server, the shell you get will be running as the user who started the sslrsh server. But, here's an alternative way to run server which makes it prompt for username and password (in addition to the certificate-base authentication). However, for this to work, the server has to be run as root. It works by replacing the call to the bash shell with a call to the login program. login prompts for username and password, checks them against the server system's password mechanism, then uses setuid() to become whatever username was provided.

BEGIN EXAMPLE
On the server

jdimpson@artoo:~$ sudo ./sslrsh -s -e "/bin/login"
Listening on "1479"
On the client, you can see the change

jdimpson@argentina:~$ sslrsh artoo
Connecting to "artoo:1479"
via "localhost:8888" proxy
artoo login: sysadmin
Password:
Last login: Wed Nov 26 09:34:29 EST 2008 from argentina.apt.net on pts/8
Linux artoo 2.6.24-22-generic #1 SMP Mon Nov 24 18:32:42 UTC 2008 i686

The programs included with the Ubuntu system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Ubuntu comes with ABSOLUTELY NO WARRANTY, to the extent permitted by
applicable law.

To access official Ubuntu documentation, please visit:
http://help.ubuntu.com/
You have mail.
sysadmin@artoo:~$

END EXAMPLE

A note about socat: Its man page describes it like this:

socat - Multipurpose relay (SOcket CAT)

Socat is a command line based utility that establishes two bidirectional
byte streams and transfers data between them. Because the streams can be
constructed from a large set of different types of data sinks and sources
(see address types), and because lots of address options may be applied to
the streams, socat can be used for many different purposes. It might be
one of the tools that one ‘has already needed´.

That's accurate, but doesn't really capture the scope of socat's capability. Just like its namesake cat, socat is all about connecting standard input to standard output. Unlike cat, which primarily operates on files and TTYs, socat can operate on (and create, if necessary) files, TTYs, TCP sockets, UDP sockets, raw sockets, Domain sockets, raw IP frames, child processes, named pipes, arbitrary file handles, as well as files and standard input, output, and error. Additionally, socat knows about a few application-level protocols like SSL, Web CONNECT Proxy, and SOCKS. It also knows about various network access methods, like IPv4, IPv6, IP Multicast, IP Broadcast, and datagram- and session-style transmit and receive. All of these communication mechanism are called addresses, and socat has a rich set of options that apply to numerous address types, such as setting device-specific ioctl()s, IP headers, socket options, and security ciphers, and performing functions like fork(), chroot(), and setuid() in order to get various security and performance behaviours.

Folks who know about netcat might wonder how it compares to socat. Given socat, you don't need netcat, although netcat is a lot simpler to use if all you need is basic TCP or UDP streaming. Personally, I plan on keeping netcat in my own personal toolbox, if only because I can bang out a netcat command line without any conscious thought.

A note on certificates: As mentioned above, the client and server need certificates & keys, and they both need access to a Certificate Authority certificate in order validate each other's certificates. Here's how to use mini_ca to create the necessary certs. (Follow the link to find out where to download mini_ca.)

Create the Certificate Authority, and the CA certificate, like this:

BEGIN EXAMPLECollapse )END EXAMPLE

You'll find the CA certificate at "sslrsh-ca/sslrsh-cacert.pem".

Use the CA to create a server certificate/key.

BEGIN EXAMPLECollapse )END EXAMPLE

You'll find the server certificate/key at "sslrsh-ca/certs/sslrsh-server.pem ".

Use the CA to create a client certificate/key.

BEGIN EXAMPLECollapse )END EXAMPLE

You'll find the client certificate/key at "sslrsh-ca/certs/sslrsh-client.pem ".

Interlude: SSH over SSL and the USSR

One day I needed to tunnel an SSH session through an SSL connection.  "Why?", you ask?  Good question.  The short answer is "Because of Communism".  Here's the long answer....

Everyone who is awesome knows that SSH is awesome.  However, there are some sub-awesome people who wish to prevent you from using SSH, complaining about something to do with X11 forwarding or service tunneling or unregulated VPNs, yadda yadda yadda.  So they block outbound port 22 traffic.  In fact, these same sort of people tend to block EVERYTHING, including web traffic.  These Apparatchiks eventually realize that there's no sense in paying for an Internet connection if you're going to block all use of it, so they grudgingly set up an HTTP proxy server.  That's great for the unwashed proletariat masses who only use the web, but not so much fun for we elite Internet Power Users who want to login to a remote system and run VNC to see our desktop.  We need a way to get our SSH traffic past the block, and the HTTP proxy is a good candidate to do that.

The cold war has begun.

HTTP proxies work  by making the web client connect to the proxy and issue GET and POST requests.  The proxy then connects to the intended web server and forwards the requests to it.  The responses are forwarded back.  Unfortunately, unencrypted HTTP traffic, although TCP based, is request-response based and has short-lived sessions. So the  GET and POST functions of the proxy are not easily amenable to tunneling (although it isn't impossible; google "httptunnel" some time). 

That's OK, most HTTP proxies also have to handle encrypted HTTP, aka HTTPS, aka HTTP over SSL (aka TLS).  Because in HTTPS the encryption occurs at the session level (rather than the message level), a proxy can't interfere with encryption negotiation, else it will fail--the user's web browser will report that the server's hostname doesn't match the hostname listed in the public key, which is either a sign you're being "man-in-the-middled", or that web server is poorly administrated.  So most web proxy servers provide a CONNECT function.  The CONNECT function is the web client's way of telling the proxy to "f** * off".  No wait, I mean to "give me a two-way, persistent tunnel to the following destination...".  Perfect.  So we wrap a CONNECT call around the SSH connection.

I'm sure that I was NOT the first to discover this use of the HTTP Proxy CONNECT function, but I did figure it out independently from anyone else.  My first solution was called "nc-ssh-inetd".  It ran out of inetd or xinetd, listening on port 2222, and using netcat (nc) to connect to the proxy.  It looked like this:

BEGIN CODE
 nc-ssh-inetd
#!/bin/sh

(
 echo 'CONNECT impson.tzo.com:22 HTTP/1.0';
 echo 'pragma: No-Cache';
 echo 'proxy-connection: Keep-Alive';
 echo 'user-agent: I_OONZ_JU/0.0';
 echo ;

 cat;
) | nc proxy 80 | ( read x; read x; read x; cat )
END CODE nc-ssh-inetd

Then I'd run "ssh -p 2222 localhost", which would connect to this script.  This script connects to the HTTP proxy called "proxy" on port 80.  It would send it the CONNECT command telling it the SSH server & port to connect to, plus some other headers (including my hilarious made-up user-agent) followed by a blank line.  It would read (and throw away) three lines from the proxy, which were the responses to the CONNECT command.  Then it would just read and write data in both directions (the two calls to "cat").  This left the SSH client and server free to do their thing.  An ugly hack, but it worked.

Other people solved this more elegantly.  PuTTY, my second favorite SSH client (after the one provided by OpenSSH), even built the CONNECT command into the product.  Better yet, OpenSSH created the ProxyCommand directive, which lets you specify an arbitrary command to use to create the network socket over which ssh will start the session.  My nc-ssh-inetd script worked unmodified that way, although I parameterized it so that it would read the SSH server & port and proxy server & port from the command line.  Someone even added CONNECT support directly to netcat, but modifying h0bbit's 1.10 version of netcat is sacreligious and I'll have no part of it.

But since so many people can now easily tunnel SSH traffic through CONNECT-able HTTP proxies, the practice became wide-spread, to the point that even the sub-awesome Apparatchiks noticed that it was happening.  (Probably when their proxy servers started running out of free sockets because so many persistent CONNECT connections were simultaneously in use.) Since most HTTPS traffic is destinated to port 443, they tried to put a stop to SSH tunneling by modifying the HTTP proxies to disallow CONNECT calls to port 22.   (More accurately, they probably paid their proxy vendor large amounts of cash to make a very simple functional addition to their proxy product.)

The arms race escalated.

The most elite of our awesome selves, however, have a certain power.  The power of root. Specifically, the power to arrange for SSH service on a port other than the default port 22.  The comes from either 1) moving the SSH server to another port, 2) running two instances of the SSH server (on 22 and on the new port), or 3) using port redirection rules on a firewall, to rewrite incoming packets destined to the new port to instead go to 22.  OpenSSH even lets you make one server listen on multiple ports at once.  And since SSH is so awesome, we know that we'll never use telnet again, so port 23 is a logical choice. Of course, 23 could also be put on the CONNECT ban list.  So we use another, high level port., say, 8443 (common location for a test HTTPS server). 

It might eventually come to a point where the Apparatchik ban every port but 443.  (If they ban that, no HTTPS traffic will work.)  No problem.  If you aren't already running HTTPS server, just use 443 for SSH.  If you are, you can make a hard choice to shut down your HTTPS server, OR if you're so awesome and elite that you already have a featureful firewall in front of your SSH/HTTPS server, you can use a custom port redirection rule that will redirect most incoming packets destinated to port 443 to your HTTPS server, but any incoming port 443 packets from a certain source address (or list of addresses) to your SSH server.   Any packet coming from the Internet-facing address of the proxy server to port 443 gets redirected to port 22.  The rest remain unmodified.

This worked for quite a while, to the point that the newfound Glasnost seemed permanent.  Then the hard-liners returned!

All of a sudden, outgiong CONNECT-wrapped SSH connections started blocking on startup, to eventually time out.  Normally SSH behaviour has the client connect to a server and wait for the server to initiate the protocl by sending a string specifiying what version of SSH it speaks.  From there, the client and server do their dance to negotiate algorithms, determine shared secrets, verify each other's identity, then set up the login session and any desired tunnels.  Now, the client received nothing from the proxy.  The protocol negotiation couldn't continue.

It wasn't until one bold freedom fighter was able to simultaneously sniff the client and server sides of a connection attempt, using a totally different remote access technique, that the problem was understood.  On the client side, he saw the outgiong connection to the proxy.  On the server side, he saw the related incoming connection from the proxy.  And he saw the appropriate response from the server.  But he didn't see that response make it back to the client!  The client eventually gives up and breaks the connection.

The hard-liners had moved beyond a list of banned ports.  They started monitoring traffic at the protocol level.  That's right: censorship!

The primary purpose of the CONNECT proxy function is to support HTTPS, specifically SSL/TLS.  In SSL the client initiates the session by sending the first pieces of information, not the server as in SSH.  So right there is a measurable behaviour, and the hard-liners were using it to once again block democracy-loving SSH sessions from succeeding.  No longer would CONNECT-wrapped SSH connections work.

One potential solution would be to somehow wrap both the SSH client and server in such a way that the client sends arbitrary data first, then starts up normal SSH.  This requires some sort of hack to be developed, or custom modification to both SSH client and server. 

Plus, if the hard-liners have learned anything through this, these new  proxies should be able to positively identify and allow only legitimate SSL traffic.  If they can't do this yet, they'll learn it eventually.  May as well give them the benefit of the doubt.

And THAT'S why I need to run SSH over SSL.

(How is less exciting, and will be deferred to a later post.)