Motivation

I just spent some time cleaning up my SSH config files (such as combining a bunch of similar "Host" sections under one heading, so as to avoid duplicating config stanzas.)

I also decided to clean up known_hosts to only contain what is in my config file.  I want it to be easy to securely seed new systems, I like the "clean" feeling of knowing there is no old cruft in there, and I think having cruft is a security hole anyway.  Maybe I once accepted a certain host key, assessing that it was OK in one context, or for a temporary throwaway account – but if the host is not in my config file, I probably don't want its key permanently in my known_hosts.  That's why I want to easily know what's in it, and manipulate its contents.

My Configuration

I keep an ssh config file with hostnames, some nonstandard ports, and pointers to encrypted authentication keys I load into an SSH authentication agent. I just put it into a local git repository today, because I have found that useful for other configuration files.

Why Use HashKnownHosts?

HashKnownHosts is supposed to be to prevent address-harvesting attacks. The address and (port number if non-standard) are hashed cryptographically with a salt, to make it more difficult for worms or crackers to slurp up targets.

Why Not Use HashKnownHosts?

HashKnownHosts makes it difficult to maintain a known_hosts file. Want to edit the file to remove a certain host? You can't do it by hand, you need the ssh-keygen tool to do it for you.

Want to filter out outdated entries? You'll either have to crack the hashes, or enumerate every entry you want to keep, and remove the ones you want to by elimination (which is like cracking the hashes with a dictionary attack.) This is what I had to do to recover the keys I wanted to keep, to check in a nice, clean file to git.

Want to merge hashed known_hosts files from different hosts or user accounts? You'll have to (or at least may want to, for file cleanliness) de-duplicate hashes of the same hostname but with a different salt.

Want to just know what's in your file? You'll have to dictionary-attack it.

After all that inconvenience, I'm not sure hiding the connection history is even all that valuable. In my configuration, I only use cryptographic keys for authentication, necessitating (AFAICT) a unique entry in my config file for every host, listing hostnames and what port to connect to and what key to use. That's right, machine-readable instructions on how to connect to every host I connect to, with potentially more information than known_hosts has (such as a username.) (The keys are encrypted though.) Pretty much the exact same situation as an unhashed known_hosts – connection information with just the last authentication factor (a password, or private key encryption passphrase) missing.

Maybe that means the names in the config file should be hashed, too. Besides the annoyance that would bring, this still just seems to obscure how bad having your config or known_hosts files accessible to an attacker are. You could just as well have a backdoored ssh in your PATH that logs everything you do, or another machine on the network sniffing for SSH packets and logging where they are going. Your shell history file may have connection details. If you're relying on unique hostnames or non-standard ports to bandaid over weak passwords or whatever, you've got worse problems.

And as I mentioned, an opaque, difficult-to-audit trusted key database could be a security liability of its own.  This file isn't some MRU auto-completion nicety, it's a security mechanism to prevent man-in-the-middle attacks.  You wouldn't want your browser keeping only the hashes of URLs you'd added SSL certificate exceptions for (such as for self-signed certs), would you?

Is it worth extra administrative burden to obscure connection information that will probably be obtainable from other sources anyway?  I'm thinking not, so for now I've disabled HashKnownHosts (was enabled by default in Ubuntu.) Luckily it's easier to re-hash later if I want to, than it was to unhash.