Design

How the git client uses SSH

If a user tells git to clone, push or pull a remote, the client will connect via SSH and call the appropriate server command (one of upload-pack, receive-pack or upload-archive). These commands make the full repository available with read- or write access.

githome interception

Upon receiving a connection attempt, OpenSSH will consult the authorized_keys file for the user account on the server (by convention, a special user named git is often creating for serving git repositories). Inside the authorized_keys file, it is possible to instruct OpenSSH to ignore the requested command and execute a different one instead; passing on the original command as an environment variable.

githome modifies the authorized_keys file each time a key or user is added and sets up a line for this specific key. When a connection is made, githome is called and checks if the user associated with the key that just connected is allowed to access the repository he requested access to. Any invalid or unknown commands are rejected as well.

If authorization is granted, githome will replace itself with a call to the appropriate git server process.

Security

To be as secure as possible with the least amount of work as possible, githome tries to offload as many security relevant tasks to established software as possible. The only daemon facing the outside world is OpenSSH and to gain any sort of access, a user must present at least a valid key for any of the githome user accounts. As long as OpenSSH is secure (and configured correctly), you are safe from unauthorized access from foreign (i.e. unauthenticated) users.

Once a user is authenticated, control is passed over to githome, which authorizes or denies access based on its own rules: It checks if the user that is connected has access to the repositories he wants to access. Since any critical bug in githome here could result in priviledge escalation on repository access or code execution; githome takes great care to never pass on unknown parameters and uses strictly whitelist-based access to git subsystems.

Once authentication (by OpenSSH) and authorization (by githome) are done, githome passes on control to one of the git binaries via execlp() leaving the rest of the work to be done by git and OpenSSH.

Performance

githome was created when I wanted to host git repositories on a Raspberry Pi 1 and could not coax gitlab into to working well enough on it. Here’s why githome runs well on a single core 800 Mhz ARM CPU System with 512 MB RAM:

  • It lets SSH do the heavy crypto lifting. No number crunching is done by githome to de- or encrypt stuff.
  • execlp() is used to replace the running githome process with the appropriate git binary once all authorization checks have completed. At this point no extra in-memory copying is done and things run as if githome had never been run as an intermediary.

Two clients

When a user connects via SSH, a client has to be launched to perform githome authorization. There are two available: githome shell and gh_client. The githome shell functions without any additional setup and will authenticate the requets itself. It has one drawback: The full python interpreter and the SQLAlchemy-library must be loaded, which will take up to 5 seconds on a slow SD-card. This feels very slow.

By default, the alternative gh_client is enabled. It needs a githome server to be run to go with it. The server is the heavier python application, which is meanted to be run as daemon. Upon connection, only gh_client, which is a very small and fast client written in C needs to be launched. It will connect to the running server via UNIX domain sockets and wait for an OK or an authentication error.

If no error occurs, it will execvp to the appropriate git server process.

Alternate design

githome favored a different architectural approach [1] before, using paramiko to deliver a complete SSH-daemon itself instead of using OpenSSH. The advantages of this were less “clutter” in the system, no crutches in the form of hooks to rewrite files like .ssh/authorized_keys and more fine- grained control over the process. It also allowed different ways for users to login, password based logins if desired and the ability assign the same key to different users.

Ultimately, the approach failed though. While writing the software, some issues were uncovered with paramiko. While these could be solved, they made the author nervous about using it for a critical server implementation. General paramiko development seems to be much more focused on the client side.

As an example, if one follows the sparse docs on key handling, its fairly easy to confuse the abstract base for public keys with an actual implementation. Paramiko quietly accepts this, and implement key checking that always compares empty keys, thus rendering all authentication moot.

To avoid any errors of this kind, githome is for now based on OpenSSH. This also reduced complexity quite a bit, as gevent or other libraries for parallel processing were no longer required, avoiding another host of problems.

While the paramiko-based design is in some ways more interesting, it requires a lot more effort in auditing and presents a larger attack surface and was shelved for this reason.

[1]Which can be seen up until revision fbe3f35.