SipHash is designed as a non-cryptographic hash function. Although it can be used to ensure security, SipHash is fundamentally different from cryptographic hash functions like Secure Hash Algorithms (SHA) in that it is only suitable as a message authentication code: a keyed hash-function-like hash message authentication code (HMAC). That is, SHA is designed so that it is difficult for an attacker to find two messages X and Y such that SHA(X) = SHA(Y), even though anyone may compute SHA(X). SipHash instead guarantees that, having seen Xi and SipHash(Xi, k), an attacker who does not know the key k cannot find (any information about) k or SipHash(Y, k) for any message Y ∉ {Xi} which they have not seen before.
Overview
SipHash computes a 64-bitmessage authentication code from a variable-length message and 128-bit secret key. It was designed to be efficient even for short inputs, with performance comparable to non-cryptographic hash functions, such as CityHash;[4]: 496 [2]
this can be used to prevent denial-of-service attacks against hash tables ("hash flooding"),[5] or to authenticate network packets. A variant was later added which produces a 128-bit result.[6]
An unkeyed hash function such as SHA is collision-resistant only if the entire output is used. If used to generate a small output, such as an index into a hash table of practical size, then no algorithm can prevent collisions; an attacker need only make as many attempts as there are possible outputs.
For example, suppose a network server is designed to be able to handle up to a million requests at once. It keeps track of incoming requests in a hash table with two million entries, using a hash function to map identifying information from each request to one of the two million possible table entries. An attacker who knows the hash function need only feed it arbitrary inputs; one out of two million will have a specific hash value. If the attacker now sends a few hundred requests all chosen to have the same hash value to the server, that will produce a large number of hash collisions, slowing (or possibly stopping) the server with an effect similar to a packet flood of many million requests.[7]
By using a key unknown to the attacker, a keyed hash function like SipHash prevents this sort of attack. While it is possible to add a key to an unkeyed hash function (HMAC is a popular technique), SipHash is much more efficient.
Functions in SipHash family are specified as SipHash-c-d, where c is the number of rounds per message block and d is the number of finalization rounds. The recommended parameters are SipHash-2-4 for best performance, and SipHash-4-8 for conservative security. A few languages use Siphash-1-3 for performance at the risk of yet-unknown DoS attacks.[8]
^ ab"SipHash: a fast short-input PRF". 2016-08-01. Archived from the original on 2017-02-02. Retrieved 2017-01-21. Intellectual property: We aren't aware of any patents or patent applications relevant to SipHash, and we aren't planning to apply for any. The reference code of SipHash is released under CC0 license, a public domain-like license.
^Aumasson, Jean-Philippe (veorq) (Nov 12, 2015). "Comment on: change Siphash to use one of the faster variants of the algorithm (Siphash13, Highwayhash) · Issue #29754 · rust-lang/rust". GitHub. Retrieved 28 February 2024. SipHash designer here, haven't changed my opinion about SipHash-1-3 :-) [...] There's a "distinguisher" on 4 rounds[...], or in simplest terms a statistical bias that shows up given a specific difference pattern in the input of the 4-round sequence. But you can't inject that pattern in SipHash-1-3 because you don't control all the state. And even if you could inject that pattern the bias wouldn't be exploitable anyway.