Basic cryptography: hash, digital signature, MAC, symmetric keys

Message hash

What’s a hash? It is the result of a non-reversible mathematical function, which returns a bit sequence after receiving an arbitrary length data input. The result of applying hashes to a data set is a fixed length sequence. Hash example algorithms: MD5, SHA1, SHA256. The length of the MD5 code is 128 bits, the length of the SHA1 code is 160 bits.

Given the hash output, you cannot return to the original message. But the same message (or generally data) will always give the same hash when hashed with the same algorithm. In addition, the calculation of hashes on two different data, causes two different results (there is therefore no “collision”). So, if you receive a message together with a hash, you only need to calculate your hash independently with the same algorithm. If the 2 hashes match it means that the message is the original one intact, otherwise the message has been tampered in some way.

Another important property of a hash is that a small change to the message makes the resulting hash value changing significantly. This, together with the above, means that a hash ensures the integrity of the message.

Encryption

What does it mean to use encryption? It means replacing the real text of a message with a different one obtained from a certain algorithm and a key, in such a way that it is impossible or very very difficult (or extremely long) to decipher it, without having the necessary encryption key.

This means that encryption ensures the secrecy of a message, but not anonymity. In the sense that everyone knows that a message has been sent by the sender, but no one except the receiver is able to read its contents.

There are 2 types of encryption: symmetrical and asymmetrical.

Symmetric Encryption

Symmetric encryption is faster. It uses only 1 key for encryption and decryption. Both the sender and the recipient must know the key and keep it secret, no one else must know it.

The risky and delicate part of symmetric encryption is sharing the key between the two participants in the conversation, because there is only one key and if compromised the whole conversation is compromised. There are several mathematical systems that have been invented to share a secret key between two ends of a conversation through an insecure medium (e.g. Diffie Hellmann key exchange).

Examples symmetric encryption: AES, DES.

Asymmetric encryption

Asymmetric encryption is slower than symmetric one. It uses 4 keys, each user has 2 (a public/private pair), these 2 keys are mathematically linked. The most important property is that everything that is encrypted with the public key can be decrypted with the corresponding private key and vice versa.

The public key, as the name suggests, is not a secret, the owner of the key pairs can publish his public key on his website or anywhere, and in fact he must do so in order to receive encrypted messages that can only be decrypted with his private key. It’s like an address or a mailbox. The private key, as the name suggests, must be kept secret (it is the personal cryptographic secret). If the owner of the key pairs gets his private key compromised, then anyone can know what messages he has received. In case of compromise a new key pair must be generated.

To communicate with asymmetric encryption, simply exchange public keys through any public medium, even insecure ones. Unlike symmetric encryption, you can do it outdoors, no one can read your messages if they only know your public keys.

If you want to send a message, use the other person’s public key to encrypt the message and send it to him, only he can unlock the message with his private key and read it.

If you want to receive a message, the other person must have used your public key to encrypt the message, which means that only your private key can decrypt the message.

Examples asymmetric encryption: RSA, DSA.

Of course, most of the time this is done automatically via email or other forms of messaging. You may have already communicated with asymmetric or symmetric encryption without manually encrypting and decrypting a message.

Integrity of digitally signed messages

When we talk about the integrity of messages, we intend to provide guarantees to the recipient regarding: 1) authentication, 2) integrity and 3) non-repudiation. With the digital signature what is done is to encrypt with the sender’s private key a hash of the entire message you are sending. The message is sent together with this digital signature.

In practice you create a hash for your message with a certain hashing algorithm (message digest) and then this hash is encrypted with your private key.

The recipient of your message uses your public key to decipher the signature, then receives the message digest. He applies the same hashing algorithm (e.g. MD5 for SHA1) that was used by the sender, but independently, and if the result of the calculated hash matches the hash he obtained after decrypting your digital signature with your public key, means not only 1) that the message was sent by you, since it is you who have your private key that encrypted the digital signature, but also 2) that the message it received is the original one you wanted to send, since its hash matches the hash in the signature.

Because of the signature, you cannot deny the authenticity of the message and yourself (non-repudiation). Third goal reached.

But there’s one more thing to solve: how can you make sure that the public key you send publicly (i.e. on an insecure public medium) hasn’t been altered, or that someone makes their own public key and declares that it’s your key? This way they can decipher the messages that are meant for you, and you can’t. That’s where digital certificates come in.

Digital certificates

Digital certificates ensure that the public key published in your name is actually certified as your public key by a higher third party. Obviously this is a trust relationship and not a mathematical one.

Digital certificates include information about the public key, information about the identity of its owner (called the subject) and the digital signature of an entity that verified the certificate’s content, a third party (called the issuer). If the signature is valid, and the software examining the certificate trusts the issuer, then it can use that public key to communicate securely with the subject of the digital certificate.

Integrity of messages with HMAC

But what if you want to guarantee the integrity of the message but you don’t need the authorship of the message and you just want the messages to work faster? This is where HMAC comes in, unlike digital signatures, HMAC encrypts the message hash with a symmetrical key.

Being encrypted with a symmetrical key, the authorship of the message cannot be traced back to you.

Being encrypted with a symmetrical key, the authorship of the message cannot be traced back to you, because you are not the only one who has access to that symmetrical key. In fact, the recipient also has the same key and may have created that message himself. Of course, they are the only 2 entities that have access to the symmetric key, so unless one of them has compromised the key, the message you received, if you know it wasn’t you who wrote it, you can be sure it came from the other person who has the symmetric key.

The H in HMAC stands for hash and the MAC stands for message authentication code, a code that also guarantees the integrity and authenticity of the data, allowing viewers who have the secret key to detect any changes to the message content. A MAC usually has 3 parts: a key generation algorithm, a signature algorithm and a verification algorithm.

Summarizing: