ITS 2110 - Introduction to Network Security

Chapter 3 - Basic Cryptography

Objectives:

This lesson begins our discussion of cryptography. Objectives important to this lesson:

  1. Cryptography
  2. Hashing algorithms
  3. Symmetric algorithms
  4. Asymmetric algorithms
  5. File and file system cryptography
  6. Whole disk encryption
Concepts:

Chapter 3 begins with some stories about encrypting laptops, and the dangers of not doing so, but the text does not return to this particular topic for many pages. The author backs up a few steps to discuss cryptography in general. Cryptography (hidden writing) is defined on page 101 as the scrambling of a message. This does not explain the idea.

A classic example, the cipher used by Julius Caesar is clearer. It will be helpful first to define a cipher as a substitution of characters. In the example in the text, each letter in the original message (called cleartext or plaintext) is changed to another character by a specific algorithm (method), resulting in an encrypted message that displays in ciphertext. A Caesar cipher is easy to break, since the same cipher letter is substituted each time the same plaintext letter occurs in the message. The cipher is even more vulnerable due to its simple algorithm: add three to each plaintext letter, wrap to front of alphabet as needed (for x, y, and z). It would be harder to break if a random substitution of letters had been used. The example on page 103 is not a classis cipher, since the author is subtituting two digits for each letter. Ciphers have evolved a good bit, as we will see.

Messages can also be scrambled by steganography. (covered writing?) The text briefly reviews a few methods of doing this: hiding a message in unused parts of a file, hiding it in metadata, hiding it where the uninformed would not look, and hiding it in images. The short form is that an image typically has three bytes (RGB) of color information for each pixel in it. It is unlikely that anyone just looking at an image could tell the difference between pixels that are true to color and those that have had each of their least significant color bits changed as needed to hide/provide data. If you change one bit per color, you can hide one byte every three pixels.

Imagine that the table below represents a series of pixels. I have used cells in a table to make the idea more visual. I have put a reference color in the first cell: hex code 58C314 stands for 111, because I chose that color as the key. I have modified the color in each of the other cells in the second row to indicate three bits. The bits are indicated by the color's deviation from the key color.

58c314
refcolor

57c313

58c213

58c313

58c314

57c313

57c214
111 010 100 110 111 010 001


58c213

58c214

57c314

58c214

58c213

58c313

57c313

58c314
100 101 011 101 100 110 010 111

The binary code for that sequence, which would have taken 15 pixels, is:

  • first three ones are reference
  • 010 100 11
  • 0 111 010 0
  • 01 100 101
  • 011 101 10
  • 0 110 010 1
  • last two ones are padding

This example used seven variations on one color. The sender could send an image in which every pixel was modified if the receiver already had a reference copy to the image for comparison. I have done this by hand: an application that encrypts a message in an image or audio file would be much faster.

On page 189, the text lists five kinds of security that cryptography might provide, and notes that not all kinds of cryptography provide all five features. (The next kind discussed in the text provides only one.)

  • confidentiality - only those with the decryption algorithm can read it
  • integrity - the message is protected from changes (unless a man in the middle has the encryption algorithm)
  • availability - anyone with the decryption algorithm can read it
  • authenticity - a secure algorithm proves the message was sent by a trusted party
  • non-repudiation - the encryption may prove who did something and when it was done
Hashing

The text continues with a general discussion of hashing, which then branches into specific types. Hashing is defined on page 190 as creating a unique encrypted result from a data set. The encrypted result is called a hash, a signature, or a digest, all of which mean the same thing. The hash should not resemble the plaintext in appearance or in length. Note that the text also refers to this method as a one-way hash, which means that the hash resulting from the algorithm is not meant to be unencrypted. The reason for this becomes clear in the example of its use with an ATM card.

Assume you have an ATM card that is assigned a Personal ID Number (PIN). The card provider used a hash algorithm to create a hash from your PIN which is stored on the card on a magnetic stripe. Sounds less than secure, right? When the card is used, it is inserted into a slot on a machine, and the machine asks for the PIN. The user is expected to enter the PIN, which the machine uses to create a hash, which is then compared to the hash stored on the card. The user is not allowed to use the card if the two hashes do not match.

This makes the use of the ATM card an example of two level security. The card alone is not enough. The user must provide the PIN (something you know) which is used by the machine to create a hash, which is compared to the hash on the card (something you have). Of course, the system breaks down if the hash algorithm becomes known to hackers. Feel like changing your PIN now?

The text lists four characteristics of a secure hashing algorithm on page 191:

  • the hash has a fixed size regardless of the size of the cleartext
  • different sets of data will always result different hashes
  • it is impossible to create a data set that will result in a specific defined hash (This one worries me.)
  • the hash algorithm cannot be reversed to decrypt the original input (This one worries me, too.)

Hash algorithms are typically used to encrypt passwords when users log in to networks. Like the ATM example, the password is never stored in plaintext, only in hash form. The system creates a hash from the password the user enters, and this is compared to the hash for that user ID in the password file. This brings us to the table on page 192 that indicates the only security feature provided by hashing: integrity. The other features are defeated by the fact that the hash cannot be decrypted, and the fact that another user may use a stolen password (or PIN).

An experienced hacker could use rainbow tables to compare to a captured hash. A rainbow table holds the hash values of known words and numbers. If the hacker finds a match, the password is no longer secret.

Message Digest (MD) Hashing

The text discusses three versions of this hashing algorithm. The short story is that none of them are recommended as being secure any longer.

  • Message Digest 2 (MD2) - a 16 bit (processor size) algorithm from 1989; creates a hash that is 128 bits long. Messages are processed as 128 bit sections. If a message is not a multiple of 128 bits, the last section is padded with characters to make 128 bits. A 128 bit (16 byte) checksum is added to the message, padded or not, before hashing. Considered too slow to use now.
  • Message Digest 4 (MD4) - a 32 bit (processor size) algorithm from 1990; creates a hash that is 128 bits long. Messages are processed as 512 bit sections. The text notes that this algorithm can create collisions (identical hashes).
  • Message Digest 5 (MD5) - a 32 bit (processor size) algorithm from 1991; uses 4 rotating 32 bit variables; cracked around the turn of the century

Secure Hash Algorithm (SHA)

SHA has two major types, but the second one has four subtypes. It was created by the National Security Agency and the National Institute of Standards and Technology.

  • SHA-1 - the text says this works like MD4, but creates a hash that is 160 bits long. Wikipedia says it is like MD5. It should not be considered secure.
  • SHA-2 has four subtypes, whose names indicate the length of their resulting hashes:
    • SHA-224
    • SHA-256
    • SHA-384
    • SHA-512
  • SHA-3 is currently considered secure (fall of 2014).

Whirlpool

The text mentions the Whirlpool hash function, but does not elaborate on it. See the linked article on Wikipedia for more details. Note that it produces a 512 bit hash (digest), and that it has been endorsed by ISO.

Password Hashes

As previously discussed, local or domain passwords entered on a computer running Windows are converted by a hash program and compared to a stored hashed version of the user's current password. The text discusses the two versions used in common versions of Windows: LM hash and NTLM hash. We learn that LM hash is not considered a "real" hash because its result is cryptographic(character substitution) instead of numeric (hexadecimal digits). It was used in versions of Windows before Windows NT. NTLM is used in later versions, but we have already been informed that passwords shorter than 14 characters are stored both ways.

Linux systems may use an MD5 hash. The text says that Apple OS X systems use SHA-1, but use NTLM as well if Windows sharing is implemented. This post indicates that OS X is using SHA-512 instead.

Symmetric Cryptographic Algorithms

Unlike hashes, cryptographic algorithms are typically meant to be used for encryption and decryption. The methods in this group use the same key to encrypt and to decrypt, which is why they are called symmetric. They are also called private key algorithms because the key must remain private to the users of the system or there is no security. (This seems like an obvious point, but we will consider another system where it is not true.)

The text divides symmetric algorithms into two groups. Stream ciphers encrypt one character at a time (from the flowing stream of data). Block ciphers divide the message into blocks of a specific size, then encrypt each block as a unit. The text discusses some methods used by stream ciphers:

  • substitution - each character in the plaintext may be changed to a specific character in the ciphertext, or the cipher may rotate between several cipher algorithms, allowing each character in the plaintext to have several ciphertext representations
  • transposition - the plaintext is reordered based on a key, then the reordered text may be processed against the plaintext in an XOR operation to create the final ciphertext; the text mentions that the reordered text may be processed against a One Time Pad (a random key string)

These are just examples, of course. Many variations exist on the methods discussed. The text lists three symmetric algorithms to be aware of:

  • DES - Data Encryption Standard
  • 3DES - Triple Data Encryption Standard
  • AES - Advanced Encryption Standard
Asymmetric Cryptographic Algorithms

It should be obvious that asymmetric (not symmetric) algorithms will use different keys. These algorithms are also called public key cryptography. This name does not describe the method well. A person must have two keys in this system, a public key and a private key. They are created so that whatever is encrypted with one must be decrypted with the other. The owner of the keys gives the public key to anyone who wants it, but keeps the private key safe from anyone else.

This is how SSL encryption on a web site works. I connect to a vendor's web site. I obtain the vendor's public key by making the secure connection. My browser encrypts my credit card data with the public key and sends the ciphertext to the vendor. If the vendor's private key is secure, the vendor is the only one who can decrypt the data sent through the public key. In this way, a key is made available to anyone who wants it, but using it makes the data unintelligible to everyone who does not have the private key.

On page 201, the author describes a process to use a private key to make a digital signature. Note that the user's private key stays in his possession during this transaction, which makes use of the fact that data encrypted with the private key can be decrypted with the public key. This does not make that data secure, but it does prove that the data stream was encrypted with the matching key.

The text lists three asymmetric algorithms to be aware of:

  • RSA - named for its creators, so there is no acronym meaning
  • Elliptic Curve Cryptography - read the description on page 203, then shake your head; just know that it exists, and that it uses a curve and a point on it
  • NTRUEncrypt - uses a lattice of points, and vectors to them from a reference point (see page 204)

The text lists four solutions for key exchanges, which is one of the more difficult moments in setting up a system. Secure exchange of keys is critical to preserve any kind of secrecy.

  • Diffie-Hellman - named for its creators; it is only used to allow two users to share a key, enabling them to use symmetric cryptography
  • Diffie-Hellman Ephemeral - like the one above, but it uses different keys each time (ephemeral means temporary)
  • Elliptic Curve Diffie-Hellman - uses an elliptic curve instead of prime numbers
  • Perfect forward secrecy - a public key system that generates new public keys each time a new session is started
Cryptography on Files and Disks

The text discusses two systems that can be used to encrypt files.

Pretty Good Privacy (PGP)

Although the heading is about PGP, a commercial product, the discussion is also about GPG (Gnu Privacy Guard), which is an Open Source product. The two products are typically compatible.

Everyone who uses PGP will have a public key that is freely available, a private key that remains secure, and everyone can generate new keys as needed.

  1. When I want to send a message to you, I generate a new symmetric key for that message.
  2. I encrypt the message with the symmetric key.
  3. I encrypt the symmetric key with your public key.
  4. I send the encrypted message and the encrypted key to you.
  5. You are the only person who can decrypt the encrypted key, by using your private key.
  6. You then use the decrypted symmetric key to decrypt the message.

In this way, PGP (and GPG) can use both symmetric and asymmetric keys.

Microsoft Windows Encrypting File System (EFS)

This file encryption system is transparent to the user who turns it on for a folder or file on his/her system. Note the bullets on page 207 that list features and concerns for this system.

Windows BitLocker

This is a total disk encryption system that is available on some (not all) versions of Windows Vista and Windows 7. It encrypts an entire hard drive, which makes the system more secure against theft. The drive becomes accessible when a user authorized for the device logs in. The drive remains encrypted, but a decryption key is loaded into RAM, making the system transparent to an authorized user.

Note the potential vulnerability in BitLocker mentioned on page 208. First, a computer encrypted in this way is vulnerable if the attacker gains access while a user is logged in. Second, the device may be accessible if an attacker can harvest the decryption key while the device is in sleep or hibernation (not locked) mode. Advice? Don't walk away from that computer without locking it or turning it off. Hibernation and sleep modes rarely work well to begin with, so don't use them.

Trusted Platform Module (TPM)

This is a hardware technology that supports hard drive encryption. It features a real random number generator in a chip that is also used for encryption and decryption.