CSS 211 - Introduction to Network Security

Lesson 4 - Basic Cryptography, Applying Cryptography

Objectives:

This lesson covers chapters 5 and 6 in the text. It discusses cryptography algorithms and applications. Objectives important to this lesson:

  1. Cryptography
  2. Hashing algorithms
  3. Symmetric algorithms
  4. Asymmetric algorithms
  5. File and file system cryptography
  6. Whole disk encryption
  7. Digital certificates
  8. Use of digital certificates
  9. Public Key Infrastructure
  10. Key management
  11. Transport protocols
Concepts:
Cryptography

Chapter 5 begins with some stories about encrypting laptops, and the dangers of not doing so, but the text does not return to this particular topic for many pages. The author backs up a few steps to discuss cryptography in general. Cryptography (hidden writing) is defined on page 186 as the scrambling of a message. This does not explain the idea.

The example on page 186 of the cipher used by Julius Caesar is clearer. It will be helpful first to define a cipher as a substitution of characters. In the example in the text, each letter in the original message (called cleartext or plaintext) is changed to another character by a specific algorithm (method), resulting in an encrypted message that displays in ciphertext. The Caesar cipher is easy to break, since the same cipher letter is substituted each time the same plaintext letter occurs in the message. The cipher is even more vulnerable due to its simple algorithm: add three to each plaintext letter, wrap to front of alphabet as needed (for x, y, and z). It would be harder to break if a random substitution of letters had been used. Ciphers have evolved a good bit, as we will see.

Messages can also be scrambled by steganography. (covered writing?) The text briefly reviews a few methods of doing this: hiding a message in unused parts of a file, hiding it in metadata, hiding it where the uninformed would not look, and hiding it in images. The short form is that an image typically has three bytes (RGB) of color information for each pixel in it. It is unlikely that anyone just looking at an image could tell the difference between pixels that are true to color and those that have had each of their least significant color bits changed as needed to hide/provide data. If you change one bit per color, you can hide one byte every three pixels.

Imagine that the table below represents a series of pixels. I have used cells in a table to make the idea more visual. I have put a reference color in the first cell: hex code 58C314 stands for 111, because I chose that color as the key. I have modified the color in each of the other cells in the second row to indicate three bits. The bits are indicated by the color's deviation from the key color.

58c314
refcolor

57c313

58c213

58c313

58c314

57c313

57c214
111 010 100 110 111 010 001


58c213

58c214

57c314

58c214

58c213

58c313

57c313

58c314
100 101 011 101 100 110 010 111

The binary code for that sequence, which would have taken 15 pixels, is:

  • first three ones are reference
  • 010 100 11
  • 0 111 010 0
  • 01 100 101
  • 011 101 10
  • 0 110 010 1
  • last two ones are padding

This example used seven variations on one color. The sender could send an image in which every pixel was modified if the receiver already had a reference copy to the image for comparison. I have done this by hand: an application that encrypts a message in an image or audio file would be much faster.

On page 189, the text lists five kinds of security that cryptography might provide, and notes that not all kinds of cryptography provide all five features. (The next kind discussed in the text provides only one.)

  • confidentiality - only those with the decryption algorithm can read it
  • integrity - the message is protected from changes (unless a man in the middle has the encryption algorithm)
  • availability - anyone with the decryption algorithm can read it
  • authenticity - a secure algorithm proves the message was sent by a trusted party
  • non-repudiation - the encryption may prove who did something and when it was done
Hashing

The text continues with a general discussion of hashing, which then branches into specific types. Hashing is defined on page 190 as creating a unique encrypted result from a data set. The encrypted result is called a hash, a signature, or a digest, all of which mean the same thing. The hash should not resemble the plaintext in appearance or in length. Note that the text also refers to this method as a one-way hash, which means that the hash resulting from the algorithm is not meant to be unencrypted. The reason for this becomes clear in the example of its use with an ATM card.

Assume you have an ATM card that is assigned a Personal ID Number (PIN). The card provider used a hash algorithm to create a hash from your PIN which is stored on the card on a magnetic stripe. Sounds less than secure, right? When the card is used, it is inserted into a slot on a machine, and the machine asks for the PIN. The user is expected to enter the PIN, which the machine uses to create a hash, which is then compared to the hash stored on the card. The user is not allowed to use the card if the two hashes do not match.

This makes the use of the ATM card an example of two level security. The card alone is not enough. The user must provide the PIN (something you know) which is used by the machine to create a hash, which is compared to the hash on the card (something you have). Of course, the system breaks down if the hash algorithm becomes known to hackers. Feel like changing your PIN now?

The text lists four characteristics of a secure hashing algorithm on page 191:

  • the hash has a fixed size regardless of the size of the cleartext
  • different sets of data will always result different hashes
  • it is impossible to create a data set that will result in a specific defined hash (This one worries me.)
  • the hash algorithm cannot be reversed to decrypt the original input (This one worries me, too.)

Hash algorithms are typically used to encrypt passwords when users log in to networks. Like the ATM example, the password is never stored in plaintext, only in hash form. The system creates a hash from the password the user enters, and this is compared to the hash for that user ID in the password file. This brings us to the table on page 192 that indicates the only security feature provided by hashing: integrity. The other features are defeated by the fact that the hash cannot be decrypted, and the fact that another user may use a stolen password (or PIN).

An experienced hacker could use rainbow tables to compare to a captured hash. A rainbow table holds the hash values of known words and numbers. If the hacker finds a match, the password is no longer secret.

Message Digest (MD) Hashing

The text discusses three versions of this hashing algorithm. The short story is that none of them are recommended as being secure any longer.

  • Message Digest 2 (MD2) - a 16 bit (processor size) algorithm from 1989; creates a hash that is 128 bits long. Messages are processed as 128 bit sections. If a message is not a multiple of 128 bits, the last section is padded with characters to make 128 bits. A 128 bit (16 byte) checksum is added to the message, padded or not, before hashing. Considered too slow to use now.
  • Message Digest 4 (MD4) - a 32 bit (processor size) algorithm from 1990; creates a hash that is 128 bits long. Messages are processed as 512 bit sections. The text notes that this algorithm can create collisions (identical hashes).
  • Message Digest 5 (MD5) - a 32 bit (processor size) algorithm from 1991; uses 4 rotating 32 bit variables; cracked around the turn of the century

Secure Hash Algorithm (SHA)

SHA has two major types, but the second one has four subtypes. It was created by the National Security Agency and the National Institute of Standards and Technology.

  • SHA-1 - the text says this works like MD4, but creates a hash that is 160 bits long. Wikipedia says it is like MD5. It should not be considered secure.
  • SHA-2 has four subtypes, whose names indicate the length of their resulting hashes:
    • SHA-224
    • SHA-256
    • SHA-384
    • SHA-512
  • SHA-3 is currently considered secure (fall of 2014).

Whirlpool

The text mentions the Whirlpool hash function, but does not elaborate on it. See the linked article on Wikipedia for more details. Note that it produces a 512 bit hash (digest), and that it has been endorsed by ISO.

Password Hashes

As previously discussed, local or domain passwords entered on a computer running Windows are converted by a hash program and compared to a stored hashed version of the user's current password. The text discusses the two versions used in common versions of Windows: LM hash and NTLM hash. We learn that LM hash is not considered a "real" hash because its result is cryptographic(character substitution) instead of numeric (hexadecimal digits). It was used in versions of Windows before Windows NT. NTLM is used in later versions, but we have already been informed that passwords shorter than 14 characters are stored both ways.

Linux systems may use an MD5 hash. The text says that Apple OS X systems use SHA-1, but use NTLM as well if Windows sharing is implemented. This post indicates that OS X is using SHA-512 instead.

Symmetric Cryptographic Algorithms

Unlike hashes, cryptographic algorithms are typically meant to be used for encryption and decryption. The methods in this group use the same key to encrypt and to decrypt, which is why they are called symmetric. They are also called private key algorithms because the key must remain private to the users of the system or there is no security. (This seems like an obvious point, but we will consider another system where it is not true.)

The text divides symmetric algorithms into two groups. Stream ciphers encrypt one character at a time (from the flowing stream of data). Block ciphers divide the message into blocks of a specific size, then encrypt each block as a unit. The text discusses some methods used by stream ciphers:

  • substitution - each character in the plaintext may be changed to a specific character in the ciphertext, or the cipher may rotate between several cipher algorithms, allowing each character in the plaintext to have several ciphertext representations
  • transposition - the plaintext is reordered based on a key, then the reordered text may be processed against the plaintext in an XOR operation to create the final ciphertext; the text mentions that the reordered text may be processed against a One Time Pad (a random key string)

These are just examples, of course. Many variations exist on the methods discussed. The text lists three symmetric algorithms to be aware of:

  • DES - Data Encryption Standard
  • 3DES - Triple Data Encryption Standard
  • AES - Advanced Encryption Standard
Asymmetric Cryptographic Algorithms

It should be obvious that asymmetric (not symmetric) algorithms will use different keys. These algorithms are also called public key cryptography. This name does not describe the method well. A person must have two keys in this system, a public key and a private key. They are created so that whatever is encrypted with one must be decrypted with the other. The owner of the keys gives the public key to anyone who wants it, but keeps the private key safe from anyone else.

This is how SSL encryption on a web site works. I connect to a vendor's web site. I obtain the vendor's public key by making the secure connection. My browser encrypts my credit card data with the public key and sends the ciphertext to the vendor. If the vendor's private key is secure, the vendor is the only one who can decrypt the data sent through the public key. In this way, a key is made available to anyone who wants it, but using it makes the data unintelligible to everyone who does not have the private key.

On page 201, the author describes a process to use a private key to make a digital signature. Note that the user's private key stays in his possession during this transaction, which makes use of the fact that data encrypted with the private key can be decrypted with the public key. This does not make that data secure, but it does prove that the data stream was encrypted with the matching key.

The text lists three asymmetric algorithms to be aware of:

  • RSA - named for its creators, so there is no acronym meaning
  • Elliptic Curve Cryptography - read the description on page 203, then shake your head; just know that it exists, and that it uses a curve and a point on it
  • NTRUEncrypt - uses a lattice of points, and vectors to them from a reference point (see page 204)

The text lists four solutions for key exchanges, which is one of the more difficult moments in setting up a system. Secure exchange of keys is critical to preserve any kind of secrecy.

  • Diffie-Hellman - named for its creators; it is only used to allow two users to share a key, enabling them to use symmetric cryptography
  • Diffie-Hellman Ephemeral - like the one above, but it uses different keys each time (ephemeral means temporary)
  • Elliptic Curve Diffie-Hellman - uses an elliptic curve instead of prime numbers
  • Perfect forward secrecy - a public key system that generates new public keys each time a new session is started
Cryptography on Files and Disks

The text discusses two systems that can be used to encrypt files.

Pretty Good Privacy (PGP)

Although the heading is about PGP, a commercial product, the discussion is also about GPG (Gnu Privacy Guard), which is an Open Source product. The two products are typically compatible.

Everyone who uses PGP will have a public key that is freely available, a private key that remains secure, and everyone can generate new keys as needed.

  1. When I want to send a message to you, I generate a new symmetric key for that message.
  2. I encrypt the message with the symmetric key.
  3. I encrypt the symmetric key with your public key.
  4. I send the encrypted message and the encrypted key to you.
  5. You are the only person who can decrypt the encrypted key, by using your private key.
  6. You then use the decrypted symmetric key to decrypt the message.

In this way, PGP (and GPG) can use both symmetric and asymmetric keys.

Microsoft Windows Encrypting File System (EFS)

This file encryption system is transparent to the user who turns it on for a folder or file on his/her system. Note the bullets on page 207 that list features and concerns for this system.

Windows BitLocker

This is a total disk encryption system that is available on some (not all) versions of Windows Vista and Windows 7. It encrypts an entire hard drive, which makes the system more secure against theft. The drive becomes accessible when a user authorized for the device logs in. The drive remains encrypted, but a decryption key is loaded into RAM, making the system transparent to an authorized user.

Note the potential vulnerability in BitLocker mentioned on page 208. First, a computer encrypted in this way is vulnerable if the attacker gains access while a user is logged in. Second, the device may be accessible if an attacker can harvest the decryption key while the device is in sleep or hibernation (not locked) mode. Advice? Don't walk away from that computer without locking it or turning it off. Hibernation and sleep modes rarely work well to begin with, so don't use them.

Trusted Platform Module (TPM)

This is a hardware technology that supports hard drive encryption. It features a real random number generator in a chip that is also used for encryption and decryption.

Chapter 6 continues the discussion of cryptography. On page 229, it returns to the topic of digital certificates. As discussed above, it can be based on the public key system, but it has a problem. Consider the discussion in the text of the way it should work: If you and I both have public and private keys, I can encrypt something with my private key that you can decrypt with my public key. Anyone else could decrypt it as well, but this step proves that the encrypted form came from me, or at least from the person whose public key you have.

In the example in the text, I can send a message to you encrypted with your public key (which only you can decrypt) along with a hash made from the message and encrypted with my private key. The encryption of the hash proves that it came from me: you can decrypt it with my public key. The hash itself will match a hash you can make from the decrypted message, proving that it has not changed. This is a secure system as long as our private keys remain private, and our public keys are not available except by a secure method. As the book explains, the system fails if you think you have my public key, but you really have the key of an impostor who sent the message in the first place. In other words, you know that the message is unchanged since it left the person who sent it, but how can you be sure that it came from the person you think it came from? How do you know my public key is really mine?

You need to get my public key from a trusted source, like Verisign, which is now affiliated with Symantec/Norton. If I register my public key with a trusted source (typically a certificate authority), or have them create it in the first place, they provide me with a copy of the key that is endorsed by them, which makes it a digital certificate. When I send this certificate to someone, they can verify that it is from an authority that can be trusted. Note the properties of a digital certificate listed on page 231:

  • name or alias of the public key owner
  • public key
  • name of the certificate issuer
  • digital signature of the issuer
  • serial number of the certificate
  • expiration date of the certificate (technically, not of the key, but they may be the same)

This leads us to an industry that supports verified identities by the use of digital certificates. Some terms associated with it:

  • certificate authority (CA) - an agency that issues digital certificates; the CA may create key pairs for users or accept them from users who create them; the CA is responsible for verifying the identity of the key owner; organizations with the proper software can act as their own internal CA
  • registration authority (RA) - an entity that takes on some of the tasks of a CA and operates on their behalf
  • certificate revocation list (CRL) - a list of certificates that are no longer valid for various reasons
  • certificate repository (CR) - a public directory of valid certificates; may contain invalid certificates as well or may link to a CRL

The text moves on to discuss different types of digital certificates.

  • personal digital certificates (class 1) - for individuals to use as needed
  • server digital certificates (class 2) - for web servers, mail servers, and other type servers to confirm identity of the operator; also used by servers to confirm secure SSL sessions
  • software publisher digital certificates (class 3) - to confirm data integrity of published software
  • X.509 - a standard defined by the ITU; features are listed on page 239

The certificates described so far have been single-sided, which means they only confirm identity. Dual-sided certificates are actually two certificates: one to confirm identity (signing certificate), and the other to confirm encryption (encryption certificate).

Public Key Infrastructure

The Public Key Infrastructure is the system (or a specific system) that includes all the concepts listed so far in this chapter. The text discusses three aspects of it: standards, trust models, and key management.

Standards

The text lists fifteen standards (page 241) that apply to pubic key cryptography. The same standards are available on the Internet, such as this version at Wikipedia. Note that numbers 2 and 4 are no longer in force, having been merged into number 1. As the text explains, these standards actually belong to the RSA Data Security company, but they are "widely accepted".

Trust Models

The text lists five trust models, but discusses three of them. One that is only mentioned is the web of trust, which is the model that would exist without any Certificate Authorities. The text describes this situation as one in which all entities know and trust each other directly (direct trust). The other models involve third party trust, in which Certificate Authorities are used.

  • Hierarchical Trust Model - one CA (the root) grants all certificates with a single key, and all entities in the system trust it; there is a single point of failure and all work falls on the one CA
  • Distributed Trust Model - multiple CAs exist; a CA can delegate certificate duties to other CAs; a single failure will not break the entire system
  • Bridge Trust Model - a model that is actually a hybrid of the other two; a bridge CA links CAs in the other kinds of models

Managing PKI

The text describes three concepts to manage the PKI entities better. More acronyms to recognize.

  • Certificate Policy (CP) - a set of rules that dictate what the CAs, RAs, and other entities must do to operate
  • Certificate Practice Statement (CPS) - if a CP is a set of rules, a CPS is a set of procedures for implementing those rules; the CPS states how entities request and perform services
  • Certificate Life Cycle - certificates should have a planned creation and removal process, so this concept addresses four life events for a certificate:
    • creation - includes the procedures to follow to verify identity, grant and issue a certificate, and publish a certificate
    • suspension - when a certificate should go out of use temporarily, the suspension process is used
    • revocation - when a certificate is found to be invalid because it has been compromised, the owner is out of business, or for other reasons, the certificate is revoked and moved to the CRL
    • expiration - the normal end of a certificate's life; one might wonder if this is to sell the owner another one, but it is also sensible to end certificates regularly in case they have neither been suspended nor revoked but should have been and the CA knew nothing about it

The text continues with a discussion of the care of keys used in PKI. Three concepts and more subconcepts once again:

  • Key storage - the text recommends storing public keys in digital certificates, or on hardware devices from the CA; it recommends storing private keys on hardware like smart cards or tokens, since they can be vulnerable to attackers when left in file storage
  • Key usage - this concerns how to use multiple key pairs to maintain as much security as possible
  • Key handling procedures - this is similar to the life cycle concept for certificates; it could have been called a key life cycle but it is called key handling instead
    • escrow - a third party, such as a CA, holds the keys
    • expiration - keys expire so that a stolen key is not good forever
    • renewal - this is an extension to the life of a key about to expire; this invalidates the security aspect of expiration
    • revocation - turning off the functionality of a key before the expiration date; there is no reinstatement from this
    • recovery - although it is not listed as a handling method, a key can be archived, and removing it from that status is recovery
    • suspension - a temporary status to remove the key from use
    • destruction - a permanent status that erases the information about the key and the user from the CA, which is not done for expiration or revocation
Cryptographic Transport Protocols

The last major concept for the chapter discusses types of protocols associated with encrypted data flowing across a network:

  • File Transfer Protocol (FTP) - FTP does what it sounds like, it moves or copies files in a TCP/IP environment; the text describes its problem: no encryption, which makes it vulnerable to man-in-the-middle attacks
    The text recommends using a secure protocol instead when updating web sites or moving sensitive data. Secure FTP (SFTP) can use either of the next two protocols as a basis.
  • Secure Sockets Layer (SSL)/Transport Layer Security (TLS) - You have probably used SSL every time you have made a purchase across the Internet. The vendor site sends a public key to your computer, your computer encrypts the transaction, and the resulting SFTP data stream can only be decrypted with the vendor's private key. (You should know this concept very well by now.)
    The text explains that TLS is an improvement over SSL. This link goes to a Wikipedia article that discusses the same point, as well as the uses of TLS which include email.
  • Secure Shell (SSH) - SSH is another method used to implement SFTP. This is a Unix based protocol, that can be used to replace Telnet, and it is used to provide secure login, file operations, and command line operations on the remote server.
  • Web protocols - such as HTTPS and SHTTP, provide SSL versions of HTTP. Note the discussion of ports: HTTP typically uses port 80, while HTTPS typically uses port 443.
  • IP Security (IPsec) - IPsec is described as a preferred protocol because it is implemented at a lower layer in the ISO network model (Network layer) than PGP (Application layer), Kerberos, or SSL (both at the Session layer). As such, it is more transparent to processes that occur at higher layers, to users, and to software running on the workstation. The text discusses the virtues of IPsec for several pages.
  • S/MIME - uses digital certificates to protect email. This protocol is built into most email applications.