ITS 4550 - Fraud Prevention and Deterrence


Chapter 3, Cryptographic Concepts

This lesson presents material from chapter 3. Objectives important to this lesson:

  1. Background on cryptography
  2. Algorithms and ciphers
  3. Symmetric encryption
  4. Asymmetric encryption
  5. Public key infrastructure
  6. Hashing
  7. Common systems
  8. Cryptanalysis
  9. Possible future forms
Concepts:
Chapter 3

The chapter begins with a reference to the cipher used by Julius Caesar. It will be helpful to define a cipher as a substitution of characters. A code is actually something else. Codes do not translate individual characters. They often use short "words" or symbols to represent real sentences or sequences of actual words. In the example in the text, each letter in the original message (called cleartext or plaintext) is changed to another character by a specific algorithm (method), resulting in an encrypted message that displays in ciphertext. The Caesar cipher is easy to break, since the same cipher letter is substituted each time the same plaintext letter occurs in the message. The cipher is even more vulnerable due to its simple algorithm: add three to each plaintext letter, wrap to front of alphabet as needed (for x, y, and z). It would be harder to break if a random substitution of letters had been used.

More history appears on pages 54 and 55. It compares Egyptian hieroglyphics to the Caesar Cipher, calling them another example of a substitution cipher. In case you didn't know, most hieroglyphic symbols stood for sounds in the Egyptian language, what we might call phonemes. Someone who could read the symbols as sounds could speak the message on the wall, column, or stone where it was carved. More modern substitution ciphers (like Julius Caesar's) substitute a particular character for a letter in the alphabet of the message, which takes less space on the page.

Page 55 shows part of a more elaborate substitution cipher, based on a Vigenère square for English. The square shows a Caesar cipher variant for each letter of the alphabet, as you can see in the image on the right. The row for the letter G, for example, starts with G and ends with F. If that was all there was to it, this would simply be a repeating series of 26 ciphers. There is more to it.

The person encrypting the message needs to know a key word. For instance, the key could be Caesar. That would mean we would use the six letters in that word to control which row in the table is used to encrypt or decrypt the next letter in the message. "Caesar" would mean we would use row C, row A, row E, row S, row A again, and row R on a rotating basis as we encrypt the whole message.

The method is more difficult to break if the key is not an actual word. A long series of randomly chosen letters could be the key, making it hard for a human to use, and also hard to communicate a new key whenever it is needed. Computers make this one more workable.

Regarding the cipher in the first line of the table, note that there is no offset. A stands for A and so on. This row appears in the cipher to make the table symmetrical. It would never be used in an actual transmitted message unless you were rotating for each character. As a single row cipher, it would accomplish nothing, since the result would be identical to the original message.
A mechanical device that incorporated changing ciphers was the Enigma machine. There were many models, including commercial versions dating back to 1923. Modifications were made for the versions used by the German military in World War II. Major sections of the device are indicated in the image on the right. Follow the link above for a very good article that describes the operation and functions of various models.

One could argue that the Enigma mechanism was invented for security alone, but it seems likely that it was also created to eliminate human errors in what was a process done by hand before its invention. In fact, such a device allows a rotating cipher system to be more elaborate, faster to use, and more accurate.

The text mentions using a concealment cipher, which is a better name for steganography. (concealed writing?) It can be done several ways. The text only mentions using the first letter of each word in a message, which is not very concealed. Other methods of doing this include hiding a message in unused parts of a file, hiding it in metadata, hiding it where the uninformed would not look, and hiding it in images. An image typically has three bytes (RGB) of color information for each pixel in it. It is unlikely that anyone just looking at an image could tell the difference between pixels that are true to color and those that have had each of their least significant color bits changed as needed to hide/provide data. If you change one bit per color, you can hide one byte in every three pixels. Where the color is the same, it means 1. Where the color is one bit different in a byte, it means 0. Let me show you what I mean.


Imagine that the table below represents a series of pixels. I have used cells in a table to make the idea more readable. I have put a reference color in the first cell: hex code 58C314 stands for 111, because I chose that color as the key. I have modified the color in each of the other cells in the second row to indicate three bits. Remember, those cells represent consecutive pixels in an image that are supposed to be the same color as the first one. The bits are indicated by the pixel's color deviation from the key color. By the way, the colors displayed are as accurate as your browser can make them. Real image files would be exchanged by real information passers.

58c314
(reference color)

57c313
57 is 1 less than 58, so 0.
c3 is a match, so 1.
13 is 1 less than 14, so 0.

58c213
58 is a match, so 1.
c2 is 1 less than c3, so 0.
13 is 1 less than 14, so 0.

58c313

58c314

57c313

57c214
111 010 100 110 111 010 001


58c213

58c214

57c314

58c214

58c213

58c313

57c313

58c314
100 101 011 101 100 110 010 111

The binary code for that sequence, which would have taken 15 pixels, is:

  • first three ones are reference
  • 010 100 11
  • 0 111 010 0
  • 01 100 101
  • 011 101 10
  • 0 110 010 1
  • last two ones are padding

This example used one base color and seven variations. The sender could send an image in which every pixel was modified if the receiver already had a reference copy to the image for comparison. This would avoid the need for a sequence of pixels that that were meant to be the same color. I have done this operation by hand: an application that encrypts a message in an image or audio file would be much faster. This is not a foolproof system, but it has an advantage. Most people would not know to look for it, and would not notice the difference between the eight possible colors being used in this process.

In the image on the right, I have made eight rows, each of which is filled with a single color. The colors are as described above, and they are true. They do not suffer from your web browser approximating the intended color. Trust me, there are eight colors in it. Or, don't trust me. Download the image and examine it in an art program. Which is the base color, the stripe at the top of the image, or the one at the bottom? Or have I arranged them differently?

Let's move on to page 57, and the discussion of symmetric keys. The methods in this group use the same key to encrypt and to decrypt, which is why they are called symmetric. They are also called private key algorithms because the key must remain private to the users of the system or there is no security. (This seems like an obvious point, but we will consider another system where it is not true.) In an FYI box on page 58, the text mentions that we can consider the length of a key as part of the strength of a cipher, but we have to consider the complexity of the algorithm that will use the key as well. A very long key is not a guarantee of security if the algorithm that uses it is relatively easy to crack.

The text remarks that Data Encryption Standard (DES) was well named when it came out in 1976. It was a standard symmetric encryption algorithm for a long time in the lifespan of such things. It was cracked in a demonstration in 1998. NIST asked for a replacement, and the chosen algorithm was Advanced Encryption Standard (AES). Just considering the calendar, it may be time for a new standard soon. Several are listed on pages 59 and 60.

The text continues with a discussion of asymmetric key systems, which typically mean that one key is used for encrypting a message, but another must be used to decrypt that message. The most used system is public key cryptography, in which a pair of keys is created for each entity. One is called the private key, and is kept private and secure. The other is called the public key, which is provided to anyone wishing to send encrypted data to the first entity. The principle is that messages encrypted with my public key can only be decrypted with my private key, which works as long as no one else has a copy of my private key. The video below summarizes the main points in the discussions of symmetric and asymmetric systems.



The text provides more detail about how the world uses the public key method: Public Key Infrastructure. Public Key Infrastructure is not the only cipher system used in business or government, but it is widely used by both, and by individuals to protect personal or sensitive information. There is a difference between PKI and public key cryptography.

  • Public key cryptography is a system in which each entity has two cryptographic keys, each of which is the only means to decrypt what was encrypted by the other.
  • Public Key Infrastructure is a system of using public key cryptography, distributing keys through trusted sources, and revoking keys that have been compromised.

Public key cryptography is how SSL encryption on a web site works. I connect to a vendor's web site. I obtain the vendor's public key by making the secure connection. My browser encrypts my credit card data with the vendor's public key and sends the ciphertext to the vendor. If the vendor's private key is secure, the vendor is the only one who can decrypt the data sent through the public key.

That's the way it is supposed to work in a perfect world. However, attackers have created a need for a security net around the process. In a way, PKI is the success story of businesses that have grown up around this technology. Components of public key infrastructure:

  • Certificate authority - An entity, typically a company, that creates digital certificates, which are verified statements of a public key and its owner. They may also create the key pair for the customer, and are responsible for storing and providing certificates as needed.
  • Registration authority - An entity that receives requests for certificates, verifies the requests are from recognized users (such as merchants processing credit cards), and forwards the requests to certificate authorities.
  • Certificate server - A service, or the device that runs the service, that responds to certificate requests.
  • Certificate repository - A database for storing digital certificates, sometimes including records of revoked certificates.
  • Certificate revocation list - A list of certificates that are no longer valid for various reasons.
  • Certificate validation - A process used to make sure that a request submitted for certificate creation actually came from the organization it appears to come from, and that the key submitted in the request is theirs.
  • Key Recovery Service - A service that stores and recovers encryption keys in case they should be lost, for example in a system crash or attack.
  • Time server - A service that provides a standard time reference, used to mark the time of requests and responses. Timestamps may be used to judge whether requests are being processed by the entity we expect to process it.
  • Signing server - In a system that is increasingly automated, this is a central control over related services.
Basic Encryption and PKI

The text lists several encryption systems have been used. You should be aware of some examples of symmetric and asymmetric systems.

The text lists three symmetric algorithms to be aware of:

  • DES - Data Encryption Standard
  • 3DES - Triple Data Encryption Standard
  • AES - Advanced Encryption Standard

It also lists some asymmetric algorithms:

  • RSA - named for its creators, so there is no acronym meaning
  • Diffie-Hellman - also named for its creators; does not seem to belong in this group, since it is only used to allow two users to share a key, enabling them to use symmetric cryptography
  • Elliptic Curve Cryptography - the link takes you to an Ars Technica article that reviews all three methods, and may hurt to read; just know that it exists, and that it predicted the end of RSA within 5 years. The article was written in 2013. Well, that has not happened yet, but it is still a good article.

Algorithms use a set of values or characters to create keys and to encrypt messages with those keys. The set of values is the keyspace. Larger keyspaces mean more possible keys from the algorithm. This is what makes it harder to guess the actual contents of a key. Think about that. We rely on secrecy about the algorithm and on the complexity of the keyspace to make security of this type possible. And unless we do something special with the algorithm, most are known, so we only have to know the key and right algorithm to be able to decrypt a message sent in symmetric key system. Are you worried now?

Digital Certificates

A very useful concept is a digital certificate, which contains data about a keyholder, and is provided by a certificate authority. The most critical feature is the public key, but the other factors are required by the X.509 standard. The link to Wikipedia tells us that X.509 is an international standard for PKI. Some of the elements included in that standard are:

  • Version Number
  • Serial Number
  • Signature Algorithm ID
  • Issuer Name
  • Validity period
    • Not Before
    • Not After
  • Subject name
  • Subject Public Key Info
    • Public Key Algorithm
    • Subject Public Key
  • Issuer Unique Identifier (optional)
  • Subject Unique Identifier (optional)
  • Extensions (optional)

You should know that keys are destroyed when they are compromised and when they reach the end of their intended life. This is more about private keys than public keys. Note that lifetime should be related to the sensitivity of the use the key serves. More sensitive equals shorter life.

This concept is easily confused with certificate revocation. When certificates are compromised (stolen and used by other entities) those certificates are moved to a Certificate Revocation List (CRL) which may or may not be part of the certificate repository.

What PKI is and is not

PKI can provide security, integrity, and nonrepudiation. It is used for financial transactions and downloaded file integrity. PKI is meant to be one layer of security.

It does not include authorization functions. It does not prove the identity of someone who is only using the public key in a key pair. If encrypted data is desired in both directions in a session, both entities should have a key pair, and each will need the other's public key.

Also, PKI does not prove that someone who has a key pair is trustworthy. Unless you already trust a vendor, the fact that they use a public key pair does not by itself mean you should trust them. Consider the difference between a vendor who we already trust and respect, and a vendor we know nothing about. The transaction from each may be trustworthy, but the vendor in each case either deserves our trust or does not. PKI provides an outside authority to vouch for the vendors they represent.

Page 73 presents a list of common computer protocols that use encryption. Several have been discussed before:

  • SSH
  • SSL
  • TLS
  • IPSec
  • Password Authentication Protocol (PAP) should use encryption, but it does not. Not recommended. Use CHAP instead.
  • PPTP, L2TP, and SSTP, three tunneling protocols for VPNs

The discussion of attacks at the end of the chapter is not useful, so we will skip that.


Assignments

This week you need to complete Lab 2. It is due next week, which is week 4.
Assignment 2 and Part 2 of an ongoing course project are due in week 5.