Chapter 7, Securing Web Applications

ITS 4050 - Internet and Web Security

Chapter 7, Securing Web Applications

This lesson presents some material from chapter 7. Objectives important to this lesson:

User input issues
Web site technologies and systems
Layered security
Secure and unsecure protocols
SSL
Access control

Concepts:

Chapter 7

The chapter begins with a review list of attacks that can affect our systems while web browsers are connected to web applications:

Cross-site scripting
SQL injection
Directory traversal
URL redirection
XML injection
XQuery injection

The text follows this with a bullet list of advice:

Client side validation of data to be sent to our web application cannot be trusted. It is mainly there for the users who are real customers. Hackers will work around it.
Use server-side validation. Data that is received across the Internet should be validated, sanitized, and accepted or rejected. Don't trust the user to answer the question that was asked. Assume that any received data has a flaw or an attack payload in it. Find it and get rid of it.
Regarding received data, consider using whitelists and blacklists when appropriate. A whitelist, in this case, is a table of all acceptable data strings for a form. If you use s whitelist, only strings that appear on it will be allowed. A blacklist is a table of data strings that are not allowed. If the user's input is on the blacklist, it will be rejected, and any other string would be acceptable. Blacklisting is more dangerous: it allows anything except the problems we know about. Whitelisting is more protective, but it rejects any input it does not recognize.

The text appears to turn away from its subject for a moment on page 184. It recommends that we review some RFC (Requests for Comments) documents. These are documents created by or sent to groups of engineers working on Internet functionality. In this case, the author is referring us to current and potential future Internet standards. These standards are being offered as guides to accepted or more useful data formats, and methods of validating client data. The RFC archives can be searched and reviewed at http://www.faqs.org/rfcs/.

Turning to technologies that make a web site work, the text begins with a short lesson on HTML on page 185. For those who have never used it, HTML is called a markup language because it uses tag pairs with text between them. Each tag pair specifies a command to the browser about how to present the enclosed text. This sentence, for example is enclosed by a pair of tags that say to begin and end using a style with a bold font. It is up to browser to display that requested style as best it can.

The text points out that it is easy to add scripts to a web page which will run on the user's computer when the page is loaded. Typically, an attack will come from a web page the attacker has written and uploaded to some web server. As a harmless example, if you are viewing this page from my web site, you should see a change in the appearance of the navigation buttons in the upper left corner of the page whenever you hover over one with your mouse pointer. I could have written a script that would harvest information about you and send it to me. The harmless script in this case is a JavaScript. A variation on this technique can also happen when an attacker submits code through a form that is then added to a page that other users will see/load. This kind of attack has been seen on social networking sites.

The text explains that requests from a web interface to a database (common on e-commerce sites) are often done through an intermediate layer. This layer is often a Common Gateway Interface (CGI) program. The CGI program would then make its request to an SQL Database Back-end, a database that is meant to be accessed by the intermediate program. The point here is that the CGI program can be written with the same suspicion we use in our data validation measures in our web applications. Adding another layer of security here supports the idea of defense in depth, and will serve as a good barrier if our attacker subverts the validation in our web pages or applications.

This takes us to a brief discussion of the steps in the waterfall version of system development, shown on page 188. The text advises us to think about security while we are still designing a system. I hope this is something that would occur to every student in this program.

The text returns to idea of layered security on page 189. It explains that firewalls cannot do the job alone. For example, they are unlikely to work against social engineering or against injection attacks. One occurs on devices where we invite user input, and the other on any part of our network that can be compromised with user cooperation. That being said, the text presents several techniques (layers) by which security can be applied.

perimeter security - measures are applied to areas on the outer edge of our network
host-based security mechanisms - placing a line of defense in each host, and potentially in each application, that secures it directly
end-user education - teaching the users to trust less, to doubt more, and to protect what they can
authentication and access management - where authentication has not been defeated, this is a necessary line of defense
input validation - web pages and applications must be taught to reject user input that does not meet reasonable expectations
vulnerability management - patching, updating, and improving code to meet the challenges of newly discovered exploits and vulnerabilities

The text returns to its discussion of adding security concerns to application design for a few pages. It considers doing so at multiple layers of the waterfall model. It calls some layers by other names than those in the image on this page, but they mean the same things.

On page 193, the text begins a discussion of protocols. It tells us that HTTP, for example, sends HTML pages in cleartext. We should prefer to use HTTPS, (HTTP Secure) instead because HTTPS sends encrypted text through an SSL connection. On the next page we see a list of several protocols that we also saw in chapter 2. This is the version of the chart I gave you then:

Old Protocol	Replacement Protocol
FTP - File Transfer Protocol	SFTP - Secure File Transfer Protocol
HTTP - Hypertext Transfer Protocol	HTTPS - Hypertext Transfer Protocol Secure
Telnet - Teletype Network (connect to a remote system)	SSH - Secure Shell (allows a secure connection to the remote device)
rsh - Unix protocol to runs a Remote Shell
RCP - Remote Copy Protocol	SCP - Secure Copy Protocol
SNMPv1 and v2 - Simple Network Management Protocol	SNMPv3 - Simple Network Management Protocol; this version allows encryption and authentication

Secure Sockets Layer (SSL) is discussed in terms of its use of public key cryptography to pass a symmetric key to a session partner that allows encrypted data to be passed in either direction. The text points out that MD5, long used as a hashing protocol, has weaknesses, which means we should stop using it. Several substitutes are discussed on page 196. Skip the first two, and you will probably be using a secure method.

The last section of the chapter begins with a discussion of access controls and related mechanisms.

Identification takes place when a user tells a system who they are, typically by entering a recognized user ID.
Authentication takes place when a user proves they are the person that a user ID stands for, typically by entering the password linked to that ID.
Access control is the process of allowing or denying access to assets based on the permissions that have been granted to the ID for which the user has authenticated. Access control is done by the system. Permissions must have already been set up for the ID by someone with permission to do so, or the access control will not have any effect.

The text introduces the idea that access control applies to programs and processes on a system as well as to users. Whether a requester is a user or some process, that requester is called a subject by the access control system. The resource that the subject is attempting to access is called the object of the request.

The text lists four types of access control models, each of which has a different approach to using ACLs (Access Control Lists).

Discretionary Access Control (DAC) - Rights that are granted to a subject under this system may be granted by that subject to other subjects in the system. This means that the owner of an object can assign rights to other subjects (users) without needing the intervention of an administrator.
Mandatory Access Control (MAC) - In this one, there is more restriction. Objects are assigned to security classes, and subjects (users) are assigned security clearance levels. The result is that a user who has a clearance, for example, only for Confidential (and below) information cannot be assigned rights to an object classified as Secret or Top Secret.
Rule-based Access Control (RBAC, just like the next one, darn it) - This one is based on rules, like those in a firewall, that allow or deny based on the protocol of network traffic, the source or destination IP address, or a combination of the two. (This author does a better job of explaining this type than most authors.)
Role-based Access Control (RBAC) - Roles are like groups. Users can be assigned to either and rights can be inherited from the group or role by the user. A user can be assigned to multiple roles. In this system:

that subjects must be assigned to roles or they will have no rights,
that the role a subject is assigned to must be allowed (authorized) for the subject, which is not usually done with groups,
and that transactions must be authorized for the role a subject is in, else the subject cannot perform them.

On page 204, the text presents a short table of attack types and recommended mitigations for them. In case you are in need of a summary of the ideas, consider the firsts sentence in the mitigation column of the table:"Assume all input is harmful." This is good advice when accepting input from users.

Assignments

Continue the reading assignments for the course.
Download the new lab handouts as they become available, and submit your work on them.
Access the labs on the publisher's web site to perform their required labs.