Web application security

In this tutorial we're going to add passwords to our users and let them log in to the site. We'll also learn the basic tools for securing our website against common attacks.

Topics we will cover:

  • HTTP vs HTTPS
  • storing passwords securely
  • HTTP cookies
  • CSRF attacks and mitigation
  • XSS attacks and mitigation
  • security headers: CSP, HSTS, X-Frame-Options

HTTP vs HTTPS

When the browser connects to the server, the data they exchange passes through several computers on the way. The regular HTTP protocol sends everything in plain text. That means the machines connecting the server and the browser can read everything that is sent and even secretly change the data. This is an obvious security issue - when one of the intermediaries is malicious, then he can steal the users' passwords etc.

HTTPS was invented to guard against such attacks. With HTTPS, the server and the browser create an encrypted connection that the intermediaries cannot decode or change (without being detected). To use HTTPS, the server must be configured with an encryption key and a digital certificate. The encryption key is used to set up the secure channel and it can be easily generated by anyone. The certificate tells the browser which domain (e.g. github.com) the encryption key belongs to.

The browser will only trust the certificates that are issued by a certain companies called the Certificate Authorities (CAs). The CAs' job is to make sure that a random person cannot get a valid certificate for a domain they don't own.

We will now set up HTTPS for our web application. The first step is to generate our new encryption key. Run the following command on the command line. Choose a random password when asked, leave everything else empty, finally reply "yes" to "Is ... correct?" in the end).

keytool -genkey -alias htmlbasics -storetype PKCS12 -keyalg RSA -keysize 2048 -keystore keystore.p12 -validity 3650

This should create a file keystore.p12. The file contains the generated encryption key and a self-signed certificate. The self signed certificate will work for creating the connection, but the browser will not trust it because it's not issued by a trusted CA. When setting up a real page, you must get a real certificate from a CA such as Let's Encrypt or your clients will always see a security warning.

The next step is to switch our application to HTTPS. Open application.properties and add the following (replace the password with whatever you chose earlier):

server.port=8443
server.ssl.key-store=keystore.p12
server.ssl.key-store-password=your-keystore-password
server.ssl.keyStoreType=PKCS12
server.ssl.keyAlias=htmlbasics

Congratulations! HTTP should now be disabled and HTTPS enabled. You can connect to your application at https://localhost:8443/.

Storing passwords securely

Now that we have a secure connection to the server, we can start sending sensitive information to it. We want to add passwords to our users so they can log in.

Storing passwords requires some special considerations. Whatever we do, there's a chance that someone will hack our server and steal data from our database. We should ensure that the attacker cannot log into our users' accounts even after reading our database. Also, users often reuse their passwords on different websites - the attacker shouldn't be able to use the stolen data to hijack our users' accounts on other sites. This effectively means we can't store the passwords in our database.

This doesn't make much sense because we need to check the password when the user is logging in. Fortunately we don't need to store the plain text password in the database. When storing the database, we will first hash it and store the hash value. Hashing is a deterministic one-way function that converts some arbitary input to a fixed-size output. Converting the hash back to the original input efficiently should be practically impossible when using a good hash. When the user tries to log in, we will hash the provided password and compare it with the one stored in the database. If the hashes match, then it must be the same password (even though we never stored the actual password).

There are different hash functions and some are better suited for passwords than others. We will now discuss two main issues with hashing.

Rainbow tables

The attackers objective is to discover the original input for a given hash value as fast as possible. This should be impossible as hashing is a one-way function. However, nothing is stopping the attacker from building a database that contains the hash of every possible alphanumeric input. The english alphabet has 26 letters, so for a password of length 8, there are only 26^8 possible passwords. The database containing all possible hashes would be less than 10TB large and would fit on just a few hard drives. If the hash -> input pairs are sorted, then looking up the password by its hash would be trivial.

The solution to rainbox tables is salting. Instead of storing hash(password), we generate a random string (salt) and store hash(salt+password). The salt can be stored in the database with the password hash. Its only purpose is to make each password long enough that the rainbow table would be too large to compute. Note that each password should have its own salt.

Brute forcing with the cloud

Even with a decent salted hash, there's still a possibility that the attacker can guess a password by trying all possible inputs. The attacker can rent thousands of processors from amazon and have them all try to guess passwords in parallel. Most hash functions can be computed very quickly and with the cloud, it's possible to literally try billions of passwords a second.

This solution also has a few solutions:

  1. Use a slow hash function, such as bcrypt or scrypt
  2. Apply a fast hash function thousands of times:
    result = salt + password
    for (1..100000)
      result = hash(result)
    return result
    
    Search for PBKDF2 (Password-Based Key Derivation Function) to see how to do it securely

BCrypt in practice

We will use the BCrypt hash for storing our passwords. This is how it works:

String salt = BCrypt.gensalt();
String hashed = BCrypt.hashpw("my-password", salt);
boolean matches = BCrypt.checkpw("my-password", hashed);

Note that the output of hashpw already contains the salt, so you don't have to store it separately. The output looks something like $2a$10$59jVznraEgiehQk4m1qmWeVL1st/WXY1Gmz8CTPV5BdYG3wdUMfTe. The pom.xml in this repository contains the necessary dependency spring-security-crypto.

Adding the passwords

Update your forum application to support storing passwords:

  • ask for password on the registration page
  • store the password hash in the User class (should be stored in the database with the rest of the data)

HTTP cookies

Now that we have passwords, it's time to create a login page. The user will enter the email address and the password. The server can check whether the password is right. But what happens next - how can the server "remember" that the user is logged in?

The solution is cookies. The server can ask the browser to store a name-value pair (the cookie). Both the name and value are regular strings and can contain anything. Whenever in the future the browser sends a request to the server that set the cookie, it will also include the cookie in the request.

The server will add the Set-Cookie header to a response to set a cookie. The browser will include the previously set cookies in the Cookie header when sending a request. The server can set several cookies (not limited to one).

Note that the cookies are stored in the browser, which means that the user can add, change and delete them. You can see the cookies in the browser developer tools and/or the privacy settings, depending on the browser.

In the server you can set the cookies like this:

@RequestMapping
public void cookieWriter(HttpServletResponse response) {
  Cookie cookie = new Cookie("cookie-name", "cookie-value");
  cookie.setSecure(true); // include only in HTTPS requests
  cookie.setHttpOnly(true); // don't allow javascript to read this cookie
  response.addCookie(cookie);
}

You can read the cookies like this (two options):

@RequestMapping
public void cookieReader(@CookieValue(name = "cookie-name", required = false) Cookie cookie) {
  if (cookie != null) {
    // use the cookie
  }
}

@RequestMapping
public void cookieReader(HttpServletRequest request) {
  Cookie[] cookies = request.getCookies();
  if (cookies != null) {
    // find the right cookie
  }
}

So how can we implement login with cookies? One option is to set a cookie with the name current-user and set the value to the email of the logged in user when the users logs in with a correct password. This way the user's browser would send the current-user cookie with each following request and the server would know which logged in user is contacting it. However, there's a problem: an attacker could use the browser's developer tools to add the cookie manually and "become logged in" as any user he wants.

A better solution is to use login tokens. When an user logs in, generate a random long string (30 characters?) and store it in a cookie. Additionally, store the login token in the database, so that it's clear which user the token belongs to. When the user sends a request to the server, then the server can match the token to the logged in user. However, an attacker would have a very low probability of guessing a valid token.

Note that the user could log in from different browsers. Each browser should have a different login token. This way if the user logs out from one browser, you could remove the token from the database and the other browsers would still be logged in.

Adding login/logout

Time to add the login functionality. Create a new page that the user can use to log in. When the user logs in successfully, create the login token, store it in the database and a cookie.

Next, create another page that can be used to log out. When the user is logged in, the page should display the logged in user's display name. Additionally, there should be a button to log out the user. Pressing the button should cause the login token to be deleted from the database.

Hints:

  • Use @ElementCollection to store the login tokens in the User class.
  • Add findByLoginTokensEquals(String token) to the users repository. Spring Data understands it as "find any users whose loginTokens collection contains the exact string token".
  • The log out button should do a POST request, not a GET request. Recall that GET is a "safe" method.

Authenticate creating posts

Currently when an user creates a new forum posts, an email address field is used to identify the user. Remove the email field from the new post form. Set the post's author to the current logged in user. Hide the form when no user is logged in.

CSRF attacks and mitigation

Our login system with the login tokens is already quite secure, but there is still one major security issue we need to take care of. Remember that when a server sets a cookie, the cookie is included in all future requests to the server that set the cookie. There is no restriction that the request must come from a web page served by that same server.

Consider this scenario:

  • the user logs in to our forum and the login token is stored in a cookie
  • the user leaves our site and visits another site (e.g. evil.com)
  • evil.com contains a button that's labeled Get free stuff, implemented like this:
    <form action="https://your-forum-address/threads" method="post">
      <input type="hidden" name="threadName" value="check out my cool new program at evil.com/virus.exe" />
      <input type="submit" value="Get free stuff" />
    </form>
  • the user clicks the free stuff button and submits the form to your forum site. because the request is sent to your server, the login token cookie is included. the new evil forum thread is created with the current logged in user as the author.

This attack is called a Cross Site Request Forgery (CSRF). The issue is that we cannot prevent the browser from including the cookie when sending request from other sites, but the cookie is enough to authenticate the user. The solution to the CSRF problem is to change our code so that the login token is not enough to accept the POST request from the user.

This is how it works in detail:

  • When the user first visits our site, we will generate a random long string (30 characters?) that will be our csrf token.
  • We store the csrf token in a separate cookie
  • We add a new hidden field to all our html forms that contains the csrf token
  • When we receive a POST request, then we check if the form contains the same csrf token that is in the cookie.

The solution works because the browser will ensure that each site can only read and change its own cookies. Therefore, evil.com cannot read the csrf token in our cookie and cannot include it in its form.

Request filters

It is really annoying to include the csrf token in the forms, but it's really the best way. Adding the CSRF token check to the beginning of all our @RequestMapping methods would be madness, though. Thankfully, there's an easier way by using request filters.

This is what a filter looks like:

@Component
public class CsrfFilter extends GenericFilterBean {

  @Override
  public void doFilter(ServletRequest servletRequest, ServletResponse servletResponse, FilterChain filterChain) throws IOException, ServletException {
    System.out.println("before controller");
    filterChain.doFilter(servletRequest, servletResponse);
    System.out.println("after controller");
  }
}

Filters allow you to run code before and after a request has been processed by the controller. When the request arrives, the server will call the first filter's doFilter. The filter can do whatever it wants with the request and then call filterChain.doFilter to pass the request to the next filter. When the final filter calls filterChain.doFilter the request is passed to the controller. After the controller has finished, the doFilter method will return and the filter can do whatever it wants again.

Implement the filter

Find the app.services.CsrfFilter class and add the following:

  • check if we already have a csrf token generated and stored in a cookie
  • generate and store the csrf token if needed
  • check if the request needs csrf checking: all requests except GET and HEAD should contain the csrf token.
  • check if the request contains the csrf token and it's the same as the one stored in the cookie. if the check fails, throw an exception.
  • store the csrf token in the request attributes using request.setAttribute(...). this way you can access it in the thymeleaf templates using ${attributeName}
  • add the csrf token to all your html forms: <input type="hidden" ... />
  • change your login and logout code so that these operations delete the csrf cookie. this forces the CsrfFilter to generate a new csrf token on the next request. the purpose is to avoid csrf token fixation attacks (advanced stuff).

See also: Cross-Site Request Forgery is dead! is a good article discussing how CSRF could be prevented on the browser side in the future.

Security headers

With the CSRF filter in place, the forum security is almost perfect. Only a few minor tweaks remain.

Cross Site Scripting (XSS) attacks

The forum allows the users to post any content and then shows it in the threads. There's nothing stopping the users from posting bad stuff, such as scripts. For example, an attacker could post the following code snippet:

<script>document.location.assign('http://evil.com');</script>

Every user that would open the thread containing that post would be redirected to evil.com.

Luckily thymeleaf already escapes our html (replaces < with &lt;, > with &gt; etc). This causes the script tag to be rendered as text, not added as a html element. To add an extra safety net agains such scripting attacks, we will add the CSP header to our application.

The CSP (Content Security Policy) header (overview, reference) tells the browser where it is allowed to load scripts and other files from. We will use CSP to tell the browser to only allow loading css files from our own server and block all javascript (we're not using it anyway).

Create a new filter class SecurityHeaderFilter and have it add the following header to all responses that our server sends:

  • header name "Content-Security-Policy"
  • header value "default-src 'none'; style-src 'self';"

HTTP Strict-Transport-Security

As the first thing in this tutorial we set up our nice https certificates. The HSTS (HTTP Strict-Transport-Security) header allows us to tell the browser to always use https with our site. This helps against cases where a link to our page uses http:// instead of https:// by accident or the user types our page address into the browser's URL bar without https://.

Change the SecurityHeaderFilter and add this header to all responses:

  • header name "Strict-Transport-Security"
  • header value "max-age=31536000; includeSubDomains"

X-Frame-Options

It is possible to show other pages inside your own page using the <iframe> tag (like a small embedded browser window). One annoying attack is the clickjacking attack. The attacker will place your page in his page so that your page fills the entire screen. Next he will place an invisible layer on your page, but cover up some input fields with his own fields.

For example, the attacker can show the victim your login page, but the username and password fields are covered up with identical looking fields. The user is easily tricked into entering his login information in the fake fields.

The fix is to use the X-Frame-Options header that tells the browser "please don't let this page be placed in an iframe". Add it to your SecurityHeaderFilter:

  • header name "X-Frame-Options"
  • header value "DENY"

All done

The security features we added to our forum should protect it from a wide range of attacks. Feel free to read the linked articles and search for more information. Remember that security is not optional and should be built into the page from the start. It's harder to add it later and it's not nice to leave your users unprotected in the mean time.