layout | title | sidebar |
---|---|---|
page |
HTTP, Yeah You Know Me! |
true |
In this project we'll begin to introduce HTTP, the protocol that runs the web, and build a functioning web server to put that understanding into action.
- Practice breaking a workflow into a system of coordinating components
- Practice using TDD at the unit, integration, and acceptance levels
- Understand how the HTTP request/response cycle works
- Practice implementing basic HTTP requests and responses
The internet, which for most people is the web...how does that work?
HTTP (HyperText Transfer Protocol) is the protocol used for sending data from your browser to a web server then getting data back from the server. As protocols go, it's actually a very simple one.
Imagine that you're requesting information from a penpal (old school with paper, envelopes, etc). The protocol would go something like this:
- You write a letter requesting information
- You wrap that letter in an envelope
- You add an address that uniquely identifies the destination of the letter
- You hand the sealed enveloper to your mail person
- It travels through a network of people, machines, trucks, planes, etc
- Assuming the address is correct, it arrives at your penpal's mailbox
- Your penpal opens the envelope and reads the letter
- Assuming they understand your question, your penpal writes a letter of their own back to you
- They wrap it in an envelope and add an address that uniquely identifies you (which they got from the return address on your envelope)
- They hand their letter to their mail person, it travels through a series of machines and people, and eventually arrives back at your mailbox
- You open the envelope and do what you see fit with the information contained in there.
Metaphor aside, let's run through the protocol as executed by computers:
- You open your browser and type in a web address like
http://turing.io
and hit enter. The URL (or "address") that you entered is the core of the letter. - The browser takes this address and builds a request, the envelope. It uniquely identifies the machine (or server) out there on the internet that the message is intended for. It includes a return address and other information about the requestor.
- The request is handed off to your Internet Service Provider (ISP) (like CenturyLink or Comcast) and they send it through a series of wires and fiber optic cables towards the server
- The request arrives at the server. The server reads the precisely formatted request to figure out (a) who made the request and (b) what they requested
- The server fetches or calculates the requested information and prepares a response. The response wraps the requested information in an envelope that has the destination address on it (your machine).
- The server hands the response off to their ISP and it goes through the internet to arrive at your computer
- Your browser receives that response, unwraps it, and displays the data on your machine.
That's HTTP. You can read more on wikipedia article or the IETF specification.
Here is what an actual request looks like. Note that it's just a single highly-formatted string:
GET / HTTP/1.1
Host: 127.0.0.1:9292
Connection: keep-alive
Cache-Control: max-age=0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.80 Safari/537.36
Accept-Encoding: gzip, deflate, sdch
Accept-Language: en-US,en;q=0.8
The parts we're most interested in are:
- The first line,
GET / HTTP/1.1
, which specifies the verb, path, and protocol which we'll pick apart later Host
which is where the request is sent toAccept
which specifies what format of data the client wants back in the response
With those pieces of information a typical server can generate a response.
The Server generates and transmits a response that looks like this:
http/1.1 200 ok
date: Sun, 1 Nov 2015 17:25:48 -0700
server: ruby
content-type: text/html; charset=iso-8859-1
content-length: 27
The response body goes here
The parts we're most interested in are:
- The first line,
HTTP/1.1 200 ok
, which has the protocol and the response code - The unmarked lines at the end which make up the body of the response
content-length
which tells the client when to stop listening
While working on this project you're going to need to make a lot of HTTP requests. There are many tools that can help you with that, but we recommend you use the following:
- Web Browser -- if you need to make HTTP GET requests you can use the browser, but it's the weakest of these three tools.
- Postman -- a Chrome extension which gives you amazing control and the ability to make any kind of request. Use this for your manual testing and experimentation.
- Faraday -- a Ruby library for making requests and parsing responses. Use this for your automated testing, basically like a scripted version of request/reponse cycles you could do with Postman.
Ruby has handy built-in libraries for dealing with most of the low-level networking details about running a server. Let's write a short program that can start up, listen for a request, print that request out to the screen, then shut down.
First, we need to "open a port" which basically means "tell the computer that network requests identified addressed for a specific port should belong to this program".
On your computer there are dozens of programs that are using the network connection at any given time. If the messages in and out of those programs were all happening through the same channel then it'd be confusing which message belongs to which program. Think of the port like a mailbox in an apartment building: all the residents (aka programs) share the same street address (your computer) but each have their own mailbox (or port).
Let's start our server instance and have it listen on port 9292
:
require 'socket'
tcp_server = TCPServer.new(9292)
client = tcp_server.accept
We can read the request from the client
object which is what we call an IO stream. Here's a snippet to keep reading from that stream until the input is a blank line and store all the request lines in an array request_lines
:
puts "Ready for a request"
request_lines = []
while line = client.gets and !line.chomp.empty?
request_lines << line.chomp
end
Note that when the program runs it'll hang on that gets
method call waiting for a request to come in. When it arrives it'll get read and stored into request_lines
, then lets print it to the console for debugging:
puts "Got this request:"
puts request_lines.inspect
Then it's time to build a response. For this example let's just print out the request data as the response:
puts "Sending response."
response = "<pre>" + request_lines.join("\n") + "</pre>"
output = "<html><head></head><body>#{response}</body></html>"
headers = ["http/1.1 200 ok",
"date: #{Time.now.strftime('%a, %e %b %Y %H:%M:%S %z')}",
"server: ruby",
"content-type: text/html; charset=iso-8859-1",
"content-length: #{output.length}\r\n\r\n"].join("\r\n")
client.puts headers
client.puts output
And close up the server:
puts ["Wrote this response:", headers, output].join("\n")
client.close
puts "\nResponse complete, exiting."
Save that file and run it. Open your web browser and enter the address http://127.0.0.1:9292
. If everything worked then your browser should show all the details of your request. Flip over to the terminal where your ruby program was running and you should see the request outputted to the terminal.
You just built a web server.
Having trouble? Check out the whole file here.
You're going to build a web application capable of:
- Receiving a request from a user
- Comprehending the request's intent and source
- Generating a response
- Sending the response to the user
Build a web application/server that:
- listens on port 9292
- responds to HTTP requests
- responds with a valid HTML response that displays the words
Hello, World! (0)
where the0
increments each request until the server is restarted
Let's start to rip apart that request and output it in your response. In the body of your response, include a block of HTML like this including the actual information from the request:
<pre>
Verb: POST
Path: /
Protocol: HTTP/1.1
Host: 127.0.0.1
Port: 9292
Origin: 127.0.0.1
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
</pre>
Keep the code that outputs this block at the bottom of all your future outputs to help with your debugging.
Now let's react to the path
that the user specifies:
- If they request the root, aka
/
, respond with just the debug info from Iteration 1. - If they request
/hello
, respond with "Hello, World! (0)" where the0
increments each time the path is requested, but not when any other path is requested. - If they request
/datetime
, respond with today's date and time in this format:11:07AM on Sunday, November 1, 2015
. - If they request
/shutdown
, respond with "Total Requests: 12" where12
is the aggregate of all requests. It also causes the server to exit / stop serving requests.
Often we want to supply some data with a request. For instance, if you were submitting a search, that'd typically be a GET
request that has a parameter. When we use parameters in GET
requests they are embedded in the URL like this:
http://host:port/path?param=value¶m2=value2
You know your computer has a dictionary built in, right? It's stored in a special file on your
machine located at /usr/share/dict/words
. Let's use this information to write an "endpoint" that works like this:
- The path is
/word_search
- The verb will always be a
GET
- The parameter will be named
word
- The value will be a possible word fragment
In your HTML response page, output one of these:
WORD is a known word
WORD is not a known word
Where WORD
is the parameter from the URL.
The path is the main way that the user specifies what they're requesting, but the secondary tool is the verb. There are several official verbs, but the only two typical servers use are GET
and POST
.
We use GET
to fetch information. We typically use POST
to send information to the server. When we submit parameters in a POST
they're in the body of the request rather than in the URL.
Changing the verb and submitting parameters in the body instead of the parameters for a POST
request can both be done in Postman.
Let's write a simple guessing game that works like this:
This request begins a game. The response says Good luck!
and starts a game.
A request to this verb/path combo tells us:
- a) how many guesses have been taken.
- b) if a guess has been made, it tells what the guess was and whether it was too high, too low, or correct
This is how we make a guess. The request includes a parameter named guess
. The server stores the guess and sends the user a redirect response, causing the client to make a GET
to /game
.
We use the HTTP response code as a short hand way to explain the result of the request. Here are the most common HTTP status codes:
200 OK
301 Moved Permanently
401 Unauthorized
403 Forbidden
404 Not Found
500 Internal Server Error
Let's modify your game from Iteration 4 to use status codes:
- Most requests, unless listed below, should respond with a
200
. - When you submit the
POST
to/start_game
and there is no game in progress, it should start one and respond with a301
redirect. - When you submit the
POST
to/start_game
but there is already a game in progress, it should respond with403
. - If an unknown path is requested, like
/fofamalou
, the server responds with a404
. - If the server generates an error, then it responds with a
500
. Within the response let's present the whole stack trace. Since you don't write bugs, create an/force_error
endpoint which just raises aSystemError
exception.
The HTTP-Accept
parameter is used to specify what kind of data the client wants in response. Modify your /word_search
path so that if the HTTP-Accept
starts with application/json
then they are sent a JSON body like the following.
A search for pizza
returns this JSON:
{"word":"pizza","is_word":true}
A search for pizz
returns JSON with possible matches like this:
{"word":"pizza","is_word":true,"possible_matches":["pizza","pizzeria","pizzicato"]}
What happens if your web server gets more than one request at a time? Let's experiment with Threads. to be continued
The project will be assessed with the following rubric:
- 4: Application implements all five iterations and at least one extension
- 3: Application implements iterations 0 - 4
- 2: Application implements iterations 0 - 3
- 1: Application implements through interation 2 or less
- 4: Application demonstrates excellent knowledge of Ruby syntax, style, and refactoring
- 3: Application shows some effort toward organization but still has 6 or fewer long methods (> 8 lines) and needs some refactoring.
- 2: Application runs but the code has many long methods (>8 lines) and needs significant refactoring
- 1: Application generates syntax error or crashes during execution
- 4: Application is broken into components which are well tested in both isolation and integration
- 3: Application uses tests to exercise core functionality and some edge cases, but fails to break out component objects/tests.
- 2: Application uses tests to exercise core functionality but leaves many common edge cases untested.
- 1: Application does not demonstrate strong use of TDD
- 4: Application effectively breaks logical components apart with clear intent and usage
- 3: Application has multiple components with defined responsibilities but there is some leaking of responsibilities
- 2: Application has some logical components but divisions of responsibility are inconsistent or unclear and/or there is a "God" object taking too much responsibility
- 1: Application logic shows poor decomposition with too much logic mashed together
There is content from previous versions not germane to the assignment above available here.