/42_webserv

Coding a HTTP/1.1 server in C++ 98, according to RFC7230 through 7235 [05/2021-08/2021] [Ubuntu18]

Primary LanguageC++

42_webserv

42's webserv, team rturcey-esoulard! [05/2021-08/2021] [Ubuntu18]

GOAL: write a HTTP server in C++ 98 (http 1.1) When in doubt, compare program behaviour with nginx.

LINKS:

RESOURCES: difficulty rated 1 to 3, 1 being pretty understandable, 3 is as obscure as the deepest darkest pits of hell. 2 means you're not gonna have the best time but it needs to be done, kinda like a very awkward family dinner

BEFORE STARTING, DO TESTS WITH THESE TO SEE WHAT OUR SERVER SHOULD DO:

  • telnet: /!\ telnet is an unencrypted and therefore insecure protocol.
  • nginx: nginx is said to be pretty secure but steps can be taken in its configuration to secure it further.

KEY NOTIONS:

  • Hypertext Transfer Protocol (HTTP) : an application protocol for distributed, collaborative, hypermedia information systems. Was developed to facilitate hypertext and the World Wide Web. Communication between client and server uses HTTP. HTTP also includes ways of receiving content from clients. This feature is used for submitting web forms, including uploading of files.
  • HTTP session : An HTTP session is a sequence of network request–response transactions. An HTTP client initiates a request by establishing a Transmission Control Protocol (TCP) connection to a particular port on a server (typically port 80, occasionally port 8080; see List of TCP and UDP port numbers). An HTTP server listening on that port waits for a client's request message. Upon receiving the request, the server sends back a status line, such as "HTTP/1.1 200 OK", and a message of its own. The body of this message is typically the requested resource, although an error message or other information may also be returned. With HTTP/1.1, a connection could be reused for more than one request. The body can contain data in any format that server and client both know how to handle: plain text, pictures, HTML, XML..
  • HTTP header : the format of the messages exchanged between client and server. See HTTP_header_walkthrough.txt
  • World Wide Web : where hypertext documents include hyperlinks to other resources that the user can easily access.
  • Web server : stores, processes and delivers web pages to clients. It is a regular network application that is listening on a specific port (by default, 80, for https 443). The HTTP request must be addressed to THAT port.
  • HTML : Pages delivered are most frequently HTML documents, which may include images, style sheets and scripts in addition to the text content.
  • User agent : commonly a web browser or web crawler. Initiates communication by making a request for a specific resource using HTTP and the server responds with the content of that resource or an error message if unable to do so. The resource is typically a real file on the server’s secondary storage, but this is not necessarily the case and depends on how the web server is implemented.
  • cURL: very popular HTTP client library. Includes both a standalone command line program, and a library that can be used by various programming languages.
  • CGI: Common Gateway Interface: interface specification that enables web servers to execute an external program, typically to process user requests. A typical use case occurs when a Web user submits a Web form on a web page that uses CGI. The form's data is sent to the Web server within an HTTP request with a URL denoting a CGI script. The Web server then launches the CGI script in a new computer process, passing the form data to it. The output of the CGI script, usually in the form of HTML, is returned by the script to the Web server, and the server relays it back to the browser as its response to the browser's request. For more info on CGI
  • Socket : mechanism that most popular operating systems provide to give programs access to the network. It allows messages to be sent and received between applications (unrelated processes) on different networked machines.
  • What's the deal with select()

PROGRAM ARGUMENTS

MINDMAPS:

  • Main modules of our program(careful, each time we save it triggers a new link that we need to update here, otherwise the update will be lost!)
  • of more precise modules (TBD)

TEAM NAMING FORMATS CONVENTION (TBC):

  • function_name()
  • variable_name()
  • _private_variable
  • ClassName
  • s_struct t_struct

ALLOWED TOOLS:

  • Everything in C and C++ 98 but no external library.

RESOURCES ON SPECIFIC ISSUES

BEFORE HANDING IT IN, DO TESTS!!:

  • The included tester in project page
  • Compare returns with nginx
  • Do a stress test
  • Test with several programs (different languages are allowed)
  • acoudert tester

GIT BRANCHING