Request life cycle

What happens when you enter URL in browser

Concepts
- Browser
- Server
Procedure
- DNS lookup
- ARP process
  - IPv4
  - IPv6
- TCP three-way handshake
  - Flags
  - Values
  - Procedure
- TLS handshake
- HTTP
TODO:
References

Concepts

Browser

Cache
- Server can specify the cache control policy using the cache-control header, but things may not like developer expected, i.e. when user click the back button, browser may load everything from cache without any request ²
- it is said that Cache-Control: no-cache, max-age=0, must-revalidate, no-store can solve the back button problem ³
- TODO: forward and backward cache
- TODO: push state, I think I have encountered the term when using angularjs
Connection
- After a http request is finished, the browser can kept the opened connection for later requests.
Port
- TODO: does browser use a new local port when it open a new connection to the server

Server

Cache
- Server can check the header by browser (i.e. ETag⁴) to determine if browser has valid cache, if so, it returns 304 with empty body
TODO: ethernet address (also named a MAC address)

Procedure

DNS lookup

This part is mainly from ¹

If user enter ip address, then there is no need for domain name lookup

browser check its cache i.e. chrome://net-internals/#dns
If not found, the browser calls gethostbyname library function (varies by OS) to do the lookup
If not found, first lookup the hosts file i.e. /etc/hosts
If still not found, it goes to DNS server, by default it is configured by DHCP, but can manually configured by user to avoid certain level of DNS pollution ⁵, i.e. use 8.8.8.8 ⁶
TODO: If the DNS server is on the same subnet the network library follows the ARP process below for the DNS server
TODO: If the DNS server is on a different subnet, the network library follows the ARP process below for the default gateway IP

TLS and SSH key exchange is different because TLS use public/private key and invlove trused CA

ARP process

The Address Resolution Protocol (ARP) is a telecommunication protocol used for resolution of Internet layer addresses into link layer addresses ⁹. i.e. From IPv4 addresses to a physical address like an Ethernet address (also named as MAC address)

IPv4

Check ARP Cache, if target IP is found, following step are skiped
Check route table, to see if the Target IP address is on any of the subnets on the local route table. If it is, the library uses the interface associated with that subnet. If it is not, the library uses the interface that has the subnet of our default gateway. ¹
Look up the MAC address of the selected network interface ¹
Sends a Layer 2 (data link layer of the OSI model) ARP request:

Sender MAC: interface:mac:address:here
Sender IP: interface.ip.goes.here
Target MAC: FF:FF:FF:FF:FF:FF (Broadcast)
Target IP: target.ip.goes.here

Router gives ARP reply

Sender MAC: target:mac:address:here
Sender IP: target.ip.goes.here
Target MAC: interface:mac:address:here
Target IP: interface.ip.goes.here

How does the router come up with reply, it has to ask other router if this is outside its network?
ARP probe, test if this address is already used by someone else
- TODO: what if some bad guy always say it is in use?

IPv6

For IPv6, functionality of ARP is provided by the Neighbor Discovery Protocol (NDP)

TCP three-way handshake

Flags

SYN: synchronize sequence number, Only the first packet sent from each end should have this flag set. Some other flags and fields change meaning based on this flag, and some are only valid for when it is set, and others when it is clear. ⁷
ACK: indicates that the Acknowledgment field is significant. All packets after the initial SYN packet sent by the client should have this flag set ⁷

Values

Sequence number
- ISN: initial sequence number If the SYN flag is set (1)
- SEQ: If the SYN flag is clear (0), then this is the accumulated sequence number of the first data byte of this segment for the current session
Acknowledgment number: if the ACK flag is set then the value of this field is the next sequence number that the sender is expecting. This acknowledges receipt of all prior bytes (if any). The first ACK sent by each end acknowledges the other end's initial sequence number itself, but no data.

Procedure

SYN: Client
- client choose ISN, A
- set the SYN bit to indicate it is setting the ISN
- send the packet to the server
SYN-ACK: Server
- server choose ISN, B
- set the SYN bit to indicate it is setting the ISN
- set the ACK bit to indicate it is acknowledging receipt of the first packet
- set ACK as A+1
- send the packet to the client
ACK: Client
- set SEQ to A+1
- set ACK bit
- set ACK as B+1
- send the packet to the server

TLS handshake

Transport Layer Security (TLS) and its predecessor, Secure Sockets Layer (SSL), both frequently referred to as "SSL", are cryptographic protocols that provide communications security over a computer network. ⁸

The following is a mix of ¹ and ⁸

client ask the server to use TLS (i.e. use https port 443)
client provide the server a list of cipher suites it supports
server pick one cipher suites it also support, send it back with its public key signed by a CA (Certificate Authority)
client verify the digital certificate against its list of trusted CAs before proceeding
generate the symmetric key for encrypt following communication, either
- use Diffie-Hellman as SSH does, if the server's private key is disclosed in the future, it can not be used to decrypt the current sessions (recorded by third party)
- client use server's public key to encrypt a random number and send to server
- server decrypt using its private key to get the random number and generate its own copy of symmetric key
- client generate hash of its own copy of symmetric key, send Finish message with its hash to server encrypted using the symmetric key
- server generate hash of its own copy of symmetric key and compare with decrypted client-sent hash. if matches, server send Finish message to client encrypted using the symmetric key
encypt the following communication using the symmetric key

More detail can be found in wiki
If CA is compromised, then using server's public key to generate the symmetric is not safe.

HTTP

TODO:

HTTP
HTTP2, how do brower know server support HTTP2
Encapsulation and Decapsulation
The stackoverflow one is just way too simple http://stackoverflow.com/questions/2092527/what-happens-when-you-type-in-a-url-in-browser
SSL != TLS https://en.wikipedia.org/wiki/Public-key_cryptography#Weaknesses
When I use arp I only found two entry, me and the router. But other people do have multiple entries.

References

¹. https://github.com/alex/what-happens-when ↩

². https://madhatted.com/2013/6/16/you-do-not-understand-browser-history ↩

³. http://blog.55minutes.com/2011/10/how-to-defeat-the-browser-back-button-cache/ ↩

⁴. https://en.wikipedia.org/wiki/HTTP_ETag ↩

⁵. https://en.wikipedia.org/wiki/DNS_spoofing ↩

⁶. https://developers.google.com/speed/public-dns/ ↩

⁷. https://en.wikipedia.org/wiki/Transmission_Control_Protocol ↩

⁸. https://en.wikipedia.org/wiki/Transport_Layer_Security ↩

⁹. https://en.wikipedia.org/wiki/Address_Resolution_Protocol ↩

Enter URL in browser

Request life cycle

Concepts

Browser

Server

Procedure

DNS lookup

ARP process

IPv4

IPv6

TCP three-way handshake

Flags

Values

Procedure

TLS handshake

HTTP

TODO:

References

results matching ""

No results matching ""