Tracking Users: Cookies, Sessions, and Tokens
Learning Objectives
- You know of the three main ways of tracking users: cookies, sessions, and tokens.
- You know of regulations regarding tracking and privacy.
Tracking users
For authentication and authorization to work in practice, there is a need for a mechanism for keeping track of the user.
HTTP has been designed as a stateless protocol to allow scalability. This means that every request made to the server is independent from other requests. If the same client makes two requests to the server, the server does not know whether the requests were made by the same client or not — there is nothing in the requests that would allow determining whether the requests were made by the same client.
GET / HTTP/1.1
Host: myserver.net
GET / HTTP/1.1
Host: myserver.net
Due to the need to track users, mechanisms have been created to allow keeping track of users across requests. These mechanisms allow the server to identify the client making the request, and thus, e.g., determine whether the client is logged in or not.
Currently, there are three main ways for this: using cookies, using sessions, and using token-based authentication. We will look into each of these mechanisms next.
Cookies
Cookies, initially proposed in the mid-1990s, is a mechanism for storing small amounts of data on the client, which the client then sends to the server on every request. This way, on every request, the server can study whether a cookie exists in the request and, if a cookie exists, the cookie’s value.
Cookies are implemented using HTTP protocol headers. When a client makes a request to the server, the server adds a Set-Cookie header to the response. A Set-Cookie header could, for example, look like the following one — in the following example, we create a cookie with the name visits that has the value 0.
Set-Cookie: visits=0
Now, when a client receives a response that contains the cookie, the client — browser — automatically stores the cookie. Then, on every request to the same application, the browser adds the cookie to the request in a header called Cookie.
Cookie: visits=0
The basic flow of using cookies is shown in Figure 1 below.
Additional information can be set to the cookies. This information includes maximum age of the cookie (Max-Age), a path or a part of a path where the cookie is valid (Path), the domain or part of a domain where the cookie is valid (Domain), a version of the cookie (Version), information on whether the cookie should be sent only over a secure connection, i.e. only over HTTPS, (Secure), and whether the cookies should not be accessible using JavaScript in the browser (HttpOnly).
This information is added to the Set-Cookie header after the key-value -pair — additional information is separated with semicolons.
For example, the below Set-Cookie header would ask the client to store a cookie with the name name and value anonymous. The cookie is valid for 3600 seconds, and is used for the domain aalto.fi.
Set-Cookie: name=anonymous; Max-Age=3600; Domain=".aalto.fi"
Cookies are stored within a register in the browser, from where they are retrieved whenever a request is made. The register is stored in a file, which means that the cookies persist also if the browser or the computer is restarted.
Cookies are sent as plain text and can be read and modified by anyone who has access to the client. This means that values from cookies must not be trusted. For example, a cookie should not include a user identifier, as a plain-text cookie could be modified to impersonate another user.
Sessions
Sessions are a mechanism that build on cookies. When using sessions, the value of the cookie is created as a long random string that is passed back and forth between the server and the client. On the server, the cookie is resolved to an object stored on the server, which contains data related to the particular cookie and client.
Using a long random string as the value of a cookie makes it difficult to guess the value of a cookie to impersonate another user.
In essence, sessions are a mechanism that allows storing information about the client on the server. The client is still identified with a cookie, but the cookie contains a hard-to-guess reference to the data stored on the server. Such a cookie can be used to track the user across requests, while keeping sensitive data stored on the server.
Sessions are a way to store data on the server, while using cookies to identify the client.
The flow of using sessions is shown in Figure 2 below.
One of the downsides of sessions is that the server needs to store the session data. If the application has multiple servers, the user needs to be always directed to the same server, as the session data is stored on the server. Alternatively, the session data needs to be shared between the servers by e.g. using a database.
Token-based authentication
Users can also be identified using a token. Tokens are created by the server and sent to the client, which then includes the token in subsequent requests. The server can then use the token to authenticate and identify the user.
Sessions and cookies also rely on tokens (such as session IDs), but these are stateful — the server stores data associated with the token. In contrast, token-based authentication is often stateless, since all information needed to verify the user is contained within the token itself.
Tokens are typically passed in an Authorization header in the request. The value of the header usually has the form Bearer <token>, where <token> is the actual token. This means that, on the client, the token needs to be explicitly added to the request.
The
Bearerprefix is added to indicate the authentication scheme being used. TheBearerscheme indicates that the token is a bearer token, meaning that whoever possesses the token can use it to access the associated resources.
Tokens can be stored on the client either in cookies or in the browser’s localStorage. Unlike sessions, the server does not keep any data related to the token itself. Instead, the token (for example, a JWT) contains all the information required to identify and verify the user, while the server holds a secret key used to sign and verify the token’s integrity.
Privacy and tracking regulation
When discussing tracking users, it is also important to mention privacy regulations.
The complete guide to General Data Protection Regulation (GDPR) compliance provides materials for individuals and businesses on the legislation regarding data and privacy. Regarding tracking, the ePrivacy directive notes that consent must be asked for and received before using any cookies or similar tracking mechanisms, except for strictly necessary ones. In addition, users should be given information on what the tracking is used for and should be provided the means to withdraw consent.
The regulations differ between countries. In the EU, tracking is regulated by EU law (GDPR and ePrivacy). In the US, there are no comprehensive country-wide privacy laws, but several state and sectoral laws apply. For example, California has its own data protection regulation, the California Consumer Privacy Act (CCPA, as amended by the CPRA). The CCPA grants users the right to know what information is collected, the right to delete most of it, the right to opt out of data sharing, the right to correct information, and the right to limit the use of sensitive data.
In practice, websites that use tracking mechanisms inform users about the collected data and its purpose, typically through a consent banner or pop-up when visiting a site for the first time.
Summary
In summary:
- HTTP is a stateless protocol, meaning each request is independent. Tracking mechanisms are needed to identify users across requests.
- Cookies store small amounts of data on the client that are automatically sent with each request. They are simple but can be read and modified by the client.
- Sessions use cookies to store a random identifier that references data stored on the server, providing better security for sensitive information.
- Token-based authentication uses cryptographically signed tokens (like JWTs) that contain user information and can be verified without server-side storage.
- Privacy regulations like GDPR (EU) require informing users about tracking and obtaining consent for non-essential cookies.