In this article, I’m going to discuss a novel way to distribute network load of a highly distributed web based application service.
The idea is to replace load balancing mechanism with the load balancer by using service workers in the client side. I have been struct by this idea so I have decided to gather my thoughts in this article. The proposed architecture would be suitable for a distributed system where there is significant network load which the existing load balancer is unable to handle. For example it could be due to hardware limitations.
The neat mechanism is to use the service worker’s ability to route requests to the desirable application instances directly.
Before we dive in, please note that the architecture decisions, and ideas discussed here are from my thoughts and opinions. If you are going to use it for your new or existing distributed system, please consider all avenues. such as:
- Whether the load balancer is actually causing network bottleneck
- Maintaining the load balancer is very expensive due to amount of network data
- How complex it is to replace the load balancer
Why a load balancer?
First let’s clarify what is a load balancer and how it is being used.
Imagine a simple web based server-client architecture.
Server-client architecture
Here we have one server and it gets all the requests, process them, and send the responses back to the clients. It takes CPU load to process each of these requests. As the number of clients grow and they send multiple requests that are CPU heavy, it is going to slow the application server.
When we realize the app server cannot process all the requests in a timely manner because its overloaded to handle all the requests, we would think to add more app servers which is called horizontal scalling. If we can add more app servers, we can distribute the requests and hence distribute the CPU load among the servers. With this now we introduce another element to route the requests. This is called the load balancer. Simply a load balancer is a proxy that knows about it’s app servers.
Server-client with load balancer
As seen in the previous diagram, the load balancer will manage which app server to send the request to.
When network and CPU load is high
Let’s assume there are millions of clients that the application service needs to server. As a results the number of requests and the CPU load to process the requests and the network bandwidth used to manage the requests have increased dramatically.
For this scenario, it turns out we would need at least 100 app server instances to distribute the CPU load so that the requests can be processed in a timely manner. The problem now however is that all the data in the network is going through the load balancer and we noticed that the load balancer is causing a network bottleneck.
What can we do?
What can we do to solve the network bottleneck with the load balancer. Following is my proposed suggestion is to use server worker to distribute the network requests. So in other words, the clients are made smart by configuring them to determine which app server to send the requests to.
In this proposed architecture there is a new server that is responsible to send application server information when clients request. We will call this ‘server manager’.
Here is process in steps:
- Client hits the website for the first time
- The initial request is sent to the server manager and it send back the application boot (a simple web app that will initialize the client state) which includes the service worker also
- The service worker gets installed
- Once the service worker is installed, it will request the app server details from server manager
- The response from the server manager includes the necessary details for the clients to proceed sending requests to app servers
- Service worker refresh the page
Service worker will now intercept the requests from the client and direct them to a selected app server. The selection of the application instance depends on the system architecture of your application. For example, some applications would better to know what app server instances are closer to the client so the response lag is low.
With this approach we would eliminate the problem with the load balancer causing bottleneck the network load as the clients are now directly talk to application instances via the service worker.
Service worker script should also cache the app server details sent by the server manager and only requests app server details when necessary. For example periodically the service worker can get the details so that it is aware what app servers are active at the moment.
Also an interesting property of the suggested architecture is that now we can be independent of choice of cloud service provider when it comes to make some certain decisions. For example we can have some app server instances in AWS and some in Google cloud if for some reason it made sense to your system architecture.
Wrapping up…
In conclusion I hope you have got an idea of my proposed architecture and at least inspired by it or provoke interesting thoughts. Service workers are powerful tool for web applications be sure to check them and their capabilities.