Reactive scaling

With the use of a machine learning algorithm, predictive scaling is getting more accurate, but you have to deal with sudden traffic spikes, and depend upon reactive scaling. This unexpected traffic that may come could be even 10 times the regular traffic; this usually happens due to a sudden demand or, for example, due to a first attempt to run sales events, where we're not sure about the amount of incoming traffic.

Let's take an example where you are launching a flash deal on your e-commerce website. You will have a large amount of traffic on your home page, and from there, the user will go to the flash deal product-specific page. Some users may want to buy the product; therefore, they will go to the add to cart page.

In this scenario, each page will have a different traffic pattern, and you will need to understand your existing architecture and traffic patterns, along with an estimate of the desired traffic. You also need to understand the navigation path of the website. For example, the user has to log in to buy a product, which can lead to more traffic on the login page.

In order to plan for the scaling of your server resources for traffic handling, you need to determine the following patterns:

  • Determine web pages, which are read-only and can be cached.
  • Which user queries need just to read that data, rather than write or update anything in the database?
  • Does a user query frequently, requesting the same or repeated data, such as their own user profile?

Once you understand these patterns, you can plan to offload your architecture in order to handle excessive traffic. To offload your web-layer traffic, you can move static content, such as images and videos, to content distribution networks from your web server. You will learn more about the Cache Distribution pattern in Chapter 6, Solution Architecture Design Patterns.

At the server fleet level, you need to use a load balancer in order to distribute traffic, and you need to use auto-scaling to increase or shrink several servers in order to apply horizontal scaling. To reduce the database load, use the right database for the right need—a NoSQL database to store user sessions and review comments, a relational database for the transaction, and apply caching to store frequent queries.

In this section, you learned about the scaling patterns and methods that are used to handle the scaling needs of your application in the form of predictive scaling and reactive scaling. In Chapter 6, Solution Architecture Design Patterns, you will learn about the details of the different types of design patterns, and how to apply them in order to be able to scale your architecture.