Scaling architecture strategy: Comparing vertical vs. horizontal vs. serverless
Hello everyone! In this post, I want to share with you a comparison between a traditional architecture and a serverless architecture.
We will go through the following:
- How scaling is implemented in different architectural approaches
- Benefits and downsides of each architecture
- Analysis of implementation costs
- The operational complexity of each
- What suits better according to different environments
The application
Design overview
As a base point, I’ll use an e-commerce application, considering that I already analyzed different architectural approaches. You can read it on Medium (Monolithic vs. Microservices vs. AWS Serverless Architecture for e-commerce).
Let’s define a client’s user story just as a reminder of the core features for the app:
Our e-commerce application will allow users to register and log in to the application. Each user will have their own personal data, with the most important being the email and the delivery address. They can view a list of available products, add them to shopping carts, store those carts, and also buy the cart. The application must have the business logic for validating stock, user sessions, and storing the orders.
High-level architecture
What are the minimum elements we need to consider for this application
- A DNS server: To direct the browser to our site when the user writes the URL.
- A web server: For the front end or presentation tier. We will need a user interface for our customers.
- An application server: To handle the business logic tier.
- A database: Our data layer for storing and retrieving the data related to products, stocks, etc.
A very simplistic diagram gives you a general idea.
But wait. You may be thinking, “This is a generic three-tier architecture, just like the one you showed previously in your post Conceptual architecture design for a web application."
Yes, you are right. But now we will make concrete decisions for a specific use case: Our e-commerce application.
Just sit and keep reading!
Scaling considerations
For our purposes of comparing different scalable architectures, we need to justify the analysis. Is there anything better than a user story to describe the context? I don’t think so.
Our client would request something like this:
Our company is putting all-in for this project, so we are working with an agency for all the advertisement on social media. They promise us a dramatic increase in registered users in the first 3 months from site launch. Then, there will be new users, but in a paced manner. Also, we hope our sales will grow accordingly with user usage. As we add a significant amount of new products to our store, visits and sales will also increase.
From this user story, we can conclude several things:
- Our client is very optimistic. ;-)
- We will have to manage a burst of new users at the beginning.
- There will be an increase in added products.
- This will generate more traffic and more sales.
With these elements in mind, we can improve the architecture.
First approach — Vertical scaling
With vertical scaling, we need to build thinking about the worst case. Let’s analyze each component in terms of their CPU, memory, and storage requirements.
- We can assume there will be a large number of DNS requests. Browser DNS cache would help us a bit; also, DNS requests are small in size, and the protocol (UDP) is “lighter.” Mostly, high memory usage will be required.
- Probably, we will need a huge web server for managing a large number of HTTP requests, along with static content requests related to product metadata such as images and specification documents. Therefore, we must consider high memory, high CPU, and enough storage space.
- A massive application server would fit the application’s needs, as it will be responsible for user registration, authentication, and handling the business logic related to product lists, validations, and other functions. This server will mostly need a big CPU and memory.
- Finally, we will need to consider an enormous database server. Why? Because in an e-commerce application, we need to manage transactions; we will be making queries constantly, possibly complex queries (we can assume a relational database), and we will be constantly performing CRUD (Create, Read, Update, and Delete) processes. When the amount of stored data increases in a database, the servers need more resources for searching and storing the data.
An overview of the architecture will be as follow.
Yes, I know there could be some adjustments we could make. We should consider caching, read-replicas, in-memory databases, etc.
But in my humble opinion, those elements should be considered in a second approach and not as part of the initial architectural design.
I mean, we are not fixing things, we want an optimized architecture.
But don’t get me wrong, I am not saying that using caching is a bad practice. In this specific case it probably will be, as there will be a ton of better architectures before thinking about cache.
Lets analyze this approach with its advantages, the challenges it could face, and when is recommended to use it.
Advantages
- Minimal operational complexity. It will be easy to manage the environment.
Challenges
- Constant cost whether there is 1 user using the app or thousands.
- If the resources are not enough to handle an increase in platform usage, we will need to increase the size of our servers.
- Single point of failure on each layer.
- We will need to make a big sum of assumptions related to the worst case scenario considering traffic patterns, data transfer, etc.
Use cases
This architecture would may be not suitable for most cases.
However, if for some reason we are extremely sure about continuous usage of the app, it would fit.
Also, we can consider it for fast development or for a very small team without cloud architects, needing to set up production as fast as possible without any overhead.
For our specific use case, and according to client’s requirements, it will not be the smartest solution, whether the project succeeds or fails.
Horizontal Scaling
When considering horizontal scaling for this type of architecture, we need to take into account several key concepts:
- Application Load Balancer (ALB): We need a specific element responsible for distributing traffic to different services or servers. The ALB serves this purpose effectively.
- Independent Servers: Each server should be able to operate independently, allowing the system to parallelize requests without losing data integrity or causing interference between requests. The communication between the ALB and the servers is crucial in this setup.
- Auto-scaling Groups: The number of servers needed at each layer or architectural component must directly correlate with the application’s requirements. Auto-scaling groups manage the number of available servers and need to be configured to scale out or scale in based on defined parameters such as the minimum and maximum virtual CPUs (vCPUs) or the acceptable range for memory values, among others.
An overview of a specific layer of our architecture will look like the following diagram.
What about other layers and application components?
- DNS Server: The DNS server needs to scale as well. For the sake of simplicity, we will represent it as a simple box. It’s crucial, however, to ensure that it can manage a large volume of DNS requests efficiently.
- Web and Logic Tiers: Both the web and logic tiers can scale following the same horizontal scaling pattern we’ve discussed. This ensures load distribution and redundancy.
- Data Tier: The data tier requires a different approach for optimal scaling. We will use database Read Replicas. These allow us to generate copies of our data on different, independent servers, enabling horizontal scaling. However, these replicas are available only for reading purposes, which aligns with their name. This approach is relevant as it ensures data integrity by having a single point for writing data and one or multiple replicas for reading. This makes sense for an e-commerce application, where reading actions typically exceed writing actions. We need to take into consideration that our application tier will have to manage different endpoints for read and write.
Our diagram will look as follows.
Advantages
- The e-commerce application will adapt to usage patterns, leading to better performance and cost optimization.
- Single points of failure are removed in most cases, increasing reliability and uptime.
Challenges
- The database will need to be scaled vertically to handle write throughputs.
- This approach requires more management than vertical scaling, necessitating a robust architecture.
- It’s important to consider that the base architecture needs to always be available, which translates to a minimum ongoing cost.
- The scaling process is not immediate. Therefore, there could be issues during traffic bursts, and downsizing might leave some costly resources unused. This necessitates careful tuning to implement the best approach based on traffic needs and analyzed patterns.
Use Cases
- Compared to vertical scaling, horizontal scaling is likely the best approach for almost all scenarios.
- It will fit the client’s requirements for the e-commerce application, as it adapts to varying traffic types.
Serverless approach
The serverless architecture could be seen in my previous post that I already mentioned, Monolithic vs. Microservices vs. AWS Serverless Architecture for e-commerce. The implementation will be just like the same as shown there, with the only difference that the focus in that case was the microservice implementation of the solution.
In this post, we will analyze the scaling behavior and compare it to other scaling approaches.
As a reminder, this is the diagram from the mentioned post.
Things to Note
- Amazon Route 53 scales automatically for managing DNS server requests.
- Amazon CloudFront will help us with HTTP requests, connected to Amazon S3, where our static HTML page will be stored.
- Amazon Cognito will help us with user pools without having to manage servers or scaling.
- Amazon API Gateway will manage request traffic to AWS Lambda functions that will perform the business logic.
- The database read and write processes and scaling will be automatically managed by Amazon DynamoDB.
Scaling Behavior
What differs this approach from the previous architectures?
As we reviewed, we do not need to manage scaling by ourselves. Each serverless component will handle its own scaling needs, and we only need to take care of the proper configuration for elements like the amount of memory available for Lambda or the supported capacity range for Aurora Serverless.
Is it that easy? Yes, and as we will see, there will be multiple benefits to this approach. However, there are also some things to consider.
Advantages
- The scaling process will be seamless.
- We reduce the operational overhead.
- You pay for what you use. No traffic means (almost) no cost.
Challenges
- In some cases, it could cost more than other types of architectures, as serverless services tend to cost more per usage.
- There could be specific cases where this approach would not fit, like legacy or migrated applications, where the cost and operational overhead of refactoring the application will be higher than a more traditional approach.
Use Cases
- This approach will likely fit any type of application and it will solve most of the performance, reliability, and scaling problems if the architecture is properly implemented.
Key Takeaways
- Vertical scaling involves provisioning big servers with enough resources considering the worst case scenario of traffic and memory usage.
- Horizontal scaling is focused on distributing the load across multiple servers.
- Serverless components will handle their own scaling automatically based on demand.
- Also, serverless uses pay-per-use model and can be more cost-effective than other approaches.
Well, this is the end of this post, please comment any question or opinion and follow to stay tuned for future posts.