One of the best things about the serverless is its ability to scale even in case of huge traffic spikes. But unfortunately, scaling is not free both financially and technically. That’s why developers need to control their applications’ scalability. Here the main reasons you will need a rate limiting mechanism in your serverless application:
1- Protect your resources: If you’re providing a public API, traffic spikes can degrade the quality of the service, and may lead to a service outage for all your users. You need to protect your system against such cascading failures as well as self-inflicted Ddos incidents. A bug in your application can trigger such problems in your system. An internal process which retries an endpoint indefinitely in case of a failure can easily exhaust your resources.
2- Manage user quotas: You may want to define quotas for your users for fair use of your services. Also you may need quotas if you provide your services in different pricing tiers.
3- Control the cost: There are many real life examples how an uncontrolled system can cause large bills. This is quite a risk for serverless applications thanks its highly scalable nature. Rate limiting will help you control these costs.