RateLimiter limiter = GetLimiter();
using RateLimitLease lease = limiter.Acquire(permitCount: 1);
if (lease.IsAcquired)
// Do action that is protected by limiter
// Error handling or add retry logic
In the example above we attempt to acquire 1 permit using the synchronous Acquire
method. We also use using
to make sure we dispose the lease once we are done with the resource. The lease is then checked to see if the permit we requested was acquired, if it was we can then use the protected resource, otherwise we may want to have some logging or error handling to inform the user or app that the resource wasn’t used due to hitting a rate limit.
The other method for trying to acquire permits is WaitAsync
. This method allows queuing permits and waiting for the permits to become available if they aren’t. Let’s show another example to explain the queuing concept.
RateLimiter limiter = new ConcurrencyLimiter(
new ConcurrencyLimiterOptions(permitLimit: 2, queueProcessingOrder: QueueProcessingOrder.OldestFirst, queueLimit: 2));
// thread 1:
using RateLimitLease lease = limiter.Acquire(permitCount: 2);
if (lease.IsAcquired) { }
// thread 2:
using RateLimitLease lease = await limiter.WaitAsync(permitCount: 2);
if (lease.IsAcquired) { }
Here we show our first example of using one of the built-in rate limiting implementations, ConcurrencyLimiter
. We create the limiter with a maximum permit limit of 2 and a queue limit of 2. This means that a maximum of 2 permits can be acquired at any time and we allow queuing WaitAsync
calls with up to 2 total permit requests.
The queueProcessingOrder
parameter determines the order that items in the queue are processed, it can be the value of QueueProcessingOrder.OldestFirst
(FIFO) or QueueProcessingOrder.NewestFirst
(LIFO). One interesting behavior to note is that using QueueProcessingOrder.NewestFirst
when the queue is full will complete the oldest queued WaitAsync
calls with a failed RateLimitLease
until there is space in the queue for the newest queue item.
In this example there are 2 threads trying to acquire permits. If thread 1 runs first it will acquire the 2 permits successfully and the WaitAsync
in thread 2 will be queued waiting for the RateLimitLease
in thread 1 to be disposed. Additionally, if another thread tries to acquire permits using either Acquire
or WaitAsync
it will immediately receive a RateLimitLease
with an IsAcquired
property equal to false, because the permitLimit
and queueLimit
are already used up.
If thread 2 runs first it will immediately get a RateLimitLease
with IsAcquired
equal to true, and when thread 1 runs next (assuming the lease in thread 2 hasn’t been disposed yet) it will synchronously get a RateLimitLease
with an IsAcquired
property equal to false, because Acquire
does not queue and the permitLimit
is used up by the WaitAsync
call.
So far we’ve seen the ConcurrencyLimiter
, there are 3 other limiters we provide in-box. TokenBucketRateLimiter
, FixedWindowRateLimiter
, and SlidingWindowRateLimiter
all of which implement the abstract class ReplenishingRateLimiter
which itself implements RateLimiter
. ReplenishingRateLimiter
introduces the TryReplenish
method as well as a couple properties for observing common settings on the limiter. TryReplenish
will be explained after showing some examples of these rate limiters.
RateLimiter limiter = new TokenBucketRateLimiter(new TokenBucketRateLimiterOptions(tokenLimit: 5, queueProcessingOrder: QueueProcessingOrder.OldestFirst,
queueLimit: 1, replenishmentPeriod: TimeSpan.FromSeconds(5), tokensPerPeriod: 1, autoReplenishment: true));
using RateLimitLease lease = await limiter.WaitAsync(5);
// will complete after ~5 seconds
using RateLimitLease lease2 = await limiter.WaitAsync();
Here we show the TokenBucketRateLimiter
, it has a few more options than the ConcurrencyLimiter
. The replenishmentPeriod
is how often new tokens (same concept as permits, just a better name in the context of token bucket) are added back to the limit. In this example tokensPerPeriod
is 1 and the replenishmentPeriod
is 5 seconds, so every 5 seconds 1 token is added back to the tokenLimit
up to the max of 5. And lastly, autoReplenishment
is set to true which means the limiter will create a Timer
internally to handle the replenishment of tokens every 5 seconds.
If autoReplenishment
is set to false then it is up to the developer to call TryReplenish
on the limiter. This is useful when managing multiple ReplenishingRateLimiter
instances and wanting to lower the overhead by creating a single Timer
instance and managing the replenish calls yourself, instead of having each limiter create a Timer
.
ReplenishingRateLimiter[] limiters = GetLimiters();
Timer rateLimitTimer = new Timer(static state =>
var replenishingLimiters = (ReplenishingRateLimiter[])state;
foreach (var limiter in replenishingLimiters)
limiter.TryReplenish();
}, limiters, TimeSpan.FromSeconds(1), TimeSpan.FromSeconds(1));
FixedWindowRateLimiter
has a window
option which defines how long it takes for the window to update.
new FixedWindowRateLimiter(new FixedWindowRateLimiterOptions(permitLimit: 2,
queueProcessingOrder: QueueProcessingOrder.OldestFirst, queueLimit: 1, window: TimeSpan.FromSeconds(10), autoReplenishment: true));
And SlidingWindowRateLimiter
has a segmentsPerWindow
option in addition to window
which specifies how many segments there are and how often the window will slide.
new SlidingWindowRateLimiter(new SlidingWindowRateLimiterOptions(permitLimit: 2,
queueProcessingOrder: QueueProcessingOrder.OldestFirst, queueLimit: 1, window: TimeSpan.FromSeconds(10), segmentsPerWindow: 5, autoReplenishment: true));
Going back to the mention of metadata earlier, let’s show an example of where metadata might be useful.
class RateLimitedHandler : DelegatingHandler
private readonly RateLimiter _rateLimiter;
public RateLimitedHandler(RateLimiter limiter) : base(new HttpClientHandler())
_rateLimiter = limiter;
protected override async Task<HttpResponseMessage> SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
using RateLimitLease lease = await _rateLimiter.WaitAsync(1, cancellationToken);
if (lease.IsAcquired)
return await base.SendAsync(request, cancellationToken);
var response = new HttpResponseMessage(System.Net.HttpStatusCode.TooManyRequests);
if (lease.TryGetMetadata(MetadataName.RetryAfter, out var retryAfter))
response.Headers.Add(HeaderNames.RetryAfter, ((int)retryAfter.TotalSeconds).ToString(NumberFormatInfo.InvariantInfo));
return response;
RateLimiter limiter = new TokenBucketRateLimiter(new TokenBucketRateLimiterOptions(tokenLimit: 5, queueProcessingOrder: QueueProcessingOrder.OldestFirst,
queueLimit: 1, replenishmentPeriod: TimeSpan.FromSeconds(5), tokensPerPeriod: 1, autoReplenishment: true));;
HttpClient client = new HttpClient(new RateLimitedHandler(limiter));
await client.GetAsync("https://example.com");
In this example we are making a rate limited HttpClient
and if we fail to acquire the requested permit we want to return a failed http request with a 429 status code (Too Many Requests) instead of making an HTTP request to our downstream resource. Additionally, 429 responses can contain a “Retry-After” header that let’s the consumer know when a retry might be successful. We accomplish this by looking for metadata on the RateLimitLease
using TryGetMetadata
and MetadataName.RetryAfter
. We also use the TokenBucketRateLimiter
because it is able to calculate an estimate of when the number of requested tokens will be available as it knows how often it replenishes tokens. Whereas the ConcurrencyLimiter
would have no way of knowing when permits would become available, so it wouldn’t provide any RetryAfter
metadata.
MetadataName
is a static class that provides a couple pre-created MetadataName<T>
instances, the MetadataName.RetryAfter
that we just saw, which is typed as MetadataName<TimeSpan>
, and MetadataName.ReasonPhrase
, which is typed as MetadataName<string>
. There is also a static MetadataName.Create<T>(string name)
method for creating your own strongly-typed named metadata keys. RateLimitLease.TryGetMetadata
has 2 overloads, one for the strongly-typed MetadataName<T>
which has an out T
parameter, and the other accepts a string for the metadata name and has an out object
parameter.
Let’s now look at another API being introduced to help with more complicated scenarios, the PartitionedRateLimiter
!
PartitionedRateLimiter
Also contained in the System.Threading.RateLimiting nuget package is PartitionedRateLimiter<TResource>
. This is an abstraction that is very similar to the RateLimiter
class except that it accepts a TResource
instance as an argument to methods on it. For example Acquire
is now: Acquire(TResource resourceID, int permitCount = 1)
. This is useful for scenarios where you might want to change rate limiting behavior depending on the TResource
that is passed in. This can be something such as independent concurrency limits for different TResource
s or more complicated scenarios like grouping X and Y under the same concurrency limit, but having W and Z under a token bucket limit.
To assist with common usages, we have included a way to construct a PartitionedRateLimiter<TResource>
via PartitionedRateLimiter.Create<TResource, TPartitionKey>(...)
.
enum MyPolicyEnum
Admin,
Default
PartitionedRateLimiter<string> limiter = PartitionedRateLimiter.Create<string, MyPolicyEnum>(resource =>
if (resource == "Policy1")
return RateLimitPartition.Create(MyPolicyEnum.One, key => new MyCustomLimiter());
else if (resource == "Policy2")
return RateLimitPartition.CreateConcurrencyLimiter(MyPolicyEnum.Two, key =>
new ConcurrencyLimiterOptions(permitLimit: 2, queueProcessingOrder: QueueProcessingOrder.OldestFirst, queueLimit: 2));
else if (resource == "Admin")
return RateLimitPartition.CreateNoLimiter(MyPolicyEnum.Admin);
return RateLimitPartition.CreateTokenBucketLimiter(MyPolicyEnum.Default, key =>
new TokenBucketRateLimiterOptions(tokenLimit: 5, queueProcessingOrder: QueueProcessingOrder.OldestFirst,
queueLimit: 1, replenishmentPeriod: TimeSpan.FromSeconds(5), tokensPerPeriod: 1, autoReplenishment: true));
RateLimitLease lease = limiter.Acquire(resourceID: "Policy1", permitCount: 1);
// ...
RateLimitLease lease = limiter.Acquire(resourceID: "Policy2", permitCount: 1);
// ...
RateLimitLease lease = limiter.Acquire(resourceID: "Admin", permitCount: 12345678);
// ...
RateLimitLease lease = limiter.Acquire(resourceID: "other value", permitCount: 1);
PartitionedRateLimiter.Create
has 2 generic type parameters, the first one represents the resource type which will also be the TResource
in the returned PartitionedRateLimiter<TResource>
. The second generic type is the partition key type, in the above example we use MyPolicyEnum
as our key type. The key is used to differentiate a group of TResource
instances with the same limiter, which is what we are calling a partition. PartitionedRateLimiter.Create
accepts a Func<TResource, RateLimitPartition<TPartitionKey>>
which we call the partitioner. This function is called every time the PartitionedRateLimiter
is interacted with via Acquire
or WaitAsync
and a RateLimitPartition<TKey>
is returned from the function. RateLimitPartition<TKey>
contains a Create
method which is how the user specifies what identifier the partition will have and what limiter will be associated with that identifier.
In our first block of code above, we are checking the resource for equality with “Policy1”, if they match we create a partition with the key MyPolicyEnum.One
and return a factory for creating a custom RateLimiter
. The factory is called once and then the rate limiter is cached so future accesses for the key MyPolicyEnum.One
will use the same rate limiter instance.
Looking at the first else if
condition we similarly create a partition when the resource equals “Policy2”, this time we use the convenience method CreateConcurrencyLimiter
to create a ConcurrencyLimiter
. We use a new partition key of MyPolicyEnum.Two
for this partition and specify the options for the ConcurrencyLimiter
that will be generated. Now every Acquire
or WaitAsync
for “Policy2” will use the same instance of ConcurrencyLimiter
.
Our third condition is for our “Admin” resource, we don’t want to limit our admin(s) so we use CreateNoLimiter
which will have no limits applied. We also assign the partition key MyPolicyEnum.Admin
for this partition.
Finally, we have a fallback for all other resources to use a TokenBucketLimiter
instance and we assign the key of MyPolicyEnum.Default
to this partition. Any request to a resource not covered by our if
conditions will use this TokenBucketLimiter
. It’s generally a good practice to have a non-noop fallback limiter in case you didn’t cover all conditions or add new behavior to your application in the future.
In the next example, let’s combine the PartitionedRateLimiter
with our customized HttpClient
from earlier. We’ll use HttpRequestMessage
as our resource type for the PartitionedRateLimiter
, which is the type we get in the SendAsync
method of DelegatingHandler
. And a string
for our partition key as we are going to be partitioning based on url paths.
PartitionedRateLimiter<HttpRequestMessage> limiter = PartitionedRateLimiter.Create<HttpRequestMessage, string>(resource =>
if (resource.RequestUri?.IsLoopback)
return RateLimitPartition.CreateNoLimiter("loopback");
string[]? segments = resource.RequestUri?.Segments;
if (segments?.Length >= 2 && segments[1] == "api/")
// segments will be [] { "/", "api/", "next_path_segment", etc.. }
return RateLimitPartition.CreateConcurrencyLimiter(segments[2].Trim('/'), key =>
new ConcurrencyLimiterOptions(permitLimit: 2, queueProcessingOrder: QueueProcessingOrder.OldestFirst, queueLimit: 2));
return RateLimitPartition.Create("default", key => new MyCustomLimiter());
class RateLimitedHandler : DelegatingHandler
private readonly PartitionedRateLimiter<HttpRequestMessage> _rateLimiter;
public RateLimitedHandler(PartitionedRateLimiter<HttpRequestMessage> limiter) : base(new HttpClientHandler())
_rateLimiter = limiter;
protected override async Task<HttpResponseMessage> SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
using RateLimitLease lease = await _rateLimiter.WaitAsync(request, 1, cancellationToken);
if (lease.IsAcquired)
return await base.SendAsync(request, cancellationToken);
var response = new HttpResponseMessage(System.Net.HttpStatusCode.TooManyRequests);
if (lease.TryGetMetadata(MetadataName.RetryAfter, out var retryAfter))
response.Headers.Add(HeaderNames.RetryAfter, ((int)retryAfter.TotalSeconds).ToString(NumberFormatInfo.InvariantInfo));
return response;
Looking closely at the PartitionedRateLimiter
in the above example, our first check is for localhost, we’ve decided that if the user is doing things locally we don’t want to limit them, they won’t be using the upstream resource that we are trying to protect. The next check is more interesting, we are looking at the url path and finding any requests to an /api/<something>
endpoint. If the request matches we grab the <something>
part of the path and create a partition for that specific path. What this means is that any requests to /api/apple/*
will use one instance of our ConcurrencyLimiter
while any requests to /api/orange/*
will use a different instance of our ConcurrencyLimiter
. This is because we use a different partition key for those requests and so our limiter factory generates a new limiter for the different partitions. And finally, we have a fallback limit for any requests that aren’t for localhost or an /api/*
endpoint.
Also shown, is the updated RateLimitedHandler
which now accepts a PartitionedRateLimiter<HttpRequestMessage>
instead of a RateLimiter
and passes in request
to the WaitAsync
call, otherwise the rest of the code remains the same.
There are a few things worth pointing out in this example. We may potentially create many partitions if lots of unique /api/*
requests are made, this would result in memory usage growing in our PartitionedRateLimiter
. The PartitionedRateLimiter
returned from PartitionedRateLimiter.Create
does have some logic to remove limiters once they haven’t been used for a while to help mitigate this, but application developers should also be aware of creating unbounded partitions and try to avoid that when possible. Additionally, we have segments[2].Trim('/')
for our partition key, the Trim
call is to avoid using a different limiter in the cases of /api/apple
and /api/apple/
as those produce different segments when using Uri.Segments
.
Custom PartitionedRateLimiter<T>
implementations can also be written without using the PartitionedRateLimiter.Create
method. Below is an example of a custom implementation using a concurrency limit for each int
resource. So resource 1
has its own limit, 2
has its own limit, etc. This has the advantage of being more flexible and potentially more efficient at the cost of higher maintenance.
public sealed class PartitionedConcurrencyLimiter : PartitionedRateLimiter<int>
private ConcurrentDictionary<int, int> _keyLimits = new();
private int _permitLimit;
private static readonly RateLimitLease FailedLease = new Lease(null, 0, 0);
public PartitionedConcurrencyLimiter(int permitLimit)
_permitLimit = permitLimit;
public override int GetAvailablePermits(int resourceID)
if (_keyLimits.TryGetValue(resourceID, out int value))
return value;
return 0;
protected override RateLimitLease AcquireCore(int resourceID, int permitCount)
if (_permitLimit < permitCount)
return FailedLease;
bool wasUpdated = false;
_keyLimits.AddOrUpdate(resourceID, (key) =>
wasUpdated = true;
return _permitLimit - permitCount;
}, (key, currentValue) =>
if (currentValue >= permitCount)
wasUpdated = true;
currentValue -= permitCount;
return currentValue;
if (wasUpdated)
return new Lease(this, resourceID, permitCount);
return FailedLease;
protected override ValueTask<RateLimitLease> WaitAsyncCore(int resourceID, int permitCount, CancellationToken cancellationToken)
return new ValueTask<RateLimitLease>(AcquireCore(resourceID, permitCount));
private void Release(int resourceID, int permitCount)
_keyLimits.AddOrUpdate(resourceID, _permitLimit, (key, currentValue) =>
currentValue += permitCount;
return currentValue;
private sealed class Lease : RateLimitLease
private readonly int _permitCount;
private readonly int _resourceId;
private PartitionedConcurrencyLimiter? _limiter;
public Lease(PartitionedConcurrencyLimiter? limiter, int resourceId, int permitCount)
_limiter = limiter;
_resourceId = resourceId;
_permitCount = permitCount;
public override bool IsAcquired => _limiter is not null;
public override IEnumerable<string> MetadataNames => throw new NotImplementedException();
public override bool TryGetMetadata(string metadataName, out object? metadata)
throw new NotImplementedException();
protected override void Dispose(bool disposing)
if (_limiter is null)
return;
_limiter.Release(_resourceId, _permitCount);
_limiter = null;
PartitionedRateLimiter<int> limiter = new PartitionedConcurrencyLimiter(permitLimit: 10);
// both will be successful acquisitions as they use different resource IDs
RateLimitLease lease = limiter.Acquire(resourceID: 1, permitCount: 10);
RateLimitLease lease2 = limiter.Acquire(resourceID: 2, permitCount: 7);
This implementation does have some issues such as never removing entries in the dictionary, not supporting queuing, and throwing when accessing metadata, so please use it as inspiration for implementing a custom PartitionedRateLimiter<T>
and don’t copy without modifications into your code.
Now that we’ve gone over the main APIs, let’s take a look at the RateLimiting middleware in ASP.NET Core that makes use of these primitives.
RateLimiting middleware
This middleware is provided via the Microsoft.AspNetCore.RateLimiting NuGet package. The main usage pattern is to configure some rate limiting policies and then attach those policies to your endpoints. A policy is a named Func<HttpContext, RateLimitPartition<TPartitionKey>>
, which is the same as what the PartitionedRateLimiter.Create
method took, where TResource
is now HttpContext
and TPartitionKey
is still a user defined key. There are also extension methods for the 4 built-in rate limiters when you want to configure a single limiter for a policy without needing different partitions.
var app = WebApplication.Create(args);
app.UseRateLimiter(new RateLimiterOptions()
.AddConcurrencyLimiter(policyName: "get", new ConcurrencyLimiterOptions(permitLimit: 2, queueProcessingOrder: QueueProcessingOrder.OldestFirst, queueLimit: 2))
.AddNoLimiter(policyName: "admin")
.AddPolicy(policyName: "post", partitioner: httpContext =>
if (!StringValues.IsNullOrEmpty(httpContext.Request.Headers["token"]))
return RateLimitPartition.CreateTokenBucketLimiter("token", key =>
new TokenBucketRateLimiterOptions(tokenLimit: 5, queueProcessingOrder: QueueProcessingOrder.OldestFirst,
queueLimit: 1, replenishmentPeriod: TimeSpan.FromSeconds(5), tokensPerPeriod: 1, autoReplenishment: true));
return RateLimitPartition.Create("default", key => new MyCustomLimiter());
app.MapGet("/get", context => context.Response.WriteAsync("get")).RequireRateLimiting("get");
app.MapGet("/admin", context => context.Response.WriteAsync("admin")).RequireRateLimiting("admin").RequireAuthorization("admin");
app.MapPost("/post", context => context.Response.WriteAsync("post")).RequireRateLimiting("post");
app.Run();
This example shows how to add the middleware, configure some policies, and apply the different policies to different endpoints. Starting at the top, we add the middleware to our middleware pipeline using UseRateLimiter
. Next we add some policies to our options using the convenience methods AddConcurrencyLimiter
and AddNoLimiter
for 2 of the policies, named "get"
and "admin"
respectively. Then we use the AddPolicy
method that allows configuring different partitions based on the resource passed in (HttpContext
for the middleware). Finally, we use the RequireRateLimiting
method on our various endpoints to let the Rate Limiting middleware know what policy to run on what endpoint. (Note the RequireAuthorization
usage on the /admin
endpoint doesn’t do anything in this minimal sample, imagine that authentication and authorization are configured)
The AddPolicy
method also has 2 more overloads that use IRateLimiterPolicy<TPartitionKey>
. This interface exposes an OnRejected
callback, the same as RateLimiterOptions
which I’ll describe below, and a GetPartition
method that takes the HttpContext
as an argument and returns a RateLimitPartition<TPartitionKey>
. The first overload of AddPolicy
takes an instance of IRateLimiterPolicy
and the second takes an implementation of IRateLimiterPolicy
as a generic argument. The generic argument one will use dependency injection to call the constructor and instantiate the IRateLimiterPolicy
for you.
public class CustomRateLimiterPolicy<string> : IRateLimiterPolicy<string>
private readonly ILogger _logger;
public CustomRateLimiterPolicy(ILogger<CustomRateLimiterPolicy<string>> logger)
_logger = logger;
public Func<OnRejectedContext, CancellationToken, ValueTask>? OnRejected
get => (context, lease) =>
context.HttpContext.Response.StatusCode = 429;
_logger.LogDebug("Request rejected");
return new ValueTask();
public RateLimitPartition<string> GetPartition(HttpContext context)
if (!StringValues.IsNullOrEmpty(httpContext.Request.Headers["token"]))
return RateLimitPartition.CreateTokenBucketLimiter("token", key =>
new TokenBucketRateLimiterOptions(tokenLimit: 5, queueProcessingOrder: QueueProcessingOrder.OldestFirst,
queueLimit: 1, replenishmentPeriod: TimeSpan.FromSeconds(5), tokensPerPeriod: 1, autoReplenishment: true));
return RateLimitPartition.Create("default", key => new MyCustomLimiter());
var app = WebApplication.Create(args);
var logger = app.Services.GetRequiredService<ILogger<CustomRateLimiterPolicy<string>>>();
app.UseRateLimiter(new RateLimitOptions()
.AddPolicy("a", new CustomRateLimiterPolicy<string>(logger))
.AddPolicy<CustomRateLimiterPolicy<string>>("b"));
Other configuration on RateLimiterOptions
include RejectionStatusCode
which is the status code that will be returned if a lease fails to be acquired, by default a 503 is returned. For more advanced usages there is also the OnRejected
function which will be called after RejectionStatusCode
is used and receives OnRejectedContext
as an argument.
new RateLimiterOptions()
OnRejected = (context, cancellationToken) =>
context.HttpContext.StatusCode = StatusCodes.Status429TooManyRequests;
return new ValueTask();
And last but not least, RateLimiterOptions
allows configuring a global PartitionedRateLimiter<HttpContext>
via RateLimiterOptions.GlobalLimiter
. If a GlobalLimiter
is provided it will run before any policy specified on an endpoint. For example, if you wanted to limit your application to handle 1000 concurrent requests no matter what endpoint policies were specified you could configure a PartitionedRateLimiter
with those settings and set the GlobalLimiter
property.
Summary
Please try Rate Limiting out and let us know what you think! For the RateLimiting APIs in the System.Threading.RateLimiting namespace use the nuget package System.Threading.RateLimiting and provide feedback in the Runtime GitHub repo. For the RateLimiting middleware use the nuget package Microsoft.AspNetCore.RateLimiting and provide feedback in the AspNetCore GitHub repo.
Customizing Controls in .NET MAUI
Let's look at how to customize .NET MAUI controls.
23 comments
Announcing .NET Conf – Focus on .NET MAUI, Reactor, and Community Events
Ready to get building with .NET MAUI? Join us for .NET Conf - Focus on MAUI, worldwide Reactor events, and local community event opportunities.
3 comments
This can be accomplished using the PartitionedRateLimiter. Using the built in PartitionedRateLimiter.Create method and defining a partition per IP/UserID, similar to the RateLimitedHandler example shown above that uses Paths instead of IP/UserID. Or with a custom implementation similar to PartitionedConcurrencyLimiter but for IPs/UserID.
429 is primarily meant for “a single client/user is making too many requests”. This is why the RateLimitedHandler example above is using 429 when creating the HttpResponse because it is a client side rate limiter so in that case the single client is making too many requests. On the server side we don’t know that it’s a single client that’s tripping the rate limit (although you can add this logic yourself) so the default shouldn’t use 429, otherwise we are improperly using the status code. You can of course change the default if you don’t care about the semantic differences or if you know it’s a single client.
Built in way to enable/disable rate limiting at runtime would be nice. So you can keep it disabled most of time, and enable (per policy or globally) only when you have high load. Right now I suppose it’s possible only when you use generic AddPolicy and manually fallback to empty limiter.
Smth like singletone registered implementation of interface with 2 methods: CheckEnabled(HttpContext context) to check for global on/off and CheckEnabled(string policy, HttpContext context) to check after global check passed and policy name is resolved.
Not sure this is something we want to have built-in. We don’t want to encourage users to “turn off” rate limiting, and if your app is under low load your rate limits shouldn’t be hit anyways.
This is easy to implement yourself though, as shown above, you can write an IRateLimiterPolicy implementation that accepts your singleton service and does those checks for you and returns “no limit” when turned off. Otherwise, it returns your partitions like normal.
I’m a little confused about why WaitAsync() can return if the lease hasn’t actually been acquired. I guess the use case is to prevent having too many clients waiting at once?
Also, not a fan of you guys making a class called “MetadataName”. This seems way too generic a name to put in the framework and fairly likely to collide with user code. It may even get confusing with other parts of the framework/runtime where things are also called metadata. Maybe “HttpMetdataName”? Or if it’s not only for Http, maybe “RateLimiterMetadataName”?
Correct, the primary case when WaitAsync returns a failed lease is when the queue is filled (configurable for all the built-in limiters). There are a couple more edge cases where a failed lease can occur as well; you ask for more permits than can ever be allowed, you use QueueProcessingOrder.NewestFirst and a newer request kicks a prior WaitAsync from the queue.
I’m interested in this too. I’m hoping that I can integrate with my Redis cache to share rate-limiting state across multiple edge services. In my use-case, inbound traffic is currently round-robined to available services within my cluster by my container management system to spread the load, and we scale horizontally if we see more load. If this is pre-built, then that would be fantastic, but happy to roll my own implementation if there are one or more interfaces I can swap out using IOC.
We are currently experimenting with how distributed rate limiting implementations would work. Here is a prototype of a fault-tolerant distributed rate limiter built on Orleans: https://github.com/ReubenBond/DistributedRateLimiting.Orleans
A Redis-based implementation is also being looked into. There are no plans to ship any production ready implementations in .NET 7, there isn’t enough time! But we may ship an experimental package or provide examples in the meantime.
This is definitely an area we are interested in making work well.
One security thing I’m still missing from IIS or .Net core is a way to ban bots by IP that try to find vulnerable web apps (like WordPress or other PHP apps) based on the type of requests they make. For example, ban an IP address when it makes x amount of requests for /wp-include/*. Can this be done with RateLimiting too?
The problem with such an approach is that simple static code is not sufficient for bot detection and banning. Having spent a lot of time fighting bots on our sites I can assure you you’ll spend more time then the malicious users will trying to get this right. Bot prevention is better handled through a higher level provider like Azure, AWS or a specialized provider like Imperva. They have the AI tools and better experience to handle this kind of stuff then anything you might try to implement (or even IIS).
This is actually my same view on this library. I see it as a “poor man’s” solution to solving a problem when you don’t have (or need) a more robust solution. If you need a simple solution then this might be it but honestly it isn’t going to scale well IMO and is mostly useless for anything beyond that. The problem is that those requests still have to come to your IIS box and still have to be processed as requests. It is only after running your code that it’ll realize that rate limiting should kick in and block the request. The purpose of rate limiting is to prevent callers from calling you too many times, generally for perf reasons. If your server is still executing code to determine that it needs to block then you aren’t really getting the perf benefits. Furthermore if somebody blasts your server with 10/100/1000K requests by accident, your server is still going to be processing those requests.
Personally I might use such a library to test out rate limiting for some code but if I need rate limiting in my app then I would go with a higher level provider that handles that before it gets to my server at all. Azure has such features and I’m sure other providers do as well.
What you’re looking for is called a Web Application Firewall, like the open source ModSecurity that works with IIS. You can also create middleware for ASP.NET Core like Kestrel WAF to block requests based on rules. As already mentioned the best solution is to use a managed service deployed on the network edge, but if the cost is prohibitive then using static rules is still worthwhile.
We are working on adding some attributes to influence the behavior of the middleware for endpoints that might need that. It will be something like [RateLimit(“policyName”)] and [DisableRateLimit] (names TBD).
For now, you can still configure the GlobalLimiter in the middleware which will run for all endpoints. It will be a bit harder to specify logic for specific Controller actions, but it is feasible.
Do I understand correctly that a built-in rate limiting solution is an option only for these setups that can’t have a designated rate-limiting software doing the job in front of the web server?
I mean, if I have nginx, haproxy which do it for me would there be any reason to embed rate-limiting in the web server?
Even better example is having WAF attached to Load Balancer in front of auto-scaled set of servers (eg in containers). This eliminates the problem of distributed rate-limiting.
So my point is: as long as you can have any option mentioned by me, there is no reason to configure rate-liming inside my netcore webserver. Please prove me wrong! Thanks
This is a great feature! The community has been asking for it for a couple years, as evidenced by a couple long discussions in Polly’s issues. Also, it’s great timing to coincide with the recent announcement of HallPass.dev, which seeks to help client applications respect the rate limits of the APIs to which they make requests. Once this official .NET package is released, it would be great if HallPass reworked their stuff to use as much of these core elements as possible in their implementations.