-
-
Notifications
You must be signed in to change notification settings - Fork 835
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Memory leak when using types in collectible AssemblyLoadContexts #1324
Comments
This is an interesting use-case, albeit one we don't actually support; I've always really struggled to get an ALC to actually unload personally, they're usually pretty stubborn. It's certainly not a "bug", in that I wouldn't have expected it to work if someone asked me "does Autofac lose any references to types when the ALC unloads" . Autofac's reflection caching behaviour predates (by quite a long way) the idea of unloadable types. I'd say that the safety of your reflection-based cache clear is up-for-debate; it's entirely not supported, and is liable to break between releases. It's also incomplete (I guess because only those caches are being used in your code). There's also:
Also bear in mind that if you reset those caches using How often are you loading/unloading ALCs? Is the memory leak severe? |
Thanks for the quick response; I figured this might be the case. The world of unloadable types is new and rocky, and there are similar issues in ASPNET Core (e.g. dotnet/aspnetcore#21744). .NET itself had an issue until the latest service release (dotnet/runtime#1388) that prevented using things like Fluent NHibernate in the manner I've described. What I can say is that I've managed to cobble together a Kestrel-based runtime that can dynamically load, register, use, dispose, and collect assemblies which are used to service both web requests and Quartz jobs. The default MS IOC container appears not to leak; even with Autofac fitted, the default IOC mechanism is being used to resolve ASPNET controllers (but not their constructor args) which are dynamically loaded and unloaded succesfully. I could fall back to the default builder/container but I prefer Autofac's featureset. On top of that I'm dealing with a fairly large existing codebase that uses Autofac extensively, and many of the plugin types are externally maintained and use Autofac internally (and expose modules), so form part of the overall plugin contract, which I very much do not want to change. I'm 100% OK with my hack breaking between releases, and understand any time I'm using reflection to violate field and class privacy I'm way off in "that's not supported" land. I guess what I'm really looking for is an indication of where I am on the scale from "you might be OK if test thoroughly and maintain your hacks" to "you are completely crazy and this will never work for a bunch of Autofac-specific reasons you have overlooked". Thanks for identifying those other two collections. My test was exclusively constructor-based, so I didn't encounter those two. As for how often I'm loading / unloading assemblies, this will vary pretty wildly depending on environment, with the expectation that it'll happen say 5 times a year in production environments and 30-200 times an hour in some development situations. While wiping those caches will result in a loss of performance it's nothing like restarting the whole process and reinit'ing the whole environment. As for the severity of the leak, that really depends on how many assemblies are dragged into the collectible ALCs. I haven't pieced this all together in a realistic environment yet, but I'm expecting the leak to be in the single-digit megabytes per cycle at a minimum. This is based on the assumption that if even a single type can't unload, neither can the ALC, so all assemblies / types within it will remain. As you point out, Autofac's caching logic significantly predates the idea of unloadable types. Times are changing, though, and the possibility of managing ephemeral types in nested lifetime scopes is pretty exciting, and potentially extremely powerful. Would the team be interested in considering this as a future feature? I'm not asking for the work to be done, just looking for an indication of whether the feature would be welcome. I'd understand if it wasn't (especially on the grounds of simplicity). If it's something that people feel would be welcome in Autofac's future, and my hacks are not deemed the work of a madman, I'd be looking to stick with the hacks for now with a view to contributing to a supported solution in the future. |
I'm not opposed to supporting some way of clearing internal caches. Scenarios where Autofac supports a plugin architecture, and the ability to load plugins is a pretty common usecase. Unloading is just less so. I'm a little nervous about the idea of "scoping" caches to a particular lifetime scope; that would probably introduce a lot of cache duplication where nested lifetime scopes can't use the cache entries in their owned scope, and some overhead of creating new cache objects at nested scopes, for a scenario that's not super common. Noodling on the problem though, I do wonder if there might be a relatively simple way to allow external code in advanced scenarios to say "hey, Autofac, can you please clear your various reflection caches?" Something like a public delegate bool ReflectionCacheShouldClearPredicate(MemberInfo member);
public delegate void ReflectionCacheRegistrationCallback(ReflectionCacheShouldClearPredicate? predicate);
public static class ReflectionCache
{
private static List<ReflectionCacheRegistrationCallback> _cacheRegistrations = new();
// Make this public so Autofac code in integration libraries can register for reflection cache clearing.
public static void Register(ReflectionCacheRegistrationCallback registrationCallback)
{
lock (_cacheRegistrations)
{
_cacheRegistrations.Add(registrationCallback);
}
}
// Allow conditional clearing of reflection cache data, so only a single
// ALC's type info has to be dropped (for example).
public static void Clear(ReflectionCacheShouldClearPredicate? predicate = null)
{
lock (_cacheRegistrations)
{
foreach (var reg in _cacheRegistrations)
{
reg(predicate);
}
}
}
}
internal class ReflectionCacheDictionary<TKey, TValue> : ConcurrentDictionary<TKey, TValue>
where TKey : MemberInfo
{
public ReflectionCacheConcurrentDictionary()
{
ReflectionCache.Register(CacheClear);
}
private void CacheClear(ReflectionCacheShouldClearPredicate? predicate)
{
if (predicate is null)
{
Clear();
return;
}
foreach (var kvp in this)
{
if (predicate(kvp.Key))
{
TryRemove(kvp.Key, out _);
}
}
}
}
// ...
internal static class TypeExtensions
{
private static readonly ReflectionCacheDictionary<Type, bool> IsGenericEnumerableInterfaceCache = new();
// ...
} By delegating responsibility for which types to remove from the cache, it means we don't need to understand how AssemblyLoadContext unloadingAlc = // from somewhere
ReflectionCache.Clear(t => unloadingAlc.Assemblies.Contains(t.Module.Assembly)); There are downsides to the above approach, in that it would mandate that you not resolve anything from that assembly load context while the cache is being cleared. We wouldn't synchronise, because it would interfere with other cache hits, so this functionality would be firmly in the "advanced" camp. From a maintainer viewpoint, we would have to make sure we use that |
I'm not entirely sure that would fix everything. I haven't gone spelunking yet, but I'm thinking about stuff like...
This is a lot of why the container is effectively immutable. In some cases it isn't just about flushing the cache, it's about, say, some factory resolving This has come up a lot in the past with folks wanting to do UWP apps that dynamically load and unload components. However, there's no structure in that framework really for having nested-upon-nested lifetime scopes, so even just wiring it up is a challenge; and the root "application" object (from what I understand) is where folks end up "holding" the set of app components - an app level singleton that holds types that need to be unloaded. I'm not saying this is literally technologically insurmountable, but maybe that even if it's supported, it'll be amazingly hard to use correctly. Re: the default MS container and controllers - it's not that the default MS container is used to resolve controllers, it's the |
I'm reasonably confident (80%?) that if a lifetime scope is configured in Equally, since the created lifetime scope has it's own instance of the services tracker, computed sets of So all shared instances and computed registration data for a lifetime scope that has had additional modules loaded into it in |
Maybe that More brainstorming: I keep thinking about the synchronization problems here. To unload/update the ALC you'd have to cordon the app, drain all the traffic, ensure there are no requests going on, kill the ALC lifetime scope, create a new one, and uncordon the app so it can take requests again. During that time, you'd need to drop the ALC lifetime scope first and then flush the caches, in that order, and without doing any resolutions that might incorrectly update the caches between. Admittedly "not our problem," but not obvious or easy. (Some of this was already mentioned before, but I think it bears repeating.) I'm also thinking about weird inadvertent captive dependencies we don't have control over, like someone holding an invalid Hmmm. I guess if the whole solution on our end is to provide something to flush caches and let app devs figure the rest out for themselves... yeah, OK. I think I'll want to really, really clearly explain in the docs that you are so amazingly on your own here it's not even funny. I don't want this to become a support nightmare. I can't tell you how much I'm absolutely not interested in spending hours trying to solve someone's problems with their custom app code in this territory. Like, to the point that if someone can't literally point to the line of code that needs to be fixed and provide a PR to fix it... issue immediately closed. That not interested. On the |
As you say, we are absolutely not going to get involved in figured out how to "drain" the resolver. But, worth noting that if someone disposes the outer lifetime scope before clearing the cache, they will not be able to resolve any objects from any more-nested per-request scopes after that Dispose call (if you recall in v6, I think, I added the check that walks up the scope hierarchy checking that no parent scopes have been disposed prior to resolve). So, if someone disposes of their "scope with custom types" before calling ReflectionCache, we will only fall foul of in-progress requests, but yeah, it's completely with the developers to make sure everything is stopped, and put big warning cones out around this stuff in the documentation. Regarding un-registration, my expectation is that We don't currently hold any |
Fair enough, I'm convinced. Allowing the caches to be cleared seems like a reasonable solution to the problem. |
I appreciate the correction regarding controller creation; I had incorrectly interpreted what I'd read and thought it was the underlying, built-in container used to resolve the controller, rather than just reflection. My most recent test setups don't include dynamic controller loading and unloading, but some of my earlier ones do, and I don't recall these leaking. I'll recheck. This is an actual use case for me (I need to support dynamically adding and removing controllers). I'm aware of AddControllersAsServices, and have been experimenting with and without this, as I haven't yet determined whether I'd like (or need) it turned on or not. You're completely correct regarding the synchronization problems: I'll have to cordon off the app and drain the request queue. I'll also need to stop the Quartz schedulers and wait for jobs to complete. My situation is even worse: in production environments there will be multiple servers drawing plugins from a shared source, and I never want these to be out of sync, either, so I'll need to build a mechanism to coordinate bringing them down together and forcing a reload. None of this is Autofac's problem; I only mention it because I need the mechanism anyway for other reasons and it'll handle the Autofac-specific part of the sync problem as a by-product. Your warnings regarding resolving IEnumerables in instances that outlive the plugin container are well-taken, and this is something I'll need to be extremely careful of. I ran afoul of that parent-not-disposed check as part of my proof-of-concept work on this. It's a completely reasonable limitation, but I needed to find a way to dispose the expired plugin container from within a request created from a lifetime scope nested from it. I ended up stuffing the expired container in a disposable bound to the nested / request lifetime scope. This is obviously dangerous if other requests are in-flight, but as discussed above I need a queue-drainer anyway. Because I have the luxury of having schedulers in the background, I can easily delegate this to a cleanup job that runs periodically and locates and cleans up expired lifetime scopes. But coordinating that with a type cache clear might be a problem. But that brings me to my next point. I'm a bit confused about something: having taken a quick look at the code again, all these caches are used in "GetOrAdd" scenarios. My read of this is that it's always safe to wipe these caches at any point in any request / use; all it'll mean is that a new entry is calc'd and added on the next miss. Am I overlooking some instance-equality checks of the funcs that are stored in these? Two otherwise-identical types from different ALCs are not considered the same type, so no entangling can occur. I should clarify that by "safe" I mean "will not cause in-flight logic to fail / crash"; clearing the cache during a request might immediately put the cache entry back, preventing unloading, and needlessly costing performance. If it's always safe, I could use a different approach in development environments where there's typically only one in-flight request. While I was hoping for a centralized and supported (with the caveat that synchronization and non-type references are not part of this support) solution like the cache clearing logic described above in the short term, what I guess I'm really hoping for at some point in the future is a move away from these statics, with the responsibility for type caching moved into the lifetime scopes. I understand this is not simple, though, and would come with at least minor performance costs as multiple cache tiers would have to be inspected rather than just one. Which probably means it should be optional. Which means more complexity. Is this something that might be considered in a future release, or is it just too niche? |
You point about As for your long-term hopes, the idea of tying generic type resolve caches to a particular life-time scope, sounds nice, I get it, but I'm strongly suspicious of the complicated changes needed to make it work, and the performance impact for making sure we use the right cache, and passing the cache reference down through the resolve/build pipeline, etc. As you say, this requirement is a little niche; and I'm not sure the added maintenance complexity is going to be worth it for 99% of users. I think adding a general "clean-up" option that can be run periodically to assist with preventing memory leaks is fine, but not sure I'd want to go beyond it. That said, if you can think of a way to make it work that doesn't add loads of maintenance complexity or drag down performance, I'd be happy to look at a PR for it. |
Thanks for the help and guidance; it's good to know that it's possible to use collectible ALCs and Autofac together. I'll stick with my relection-based resets for now, and will cut across to any official reset mechanism that appears in the future. I'd love to spend some time tackling the in-lifetime-scope solution, but can't at present. Would you like this issue left open (for the commit of the cache clear logic), or closed in favour of a better-organised feature request? |
It turns out my tests were too simple, and as they're expanding I'm running into other snags (i.e. references). I don't have clear details yet, but it's definitely more than just the reflection caches, unfortunately. I'll continue to investigate and share details in the hope they may help anyone attempting to do something similar. |
Are you saying that you are finding references still left behind inside Autofac when unloading? Or more generally in your app? |
I think so, yes, but I'm far from certain. The code is extremely sensitive to where exactly references reside, far more so than just regular closure capturing, and it has caused many false-positives. What I can tell at the moment, though, is that if I reuse a root container to create nested scopes that are used to resolve types from an unloadable ALC, things don't unload properly, but if the nested containers are created from equally-short-lived, non-shared parents then things unload correctly. I'll need to pick through some more snapshot diffs to find the culprit(s). |
Okay; let us know what you find. I've got a PR ready-to-go with the |
I've found the other source of unloading issues. When resolving a service from a nested lifetime scope, an entry is added to the ComponentRegistry of the nested lifetime scope and also that of the parent container. Specifically, the ComponentRegistry._registeredServicesTracker._serviceInfo of the parent container is left holding references to collectible types. Unlike the reflection statics, clearing this collection in entirety is not safe, but selectively kicking any entry referencing a type from a collectible assembly does the trick for me. I'm assuming this propagation to the parent container is deliberate, and is done for performance reasons. Is this the case? Assuming the behaviour is intentional, are you interested in going down a similar path with respect to providing an "advanced" method somewhere to allow people to clear problematic entries from a lifetime scope? |
Can you tell what the types are of that container.BeginLifetimeScope(b => b.RegisterType<T>()); ...then it shouldn't be passing things back up, I don't think. The type there is something like Or does registration info get passed up the stack regardless? |
For the root container it's a DefaultRegisteredServicesTracker, and for the nested lifetime scope it's a ScopeRestrictedRegisteredServicesTracker. I'm using a lambda to create the nested lifetime scope. If you do something like: var builder = new ContainerBuilder();
var container = builder.Build();
var nested = container.BeginLifetimeScope(s =>
{
s.RegisterType<SomeType>();
});
nested.Resolve<SomeType>(); then inspect container.ComponentRegistry._registeredServicesTracker._service info, you'll see there's a Service instance in there referencing SomeType. |
See, this is why I'm pleased I gave myself 20% leeway to be wrong. 🤦♂️ Instances of the types themselves from This is...a tricky one. The services tracker is really intended to be immutable to external entities, and a couple of components I believe make assumptions about the fact that an initialised The only real solution here is to never ask the parent about a given service. I think there is a way to do this, but my proof of concept that requires no Autofac changes means rather than a nested scope, having a separate container, and a custom registration source. I present the below having tested it "a little", but it does prevent the problem you're seeing. public class SharedComponent
{
}
public class LoadedComponent
{
public LoadedComponent(SharedComponent shared)
{
Shared = shared;
}
public SharedComponent Shared { get; }
}
[Fact]
public void BridgedContainers()
{
var outerBuilder = new ContainerBuilder();
outerBuilder.RegisterType<SharedComponent>().SingleInstance();
var outerContainer = outerBuilder.Build();
var innerBuilder = new ContainerBuilder();
// Add a source here to *optionally* pull services from the outer container.
// Defining an example predicate filter here that in reality would check whether the type is in
// the AssemblyLoadContext.Default.
// Note that the predicate will be called *once* for each service type.
innerBuilder.RegisterSource(new FilteringExternalScopeSource(
outerContainer.Resolve<ILifetimeScope>(),
s => s is TypedService ts && ts.ServiceType != typeof(LoadedComponent)));
innerBuilder.RegisterType<LoadedComponent>();
using var innerContainer = innerBuilder.Build();
var internalComponent = innerContainer.Resolve<LoadedComponent>();
// No entry for the `InternalComponent` type in the outerContainer registry source.
// But the shared instance is the same as the outer container singleton.
Assert.Equal(outerContainer.Resolve<SharedComponent>(), internalComponent.Shared);
}
// Custom registration source that invokes a predicate to know whether to check with the parent.
private class FilteringExternalScopeSource : IRegistrationSource
{
private readonly ISharingLifetimeScope _sharingScope;
private readonly Func<Service, bool> _predicate;
public FilteringExternalScopeSource(ILifetimeScope owningScope, Func<Service, bool> predicate)
{
if (owningScope is not ISharingLifetimeScope sharingScope)
{
throw new InvalidOperationException("Cannot use a scope that does not implement ISharingLifetimeScope.");
}
_sharingScope = sharingScope;
_predicate = predicate;
}
public IEnumerable<IComponentRegistration> RegistrationsFor(Service service, Func<Service, IEnumerable<ServiceRegistration>> registrationAccessor)
{
if (_predicate(service))
{
foreach (var registration in _sharingScope.ComponentRegistry.RegistrationsFor(service))
{
// Don't include adapters.
if (registration.Target == registration)
{
yield return new FilteredExternalRegistration(service, _sharingScope, registration);
}
}
}
}
public bool IsAdapterForIndividualComponents => false;
}
// This is a variant of the internal ExternalComponentRegistration class.
internal class FilteredExternalRegistration : ComponentRegistration
{
public FilteredExternalRegistration(Service service, ISharingLifetimeScope owningScope, IComponentRegistration target)
: base(
target.Id,
new NoOpActivator(target.Activator.LimitType),
new OverrideComponentLifetime(owningScope, target.Lifetime),
target.Sharing,
target.Ownership,
new[] { service },
target.Metadata,
target)
{
}
protected override IResolvePipeline BuildResolvePipeline(IComponentRegistryServices registryServices, IResolvePipelineBuilder pipelineBuilder)
{
// Just use the external pipeline.
return Target.ResolvePipeline;
}
private class NoOpActivator : IInstanceActivator
{
public NoOpActivator(Type limitType)
{
LimitType = limitType;
}
public Type LimitType { get; }
public void ConfigurePipeline(IComponentRegistryServices componentRegistryServices, IResolvePipelineBuilder pipelineBuilder)
{
// Should never be invoked.
throw new InvalidOperationException();
}
public void Dispose()
{
// Do not do anything here.
}
}
private class OverrideComponentLifetime : IComponentLifetime
{
private readonly ISharingLifetimeScope _startScopeOverride;
private readonly IComponentLifetime _targetLifetime;
public OverrideComponentLifetime(ISharingLifetimeScope startScopeOverride, IComponentLifetime targetLifetime)
{
_startScopeOverride = startScopeOverride;
_targetLifetime = targetLifetime;
}
public ISharingLifetimeScope FindScope(ISharingLifetimeScope mostNestedVisibleScope)
{
// The owning scope is in another container.
return _targetLifetime.FindScope(_startScopeOverride);
}
}
} Like I said, the above example where there's a separate container shouldn't be the "recommended" way to do things; it's more of a proof-of-concept. We need the separate container here because Autofac always adds an We could add another overload to |
Equally, going a little further we could add a method only in .NET5+ called Then you can create a nested scope from that isolated one with the custom registrations in it. But...that does make Autofac aware of |
I think it's interesting to support this ALC construct if we don't have to go into stuff like I'm getting a little close to the "sorry this isn't working, but we'll accept a PR" area. Maybe it'd be good to start a list of all the things that'd have to change to make this work. Blue sky, anything is possible, what needs to change? I don't know about others, but at least for me, seeing a summarized list sometimes helps frame the issue better than having to mentally aggregate across comments.
Hopefully I phrased that last one correctly. What else? Something is tickling the back of my brain saying something about registration sources maybe needing to be handled differently. Like, some, but not all, registration sources make it from parent to child scope and I wonder if it'd have to change to be all of them. Might be an opportunity to obsolete that It seems like actually fixing this could be big, the way changing the resolve mechanism to be a pipeline was big. Interesting, possibly valuable improvement to make lifetime scopes more isolated, but not five minutes' worth of work, either. Cache lookups would need to be able to fall back to parent lifetime scopes. I'm guessing there'd be a potentially non-trivial perf hit since every lifetime scope has the potential to need to refresh the scope-level cache with data that might have already been populated earlier if it was shared. It might be nice in some respects because I know having two different |
I'm not sure if I'm using it correctly, but replacing the nested containers with bridged containers didn't go well in my attempt to use the provided logic. It resolved fine, but the resultant setup doesn't track instances properly. Specifically, if the no-longer-nested container called through to the no-longer-root container to ask for an IDisposable, the disposable didn't get disposed when the no-longer-nested container was disposed. Apologies if this is a limitation that should have been obvious to me. More generally, as much as I want this stuff to work, I don't want to make Autofac more complicated or, worse, significantly slower in order to support it. Sure, it'd be great if this worked in vanilla Autofac, but from my perspective anything that works is acceptible, no matter how ugly. This includes violating privacy with reflection. If, after more extensive testing, what I'm doing at the moment continues to function correctly for my use cases, I'm happy taking it to production. If we wanted to find a home for attempts at this kind of thing, spinning up something like Autofac.Extras.Collectible might be a good way to concentrate efforts and gather additional info on where problems are encountered. .NET 6+ only. No reservations about knowledge of AssemblyLoadContexts. Thoughts? I should say I certainly do not want to talk anyone out of more ambitious changes that might solve multiple problems at once, including this one. But I'm unlikely to be in a position where I'll be able to contribute PRs for this kind of thing. |
A note on this issue; in #1341 we're moving all of our reflection caches from static fields to a single When the last I'm investigating a way to prevent services only present in nested scopes from being referenced in parent scope trackers, but no immediate promises on that one. |
Thanks for the heads up. Unfortunately, I don't think the described change will be helpful for my use case. The design I'm working with will keep a base / root container alive and referenced indefinitely, so the container count will never hit zero. How big a change would it be to allow the cache object for a given container to be set at construction time? I ask because I've got three layers of container: root, plugin, and request. I'm ok with the request containers "polluting" the plugin cache (I can manage this closely so it doesn't cause a problem), and I'm ok with paying the overhead of double-caching anything that winds up in both the root and plugin contexts. If such a dual-cache approach was feasible it would remove the need for cache surgery during plugin container replacement. |
I started out passing a cache object all the way down through the container, but unfortunately it resulted in some pretty extreme breaking changes to a bunch of public interfaces in a way that exposed way too much about our reflection caching to external users, so I pulled it back to the shared cache object. The PR in play means you can now call However, to be honest even what you describe is going to be a problem as long as the root container scope is holding a reference to services that were registered in child scopes. I'm investigating a mechanism where we can 'isolate' a scope, then everything can sit in one container , but it's a bit thorny, so will take a little longer, and won't overlap much with |
I've run into something that is either a memory leak or an unsupported use case, and I'm looking for guidance on which.
My application has a plugin model that loads assemblies into collectible AssemblyLoadContexts. It has a base lifetime scope built from core types loaded into the default ALC (including the Autofac types), and a nested lifetime scope built from types in assemblies in the collectible ALC. Requests are served from a third level of short-lived lifetime scopes based off the "current" plugin lifetime scope. Under certain conditions a new collectible ALC and plugin lifetime scope are constructed to replace the existing one, which is then disposed. Unfortunately, this leaks.
The source of the leaks is Autofac's reflection caches:
Autofac.Core.Activators.Reflection.ConstructorBinder.FactoryCache
Autofac.Core.Activators.Reflection.DefaultConstructorFinder.DefaultPublicConstructorsCache
Autofac.Util.TypeExtensions.IsGenericEnumerableInterfaceCache
Autofac.Util.TypeExtensions.IsGenericListOrCollectionInterfaceTypeCache
Autofac.Util.TypeExtensions.IsGenericTypeDefinedByCache
There may well be others, but using reflection to clear these caches fixes the leak in my test harness, and my ALCs successfully unload. To elaborate slightly, these are all static ConcurrentDictionaries that cache reflection information from the collectible ALCs, holding references and preventing unloading.
Is the intention that Autofac be usable with types in collectible ACLs? Is my hacky fix safe?
I'm happy to provide a test or test harness if this is considered a bug.
The text was updated successfully, but these errors were encountered: