Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Memory leak when using types in collectible AssemblyLoadContexts #1324

Closed
mburnell opened this issue Apr 8, 2022 · 25 comments · Fixed by #1350
Closed

Memory leak when using types in collectible AssemblyLoadContexts #1324

mburnell opened this issue Apr 8, 2022 · 25 comments · Fixed by #1350

Comments

@mburnell
Copy link

mburnell commented Apr 8, 2022

I've run into something that is either a memory leak or an unsupported use case, and I'm looking for guidance on which.

My application has a plugin model that loads assemblies into collectible AssemblyLoadContexts. It has a base lifetime scope built from core types loaded into the default ALC (including the Autofac types), and a nested lifetime scope built from types in assemblies in the collectible ALC. Requests are served from a third level of short-lived lifetime scopes based off the "current" plugin lifetime scope. Under certain conditions a new collectible ALC and plugin lifetime scope are constructed to replace the existing one, which is then disposed. Unfortunately, this leaks.

The source of the leaks is Autofac's reflection caches:

Autofac.Core.Activators.Reflection.ConstructorBinder.FactoryCache
Autofac.Core.Activators.Reflection.DefaultConstructorFinder.DefaultPublicConstructorsCache
Autofac.Util.TypeExtensions.IsGenericEnumerableInterfaceCache
Autofac.Util.TypeExtensions.IsGenericListOrCollectionInterfaceTypeCache
Autofac.Util.TypeExtensions.IsGenericTypeDefinedByCache

There may well be others, but using reflection to clear these caches fixes the leak in my test harness, and my ALCs successfully unload. To elaborate slightly, these are all static ConcurrentDictionaries that cache reflection information from the collectible ALCs, holding references and preventing unloading.

Is the intention that Autofac be usable with types in collectible ACLs? Is my hacky fix safe?

I'm happy to provide a test or test harness if this is considered a bug.

@alistairjevans
Copy link
Member

alistairjevans commented Apr 8, 2022

This is an interesting use-case, albeit one we don't actually support; I've always really struggled to get an ALC to actually unload personally, they're usually pretty stubborn.

It's certainly not a "bug", in that I wouldn't have expected it to work if someone asked me "does Autofac lose any references to types when the ALC unloads" . Autofac's reflection caching behaviour predates (by quite a long way) the idea of unloadable types.

I'd say that the safety of your reflection-based cache clear is up-for-debate; it's entirely not supported, and is liable to break between releases. It's also incomplete (I guess because only those caches are being used in your code). There's also:

  • AutowiringPropertyInjector.PropertySetters
  • AutowiringPropertyInjector.InjectableProperties

Also bear in mind that if you reset those caches using Clear the performance of Autofac will 'reset', as you may have guessed, and you'll have additional overheads each time you resolve a subsequent service anywhere.

How often are you loading/unloading ALCs? Is the memory leak severe?

@mburnell
Copy link
Author

mburnell commented Apr 8, 2022

Thanks for the quick response; I figured this might be the case. The world of unloadable types is new and rocky, and there are similar issues in ASPNET Core (e.g. dotnet/aspnetcore#21744). .NET itself had an issue until the latest service release (dotnet/runtime#1388) that prevented using things like Fluent NHibernate in the manner I've described.

What I can say is that I've managed to cobble together a Kestrel-based runtime that can dynamically load, register, use, dispose, and collect assemblies which are used to service both web requests and Quartz jobs. The default MS IOC container appears not to leak; even with Autofac fitted, the default IOC mechanism is being used to resolve ASPNET controllers (but not their constructor args) which are dynamically loaded and unloaded succesfully.

I could fall back to the default builder/container but I prefer Autofac's featureset. On top of that I'm dealing with a fairly large existing codebase that uses Autofac extensively, and many of the plugin types are externally maintained and use Autofac internally (and expose modules), so form part of the overall plugin contract, which I very much do not want to change.

I'm 100% OK with my hack breaking between releases, and understand any time I'm using reflection to violate field and class privacy I'm way off in "that's not supported" land. I guess what I'm really looking for is an indication of where I am on the scale from "you might be OK if test thoroughly and maintain your hacks" to "you are completely crazy and this will never work for a bunch of Autofac-specific reasons you have overlooked".

Thanks for identifying those other two collections. My test was exclusively constructor-based, so I didn't encounter those two.

As for how often I'm loading / unloading assemblies, this will vary pretty wildly depending on environment, with the expectation that it'll happen say 5 times a year in production environments and 30-200 times an hour in some development situations. While wiping those caches will result in a loss of performance it's nothing like restarting the whole process and reinit'ing the whole environment.

As for the severity of the leak, that really depends on how many assemblies are dragged into the collectible ALCs. I haven't pieced this all together in a realistic environment yet, but I'm expecting the leak to be in the single-digit megabytes per cycle at a minimum. This is based on the assumption that if even a single type can't unload, neither can the ALC, so all assemblies / types within it will remain.

As you point out, Autofac's caching logic significantly predates the idea of unloadable types. Times are changing, though, and the possibility of managing ephemeral types in nested lifetime scopes is pretty exciting, and potentially extremely powerful. Would the team be interested in considering this as a future feature? I'm not asking for the work to be done, just looking for an indication of whether the feature would be welcome. I'd understand if it wasn't (especially on the grounds of simplicity).

If it's something that people feel would be welcome in Autofac's future, and my hacks are not deemed the work of a madman, I'd be looking to stick with the hacks for now with a view to contributing to a supported solution in the future.

@alistairjevans
Copy link
Member

I'm not opposed to supporting some way of clearing internal caches. Scenarios where Autofac supports a plugin architecture, and the ability to load plugins is a pretty common usecase. Unloading is just less so.

I'm a little nervous about the idea of "scoping" caches to a particular lifetime scope; that would probably introduce a lot of cache duplication where nested lifetime scopes can't use the cache entries in their owned scope, and some overhead of creating new cache objects at nested scopes, for a scenario that's not super common.

Noodling on the problem though, I do wonder if there might be a relatively simple way to allow external code in advanced scenarios to say "hey, Autofac, can you please clear your various reflection caches?"

Something like a ReflectionCache.Clear method perhaps, and we hook notifications into that to clear our caches.

public delegate bool ReflectionCacheShouldClearPredicate(MemberInfo member);

public delegate void ReflectionCacheRegistrationCallback(ReflectionCacheShouldClearPredicate? predicate);

public static class ReflectionCache
{
    private static List<ReflectionCacheRegistrationCallback> _cacheRegistrations = new();

    // Make this public so Autofac code in integration libraries can register for reflection cache clearing.
    public static void Register(ReflectionCacheRegistrationCallback registrationCallback)
    {
        lock (_cacheRegistrations)
        {
            _cacheRegistrations.Add(registrationCallback);
        }
    }

    // Allow conditional clearing of reflection cache data, so only a single
    // ALC's type info has to be dropped (for example).
    public static void Clear(ReflectionCacheShouldClearPredicate? predicate = null)
    {
        lock (_cacheRegistrations)
        {
            foreach (var reg in _cacheRegistrations)
            {
                reg(predicate);
            }
        }
    }
}

internal class ReflectionCacheDictionary<TKey, TValue> : ConcurrentDictionary<TKey, TValue>
    where TKey : MemberInfo
{
    public ReflectionCacheConcurrentDictionary()
    {
        ReflectionCache.Register(CacheClear);
    }

    private void CacheClear(ReflectionCacheShouldClearPredicate? predicate)
    {
        if (predicate is null)
        {
            Clear();
            return;
        }

        foreach (var kvp in this)
        {
            if (predicate(kvp.Key))
            {
                TryRemove(kvp.Key, out _);
            }
        }
    }
}

// ... 

internal static class TypeExtensions
{
    private static readonly ReflectionCacheDictionary<Type, bool> IsGenericEnumerableInterfaceCache = new();

  // ...
}

By delegating responsibility for which types to remove from the cache, it means we don't need to understand how AssemblyLoadContext works in Autofac itself, but your code could call:

AssemblyLoadContext unloadingAlc = // from somewhere
ReflectionCache.Clear(t => unloadingAlc.Assemblies.Contains(t.Module.Assembly));

There are downsides to the above approach, in that it would mandate that you not resolve anything from that assembly load context while the cache is being cleared. We wouldn't synchronise, because it would interfere with other cache hits, so this functionality would be firmly in the "advanced" camp.

From a maintainer viewpoint, we would have to make sure we use that ReflectionCacheDictionary in all the right places, which is some extra overhead, but we don't tend to add new caches that often, so I don't have a huge problem with it.

@tillig
Copy link
Member

tillig commented Apr 8, 2022

I'm not entirely sure that would fix everything. I haven't gone spelunking yet, but I'm thinking about stuff like...

  • Singleton of a type in an ALC being unloaded (or which has a reference to a type that's being unloaded). The singleton cache would also need to be flushed, or folks would just have to "know" that they can inadvertently cause their own memory leaks by doing this. In the use case described above where the ALC is kept isolated to its own child lifetime scope this wouldn't be an issue, but it could be one of those foot-gun situations if the feature isn't described in excruciating detail in docs.
  • The cache of things being resolved from registration sources. We store the list of things that get resolved from a registration source so we don't have to re-query the source. That's all stored in an IRegisteredServicesTracker, which, by default, is stored with at the container level. (That's also the general cache of service-to-implementation mappings, IIRC.)

This is a lot of why the container is effectively immutable. In some cases it isn't just about flushing the cache, it's about, say, some factory resolving IEnumerable<IService> where there's a mix of IService implementations that should be kept and some that are going to be flushed when the ALC gets dropped. It somewhat invalidates the state of the thing holding that IEnumerable<IService>, which very well could invalidate the state of the whole application. Once it's resolved it's outside our control.

This has come up a lot in the past with folks wanting to do UWP apps that dynamically load and unload components. However, there's no structure in that framework really for having nested-upon-nested lifetime scopes, so even just wiring it up is a challenge; and the root "application" object (from what I understand) is where folks end up "holding" the set of app components - an app level singleton that holds types that need to be unloaded.

I'm not saying this is literally technologically insurmountable, but maybe that even if it's supported, it'll be amazingly hard to use correctly.

Re: the default MS container and controllers - it's not that the default MS container is used to resolve controllers, it's the DefaultControllerActivator is using reflection to create the controller and independently resolving the constructor args. If you use AddControllersAsServices at app startup you get a different activator that actually does resolve the controller. The reflection-based creation is why it's not holding onto the type references, though they do cache the lambdas that invoke the constructors. I'm guessing there's no memory leak going on because the controllers and interface-based dependencies you're using are all in a context that's not getting unloaded. If any controllers are using dependencies that don't get unloaded, I'd bet you'd start seeing that memory leak again.

@alistairjevans
Copy link
Member

alistairjevans commented Apr 8, 2022

I'm reasonably confident (80%?) that if a lifetime scope is configured in BeginLifetimeScope using the callback, then a new ScopeRestrictedRegisteredServicesTracker (derived from DefaultRegisteredServicesTracker is created), which ensures that any singletons registered just for that scope are only held until the lifetime scope is disposed.

Equally, since the created lifetime scope has it's own instance of the services tracker, computed sets of IEnumerable<TService> include services from the parent scope, but those computed lists are discarded along with the generated scope.

So all shared instances and computed registration data for a lifetime scope that has had additional modules loaded into it in BeginLifetimeScope should be discarded when the scope is. I'm pretty sure any other behaviour would be a bug.

@tillig
Copy link
Member

tillig commented Apr 8, 2022

Maybe that ScopeRestrictedRegisteredServicesTracker is the key. I suppose if you're doing ALC disposal you'd have to register the newly loaded things at the time of scope creation with that lambda. As long as you don't intermingle things you want to unload when you build the base container you would probably be OK.

More brainstorming: InstancePerMatchingLifetimeScope issues? Thinking maybe no more than what we'd already have.

I keep thinking about the synchronization problems here. To unload/update the ALC you'd have to cordon the app, drain all the traffic, ensure there are no requests going on, kill the ALC lifetime scope, create a new one, and uncordon the app so it can take requests again. During that time, you'd need to drop the ALC lifetime scope first and then flush the caches, in that order, and without doing any resolutions that might incorrectly update the caches between. Admittedly "not our problem," but not obvious or easy. (Some of this was already mentioned before, but I think it bears repeating.)

I'm also thinking about weird inadvertent captive dependencies we don't have control over, like someone holding an invalid IEnumerable<T> somewhere that they resolved from the ALC lifetime scope and then try to unload. That's the stuff that invalidates the application state. Again, "not our problem."

Hmmm. I guess if the whole solution on our end is to provide something to flush caches and let app devs figure the rest out for themselves... yeah, OK.

I think I'll want to really, really clearly explain in the docs that you are so amazingly on your own here it's not even funny. I don't want this to become a support nightmare. I can't tell you how much I'm absolutely not interested in spending hours trying to solve someone's problems with their custom app code in this territory. Like, to the point that if someone can't literally point to the line of code that needs to be fixed and provide a PR to fix it... issue immediately closed. That not interested.

On the ReflectionCache, is there a need to allow any caches to self-unregister? That is, after the cache is cleared it should be removed from the list of caches so we're not holding references to them? Any caches that are at the lifetime scope level? I ask because something is tickling the back of my mind - at one point I tried setting up tracking between parent and child lifetime scopes such that disposing the parent would dispose the child, but holding that reference (of course) ended up being a leak. (Yeah, we started looking at WeakReference and similar, but it got really complicated really quickly.)

@alistairjevans
Copy link
Member

As you say, we are absolutely not going to get involved in figured out how to "drain" the resolver.

But, worth noting that if someone disposes the outer lifetime scope before clearing the cache, they will not be able to resolve any objects from any more-nested per-request scopes after that Dispose call (if you recall in v6, I think, I added the check that walks up the scope hierarchy checking that no parent scopes have been disposed prior to resolve).

So, if someone disposes of their "scope with custom types" before calling ReflectionCache, we will only fall foul of in-progress requests, but yeah, it's completely with the developers to make sure everything is stopped, and put big warning cones out around this stuff in the documentation.

Regarding un-registration, my expectation is that ReflectionCache.Register should only be used by static readonly fields in the Autofac codebase, and nothing else; hence the lack of an Unregister. I was tempted to make the Register call an internal method, but I figured it might be useful for integrations to be able to register, so made it public.

We don't currently hold any MemberInfo/Type-keyed caches at an instance scope level that reference types used in a nested scope, so any other stores of type data will be discarded when the scope is. It's just those extension methods (and similar global helpers) that made the (at the time very reasonable) assumption that types will never be unloaded, so they hung onto that type metadata at a global level.

@tillig
Copy link
Member

tillig commented Apr 8, 2022

Fair enough, I'm convinced. Allowing the caches to be cleared seems like a reasonable solution to the problem.

@mburnell
Copy link
Author

mburnell commented Apr 8, 2022

I appreciate the correction regarding controller creation; I had incorrectly interpreted what I'd read and thought it was the underlying, built-in container used to resolve the controller, rather than just reflection. My most recent test setups don't include dynamic controller loading and unloading, but some of my earlier ones do, and I don't recall these leaking. I'll recheck. This is an actual use case for me (I need to support dynamically adding and removing controllers). I'm aware of AddControllersAsServices, and have been experimenting with and without this, as I haven't yet determined whether I'd like (or need) it turned on or not.

You're completely correct regarding the synchronization problems: I'll have to cordon off the app and drain the request queue. I'll also need to stop the Quartz schedulers and wait for jobs to complete. My situation is even worse: in production environments there will be multiple servers drawing plugins from a shared source, and I never want these to be out of sync, either, so I'll need to build a mechanism to coordinate bringing them down together and forcing a reload. None of this is Autofac's problem; I only mention it because I need the mechanism anyway for other reasons and it'll handle the Autofac-specific part of the sync problem as a by-product.

Your warnings regarding resolving IEnumerables in instances that outlive the plugin container are well-taken, and this is something I'll need to be extremely careful of.

I ran afoul of that parent-not-disposed check as part of my proof-of-concept work on this. It's a completely reasonable limitation, but I needed to find a way to dispose the expired plugin container from within a request created from a lifetime scope nested from it. I ended up stuffing the expired container in a disposable bound to the nested / request lifetime scope. This is obviously dangerous if other requests are in-flight, but as discussed above I need a queue-drainer anyway. Because I have the luxury of having schedulers in the background, I can easily delegate this to a cleanup job that runs periodically and locates and cleans up expired lifetime scopes. But coordinating that with a type cache clear might be a problem. But that brings me to my next point.

I'm a bit confused about something: having taken a quick look at the code again, all these caches are used in "GetOrAdd" scenarios. My read of this is that it's always safe to wipe these caches at any point in any request / use; all it'll mean is that a new entry is calc'd and added on the next miss. Am I overlooking some instance-equality checks of the funcs that are stored in these? Two otherwise-identical types from different ALCs are not considered the same type, so no entangling can occur. I should clarify that by "safe" I mean "will not cause in-flight logic to fail / crash"; clearing the cache during a request might immediately put the cache entry back, preventing unloading, and needlessly costing performance. If it's always safe, I could use a different approach in development environments where there's typically only one in-flight request.

While I was hoping for a centralized and supported (with the caveat that synchronization and non-type references are not part of this support) solution like the cache clearing logic described above in the short term, what I guess I'm really hoping for at some point in the future is a move away from these statics, with the responsibility for type caching moved into the lifetime scopes. I understand this is not simple, though, and would come with at least minor performance costs as multiple cache tiers would have to be inspected rather than just one. Which probably means it should be optional. Which means more complexity. Is this something that might be considered in a future release, or is it just too niche?

@alistairjevans
Copy link
Member

You point about GetOrAdd is accurate, clearing the cache isn't going to break any in-progress resolves (but might slow them down). But Autofac cannot guarantee the cache will be cleared after ReflectionCache.Clear exits, which is why the big warning signs will be needed; less aware consumers of the API might make faulty assumptions about it, and raise bugs saying "I still can't unload my ALC after calling ReflectionCache.Clear".

As for your long-term hopes, the idea of tying generic type resolve caches to a particular life-time scope, sounds nice, I get it, but I'm strongly suspicious of the complicated changes needed to make it work, and the performance impact for making sure we use the right cache, and passing the cache reference down through the resolve/build pipeline, etc.

As you say, this requirement is a little niche; and I'm not sure the added maintenance complexity is going to be worth it for 99% of users.

I think adding a general "clean-up" option that can be run periodically to assist with preventing memory leaks is fine, but not sure I'd want to go beyond it.

That said, if you can think of a way to make it work that doesn't add loads of maintenance complexity or drag down performance, I'd be happy to look at a PR for it.

@mburnell
Copy link
Author

Thanks for the help and guidance; it's good to know that it's possible to use collectible ALCs and Autofac together.

I'll stick with my relection-based resets for now, and will cut across to any official reset mechanism that appears in the future. I'd love to spend some time tackling the in-lifetime-scope solution, but can't at present.

Would you like this issue left open (for the commit of the cache clear logic), or closed in favour of a better-organised feature request?

@mburnell
Copy link
Author

It turns out my tests were too simple, and as they're expanding I'm running into other snags (i.e. references). I don't have clear details yet, but it's definitely more than just the reflection caches, unfortunately. I'll continue to investigate and share details in the hope they may help anyone attempting to do something similar.

@alistairjevans
Copy link
Member

Are you saying that you are finding references still left behind inside Autofac when unloading? Or more generally in your app?

@mburnell
Copy link
Author

mburnell commented Apr 11, 2022

I think so, yes, but I'm far from certain. The code is extremely sensitive to where exactly references reside, far more so than just regular closure capturing, and it has caused many false-positives. What I can tell at the moment, though, is that if I reuse a root container to create nested scopes that are used to resolve types from an unloadable ALC, things don't unload properly, but if the nested containers are created from equally-short-lived, non-shared parents then things unload correctly. Disposing Collecting the shared root container allows collection of everything.

I'll need to pick through some more snapshot diffs to find the culprit(s).

@alistairjevans
Copy link
Member

Okay; let us know what you find. I've got a PR ready-to-go with the ReflectionCache.Clear changes, but I'm reluctant to add that if it doesn't actually solve the problem of allowing unload.

@mburnell
Copy link
Author

I've found the other source of unloading issues. When resolving a service from a nested lifetime scope, an entry is added to the ComponentRegistry of the nested lifetime scope and also that of the parent container. Specifically, the ComponentRegistry._registeredServicesTracker._serviceInfo of the parent container is left holding references to collectible types. Unlike the reflection statics, clearing this collection in entirety is not safe, but selectively kicking any entry referencing a type from a collectible assembly does the trick for me.

I'm assuming this propagation to the parent container is deliberate, and is done for performance reasons. Is this the case?

Assuming the behaviour is intentional, are you interested in going down a similar path with respect to providing an "advanced" method somewhere to allow people to clear problematic entries from a lifetime scope?

@tillig
Copy link
Member

tillig commented Apr 11, 2022

Can you tell what the types are of that _registeredServicesTracker holding references? We use two different ones, and if you are starting a lifetime scope with a lambda...

container.BeginLifetimeScope(b => b.RegisterType<T>());

...then it shouldn't be passing things back up, I don't think. The type there is something like ScopeRestrictedRegisteredServicesTracker . But if you aren't using the lambda, it's just DefaultRegisteredServicesTracker and that one does get shared. This is the stuff I was mentioning earlier about singletons or instance-per-scope things inadvertently getting references and @alistairjevans was thinking it might not be a problem. And... it might not be a problem, but it does imply the use of that lambda registration at each level.

Or does registration info get passed up the stack regardless?

@mburnell
Copy link
Author

mburnell commented Apr 11, 2022

For the root container it's a DefaultRegisteredServicesTracker, and for the nested lifetime scope it's a ScopeRestrictedRegisteredServicesTracker. I'm using a lambda to create the nested lifetime scope. If you do something like:

var builder = new ContainerBuilder();
var container = builder.Build();
var nested = container.BeginLifetimeScope(s =>
{
    s.RegisterType<SomeType>();
});

nested.Resolve<SomeType>();

then inspect container.ComponentRegistry._registeredServicesTracker._service info, you'll see there's a Service instance in there referencing SomeType.

@alistairjevans
Copy link
Member

I'm reasonably confident (80%?)...

See, this is why I'm pleased I gave myself 20% leeway to be wrong. 🤦‍♂️ Instances of the types themselves from nested are not held in container, however, when the nested scope's registry tracker determines the set of implementations for SomeType, it consults the ExternalRegistrySource to see if the parent registry is meant to provide any implementations for SomeType. The source checks the container registry for registrations, and finds none, but the container's DefaultRegisteredServicesTracker holds a dictionary entry keyed on SomeType indicating that it found no implementations of that service. Worth noting that if the service type and component type are different, only the service type would be held in the parent scope.

This is...a tricky one. The services tracker is really intended to be immutable to external entities, and a couple of components I believe make assumptions about the fact that an initialised ServiceRegistrationInfo does not mutate after creation.

The only real solution here is to never ask the parent about a given service.

I think there is a way to do this, but my proof of concept that requires no Autofac changes means rather than a nested scope, having a separate container, and a custom registration source. I present the below having tested it "a little", but it does prevent the problem you're seeing.

public class SharedComponent
{
}

public class LoadedComponent
{
  public LoadedComponent(SharedComponent shared)
  {
      Shared = shared;
  }

  public SharedComponent Shared { get; }
}

[Fact]
public void BridgedContainers()
{
  var outerBuilder = new ContainerBuilder();

  outerBuilder.RegisterType<SharedComponent>().SingleInstance();

  var outerContainer = outerBuilder.Build();

  var innerBuilder = new ContainerBuilder();

  // Add a source here to *optionally* pull services from the outer container.
  // Defining an example predicate filter here that in reality would check whether the type is in
  // the AssemblyLoadContext.Default.
  // Note that the predicate will be called *once* for each service type.
  innerBuilder.RegisterSource(new FilteringExternalScopeSource(
      outerContainer.Resolve<ILifetimeScope>(),
      s => s is TypedService ts && ts.ServiceType != typeof(LoadedComponent)));

  innerBuilder.RegisterType<LoadedComponent>();

  using var innerContainer = innerBuilder.Build();

  var internalComponent = innerContainer.Resolve<LoadedComponent>();

  // No entry for the `InternalComponent` type in the outerContainer registry source.
  // But the shared instance is the same as the outer container singleton.
  Assert.Equal(outerContainer.Resolve<SharedComponent>(), internalComponent.Shared);
}

// Custom registration source that invokes a predicate to know whether to check with the parent.
private class FilteringExternalScopeSource : IRegistrationSource
{
  private readonly ISharingLifetimeScope _sharingScope;
  private readonly Func<Service, bool> _predicate;

  public FilteringExternalScopeSource(ILifetimeScope owningScope, Func<Service, bool> predicate)
  {
      if (owningScope is not ISharingLifetimeScope sharingScope)
      {
          throw new InvalidOperationException("Cannot use a scope that does not implement ISharingLifetimeScope.");
      }

      _sharingScope = sharingScope;
      _predicate = predicate;
  }

  public IEnumerable<IComponentRegistration> RegistrationsFor(Service service, Func<Service, IEnumerable<ServiceRegistration>> registrationAccessor)
  {
      if (_predicate(service))
      {
          foreach (var registration in _sharingScope.ComponentRegistry.RegistrationsFor(service))
          {
              // Don't include adapters.
              if (registration.Target == registration)
              {
                  yield return new FilteredExternalRegistration(service, _sharingScope, registration);
              }
          }
      }
  }

  public bool IsAdapterForIndividualComponents => false;
}

// This is a variant of the internal ExternalComponentRegistration class.
internal class FilteredExternalRegistration : ComponentRegistration
{
  public FilteredExternalRegistration(Service service, ISharingLifetimeScope owningScope, IComponentRegistration target)
      : base(
            target.Id,
            new NoOpActivator(target.Activator.LimitType),
            new OverrideComponentLifetime(owningScope, target.Lifetime),
            target.Sharing,
            target.Ownership,
            new[] { service },
            target.Metadata,
            target)
  {
  }

  protected override IResolvePipeline BuildResolvePipeline(IComponentRegistryServices registryServices, IResolvePipelineBuilder pipelineBuilder)
  {
      // Just use the external pipeline.
      return Target.ResolvePipeline;
  }

  private class NoOpActivator : IInstanceActivator
  {
      public NoOpActivator(Type limitType)
      {
          LimitType = limitType;
      }

      public Type LimitType { get; }

      public void ConfigurePipeline(IComponentRegistryServices componentRegistryServices, IResolvePipelineBuilder pipelineBuilder)
      {
          // Should never be invoked.
          throw new InvalidOperationException();
      }

      public void Dispose()
      {
          // Do not do anything here.
      }
  }

  private class OverrideComponentLifetime : IComponentLifetime
  {
      private readonly ISharingLifetimeScope _startScopeOverride;
      private readonly IComponentLifetime _targetLifetime;

      public OverrideComponentLifetime(ISharingLifetimeScope startScopeOverride, IComponentLifetime targetLifetime)
      {
          _startScopeOverride = startScopeOverride;
          _targetLifetime = targetLifetime;
      }

      public ISharingLifetimeScope FindScope(ISharingLifetimeScope mostNestedVisibleScope)
      {
          // The owning scope is in another container.
          return _targetLifetime.FindScope(_startScopeOverride);
      }
  }
}

Like I said, the above example where there's a separate container shouldn't be the "recommended" way to do things; it's more of a proof-of-concept.

We need the separate container here because Autofac always adds an ExternalRegistrySource to every configured nested lifetime scope, which we actually need to swap for a filtering one.

We could add another overload to BeginLifetimeScope to take the predicate for filtering in additional to the configuration callback, but I will confess to being a little wary of adding another overload to BeginLifetimeScope.

@alistairjevans
Copy link
Member

Equally, going a little further we could add a method only in .NET5+ called ILifetimeScope BeginIsolatedLifetimeScope(AssemblyLoadContext forLoadContext), or something, which filters out any services inside that are from the provided context from leaking to the parent scope.

Then you can create a nested scope from that isolated one with the custom registrations in it.

But...that does make Autofac aware of AssemblyLoadContext, which I don't love.

@tillig
Copy link
Member

tillig commented Apr 12, 2022

I think it's interesting to support this ALC construct if we don't have to go into stuff like BeginIsolatedLifetimeScope. I am super not interested in wholly new code paths dedicated to this. I kinda figured dumping the whole container would be the only way to "unload" plugins, so having two containers like that new example makes sense. Nuke it from orbit, it's the only way to be sure.

I'm getting a little close to the "sorry this isn't working, but we'll accept a PR" area.

Maybe it'd be good to start a list of all the things that'd have to change to make this work. Blue sky, anything is possible, what needs to change? I don't know about others, but at least for me, seeing a summarized list sometimes helps frame the issue better than having to mentally aggregate across comments.

  • Ability to clear caches or isolate caches at the lifetime scope level:
    • Autofac.Core.Activators.Reflection.AutowiringPropertyInjector.PropertySetters
    • Autofac.Core.Activators.Reflection.AutowiringPropertyInjector.InjectableProperties
    • Autofac.Core.Activators.Reflection.ConstructorBinder.FactoryCache
    • Autofac.Core.Activators.Reflection.DefaultConstructorFinder.DefaultPublicConstructorsCache
    • Autofac.Util.TypeExtensions.IsGenericEnumerableInterfaceCache
    • Autofac.Util.TypeExtensions.IsGenericListOrCollectionInterfaceTypeCache
    • Autofac.Util.TypeExtensions.IsGenericTypeDefinedByCache
  • The IRegisteredServiceTracker implementation can hold references to types from parent lifetime scopes, but parent lifetime scopes can't hold references to child scope types - it needs to flow one way, hierarchically, just like lifetime scopes.

Hopefully I phrased that last one correctly.

What else?

Something is tickling the back of my brain saying something about registration sources maybe needing to be handled differently. Like, some, but not all, registration sources make it from parent to child scope and I wonder if it'd have to change to be all of them. Might be an opportunity to obsolete that IsAdapterForIndividualComponents property which is so poorly named. I never do remember what that's actually for without spelunking.

It seems like actually fixing this could be big, the way changing the resolve mechanism to be a pipeline was big. Interesting, possibly valuable improvement to make lifetime scopes more isolated, but not five minutes' worth of work, either. Cache lookups would need to be able to fall back to parent lifetime scopes. I'm guessing there'd be a potentially non-trivial perf hit since every lifetime scope has the potential to need to refresh the scope-level cache with data that might have already been populated earlier if it was shared.

It might be nice in some respects because I know having two different IRegisteredServiceTracker implementations has been a painful thing in the past - since you get a different one based on the BeginLifetimeScope call you make, we've seen behavior differences where sometimes something works with BeginLifetimeScope() but doesn't work with BeginLifetimeScope(Action<ContainerBuilder>). If the solution ended up unifying the tracker mechanism, it'd remove some of that complexity [in favor of different complexity?].

@mburnell
Copy link
Author

I'm not sure if I'm using it correctly, but replacing the nested containers with bridged containers didn't go well in my attempt to use the provided logic. It resolved fine, but the resultant setup doesn't track instances properly. Specifically, if the no-longer-nested container called through to the no-longer-root container to ask for an IDisposable, the disposable didn't get disposed when the no-longer-nested container was disposed. Apologies if this is a limitation that should have been obvious to me.

More generally, as much as I want this stuff to work, I don't want to make Autofac more complicated or, worse, significantly slower in order to support it. Sure, it'd be great if this worked in vanilla Autofac, but from my perspective anything that works is acceptible, no matter how ugly. This includes violating privacy with reflection. If, after more extensive testing, what I'm doing at the moment continues to function correctly for my use cases, I'm happy taking it to production.

If we wanted to find a home for attempts at this kind of thing, spinning up something like Autofac.Extras.Collectible might be a good way to concentrate efforts and gather additional info on where problems are encountered. .NET 6+ only. No reservations about knowledge of AssemblyLoadContexts. Thoughts?

I should say I certainly do not want to talk anyone out of more ambitious changes that might solve multiple problems at once, including this one. But I'm unlikely to be in a position where I'll be able to contribute PRs for this kind of thing.

@alistairjevans
Copy link
Member

A note on this issue; in #1341 we're moving all of our reflection caches from static fields to a single ReflectionCacheSet.Shared object.

When the last Container is GC'ed, all type references it uses will also go away, so at the very least we will be able to say "once the Container has been collected, you should be able to unload any AssemblyLoadContexts that were used".

I'm investigating a way to prevent services only present in nested scopes from being referenced in parent scope trackers, but no immediate promises on that one.

@mburnell
Copy link
Author

Thanks for the heads up. Unfortunately, I don't think the described change will be helpful for my use case. The design I'm working with will keep a base / root container alive and referenced indefinitely, so the container count will never hit zero. How big a change would it be to allow the cache object for a given container to be set at construction time? I ask because I've got three layers of container: root, plugin, and request. I'm ok with the request containers "polluting" the plugin cache (I can manage this closely so it doesn't cause a problem), and I'm ok with paying the overhead of double-caching anything that winds up in both the root and plugin contexts. If such a dual-cache approach was feasible it would remove the need for cache surgery during plugin container replacement.

@alistairjevans
Copy link
Member

I started out passing a cache object all the way down through the container, but unfortunately it resulted in some pretty extreme breaking changes to a bunch of public interfaces in a way that exposed way too much about our reflection caching to external users, so I pulled it back to the shared cache object.

The PR in play means you can now call ReflectionCacheSet.Shared.Clear to empty (either entirely or conditionally) the set of reflection caches, but you will need to explicitly say "clear" when you unload a plugin (right now).

However, to be honest even what you describe is going to be a problem as long as the root container scope is holding a reference to services that were registered in child scopes.

I'm investigating a mechanism where we can 'isolate' a scope, then everything can sit in one container , but it's a bit thorny, so will take a little longer, and won't overlap much with
#1341.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
3 participants