Skip to content

WeakReference to free up memory automatically #7998

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Open
bOmBeLq opened this issue Jan 21, 2020 · 8 comments
Open

WeakReference to free up memory automatically #7998

bOmBeLq opened this issue Jan 21, 2020 · 8 comments
Labels

Comments

@bOmBeLq
Copy link

bOmBeLq commented Jan 21, 2020

Feature Request

Q A
New Feature yes
RFC yes
BC Break yes (or no if configurable)

Summary

So php 7.4 introduces WeakReference. AFAIK UnitOfWork keeps references to all fetched objects and that's why if you want to free up memory you have to explicitly detach them or clear the entity manager. I understand that this is to increase performance and not query for objects if they ware already fetched which totally makes sense.

I'm wondering if it would make any sense to use WeakReference to keep reference to fetched objects. As long as object is used anywhere outside of ORM it will be kept in memory and re-used from the ref. But if app context does not use entities anymore they would be automatically removed from memory.
Do you see any potential problems with such approach?

Auto-detaching will probably lead to performance drop in some specific cases - should this be considered BC break in that case? This feature could be configurable (on/off) as well.

@jvasseur
Copy link
Contributor

I thought of that to, but freeing references a long as an object isn't used anymore would work because the unit of work needs to keep a reference of all objects to generate the changeset on flush.

For example

$entity = $em->find(...);

$entity->setFoo($newFoo);

unset($entity);

$em->flush();

If the unit of work "forgot" about $entity when we call unset on it, the flush won't save any changes.

@mvorisek
Copy link
Contributor

mvorisek commented Feb 2, 2020

@jvasseur This can be handled by __destruct of some wrapper class. If object is dereferenced (by refcount = 0), wrapper destoy method will be called and it will check if flush is needed (i.e. needs to stay in the memory) - if yes, the wrapper object will be recreated and the reference to the wrapper instance will be saved (otherwise it will be destroyed immediatelly), if no, it will be simply freed.

Related with: php/php-src#4882 (comment) probably a RFC for register_object_destroy_callback will be much better choise which will allow any managers like EntityManager to register a destroy callback on any object, allow to attach (i.e. add reference to) the object and cancel so processing of the destruct if needed.

@jvasseur
Copy link
Contributor

jvasseur commented Feb 2, 2020

@mvorisek that's a possible solution but it has some performances implication since it would means changes will be tracked in modified objects even if we never flush them.

@bOmBeLq
Copy link
Author

bOmBeLq commented Feb 3, 2020

If the unit of work "forgot" about $entity when we call unset on it, the flush won't save any changes.

I would say that this can be expected behavior. A tradeoff for auto-detach and a BC break for sure in that case.
If flush is called on objects which are not used in app context anymore then I suppouse this usually is some app design mistake.

$this->someService->updateSomeThingsButDontFlush();
$this->em->flush(); // what am I flushing again?

In above example service itself should flush the changes.
Eventually this may be turned off explictly if needed

$this->em->disableGarbageCollector();
$this->someService->updateSomeThingsButDontFlush();
$this->em->flush(); // just flushing complicated changes made in someService

That being said I'm still not sure if that is good idea. Just considering.

@mvorisek
Copy link
Contributor

mvorisek commented Feb 3, 2020

@bOmBeLq I do not think that auto GC any changed entities which should / can be flushed is a good idea. Also GC not changed entities should be turned off by default as if these entities are loaded again, they can contain different values (if they are not loaded in same transaction and repeatable read or better transaction isolation).

@jvasseur
Copy link
Contributor

jvasseur commented Feb 3, 2020

$this->someService->updateSomeThingsButDontFlush();
$this->em->flush(); // what am I flushing again?

I'm using this pattern in a lot of cases, basically services are responsible for business logic but never commit (flush) the changes that I consider the responsibility of the controller that can then batch multiple change in the same flush to ensure they are done at the same time (or can decide to not flush them if needed).

That being said I'm still not sure if that is good idea. Just considering.

Same here, I'm not sure if either doing it or not is a good idea, I'm just listing problems I see in going in this direction so that everyone can make the best decision possible.

@beberlei
Copy link
Member

We have a started discussion internally about a similar approach using WeakMap and WeakReference for different parts of the UnitOfWork and also documented the downsides with the lost update in case a changed update goes out of scope (for example during a batch update loop).

Nothing has been decided yet, but its something we have on our radar.

@mpdude
Copy link
Contributor

mpdude commented Feb 29, 2024

Currently debugging a batch processing script with excessive memory usage, so I came across this.

Challenges: We cannot use WeakMap for the identity map directly. It may happen that an entity is updated, but no other references kept in userland (outside the ORM). In that case we must not simply drop the reference, but keep it until the flush() operation unless instructed to do otherwise. From the identity map/UoW point of view, we also cannot easily see whether an entity has been changed, and we also do not notice when it changes (to put some kind of lock in place or similar).

Next challenge: What if we have one-to-many associations? If the identity map used a WeakMap, but both the collection on the "one" side as well as all the owning side backreferences use normal references, would this suffice of make sure the entities are destructed? Maybe the GC could handle this case (weakmap reference to a cyclic "normal" reference situation), but I'd need to check.

Update: If we have cyclic references between to objects, and this group is referred to only through a WeakReference, then gc_collect_cycles() can collect it.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants