-
Notifications
You must be signed in to change notification settings - Fork 190
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Ensure symlinked realpaths are not loaded twice #3189
base: master
Are you sure you want to change the base?
Ensure symlinked realpaths are not loaded twice #3189
Conversation
closes oracle#3138 Use a hash to cache realpath lookups and another to cache realpath -> feature similar to what was done in CRuby: https://github.com/ruby/ruby/blob/499eb3990faeaac2603787f2a41b2d9625e180dc/load.c#L358-L402 https://github.com/ruby/ruby/blob/499eb3990faeaac2603787f2a41b2d9625e180dc/load.c#L1195-L1289
1df367c
to
89e47ef
Compare
Design-wise that change looks pretty hacky to me. |
@@ -140,6 +143,16 @@ def self.find_feature_or_file(feature, use_feature_provided = true) | |||
end | |||
end | |||
|
|||
def self.load_unless_realpath_loaded(feature, path) | |||
# TODO: does this need to be synchronized? | |||
realpath = (@features_realpath_cache[path] ||= File.realpath(path) || path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIK File.realpath never returns false|nil, always a String, or an exception if some part does not exist.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That did seem to be the case for me but I wasn't sure if it was true on all platforms.
I think I was looking at this C code and it made me want to program defensively (the semantics are probably different, though).
https://github.com/ruby/ruby/blob/7ce4b716bdb5bcfc8b30ffcd034ce7aded1f72b9/load.c#L87-L88
@@ -140,6 +143,16 @@ def self.find_feature_or_file(feature, use_feature_provided = true) | |||
end | |||
end | |||
|
|||
def self.load_unless_realpath_loaded(feature, path) | |||
# TODO: does this need to be synchronized? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes of course, you could use a TruffleRuby::ConcurrentMap
.
But I think this code would need to move to Java anyway, because we cannot have an extra backtrace entry for require
, everything must happen in Primitive.load_feature
(or another Primitive reusing it).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i was wondering if it should be under with_synchronized_features
.
That's a good point, though, might be able to do the whole change in java 🤔
@@ -35,6 +35,9 @@ module FeatureLoader | |||
# A snapshot of $LOADED_FEATURES, to check if the @loaded_features_index cache is up to date. | |||
@loaded_features_version = -1 | |||
|
|||
@features_realpath_cache = {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this cache is correct, for instance it seems broken if CWD changes or $LOAD_PATH changes.
Probably that part shouldn't be cached.
You could cache absolute_path->realpath though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point, we probably want absolute path as the key.
@@ -292,13 +305,24 @@ def self.get_loaded_features_index | |||
unless @loaded_features_version == $LOADED_FEATURES.version | |||
raise '$LOADED_FEATURES is frozen; cannot append feature' if $LOADED_FEATURES.frozen? | |||
@loaded_features_index.clear | |||
previous_realpaths = @features_realpath_cache.dup | |||
@features_realpath_cache.clear | |||
@loaded_features_realpaths.clear |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this cache never needs to be cleared.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it was necessary to satisfy the specs, and I saw the CRuby code doing something simliar:
https://github.com/ruby/ruby/blob/7ce4b716bdb5bcfc8b30ffcd034ce7aded1f72b9/load.c#L363-L364
https://github.com/ruby/ruby/blob/7ce4b716bdb5bcfc8b30ffcd034ce7aded1f72b9/load.c#L377-L384
If #3138 is not really an issue in practice I would probably delay this. |
In my testing it seemed that the first time the realpath was stored it was saved to LOADED_FEATURES as whatever was actually requested. I tried to follow what the ruby C code was doing, but it's probably more complicated than what I have so far here.
Might be worth a discussion 👍 |
Action item for me here is I need to file a CRuby issue to suggest using realpath for everything, it would be so much simpler and consistent, and have that discussed in a dev meeting. Also would be useful to get an answer to #3138 (comment), cc @nirvdrum |
closes #3138
Use a hash to cache realpath lookups
and another to cache realpath -> feature.
This is based off of what I was reading in the CRuby code:
https://github.com/ruby/ruby/blob/499eb3990faeaac2603787f2a41b2d9625e180dc/load.c#L358-L402
https://github.com/ruby/ruby/blob/499eb3990faeaac2603787f2a41b2d9625e180dc/load.c#L1195-L1289
but I could easily be wrong about something here.
Let me know if you see problems with the approach or a better way to do something here.
I did have a few questions remaining:
Should the body of the
load_unless_realpath_loaded
method be synchronized? Does it make sense to wrap it inwith_synchronized_features
?There was a commit in CRuby:
ruby/ruby@972f274#:~:text=To%20avoid%20concurrency%20issues%20when%20rebuilding%20the%20loaded%20features%0Aindex%2C%20the%20building%20of%20the%20index%20itself%20is%20left%20alone%2C%20and%0Aafterwards%2C%20a%20separate%20loop%20is%20done%20on%20a%20copy%20of%20the%20loaded%20feature%0Asnapshot%20in%20order%20to%20rebuild%20the%20realpaths%20hash.
Do we not need to worry about that here because it's already being called inside
with_synchronized_features
? Or would that still be a concern?