-
Notifications
You must be signed in to change notification settings - Fork 76
Conversation
…ates through the file path and finds a file that matches.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your fix, I have left some comments :P
private Map<String, FilePath> createSourceFileMapping(FilePath workspace, TaskListener listener) { | ||
try { | ||
return Arrays | ||
.stream(workspace.list("**/*.java")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This plugin also supports other programming languages like JavaScript, C#, C++, so this might not work for these languages.
Maybe we can use another optimization method here. We used FilePath(workspace).list("**/" + sourceFilePath)
to search source file for each source file before, it is really slow and does a lot of repeated work.
But, most of the source files have the same parent directory, we can use the wildcard to search the source file if we are unable to find it directly under the workspace and possiblePaths
. Once we successfully find the source file by using the wildcard, we will add its parent path to possiblePaths
so that we can reuse it to avoid search next time.
What do you think of this change? Thanks~
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cizezsy Ahh sorry, I totally mean to change that from **/*.java
to **/*
, but i'm happy to make the changes you've suggested if you think that's preferable? Just to verify what it is you're looking for there: in tryFindSourceFile
, instead of looking at the wildcard every time, we check to find the file in our list of possible paths. If we can't find it there, we load it with the wildcard and add the parent path to the list of possible paths?
Does doing that also resolve your other comment about doing the calculating in the agent instead of master? I guess it does?
It feels like that wouldn't give us anywhere near the performance benefits that the above change does (assuming I change **/*.java
to **/*
. What are your thoughts on that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to verify what it is you're looking for there: in tryFindSourceFile, instead of looking at the wildcard every time, we check to find the file in our list of possible paths. If we can't find it there, we load it with the wildcard and add the parent path to the list of possible paths?
Yes.
Does doing that also resolve your other comment about doing the calculating in the agent instead of master? I guess it does?
Yes :)
It feels like that wouldn't give us anywhere near the performance benefits that the above change does (assuming I change **/.java to **/. What are your thoughts on that?
Yeah, you are right, that will not increase the performance compare with your changes. But, in your implementation, you used the file name as the key of sourceFileMapping
. And sourceFilePath
might not be a file name instead of a path of the source file(like Cobertura). You won't get a correct source file form the map in this situation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cizezsy This change should have no impact on Corbertura at all. Previously at the end of tryFindSourceFile
, we were just returning null. That will be exactly the same if the source file mapping isn't found in our map. Also, tryFindSourceFile
has lots of code to handle the case where the source file is absolute, so Corbertura shouldn't need the fallback anyway I don't think. We have about 7000 classes, and this change has improved performance from about 10 minutes to 5 seconds. We'd really like it if we could keep those performance benefits 🤞
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bgreen1005 Sounds reasonable to me :p
I will test the changes on my computer. If there are no other problems, we can get this merged. Thanks~
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cizezsy Perfect! Thank you 🙂 Hopefully no issues 🤞
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cizezsy How's it looking? Have you had chances to check this? 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bgreen1005 I will try it this weekend. Sorry for the delay, I am very busy this week :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cizezsy no problem 🙂 Thanks for the heads up.
@@ -78,8 +80,10 @@ public void resolveSourceFiles(Run<?, ?> run, FilePath workspace, TaskListener l | |||
|
|||
listener.getLogger().printf("%d source files need to be copied.%n", paints.size()); | |||
|
|||
final Map<String, FilePath> sourceFileMapping = createSourceFileMapping(workspace, listener); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is better to do calculating in the agent to reduce the Jenkins master workload.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cizezsy Sorry for the stupid question, but would you mind explaining how you can tell which of this is run on master and which is run on agents?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for your contribution~
(I think we need another PR to improve the case which two file have the same name, but we can get this merged now :P
Great thanks! Yeah I agree. We could do with being able to differentiate between potential duplicates. How often do you release? |
@bgreen1005 I will perform a release this week :) |
private Map<String, FilePath> createSourceFileMapping(FilePath workspace, TaskListener listener) { | ||
try { | ||
return Arrays | ||
.stream(workspace.list("**/*")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why list all files in the workspace (which would include compiled binaries, node_modules, etc.) rather than only those referenced by the coverage report(s) being processed? See #150 for something that seems to be caused by this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you look at this more closely, you will see that it is executed if the source file we are looking for was not found in the usual way.
And this code resolves source files not only for Cobertura, but also for formats like Jacoco, which does not have such a direct path to the source files.
I'll remove the warnings to don't disturb people
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, as a fallback mechanism specifically -- and one that seems to be deliberately using just the filenames without paths -- it doesn't seem worth the noise to have all these duplicates get logged. Otherwise you're logging warnings for a map that might not even get used (and ideally would not, if all went well).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm looking in the code and don't see this as a fallback, but directly called. Am I missing something.
I'm investigating why this plugin is taking over 6 hours in CoverageProcessor.convertToResults, while the Cobertura plugin runs in only 21 seconds. ( I know it's this part of the code per the log out time start of "Publishing Coverage report..." vs the next output of "A total of X reports were found".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that mapping is always constructed -- which may be the cause of your problem -- but is only used if it cannot find a match from the file names/paths in the coverage reports. I believe that's why it was being described as a "fallback".
140: This commit fixes issue 140 (https://github.com/jenkinsci/code-coverage-api-plugin/issues/140) by adding back in the code that iterates through the file path and finds a file that matches. I have also optimized the lookup by getting all of the file paths in one go. This drastically improves performance for jacoco. One thing to note, I've added some code to log duplicates. This is due to the fact that Jacoco only reports the class name, not the directory and so duplicates are possible.
I noticed a massive performance improvement with these changes. Generating our code coverage report has dropped from about 8 minutes to 5 or 6 seconds.
I faced a few issues with the POM file, and had to update some dependencies. I wasn't sure if that was just a local problem, so haven't checked in the changes but I can do if need be.
I've also added a few fixmes due to unnecessary code in DefaultSourceFileResolver. Happy to fix if you'd like :)