Introduce `StreamParsingStrategy` to support builds with large logs #40

sghill · 2022-10-30T07:27:36Z

This implementation is currently opt-in by setting the system property hudson.plugins.logparser.ParsingStrategy to hudson.plugins.logparser.StreamParsingStrategy on agents, but it'd be simpler to replace the current strategy.

The new strategy relies on Java 8 features and improves throughput with less code. I'm happy to make that change, but thought it'd be worth discussing first.

📝 Description

We have a job using this plugin that intermittently hangs. I have been able to reproduce the scenario by creating a job that logs many lines, creating an OutOfMemory error on the agent. A new strategy is included here that allows the plugin to parse builds with more logs than the current approach.

Parsing time is a function of the number of lines in the file and the number of rules. To test this, I defined one rule and ran it on a job that created a set number of lines. 10% of the lines matched the rule. I expect 100 rules run against 100 lines would scale similarly to 10 rules run on 1,000 lines or one rule on 10,000 lines.

Line-Rules	2.3.0 (average seconds over 10 runs)	`StreamParsingStrategy`
100	0	0
1,000	0	0
10,000	0	0
100,000	0	0
1,000,000	5	3.1
10,000,000	75.4	31.9
20,000,000	Failed (OOM)	66.3
50,000,000	Failed (OOM)	220

I'm using the already-existing log statement in LogParserParser to get this data. For small log files, both are under a second.

💎 Type of change

Bug fix (non-breaking change which fixes an issue)

🚦 How Has This Been Tested?

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration

I extracted the current code into a ClassicParsingStrategy and introduced a new StreamParsingStrategy. The new classes have unit tests that verify behavior.

For testing large logs, I deployed Jenkins 2.361.2 on an Azure Standard D2s v3 (2 vcpus, 8 GiB memory) and an agent with one executor on another Standard D2s v3. I installed v2.3.0 from the update center to gather the baseline data. I built the plugin locally at this commit, uploaded, and set the system property for the comparison data.

I created a job that would log out 10 lines on a loop, and a rules file with one rule.

build script

echo "warn /warning/" > rules.txt

set +x
for i in $(seq 0 10)
do
echo "> Task :javadocJar $i"
echo "> Task :sourcesJar $i"
echo "> Task :assemble $i"
echo "> Task :codenarcMain $i"
echo "> Task :codenarcTest $i"
echo "> Task :compileTestKotlin NO-SOURCE $i"
echo "> Task :pluginUnderTestMetadata $i"
echo "> Task :compileTestJava NO-SOURCE $i"
echo "> Task :javadoc $i"
echo "/home/runner/work/gradle-jpi-plugin/gradle-jpi-plugin/src/main/java/shaded/hudson/util/VersionNumber.java:551: warning: no @param for idx $i"
done

🏁 Checklist:

My code follows the style guidelines of this project
I have performed a self-review of my own code
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes
I have commented my code, particularly in hard-to-understand areas

This strategy does not allocate an executor service or split/aggregate for processing

hypery2k · 2022-10-31T05:17:08Z

nice, I really like that one. Thx

sghill added 6 commits October 28, 2022 19:52

Extract LineToStatus function from LogParserThread

4f40da8

Extract ParsingStrategy

8b018d7

Introduce StreamParsingStrategy

8ad7e6e

This strategy does not allocate an executor service or split/aggregate for processing

Add javadoc to new classes

38bc5c3

Add LineToStatusTest to document current behavior

71ef883

Do not send back unmatched lines

6073239

hypery2k approved these changes Oct 31, 2022

View reviewed changes

hypery2k merged commit 6b4b6e6 into jenkinsci:develop Oct 31, 2022

sghill mentioned this pull request Nov 16, 2022

Only copy the log to java.io.tmpdir in the ClassicParsingStrategy #53

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce `StreamParsingStrategy` to support builds with large logs #40

Introduce `StreamParsingStrategy` to support builds with large logs #40

sghill commented Oct 30, 2022

hypery2k commented Oct 31, 2022

Introduce StreamParsingStrategy to support builds with large logs #40

Introduce StreamParsingStrategy to support builds with large logs #40

Conversation

sghill commented Oct 30, 2022

📝 Description

💎 Type of change

🚦 How Has This Been Tested?

🏁 Checklist:

hypery2k commented Oct 31, 2022

Introduce `StreamParsingStrategy` to support builds with large logs #40

Introduce `StreamParsingStrategy` to support builds with large logs #40