Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Reduce memory allocations and improve performance in JSONObject #68

Merged
merged 1 commit into from
Nov 15, 2024

Conversation

basil
Copy link
Contributor

@basil basil commented Nov 14, 2024

Parsing https://updates.jenkins.io/update-center.json is extremely slow (hundreds of times slower than jq, for example). It consistently takes about 8 seconds and allocates about 170 GiB of RAM over the course of the parsing procedure. Profiling showed lots of regular expression compilation like

  java.lang.Thread.State: RUNNABLE
	at java.util.regex.Pattern.compile(java.base@11.0.5/Pattern.java:1757)
	at java.util.regex.Pattern.<init>(java.base@11.0.5/Pattern.java:1428)
	at java.util.regex.Pattern.compile(java.base@11.0.5/Pattern.java:1068)
	at net.sf.json.regexp.JdkRegexpMatcher.<init>(JdkRegexpMatcher.java:38)
	at net.sf.json.regexp.JdkRegexpMatcher.<init>(JdkRegexpMatcher.java:31)
	at net.sf.json.regexp.RegexpUtils.getMatcher(RegexpUtils.java:39)
	at net.sf.json.util.JSONTokener.matches(JSONTokener.java:111)
	at net.sf.json.JSONObject._fromJSONTokener(JSONObject.java:912)
	at net.sf.json.JSONObject.fromObject(JSONObject.java:156)
	at net.sf.json.util.JSONTokener.nextValue(JSONTokener.java:348)
	at net.sf.json.JSONArray._fromJSONTokener(JSONArray.java:1131)
	at net.sf.json.JSONArray.fromObject(JSONArray.java:125)
	at net.sf.json.util.JSONTokener.nextValue(JSONTokener.java:351)
	at net.sf.json.JSONObject._fromJSONTokener(JSONObject.java:955)
	at net.sf.json.JSONObject.fromObject(JSONObject.java:156)
	at net.sf.json.util.JSONTokener.nextValue(JSONTokener.java:348)
	at net.sf.json.JSONObject._fromJSONTokener(JSONObject.java:955)
	at net.sf.json.JSONObject.fromObject(JSONObject.java:156)
	at net.sf.json.util.JSONTokener.nextValue(JSONTokener.java:348)
	at net.sf.json.JSONObject._fromJSONTokener(JSONObject.java:955)
	at net.sf.json.JSONObject._fromString(JSONObject.java:1145)
	at net.sf.json.JSONObject.fromObject(JSONObject.java:162)
	at net.sf.json.JSONObject.fromObject(JSONObject.java:132)

and string allocation like

   java.lang.Thread.State: RUNNABLE
	at java.lang.String.<init>(String.java:207)
	at java.lang.String.substring(String.java:1933)
	at net.sf.json.util.JSONTokener.matches(JSONTokener.java:110)
	at net.sf.json.JSONObject._fromJSONTokener(JSONObject.java:912)
	at net.sf.json.JSONObject.fromObject(JSONObject.java:156)
	at net.sf.json.util.JSONTokener.nextValue(JSONTokener.java:348)
	at net.sf.json.JSONArray._fromJSONTokener(JSONArray.java:1131)
	at net.sf.json.JSONArray.fromObject(JSONArray.java:125)
	at net.sf.json.util.JSONTokener.nextValue(JSONTokener.java:351)
	at net.sf.json.JSONObject._fromJSONTokener(JSONObject.java:955)
	at net.sf.json.JSONObject.fromObject(JSONObject.java:156)
	at net.sf.json.util.JSONTokener.nextValue(JSONTokener.java:348)
	at net.sf.json.JSONObject._fromJSONTokener(JSONObject.java:955)
	at net.sf.json.JSONObject.fromObject(JSONObject.java:156)
	at net.sf.json.util.JSONTokener.nextValue(JSONTokener.java:348)
	at net.sf.json.JSONObject._fromJSONTokener(JSONObject.java:955)
	at net.sf.json.JSONObject._fromString(JSONObject.java:1145)
	at net.sf.json.JSONObject.fromObject(JSONObject.java:162)
	at net.sf.json.JSONObject.fromObject(JSONObject.java:132)

There are two issues here: repeatedly compiling a pattern where a simple .startsWith("null") would have sufficed, and repeatedly copying a massive string just to search a few characters in it. See flame graphs before and after.

before

after

I added a new unit test. This has also been shipping in production in our fork of json-lib to Jenkins users in 2.456 since May without any reported issues.

@aalmiray aalmiray added this to the 3.2.0 milestone Nov 15, 2024
@aalmiray aalmiray merged commit db76e69 into kordamp:master Nov 15, 2024
1 check passed
@aalmiray
Copy link
Collaborator

Thank you 😄

@basil basil deleted the starts-with branch November 15, 2024 20:49
# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants