Skip to content

Fix continuation memory leak in Ares.query #31

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Merged
merged 4 commits into from
Feb 21, 2024

Conversation

dieb
Copy link
Contributor

@dieb dieb commented Feb 21, 2024

Motivation

#29

Modifications

Frankly not too sure exactly why this fixes the leak (Swift noob here), but it may be that pointer.deallocate() only frees the pointer, and not the underlying initialized.

Result

QueryReplyHandler gets deallocated properly and so does the continuation that was leaking.

Leaks from this are gone when running A/AAAA queries.

Test Plan

Running Xcode leaks instrument in my app, some code examples in #29 .

@yim-lee
Copy link
Member

yim-lee commented Feb 21, 2024

@swift-server-bot add to allowlist

Copy link
Member

@yim-lee yim-lee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dieb Would you be comfortable making similar changes to DNSSD.query and QueryReplyHandler as well? If not, it's ok.

let pointer = handlerPointer.assumingMemoryBound(to: QueryReplyHandler.self)
let handler = pointer.pointee
defer {
pointer.deinitialize(count: 1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! I was making similar changes as well but was missing this line of code.

@@ -276,11 +280,6 @@ extension Ares {
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also change struct QueryReplyHandler to class QueryReplyHandler please? That combined with pointer.deinitialize(count: 1) I see deinit getting called.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely, just pushed a new commit to reflect those changes.

I'm interested in understanding the rationale behind class vs struct here. Could you please share more about your thought process? Swift noob eager to learn here 😄

My best guess is that class deinit can free up its members, though my tests indicate no leaks with struct.

Copy link
Member

@yim-lee yim-lee Feb 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was told that the address of a struct is not stable (we were thinking deallocate might not be working on the right address because of this), plus we are doing all these pointer manipulations already so class probably makes more sense.

@yim-lee
Copy link
Member

yim-lee commented Feb 21, 2024

@swift-server-bot add to allowlist

1 similar comment
@yim-lee
Copy link
Member

yim-lee commented Feb 21, 2024

@swift-server-bot add to allowlist

@yim-lee
Copy link
Member

yim-lee commented Feb 21, 2024

@swift-server-bot test this please

@yim-lee
Copy link
Member

yim-lee commented Feb 21, 2024

Something is wrong with the webhook so CI is not getting triggered. Will ask someone for help looking into it in the morning.

@yim-lee
Copy link
Member

yim-lee commented Feb 21, 2024

@swift-server-bot test this please

@dieb
Copy link
Contributor Author

dieb commented Feb 21, 2024

@dieb Would you be comfortable making similar changes to DNSSD.query and QueryReplyHandler as well? If not, it's ok.

Absolutely, thanks for the oppo. Should be a quick thing.

@dieb dieb force-pushed the memory-leak-continuation branch from b0e8c14 to 6445386 Compare February 21, 2024 15:25
@dieb
Copy link
Contributor Author

dieb commented Feb 21, 2024

@yim-lee I made the change for DNSSD but I was unable to test locally, test_concurrency hangs. I'm also getting errors on CAresDNSResolverTests test_queryTXT and test_concurrency. Would rely on you folks to review if the last commit is appropriate.

Move defer deallocation block to after initialization.

Use class instead of struct for DNSSD.QueryReplyHandler.
@yim-lee
Copy link
Member

yim-lee commented Feb 21, 2024

but I was unable to test locally, test_concurrency hangs.

@dieb Does this happen even before these changes?

I'm also getting errors on CAresDNSResolverTests test_queryTXT and test_concurrency.

Are these errors due to no results?

@dieb
Copy link
Contributor Author

dieb commented Feb 21, 2024

Same errors are happening in main and 1f5d6f4 (before #30), so I'm guessing it's something to do with my local environment.

Running swift test directly in the command-line without the docker-compose thingy and I noticed only mDNSResponder shows up in ps | grep dns, not sure if I need anything else for DNSSD.

Edit: errors are connection refused and test_concurrency hanging.

Test log
➜  swift-async-dns-resolver git:(main) ✗ swift test
Building for debugging...
[6/6] Linking swift-async-dns-resolverPackageTests
Build complete! (0.72s)
Test Suite 'All tests' started at 2024-02-21 14:31:48.762.
Test Suite 'swift-async-dns-resolverPackageTests.xctest' started at 2024-02-21 14:31:48.762.
Test Suite 'AresChannelTests' started at 2024-02-21 14:31:48.762.
Test Case '-[AsyncDNSResolverTests.AresChannelTests test_init]' started.
Test Case '-[AsyncDNSResolverTests.AresChannelTests test_init]' passed (0.001 seconds).
Test Suite 'AresChannelTests' passed at 2024-02-21 14:31:48.763.
	 Executed 1 test, with 0 failures (0 unexpected) in 0.001 (0.001) seconds
Test Suite 'AresErrorTests' started at 2024-02-21 14:31:48.763.
Test Case '-[AsyncDNSResolverTests.AresErrorTests test_initFromCode]' started.
Test Case '-[AsyncDNSResolverTests.AresErrorTests test_initFromCode]' passed (0.000 seconds).
Test Suite 'AresErrorTests' passed at 2024-02-21 14:31:48.764.
	 Executed 1 test, with 0 failures (0 unexpected) in 0.000 (0.000) seconds
Test Suite 'AresOptionsTests' started at 2024-02-21 14:31:48.764.
Test Case '-[AsyncDNSResolverTests.AresOptionsTests test_AresOptions_socketStateCallback]' started.
Test Case '-[AsyncDNSResolverTests.AresOptionsTests test_AresOptions_socketStateCallback]' passed (0.000 seconds).
Test Case '-[AsyncDNSResolverTests.AresOptionsTests test_flags]' started.
Test Case '-[AsyncDNSResolverTests.AresOptionsTests test_flags]' passed (0.000 seconds).
Test Case '-[AsyncDNSResolverTests.AresOptionsTests test_OptionsToAresOptions]' started.
Test Case '-[AsyncDNSResolverTests.AresOptionsTests test_OptionsToAresOptions]' passed (0.000 seconds).
Test Case '-[AsyncDNSResolverTests.AresOptionsTests test_rotate]' started.
Test Case '-[AsyncDNSResolverTests.AresOptionsTests test_rotate]' passed (0.000 seconds).
Test Suite 'AresOptionsTests' passed at 2024-02-21 14:31:48.765.
	 Executed 4 tests, with 0 failures (0 unexpected) in 0.001 (0.001) seconds
Test Suite 'CAresDNSResolverTests' started at 2024-02-21 14:31:48.765.
Test Case '-[AsyncDNSResolverTests.CAresDNSResolverTests test_concurrency]' started.
/Users/dieb/Projects/Terran/OSS/swift-async-dns-resolver/Tests/AsyncDNSResolverTests/c-ares/CAresDNSResolverTests.swift:128: error: -[AsyncDNSResolverTests.CAresDNSResolverTests test_concurrency] : failed: caught error: "connection refused: "
Test Case '-[AsyncDNSResolverTests.CAresDNSResolverTests test_concurrency]' failed (0.071 seconds).
Test Case '-[AsyncDNSResolverTests.CAresDNSResolverTests test_queryAAAA]' started.
Test Case '-[AsyncDNSResolverTests.CAresDNSResolverTests test_queryAAAA]' passed (0.022 seconds).
Test Case '-[AsyncDNSResolverTests.CAresDNSResolverTests test_queryA]' started.
Test Case '-[AsyncDNSResolverTests.CAresDNSResolverTests test_queryA]' passed (0.013 seconds).
Test Case '-[AsyncDNSResolverTests.CAresDNSResolverTests test_queryCNAME]' started.
Test Case '-[AsyncDNSResolverTests.CAresDNSResolverTests test_queryCNAME]' passed (0.025 seconds).
Test Case '-[AsyncDNSResolverTests.CAresDNSResolverTests test_queryMX]' started.
Test Case '-[AsyncDNSResolverTests.CAresDNSResolverTests test_queryMX]' passed (0.023 seconds).
Test Case '-[AsyncDNSResolverTests.CAresDNSResolverTests test_queryNAPTR]' started.
Test Case '-[AsyncDNSResolverTests.CAresDNSResolverTests test_queryNAPTR]' passed (0.023 seconds).
Test Case '-[AsyncDNSResolverTests.CAresDNSResolverTests test_queryNS]' started.
Test Case '-[AsyncDNSResolverTests.CAresDNSResolverTests test_queryNS]' passed (0.024 seconds).
Test Case '-[AsyncDNSResolverTests.CAresDNSResolverTests test_queryPTR]' started.
Test Case '-[AsyncDNSResolverTests.CAresDNSResolverTests test_queryPTR]' passed (0.013 seconds).
Test Case '-[AsyncDNSResolverTests.CAresDNSResolverTests test_querySOA]' started.
Test Case '-[AsyncDNSResolverTests.CAresDNSResolverTests test_querySOA]' passed (0.021 seconds).
Test Case '-[AsyncDNSResolverTests.CAresDNSResolverTests test_querySRV]' started.
Test Case '-[AsyncDNSResolverTests.CAresDNSResolverTests test_querySRV]' passed (0.025 seconds).
Test Case '-[AsyncDNSResolverTests.CAresDNSResolverTests test_queryTXT]' started.
error: Exited with signal code 13

// The handler might be called multiple times so don't deallocate inside `callback`
defer {
let pointer = handlerPointer.assumingMemoryBound(to: QueryReplyHandler.self)
pointer.deinitialize(count: 1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without this line of code, we have these leaks:

image

The above is modifying test_queryAAAA to run self.resolver.queryAAAA 100K times.

With this line there were no leaks.

@yim-lee yim-lee requested a review from ktoso February 21, 2024 17:43
Copy link
Member

@yim-lee yim-lee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @dieb.

Tests pass locally on macOS for me and I ran Instruments on modified test_queryAAAA (basically run the query many times in a loop) with and without this changeset and could see that leaks are fixed, so I think these changes are good.

@ktoso I would appreciate if you could review this as well. Thanks in advance. 🙏

Copy link
Member

@ktoso ktoso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks correct, thanks a lot for the detective work!

@yim-lee yim-lee merged commit b7079b7 into apple:main Feb 21, 2024
@dieb dieb deleted the memory-leak-continuation branch February 22, 2024 03:06
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants