Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

fix(CORE) Add mutex protection to _pendingTimeSyncRequests map access #21285

Closed
wants to merge 7 commits into from

Conversation

blinkysc
Copy link
Contributor

WorldSession::SendTimeSync() function was experiencing a segmentation fault when accessing the _pendingTimeSyncRequests map from multiple threads. Added mutex synchronization to prevent concurrent map access that was causing seg fault

Changes Proposed:

This PR proposes changes to:

  • Core (units, players, creatures, game systems).
  • Scripts (bosses, spell scripts, creature scripts).
  • Database (SAI, creatures, etc).

Issues Addressed:

SOURCE:

The changes have been validated through:

  • Live research (checked on live servers, e.g Classic WotLK, Retail, etc.)
  • Sniffs (remember to share them with the open source community!)
  • Video evidence, knowledge databases or other public sources (e.g forums, Wowhead, etc.)
  • The changes promoted by this pull request come partially or entirely from another project (cherry-pick). Cherry-picks must be committed using the proper --author tag in order to be accepted, thus crediting the original authors, unless otherwise unable to be found

Tests Performed:

This PR has been:

  • Tested in-game by the author.
  • Tested in-game by other community members/someone else other than the author/has been live on production servers.
  • This pull request requires further testing and may have edge cases to be tested.

How to Test the Changes:

  • This pull request can be tested by following the reproduction steps provided in the linked issue
  • This pull request requires further testing. Provide steps to test your changes. If it requires any specific setup e.g multiple players please specify it as well.

Known Issues and TODO List:

  • [ ]
  • [ ]

How to Test AzerothCore PRs

When a PR is ready to be tested, it will be marked as [WAITING TO BE TESTED].

You can help by testing PRs and writing your feedback here on the PR's page on GitHub. Follow the instructions here:

http://www.azerothcore.org/wiki/How-to-test-a-PR

REMEMBER: when testing a PR that changes something generic (i.e. a part of code that handles more than one specific thing), the tester should not only check that the PR does its job (e.g. fixing spell XXX) but especially check that the PR does not cause any regression (i.e. introducing new bugs).

For example: if a PR fixes spell X by changing a part of code that handles spells X, Y, and Z, we should not only test X, but we should test Y and Z as well.

@github-actions github-actions bot added CORE Related to the core file-cpp Used to trigger the matrix build labels Jan 28, 2025
@blinkysc
Copy link
Contributor Author

@nl-saw look good?

@Takenbacon
Copy link
Contributor

Why would this need to be locked? Where do you see this being accessed by multiple threads?

@blinkysc
Copy link
Contributor Author

Why would this need to be locked? Where do you see this being accessed by multiple threads?

#4 WorldSession::SendTimeSync (this=this@entry=0x62d2089d600) at /root/env/chromiecraft/azerothcore-wotlk/src/server/game/Server/WorldSession.cpp:1641
#5 0x00005c50383281e2 in WorldSession::Update (this=0x62d2089d600, diff=18, updater=...) at /root/env/chromiecraft/azerothcore-wotlk/src/server/game/Server/WorldSession.cpp:494
#6 0x00005c5038254581 in Map::Update (this=0x62d97045e80, t_diff=0, s_diff=18) at /root/env/chromiecraft/azerothcore-wotlk/src/server/game/Maps/Map.cpp:753
#7 0x00005c503826aacd in MapUpdateRequest::call (this=0x6275e115340) at /root/env/chromiecraft/azerothcore-wotlk/src/server/game/Maps/MapUpdater.cpp:44
#8 0x00005c503826a428 in MapUpdater::WorkerThread (this=0x5c5038fc7a00 MapMgr::instance()::instance+216) at /root/env/chromiecraft/azerothcore-wotlk/src/server/game/Maps/MapUpdater.cpp:158
#9 0x0000062e3b2dc253 in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6

@Takenbacon
Copy link
Contributor

Why would this need to be locked? Where do you see this being accessed by multiple threads?

#4 WorldSession::SendTimeSync (this=this@entry=0x62d2089d600) at /root/env/chromiecraft/azerothcore-wotlk/src/server/game/Server/WorldSession.cpp:1641 #5 0x00005c50383281e2 in WorldSession::Update (this=0x62d2089d600, diff=18, updater=...) at /root/env/chromiecraft/azerothcore-wotlk/src/server/game/Server/WorldSession.cpp:494 #6 0x00005c5038254581 in Map::Update (this=0x62d97045e80, t_diff=0, s_diff=18) at /root/env/chromiecraft/azerothcore-wotlk/src/server/game/Maps/Map.cpp:753 #7 0x00005c503826aacd in MapUpdateRequest::call (this=0x6275e115340) at /root/env/chromiecraft/azerothcore-wotlk/src/server/game/Maps/MapUpdater.cpp:44 #8 0x00005c503826a428 in MapUpdater::WorkerThread (this=0x5c5038fc7a00 MapMgr::instance()::instance+216) at /root/env/chromiecraft/azerothcore-wotlk/src/server/game/Maps/MapUpdater.cpp:158 #9 0x0000062e3b2dc253 in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6

That is the standard process flow of the server, there is nothing to indicate a threading crash or a necessity for locking.

Maybe we need to add this to a wiki or something as there is often misunderstanding the threading design of the server but the general rule of thumb is:

Every player/worldsession can only belong to one map thread at a time (obviously), thus member values within a player or worldsession object 99.99% of the time are thread safe, which is why we don't have a billion arbitrary locks everywhere.

There are two slight exceptions to this that are special cases:

Every session belongs to a network thread that feeds the incoming (locked) packet queue. Anything the network thread touches on the worldsession must be locked, there are only a couple like the packet queue and latency values.

Finally, and this is generally the problem most of the time, we have cross map interaction between objects. Because Maps are updated in parallel we shouldn't be accessing another player/session object from another map during the Map update. It's not often but sometimes you'll run into something that violates this rule. Something like iterating and accessing group members or casting a spell on a player on another map is a big no-no which leads to all sorts of odd and semi rare segfaults. The main point is this tends to be a fundamental design flaw in that section of code and needs to be reworked rather than just needlessly slapping locks everywhere instead of addressing the bigger problem.

@blinkysc
Copy link
Contributor Author

Why would this need to be locked? Where do you see this being accessed by multiple threads?

#4 WorldSession::SendTimeSync (this=this@entry=0x62d2089d600) at /root/env/chromiecraft/azerothcore-wotlk/src/server/game/Server/WorldSession.cpp:1641 #5 0x00005c50383281e2 in WorldSession::Update (this=0x62d2089d600, diff=18, updater=...) at /root/env/chromiecraft/azerothcore-wotlk/src/server/game/Server/WorldSession.cpp:494 #6 0x00005c5038254581 in Map::Update (this=0x62d97045e80, t_diff=0, s_diff=18) at /root/env/chromiecraft/azerothcore-wotlk/src/server/game/Maps/Map.cpp:753 #7 0x00005c503826aacd in MapUpdateRequest::call (this=0x6275e115340) at /root/env/chromiecraft/azerothcore-wotlk/src/server/game/Maps/MapUpdater.cpp:44 #8 0x00005c503826a428 in MapUpdater::WorkerThread (this=0x5c5038fc7a00 MapMgr::instance()::instance+216) at /root/env/chromiecraft/azerothcore-wotlk/src/server/game/Maps/MapUpdater.cpp:158 #9 0x0000062e3b2dc253 in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6

That is the standard process flow of the server, there is nothing to indicate a threading crash or a necessity for locking.

Maybe we need to add this to a wiki or something as there is often misunderstanding the threading design of the server but the general rule of thumb is:

Every player/worldsession can only belong to one map thread at a time (obviously), thus member values within a player or worldsession object 99.99% of the time are thread safe, which is why we don't have a billion arbitrary locks everywhere.

There are two slight exceptions to this that are special cases:

Every session belongs to a network thread that feeds the incoming (locked) packet queue. Anything the network thread touches on the worldsession must be locked, there are only a couple like the packet queue and latency values.

Finally, and this is generally the problem most of the time, we have cross map interaction between objects. Because Maps are updated in parallel we shouldn't be accessing another player/session object from another map during the Map update. It's not often but sometimes you'll run into something that violates this rule. Something like iterating and accessing group members or casting a spell on a player on another map is a big no-no which leads to all sorts of odd and semi rare segfaults. The main point is this tends to be a fundamental design flaw in that section of code and needs to be reworked rather than just needlessly slapping locks everywhere instead of addressing the bigger problem.

Ok cool. Yeah guess would be locks everywhere if you wanted to catch edge cases. I can close PR. Thanks for exp @Takenbacon

@blinkysc blinkysc closed this Jan 28, 2025
@blinkysc blinkysc deleted the ProtectSendTimeSync branch January 29, 2025 14:38
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
CORE Related to the core file-cpp Used to trigger the matrix build
Projects
None yet
Development

Successfully merging this pull request may close these issues.

WordSession::Update() Crash
2 participants