Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Adding transcribe mode #92

Merged
merged 4 commits into from
Oct 1, 2024
Merged

Conversation

fosterbrereton
Copy link
Contributor

@fosterbrereton fosterbrereton commented Sep 25, 2024

It is common during the upgrade from one version of hyde to another that the underlying clang tooling will output different symbol names for a given symbol (e.g., a namespace may get removed or added.) Although the symbol is unchanged, because its expected name differs from the have name, hyde will consider the symbols different, remove the old name and insert the new one. This wipes out any previous documentation under the old name that should have been migrated to the new name.

The solution here is very specialized. For the "overloads" key only, we gather the name of each overload in both the have and expected set. We then pair them up according to how well they match to one another (using the Meyers' string diff algorithm; two strings with less "patchwork" between them are considered a better match). Ideally this results in key pairs that represent the same symbol, just with different names. Then we call the proc with have[old_name] and expected[new_name] which will migrate any documentation from the old name to the new.

This capability assumes the overload count of both have and expected are the same. If any new functions are created or removed between upgrades in the clang driver (e.g., a new compiler-generated routine is created and documented) that will have to be managed manually. Assuming the count is the same, it also assumes there is a 1:1 mapping from the set of old names to the set of new names. This implies the transcription mode should be done as a separate step from an update. In other words, a transcription assumes the documentation is actually the same between the have and expected sets, it is just the overload names that have changed, so map the old-named documentation to the new-named documentation as reasonably as possible.

This PR also performs a similar technique for class folder names, especially those that may have been mangled or hashed, as changing the symbols will cause those folder names to be different. Using the same Meyers diff, the old folder name that most closely resembles the new folder being created is found and renamed. The reconciliation process then continues as normal.

@fosterbrereton fosterbrereton changed the title Fosterbrereton/hyde transcribe Adding transcribe mode Sep 25, 2024
@@ -36,11 +36,21 @@ FetchContent_Declare(
SOURCE_SUBDIR llvm
)

FetchContent_Declare(
diff
GIT_REPOSITORY https://github.com/fosterbrereton/diff.git
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code in this repository is a fork of a fork + additional modernizations. The header was released under the Apache 2.0 license. Although it seemed OK to me, IANAL, so I thought it best to isolate the code into its own repository and keep it separate from hyde.

@fosterbrereton fosterbrereton merged commit 3fdcc77 into master Oct 1, 2024
2 checks passed
@fosterbrereton fosterbrereton deleted the fosterbrereton/hyde-transcribe branch October 1, 2024 15:41
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant