Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Add changelog parsing code #11

Merged
merged 36 commits into from
Jan 17, 2025
Merged

Conversation

ericphanson
Copy link
Collaborator

@ericphanson ericphanson commented Dec 27, 2024

based on #10

closes #8

This adds parsing code to parse changelogs into a simple in-memory representation, which can be used to query for changes.

Some choices I made, which may or may not be optimal:

  • parse into fixed concrete structs, whose properties are public API
    • wanted to keep it simple and avoid a nest of getters
  • not aiming to have roundtrippable changelogs
    • i.e., we don't preserve formatting, things we don't parse out are just dropped
    • this way we have a very simple output, rather than needing to store that info
  • main struct named SimpleLog
    • in my code, it was named Changelog, but that clashes with the module name
    • I think this helps capture that it isn't a full representation, but a simplified one
  • if a version contains sections ("Added", "Breaking", etc), then any unsectioned-notes are placed under "General"
    • alternatively, we could have sectioned-changes and unsectioned-changes always (i.e. two separate fields sectioned_changes::OrderedDict{String,Vector{String}} and unsectioned_changes::Vector{String}).
    • or, alternatively, we could have only changes::OrderedDict{String,Vector{String}} and when there are no sections, use "General"
    • not really sure which is best
  • I use CommonMark for parsing, not Markdown stdlib, since I remember a lot of weird edge cases with the stdlib
    • I based the code on MarkdownAST, so the particular reader should be easily swappable
  • I build my own tree representation off of MarkdownAST's tree, and parse that tree
    • I found it hard to work off the raw MarkdownAST tree, because one needs to keep track of which section they are "within", but being "in" a section isn't represented in the MarkdownAST child/parent tree relationship
    • this adds another abstraction layer (string -> CommonMark AST -> MarkdownAST -> MarkdownHeadingTree), and it is somewhat leaky (we drop to MarkdownAST's nodes frequently)
  • I try to support a range of heading and date formats rather than being strict
    • I would like to use this at the ecosystem level, and I think being permissive about inputs is good here
    • We could probably support quite a few more formats by extending the header regex and the dateformats
  • I check in some big markdown changelogs from JuMP and Documenter. The tests are still fast to run, and I'd like to have some real in-the-wild tests.

@ericphanson
Copy link
Collaborator Author

if a version contains sections ("Added", "Breaking", etc), then any unsectioned-notes are placed under "General"
alternatively, we could have sectioned-changes and unsectioned-changes always (i.e. two separate fields sectioned_changes::OrderedDict{String,Vector{String}} and unsectioned_changes::Vector{String}).
or, alternatively, we could have only changes::OrderedDict{String,Vector{String}} and when there are no sections, use "General"
not really sure which is best

after some thought, I think it's better to have two fields, toplevel_changes and sectioned_changes. Introducing an artificial General category may be confusing to users and adds complexity to the implementation.

I've also renamed SimpleLog -> SimpleChangelog. I just don't like how SimpleLog looks/sounds, it doesn't really sound like a changelog, but some other logging thing, and I want "Changelog" in the name, as that's the name of the package. (Though too late to have Changelogs.jl and Changelog struct, unfortunately).

@ericphanson ericphanson merged commit 791e67a into JuliaDocs:master Jan 17, 2025
3 checks passed
@ericphanson ericphanson deleted the eph/parser branch January 17, 2025 00:00
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

scope question: add changelog parsing code?
1 participant