Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add data migration to delete old link check reports
Whitehall introduced the link checker api report concept in Nov 2017: 03c4734 Link check reports are never deleted - even ones associated with superseded editions - and editions can also be associated with multiple link check reports. Consequently, at time of writing, there are over 12 million reports in Whitehall, making refactoring the modelling challenging. This PR makes the case that keeping all of these historic link check reports around is not valuable. The old reports are never surfaced to users, and even if they were, how valuable is it to be able to know that 'at the time' a given link was ok (or not)? No, there is only really value in 'recent' reports. One could even argue that a report generated yesterday isn't that useful, as what was a good link yesterday may have become 'bad' overnight. Users can trivially generate new link check reports by hitting the relevant button in Whitehall's UI. All that being said, we see little value in retaining any link check reports generated before 2025. Even if we have logic that, say, prevents publication of a document that is missing a link check report, and we have an edge case where a document was scheduled in 2024 for publication in 2025, that should be mitigated by the fact that we're planning to kick off a batch of new link check reports for all draft/published editions, as part of https://trello.com/c/tmnht4P1/. Before: 11787828 records After: 446009 records (locally, from slightly stale data)
- Loading branch information