Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

research & discuss database migration strategies #1692

Closed
yusefnapora opened this issue Mar 22, 2022 · 7 comments
Closed

research & discuss database migration strategies #1692

yusefnapora opened this issue Mar 22, 2022 · 7 comments
Labels
effort/days Estimated to take multiple days, but less than a week kind/architecture Core architecture of project kind/discussion Topical discussion; usually not changes to codebase pi/support-0 stack/quality-velocity topic/devexp Developer Experience topic/infra Infrastructure

Comments

@yusefnapora
Copy link
Contributor

This issue is just to track the discussion around DB migrations - not the implementation, which will happen in a few weeks once we've decided what we want to do.

Where we are

DB schemas are checked into git, but schema changes need to be manually applied to deployed environments, which is stressful and scary.

Where we want to be

Schema changes are automatically applied in CI, with tests to validate the change. The same migrations that run in local development are applied on staging and production databases, and we're confident that success in local and staging envs correlates with success in production.

Possible solutions

So far we're looking at two ends of the "magic spectrum":

  • Sqitch is unopinionated and "non-magical"... it bills itself as a "change management system," where you supply the SQL (or whatever) that needs to be applied, and it makes sure it happens.
  • Prisma is a full ORM, so it's all about the magic. You define your data model in a custom DSL, and it derives DB migrations from it. It also includes type safe DB "clients" for querying and so on. It can introspect an existing DB schema to produce a prisma data model for the initial migration. Future migrations happen by changing the prisma model.

There are other tools that fit somewhere in between these two philosophies... if anyone has a favorite, please chime in.

@yusefnapora yusefnapora added kind/architecture Core architecture of project kind/discussion Topical discussion; usually not changes to codebase effort/days Estimated to take multiple days, but less than a week topic/infra Infrastructure topic/devexp Developer Experience pi/support-0 stack/quality-velocity labels Mar 22, 2022
@vasco-santos
Copy link
Contributor

There are 5 requirements I could think of:

  • apply latest migration script version from repo path
  • apply migration script on staging DB on merge to main
  • apply migration script on production DB on release on API
  • support rollbacks
  • support writes disabled

For the first three, we need to hook up github action workflows to perform them with one of the mentioned tools. Flyway also seems a good candidate and they already have a github action.

(probably not needed for first iteration) Supporting rollbacks also seems important, both for staging and production. We will merge to main and get the migration to run in Staging DB. In the event of something not working properly, we should be able to roll it back. Same for production, as production environment has a different scale that sometimes might raise issues that we don't replicate in staging. For this part, we should probably have a manual github action workflow job that could be triggered via Github Actions UI to go back.

(probably not needed for first iteration) Regarding support for writes disabled. Sometimes we will need to perform migrations that require writes to be disables temporarily. For instance, the backup urls migration that meant locking Upload table for a while. We would probably need to have a marker that marked need for this type of release. It would be great if we could have all this release setup to accommodate this, but this might also be complex to achieve.

We currently need to do everything manually, so probably we can iterate with first 3 items first, then we can add the remaining two items and decrease the needed overhead over time.

@olizilla
Copy link
Contributor

olizilla commented Jun 7, 2022

I'd need some convincing to adopt an external service to handle db migrations. We already have a custom cli for handling more complicated buiild issues... I think we should explore the CI side for testing on staging

  • Store migrations as sql file with a version number
  • Store db schema version number in schema
  • Script to copy a snapshot of the staging db for rollback if required.
  • Apply the migrations to take schema x to schema y
  • Run db tests

For going to production I would like to be able to run the migration by running a single cli command. For most migrations I wouldn't be comfortable leaving it to CI, as folks don't always follow along with the the github actions logs to see what happened. I feel like prod db migrations, should be attended by a human, but we should package up the step to run as a single command.

@vasco-santos
Copy link
Contributor

vasco-santos commented Jun 7, 2022

For going to production I would like to be able to run the migration by running a single cli command.

I am ok with that, but behind a github action with a manual trigger where we can grant permissions to more people in the team without complete access to Database

@vasco-santos
Copy link
Contributor

Other detail: I noticed that is difficult to "manage" different PRs in flight with migration scripts. Happened a few times two PRs from two different people, reviewed also by two different people resulting into two SQL migration scripts with same number.

We need to take into consideration this and guarantee that nothing bad happens for these cases.

@yusefnapora
Copy link
Contributor Author

@vasco-santos maybe we should use timestamps instead of sequence numbers for versioning migrations?

@yusefnapora
Copy link
Contributor Author

Script to copy a snapshot of the staging db for rollback if required.

This is a great idea and hopefully gets rid of the need for explicit "undo migrations" to reverse things. Which I never quite trust anyway, since they seem like some of the least likely things to get tested in conditions approximating production (or whatever state production is in after the failed migration, which you probably didn't plan for when you were writing the rollback code).

In other words, if the bug makes it past the staging tests and still causes a production issue, you probably want to take a minute to figure out how to "roll forward" instead of trusting a script you wrote before you understood the issue 😄

@vasco-santos
Copy link
Contributor

@vasco-santos maybe we should use timestamps instead of sequence numbers for versioning migrations?

A timestamp would be cool. Probably making it responsibility of developer will be complex and not avoid different order in the end. It would be super cool if we had a workflow job that on merge to migrations folder would rename file and prepend timestamp

--

After spending some time thinking, I also agree that we can rely on a custom setup to run SQL scripts that are added. We need to see what migration files were added from latest release and run them with a manual github action trigger. Probably we do the same for staging to require human input and guarantee someone sees if the migration worked fine?

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
effort/days Estimated to take multiple days, but less than a week kind/architecture Core architecture of project kind/discussion Topical discussion; usually not changes to codebase pi/support-0 stack/quality-velocity topic/devexp Developer Experience topic/infra Infrastructure
Projects
None yet
Development

No branches or pull requests

4 participants