-
Notifications
You must be signed in to change notification settings - Fork 147
Special Things About OSC
Facebook's OnlineSchemaChange (aka OSC) isn't meant to replace all the existing solutions of online schema changes. Instead we implemented it to fill the missing pieces of existing approaches which are needed in Facebook.
In Facebook we treat data consistency more than anything else. A human written tool is always vulnerable to bugs. Make OSC being able to bail out before causing corruptions to existing data is very important and not supported in all existing online schema change tools.
Apart from that, Facebook internally has a limited number of workloads compared to the ones existing in the MySQL community. Having data consistency check before bringing the new schema into production is also our way for being responsible for the data from community users who are using this tool.
The native ALTER TABLE
has some limitations and cannot cover all the schema change scenarios. For example: ignore the duplicates while adding unique indexes on InnoDB. We try to make OSC being able to cover all our daily schema change requests, so we implemented a superset of operations supported by MySQL server.
Using an asynchronous reload & replay logic internally, OSC gains the ability to test new schema against replication stream before putting them in use. As you may have known, MySQL supports replicating into a different table scheme by following certain set of rules. This enables us to perform schema changes on replicas first for test, but also the possibility to break them with improper schema change operations. Using OSC instead of direct ALTER TABLE
gives you a chance to detect this before breaking the production.
For write intensive workload, it's very important for OSC to replay the changes faster than the incoming request rate. Otherwise the shadow will always fall behind the one in use. We use a grouping logic during replay stage to speed up the catchup. For certain workloads OSC is able to replay changes 4x faster than the replication stream because of the optimization which make schema change for heavily written tables possible.
Since MyRocks has been widely adopted in Facebook, OSC has been proved itself to be successful with the new MySQL engine.
Shell out to execute a binary tool isn't really a good way to write automation. That's why we make OSC main logic a standalone Python class. This means if you want to integrate OSC deeply inside your infrastructure, and you happen to use Python as a programming language, this will be much easier for you to interact with OSC than any other schema changes tool.
By using OSC as a Python class, you then can have more insights about the OSC progress, more controls on how OSC is running, etc.
Basic
Advanced
For Bug reporters & Contributors