Skip to content

Exporting the database

Jorge Castro edited this page Apr 7, 2022 · 31 revisions
160799043 88e475e5 35bf 4e8f 95f2 4e479764b470

1. Creating a dump of a locally hosted DB instance (Logical backup)

MySQL Shell has three utility functions for dumping Tables, Schemas and instances:

  • util.dumpTables()

  • Util.dumpSchema()

  • Util:dumpInstance()

For loading:

  • util.loadDump

Here I will be focusing on the dump instance. This function has an array of options, very granular to specify arguments to filter what we need or do not need in our instance dump. Dump Instance dumps the routines, views, the data and users. The mysql, information_schema, performance_schema and SYS schemas are ignored.

Before running the dump, we have the option to do a dry run, which will only show us what actions would be performed when we run the utility for real with the options we select.

util.dumpInstance("C:/Users/jcast/testdump/my_instance_dump", {dryRun: true})
ℹ️
  • MySQL shell must be run as Administrator, otherwise we will have an error message:

    `Util.dumpSchemas: Could not create directory
    \\?\C:\Program Files\MySQL\MySQL Shell 8.0\bin\sak-aws:
    Access is denied. (RuntimeError)`
  • The dump and load utilities must be run in JavaScript or Python mode.

  • The minimum required set of privileges of the user account used to run the utility are: BACKUP_ADMIN, EVENT, RELOAD, SELECT, SHOW VIEW, and TRIGGER.

  • Depending on the use case, If the binary log is required to be included in the dump, the user running the utility needs REPLICATION CLIENT privileges. Otherwise the dump will continue as normal but without the binary log. A warning message is displayed:

    WARNING: Could not fetch the binary log information: MySQL Error 1227 (42000):
    Access denied; you need (at least one of) the SUPER, REPLICATION CLIENT privilege(s) for this operation

The first argument of the function is the path where the dump will be stored. If no path its specified, but only a name, a new directory with the given name will be created at C:\Program Files\MySQL\MySQL Shell 8.0\bin. It is a directory that is going to be created, not a single file. Next we can specify arguments, options related to how we want the dump created.

  • The process of the dump:

    • This process uses multiple threads to perform its tasks. Each thread opens its own connection to the target server and can be working on dumping data, dumping DDL or splitting the table data into chunks:

    • a global read lock is set by the global Shell session using the FLUSH TABLES WITH READ LOCK.[1] statement to get the GTID position.[2], and open all the connections. This is to have a consistent view of the data.

    • The DDL and SQL files are created. There is one .SQL file and a JSON for every table.

    • Chunks of data are created and compressed them by default (Zstd compression).[3]


\connect backup_user@localhost
util.dumpInstance("C:/Users/jcast/data/my_instance_dump", {compatibility: ["strip_definers"]})

Data files produced by the MySQL Shell Schema Dump Utility include DDL files for creating the schema structure, compressed .tsv files that contain the data, and .json metadata files.

Unlike the mysqldump utility that outputs only one file, Utility Dump Schema, Instance and Table generate a dump directory containing several files, so its divided in several chunks.

The utility does this to increase performance on data streaming. Instead of having only one file with create table statements, one after the other, the tables are spread in different chucks and loaded in parallel, all at the same time with a high number of simultaneous threads connections to the new instance.

160140660 1174385a fe12 4549 8f1e ff3846750cca

1.1. Creating a dump directly onto a cloud storage bucket

The MySQL Shell Dump Utilities support the export directly into an Oracle Cloud Infrastructure Object Storage bucket. However, exporting a dump into an AWS S3 is currently not supported:

⚠️
Util.dumpInstance: Directory handling for s3 protocol is not supported. (ArgumentError)

2. Options

Here is a recap of the options available for this function: For more details on these options, go to the documentation for the Dump Instance Utility at the footer of this page.

Dump: Options for Dump Control
Option Value Default Example Description

dryRun

true / false

false

util.dumpInstance("C:/data/instance_dump", {dryRun: true})

Display information about what would be dumped with the specified set of options

threads

integer

4

util.dumpInstance("C:/data/inst_dump", {threads: 88})

The number of parallel threads to use to dump chunks of data from the MySQL instance.

defaultCharacterSet

"string"

utf8mb4

util.dumpInstance("C:/data/inst_dump", {defaultCharacterSet: "utf8mb4"})

The character set to be used during the session connections that are opened by MySQL Shell to the server for the dump


Dump: Options for filtering
Option Value Default Example Description

dataOnly

true / false

false

util.dumpInstance("C:/data/instance_dump", {dataOnly: true})

Setting this option to true includes only the data files for the dumped items in the dump, and does not include DDL files.

users

true / false

true

util.dumpInstance("C:/data/inst_dump", {users: false})

Include (true) or exclude (false) users and their roles and grants in the dump

excludeUsers

array of strings

----

util.dumpInstance("C:/data/inst_dump", {excludeUsers: ["'user_name'@'host_name'"]})

Exclude the named user accounts from the dump files

includeUsers

array of strings

----

util.dumpInstance("C:/data/inst_dump", {includeUsers: ["'user_name'@'host_name'"]})

includeUsers

excludeSchemas

array of strings

----

util.dumpInstance("C:/data/inst_dump", {excludeSchemas: ["sakila", "bank"]})

Exclude the named schemas from the dump. Note that the information_schema, mysql, ndbinfo, performance_schema, and sys schemas are always excluded from an instance dump.

includeSchemas

array of strings

----

util.dumpInstance("C:/data/inst_dump", {includeSchemas: ["sakila", "bank"]})

Include only the named schemas in the dump. If you want to dump one or more of these schemas, you can do this using the schema dump utility util.dumpSchemas().

events

true / false

true

util.dumpInstance("C:/data/inst_dump", {events: false})

Include (true) or exclude (false) events for each schema in the dump

routines

true / false

true

util.dumpInstance("C:/data/inst_dump", {routines: false})

Include (true) or exclude (false) functions and stored procedures for each schema in the dump.

The following are Options to be applied to the dumps intended to be loaded to MySQL Database Service and Oracle Cloud Infrastructure. They also apply to AWS RDS.

Option

compatibility

Value

strip_definers

Default

----

Example

util.dumpInstance("C:/Users/jcast/data/my_instance_dump", {threads: 44, compatibility: ["strip_definers"]})

Description

Remove the DEFINER clause from views, routines, events, and triggers, so these objects are created with the default definer (the user invoking the schema), and change the SQL SECURITY clause for views and routines to specify INVOKER instead of DEFINER. Cloud databases require special privileges to create these objects with a definer other than the user loading the schema. If your security model requires that views and routines have more privileges than the account querying or calling them, you must manually modify the schema before loading it.

Option

compatibility

Value

strip_restricted_grants

Default

----

Example

util.dumpInstance("C:/Users/jcast/data/my_instance_dump", {compatibility: ["strip_definers", "strip_restricted_grants"]})

Description

Remove specific privileges that are restricted by MySQL Database Service from GRANT statements, so users and their roles cannot be given these privileges (which would cause user creation to fail).

3. Creating a dump of the entire database instance (Total of 10 schemas. 60 million rows)

To create this dump I used a localhost account with the following privileges:

SELECT, RELOAD, SHOW VIEW, EVENT, TRIGGER, BACKUP_ADMIN, REPLICATION CLIENT.

The options for the dump I used:

  • "strip_definers": This option removes the DEFINER clause from routines, events, triggers in order for these objects to be created with the user making the dump rather than leaving them with the user that initially created them (the DEFINER)


instancedump


1. `FLUSH TABLES WITH READ LOCK` closes all open tables and locks all tables for all databases with global read lock.
2. Traditional MySQL replication is based on relative coordinates — each replica keeps track of its position with respect to its current primary’s binary log files. GTID enhances this setup by assigning a unique identifier to every transaction, and each MySQL server keeps track of which transactions it has already executed. This permits “auto-positioning,” the ability for a replica to be pointed at a primary instance without needing to specify a binlog filename or position in the CHANGE PRIMARY statement.
3. Zstandard is a fast compression algorithm, providing high compression ratios. It also offers a special mode for small data, called dictionary compression. The reference library offers a very wide range of speed / compression trade-off, and is backed by an extremely fast decoder. Zstandard library is provided as open source software using a BSD license.