Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

sql: DROP DATABASE takes a looooong time #14279

Closed
danhhz opened this issue Mar 20, 2017 · 7 comments
Closed

sql: DROP DATABASE takes a looooong time #14279

danhhz opened this issue Mar 20, 2017 · 7 comments
Assignees
Labels
C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. C-performance Perf of queries or internals. Solution not expected to change functional behavior.
Milestone

Comments

@danhhz
Copy link
Contributor

danhhz commented Mar 20, 2017

We have a teamcity build that backs up a production cluster and restores it into a new database _full. Then does an incremental backup from the first one and uses both to restore into a second new database _inc. Finally, it drops the two databases_full and _inc. In a recent run the backups each took about 1m, the restores took 10m, and the first drop database took 62 minutes. (As of the time of writing, the second one hadn't finished.) The amount of data being dropped was about 2.8GB.

https://teamcity.cockroachdb.com/viewLog.html?buildId=188347

@knz knz added this to the 1.0 milestone Mar 20, 2017
@knz knz added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. C-performance Perf of queries or internals. Solution not expected to change functional behavior. labels Mar 20, 2017
@vivekmenezes
Copy link
Contributor

we should look at using rocksDBs DB::DeleteRange for faster deletion of both tables and indexes.

@tamird
Copy link
Contributor

tamird commented Mar 20, 2017

I don't think that's right. Deleting tables and indexes needs to go through our MVCC layer, we can't just obliterate the data with RocksDB's DeleteRange.

@bdarnell
Copy link
Contributor

Well, we currently go through the MVCC layer, and this allows us to support ongoing queries (or new time-travel queries) across DROP TABLE boundaries, but in many cases this is not a requirement, and it might be nice to at least have the ability to opt in to a faster DROP TABLE that did not go through the MVCC layer (except to fix up the MVCC stats after the DeleteRange). This would both improve the performance of the drop itself and reduce the time before the disk space is freed (instead of waiting for a 24h GC cycle).

@tamird
Copy link
Contributor

tamird commented Mar 20, 2017

That's fine, but I'm not sure that's what this issue is about. It seems that the DROP TABLE operation is much slower than the restore, suggesting that there are lower-hanging fruit to be picked here.

@bdarnell
Copy link
Contributor

I'm sure there's a lot of room for improvement while still using MVCC, but restore is as fast as it is because it bypasses the MVCC layer.

@vivekmenezes
Copy link
Contributor

While we can work on the actual performance of drop table, a user preferred solution will be to just declare the drop as done once the name is available for reuse and run the actually drop in the background. This can be implemented by the schema changer being made aware if it is associated with a session, and running only the "release name" part of the code when run from the session, and the rest of the drop table code from the async schema changer.

@spencerkimball
Copy link
Member

Perhaps it was a mistake to make DROP TABLE work like TRUNCATE TABLE. I agree with @vivekmenezes that we should start by removing the table name -> table ID mapping in the schema and let clients continue. But instead of doing the MVCC deletion in the background, it seems considerably more efficient to have a different path for the actual deletion of the table, which would schedule DeleteRange calls to delete the underlying data according to the zone config TTL.

@vivekmenezes vivekmenezes modified the milestones: 1.1, 1.0 Apr 19, 2017
vivekmenezes added a commit to vivekmenezes/cockroach that referenced this issue Jul 17, 2017
The DROP TABLE is deemed complete as soon as the table name is no
longer in use. The table data GC cleanup is executed asynchronously
through the asynchronous schema change path, and can be made more
performant later.

fixes cockroachdb#14279

related to cockroachdb#2003
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. C-performance Perf of queries or internals. Solution not expected to change functional behavior.
Projects
None yet
Development

No branches or pull requests

6 participants