Skip to content
This repository has been archived by the owner on Mar 9, 2019. It is now read-only.

Compact db #460

Merged
merged 2 commits into from
Sep 6, 2016
Merged

Compact db #460

merged 2 commits into from
Sep 6, 2016

Conversation

vincent-petithory
Copy link
Contributor

This adds a compact to the bolt tool. It recursively inspects a bolt database and copies its keys to a new database.
This usually results in a compacted database, especially if there was a lot of deleted keys with large values. See #308

Thanks for taking a look at this!

  compact rewrites a bolt db, recursively walking its keys
  in byte order.
@jacek99
Copy link

jacek99 commented Jan 6, 2016

This is moderately useful, but forces downtime as the app has to go down and be restarted pointing to a new DB.

It would be great if this could be done on the current DB file while it is being used, to enable high availability (similar to the way Cassandra does compaction while staying online all the time).

@chrsm
Copy link

chrsm commented Jan 6, 2016

See also #423

@mdlayher
Copy link
Contributor

Cool stuff, would love to see this integrated into the tool, at least.

Online compaction would be great, but I certainly think this is a nice feature to have regardless.

@tg
Copy link

tg commented Jul 28, 2016

Just run this on my 7GB db file and it was successfully truncated to 2GB. Happy days! A few comments:

  1. I think source DB should be open in read only mode, otherwise source DB is being modified (when testing on some 167MB sample DB I actually seen the source DB being inflated to 256MB). Although this is an issue with all bolt CLI commands like check or stats, so maybe should be a separate issue.
  2. I would suggest output DB should share the same file permissions as source DB rather then 0666. We already call os.Stat(path) so this as simple as taking Mode().Perm() from the result and passing to bolt.Open.
  3. Would it make sense to have a sensible non-zero default for -tx-max-size? I bet this would save a few souls from killing their machines while running compaction on very large DBs without reading help carefully. If someone would like to optimise the process using a single transaction, they would need to do it explicitly.
  4. As I mentioned in Question: Shrink BoltDB File? #423 (comment), this compacting approach doesn't copy bucket sequence number, but as far as I can see there is no solution to this using the public interface. I guess most people don't use this functionality anyway, so we should at least add some warning in the help message or maybe somehow be able to detect buckets with non-zero sequence numbers and fail the compaction.

@benbjohnson benbjohnson mentioned this pull request Sep 5, 2016
@benbjohnson
Copy link
Member

Sorry for the ridiculously long delay. I posted a pull request which includes the commits from this PR as well as some additional features for copying buckets sequence. The new pull request is #590.

# for free to subscribe to this conversation on GitHub. Already have an account? #.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants