-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Pack multiple stdin outputs into a single snapshot #2133
Comments
Maybe it makes more sense to use a |
Looking at the code for too long I can see a following possible way to go: Clone the fs_reader Code to provide a directory of fake files with their commands, or enhance the fs_reader object to support multiple fake files incl. optional stdout of commands instead of stdin. The scanner will get a list of all fake files which are defined in the commands-file and hands them over to the Archiver. When the Archiver code calls Open() or OpenFile()? and fs.Command is given for this fake file entry, we would have to execute the given command and pipe the cmd.stdout to the read output so the archiver code can actually store it until the cmd execution reaches EOF. Then the scanner would continue with the next given fake entry and call the next Open() It sounds possible without too much of a change, apart from messing up with the fs.Reader for fake stdin files. Any ideas? |
Thanks for taking the time to submit this idea. To be honest, I'm not convinced this is a good thing to add for restic. It makes the (already complex) code for reading something from stdin even more complicated. After all, restic is intended to be a tool to backup files. Would you mind elaborating why the straightforward way (running I'm sorry if my comment comes across as negative, it's not meant that way. We are a Free Software project for which most people (at least me) work in their spare time, and our development/maintenance/debugging time is very limited. So we're trying to keep restic's scope as small as possible. This also applies to #1873. :) |
Hi fd0 thank for taking your time to comment this. I know that restic strength is to backup files. The feature to backup stuff from stdin makes it also a great tool to backup non-file based stuff. The main reason why I prefer backups via The reason why I would like to have some kind of backing up multiple stdin is the memory usage and index loading time of restic. It takes quite some time before restic starts backup up stuff (~10s but it depends on the size of the repo). If you then would like to backup multiple hundred stdin commands (say mysqldumps) and would like to avoid lots write I/O operations it you end up with executing restic multiple hundred times while the waiting time for index loading stacks up. Sure a Thanks in advance. Maybe I can get my head around it but I've never used go before. |
Okay, thanks for taking the time to describe your use case. |
@fd0 We are internally brainstorming our backup strategies and it turns out that the use case "backup files plus large streamed output of multiple commands" is a very common case. Streaming the command outputs to files and back it up along with all the regular files is a workaround but has massive drawbacks in performance and space usage. E.g. we backup apps containing small local data plus large elasticsearch dumps (which not even would fit on a single disk of the system that runs restic). Having all this together in one snapshot would be great for restore consistency. I'd appreciate if you'd consider to support this use case. Best regards, |
Related to #4804 |
Output of
restic version
restic 0.9.3 compiled with go1.11.1 on linux/amd64
What are you trying to do?
I would like to run several 100 mysqldumps commands (backup separate tables instead the whole db) and I run into several issue that makes backups very unpractical and hard to use
What should restic do differently? Which functionality do you think we should add?
I would like to propose an alternative way to backup multiple stdin outputs into a single snapshot
--stdin-commands-file <file>
to the backup task--stdin-commands-file
would contain a list of backup jobs/commands (one per line) with the resulting filename as first argmument:<filename><whitespace><command that stdin should be saved><newline>
(one filename + command per line). A config file with other syntax is also okay.Example:
--stdin-filename
parameterThis way a commandfile can be prepared prior to the execution of restic and a single restic instance could save multiple stdins into a single snapshot. Not only is it more easy to handle mysqldump backup jobs (or something similar), you also end up with faster executing (only 1x the index loading instead of multiple hundred) and be done in minutes rather then hours.
Maybe this could help #1873 aswell. In my case I would like to avoid piping mysqldumps to the disks first before backuping them as a single snapshot.
Did restic help you or made you happy in any way?
Sure. I'm about to switch to restic for my private servers (from rsnapshot) and I'm already using restic in a different environment to backup 100+ servers, but struggling with database dumps and other performance related things (like loading of indexes, memory usage etc. But it has become far better in the last year)
The text was updated successfully, but these errors were encountered: