Suggestion: Distribute SubT models through multiple of smaller docker layers #700

AravindaDP · 2020-11-11T07:47:49Z

It would be helpful to have smaller multiple layers rather than one gigantic layer when using docker pull using slower and unreliable connection. (Specially given that it's not trivial to resume if a layer download breaks in middle. While it's not trivial at least it's possible to resume from completed layers. so having multiples of smaller layers are helpful in such a situation)

So I suggest at least breaking 6GB+ layer containing all models into 3 separate layers containing resources from each circuit.

Also It would be beneficial to have subt repo cloning after model layers (Assuming models would change relatively infrequently) so model layers doesn't have to be pulled again each time subt repo is changed

zbynekwinkler · 2020-11-11T08:40:34Z

I believe the images are built here:

subt/docker/release.bash

Lines 73 to 84 in ad0c3e1

    
           if [[ "$COMMAND" == "--build" ]] || [[ "$COMMAND" == "--all" ]]; then 
        
             echo -e "\e[1;34m## Building cloudsim_sim image\e[0m\n" 
        
             ./build.bash cloudsim_sim --no-cache 
        
             echo -e "\e[1;34m## Building cloudsim_bridge image\e[0m\n" 
        
             ./build.bash cloudsim_bridge --no-cache 
        
             # Only build the subt_sim_entry if building for the public dockerhub registry 
        
             if [[ "$repo" != *"private" ]]; then 
        
               echo -e "\e[1;34m## Building subt_sim_entry image\e[0m\n" 
        
               ./build.bash subt_sim_entry --no-cache 
        
             fi 
        
           fi

Note the --no-cache. I believe we get no caching benefits for subsequent releases.

AravindaDP · 2020-11-11T09:14:12Z

My knowledge on docker is minimal. I thought layer hashes are based on content of the layer and preceding layers (Like in block chains) So I thought we'll still get same set of base layers as long as base image/ros package versions stay same even though no caching used. May be I'm mistaken.

Anyway my major point is gigantic monolithic layer. Even though it's very painful experience, I'm still fine with having to download dozen of GB as long as it's split into small layers so I could resume if connection lost in the middle.

Probably still there is replication of resources (Like shader images etc) among tiles that could be optimized for resource usage. But guess it requires major refactoring to resolve this.

peci1 · 2020-11-18T17:37:35Z

Nope, the --no-cache is only for build time of the image. It means that no cached layers should be used during the image build. But if you pull the final image to another machine, it will of course use the layer cache (unless you tell it not to). As docker's layer caching is based on the text of the command after RUN and not some results or side-effects of the commands, the --no-cache is basically needed to tell Docker to rebuild the whole thing. Without --no-cache, the apt update would only be run the first time you build the image, and all subsequent builds would consider it as cached and thus not call it. But that's often not what you want.

Models actually are downloaded before cloning the repo: https://github.com/osrf/subt/blob/ad0c3e1e6cf13236116e8f1bc4f2fc71800835b6/docker/cloudsim_sim/Dockerfile . I think the problem is that as the release script uses --no-cache, each rebuild will pull some updated dependency package, which then invalidates all following layers. To get the caching behavior you ask for, it'd be needed to install ign fuel with its minimum dependency set, then download models, and then install the rest. And hope that ign-fuel dependencies do not get updates so often.

Of course there are even better ways - e.g. building a resource-only docker using multi-stage docker build, which could create a layer that contains only the model files and not the programs needed to download them. That could be a pretty stable and infrequently changed image.

As a last option, you can always build the docker images yourself using the referenced release script. When building the image, resources will get downloaded one-by-one, which might be easier for your connection. However, I still think that if the connection drops for more than a few hundred milliseconds to re-negotiate a download, the whole download step will fail anyways and the downloaded models won't get cached. That's the time when you can start playing with editing the Dockerfile, or running it without the --no-cache switch.

AravindaDP · 2020-11-19T02:48:49Z

@peci1 Yes I agree with you on importance of using --no-cache to make sure that build is not using a cache layer from earlier build that contains older apt packages.

I overlooked the order of model download and repo cloning. Yes I'm aware that it would have a different hash if any of layers above changed in between a build (OS image, ign and any dependencies, basically very high probability unless we push it as much as up in layer list as possible or using a multi stage build as you suggested)

I also have observed another avenue of optimization. I see lot of models share same set of resources specially texture images. when time permits I'll check if we could symlink them to one model after download without breaking the simulation. (Here I'm not suggesting to change models in ign fuel server but to consolidate after downloading them)

Probably this simply a problem only I'm complaining. May be I should simply restrict local development to catkin installation and use an on cloud development workflow for docker image building and testing.

nkoenig · 2020-12-17T18:51:21Z

Models have been broken out into a separate docker image from the source code. See PR #718

The layer size for the models has been reduced to 3.82GB.

peci1 · 2021-01-12T13:52:07Z

I guess #718 could close this issue? I know the OP requested more model layers, but it seems that now only the tech repo gets downloaded, so that should do it. What happened to the other models like niosh that were previously downloaded separately?

AravindaDP · 2021-01-12T14:38:09Z

I also believe newer refactoring of docker images fulfill the intended requirement. Just couldn't test the latest build yet though. Any maintainers feel free to close this at your discretion if it is required.

nkoenig added the enhancement New feature or request label Nov 18, 2020

peci1 mentioned this issue Jan 18, 2021

Remove the model database from cloudsim_bridge image because it doesn't need it for anything. #754

Merged

nkoenig closed this as completed Jan 21, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Suggestion: Distribute SubT models through multiple of smaller docker layers #700

Suggestion: Distribute SubT models through multiple of smaller docker layers #700

AravindaDP commented Nov 11, 2020 •

edited

Loading

zbynekwinkler commented Nov 11, 2020

AravindaDP commented Nov 11, 2020 •

edited

Loading

peci1 commented Nov 18, 2020

AravindaDP commented Nov 19, 2020

nkoenig commented Dec 17, 2020

peci1 commented Jan 12, 2021

AravindaDP commented Jan 12, 2021

Suggestion: Distribute SubT models through multiple of smaller docker layers #700

Suggestion: Distribute SubT models through multiple of smaller docker layers #700

Comments

AravindaDP commented Nov 11, 2020 • edited Loading

zbynekwinkler commented Nov 11, 2020

AravindaDP commented Nov 11, 2020 • edited Loading

peci1 commented Nov 18, 2020

AravindaDP commented Nov 19, 2020

nkoenig commented Dec 17, 2020

peci1 commented Jan 12, 2021

AravindaDP commented Jan 12, 2021

AravindaDP commented Nov 11, 2020 •

edited

Loading

AravindaDP commented Nov 11, 2020 •

edited

Loading