Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Daemon to support network restore and Macvlan driver exits experimental #23524

Merged
merged 3 commits into from
Jun 15, 2016
Merged

Daemon to support network restore and Macvlan driver exits experimental #23524

merged 3 commits into from
Jun 15, 2016

Conversation

aboch
Copy link
Contributor

@aboch aboch commented Jun 14, 2016

This PR brings:

@thaJeztah
Copy link
Member

ping @crosbymichael PTAL

Signed-off-by: Alessandro Boch <aboch@docker.com>
@icecrime
Copy link
Contributor

Quick question: is it possible that I run a daemon, initialize a Swarm, change my daemon config to add --live-restore, and reload this configuration? That would be a loophole in the validation process. (cc @crosbymichael)

@aboch
Copy link
Contributor Author

aboch commented Jun 14, 2016

@icecrime Need to think about it. But existing validation logic would have same issue.
Being this to cover a corner case, would this be a good candidate for a bug fix. Do not want to rush a decision on the validation now.

@mavenugo
Copy link
Contributor

@aboch no. I added a check if daemon.clusterProvider != nil { in reloadClusterDiscovery to handle this case for --cluster-store configruation.

I think you should do a similar check for --live-restore as well

@aboch
Copy link
Contributor Author

aboch commented Jun 14, 2016

@icecrime @mavenugo Ah, then the same would work with the new check.

I just tested the scenario depicted by @icecrime:

  • start daemon w/o live-restore
  • init swarm
  • stop daemon and restart with --live-restore

daemon start fails with:

FATA[0002] Error creating cluster component: swarm mode is incompatible with --live-restore option

@icecrime
Copy link
Contributor

@aboch The scenario I was mentioning was about hot-reloading the configuration, not restarting the daemon!

@aboch
Copy link
Contributor Author

aboch commented Jun 14, 2016

@icecrime Verified the config reload leads to the same validation:

DEBU[0045] Calling GET /v1.24/containers/json           
DEBU[0049] aboch-W540: Initiating bulk sync with nodes [] 
INFO[0071] Got signal to reload configuration, reloading from: /home/alessandro/config.json 
ERRO[0071] Error reconfiguring the daemon: --cluster-store and --cluster-advertise daemon configurations are incompatible with swarm mode 
DEBU[0079] aboch-W540: Initiating bulk sync with nodes [] 

@mavenugo mavenugo added this to the 1.12.0 milestone Jun 14, 2016
@mavenugo
Copy link
Contributor

@aboch

ERRO[0071] Error reconfiguring the daemon: --cluster-store and --cluster-advertise daemon configurations are incompatible with swarm mode

It should be --live-restore in your case.

@aboch
Copy link
Contributor Author

aboch commented Jun 14, 2016

LOL, I didn't realize the same error message was being printed in different places...
I assumed the same validation function was being executed from different paths

Will take care of it shortly

@aboch
Copy link
Contributor Author

aboch commented Jun 14, 2016

@mavenugo @icecrime Thanks, PTAL, the inner compatibility check is now a property of the config and same function is used for validation in the two paths.

Follows the daemon logs for two hot reloads after the swarm init: In the first one the json config specifies the live restore. In the second one it specifies the cluster store config:

INFO[0053] Got signal to reload configuration, reloading from: /home/alessandro/config.json 
ERRO[0053] Error reconfiguring the daemon: --live-restore daemon configuration is incompatible with swarm mode 

INFO[0068] Got signal to reload configuration, reloading from: /home/alessandro/config.json 
ERRO[0068] Error reconfiguring the daemon: --cluster-store and --cluster-advertise daemon configurations are incompatible with swarm mode 
DEBU[0071] aboch-W540: Initiating bulk sync with nodes [] 

@mlaventure
Copy link
Contributor

@aboch

23:35:56 daemon\daemon.go:380: daemon.configStore.isSwarmCompatible undefined (type *Config has no field or method isSwarmCompatible)
23:35:56 daemon\daemon.go:935: config.isSwarmCompatible undefined (type *Config has no field or method isSwarmCompatible)

Windows failure is genuine.

@aboch
Copy link
Contributor Author

aboch commented Jun 14, 2016

Thanks @mlaventure. Updated.

Signed-off-by: Lei Jitang <leijitang@huawei.com>
Signed-off-by: Lei Jitang <leijitang@huawei.com>
@icecrime
Copy link
Contributor

LGTM 👍

@icecrime
Copy link
Contributor

One failure on janky (related?):

17:12:10 FAIL: docker_cli_links_test.go:32: DockerSuite.TestLinksPingLinkedContainers
17:12:10 
17:12:10 docker_cli_links_test.go:46:
17:12:10     // 3. Ping by hostname
17:12:10     dockerCmd(c, append(runArgs, fmt.Sprintf(pingCmd, "fred", "wilma"))...)
17:12:10 /go/src/github.com/docker/docker/pkg/integration/dockerCmd_utils.go:42:
17:12:10     c.Assert(err, check.IsNil, check.Commentf(quote+"%v"+quote+" failed with errors: %s, %v", strings.Join(args, " "), out, err))
17:12:10 ... value *exec.ExitError = &exec.ExitError{ProcessState:(*os.ProcessState)(0xc8231a3320), Stderr:[]uint8(nil)} ("exit status 137")
17:12:10 ... "run --rm --link container1:alias1 --link container2:alias2 busybox sh -c ping -c 1 fred -W 1 && ping -c 1 wilma -W 1" failed with errors: , exit status 137

@aboch
Copy link
Contributor Author

aboch commented Jun 15, 2016

@icecrime
Not sure, that test pass on my machine:

---> Making bundle: .ensure-nnp-test (in bundles/1.12.0-dev/test-integration-cli)
+ go test -check.f DockerSuite.TestLinksPingLinkedContainers -check.v -check.timeout=5m -test.timeout=360m github.com/docker/docker/integration-cli
INFO: Testing against a local daemon
PASS: docker_cli_links_test.go:32: DockerSuite.TestLinksPingLinkedContainers    1.735s
PASS: docker_cli_links_test.go:50: DockerSuite.TestLinksPingLinkedContainersAfterRename 1.393s
PASS: docker_cli_exec_test.go:365: DockerSuite.TestLinksPingLinkedContainersOnRename    0.640s
OK: 3 passed
PASS
coverage: 10.4% of statements
---> Making bundle: .integration-daemon-stop (in bundles/1.12.0-dev/test-integration-cli)

Let me look at it closer

@icecrime
Copy link
Contributor

Hopefully it's unrelated. @crosbymichael will be testing the PR tonight.

@crosbymichael
Copy link
Contributor

LGTM

@tonistiigi
Copy link
Member

Ping test failure in ci, not sure if related.

I would prefer IsSwarmCompatible configs checks to be handled in a layer above cluster and daemon instead of using executor callback that is not designed for these kinds of things. But this would probably require a larger refactor. LGTM

@icecrime
Copy link
Contributor

Weird failure in docker-py:

TestNetworks.test_create_network_with_host_driver_fails ____________
02:21:40 /docker-py/tests/integration/network_test.py:91: in test_create_network_with_host_driver_fails
02:21:40     self.client.create_network(net_name, driver='host')
02:21:40 E   Failed: DID NOT RAISE

@icecrime
Copy link
Contributor

The failure in docker-py is legitimate, but @aboch is aware and will fix in a followup PR. Amazing that docker-py suite caught it :-)

@icecrime icecrime merged commit 9119795 into moby:master Jun 15, 2016
@aboch
Copy link
Contributor Author

aboch commented Jun 15, 2016

Yes, it is a genuine failure, good docker-py test.
Not a big deal, docker will allow user to create a host network when the one provisioned by the daemon is already present. Second creation will fail.
Will open an issue, should be a simple fix.

@coolljt0725
Copy link
Contributor

👏

@mavenugo mavenugo changed the title Daemon to support network restore Daemon to support network restore and Macvlan driver exits experimental Jun 17, 2016
@dreamcat4
Copy link

Thanks guys - really appreciate this.

@@ -120,3 +121,13 @@ func (config *Config) GetAllRuntimes() map[string]types.Runtime {
config.reloadLock.Unlock()
return rts
}

func (config *Config) isSwarmCompatible() error {
if config.IsValueSet("cluster-store") || config.IsValueSet("cluster-advertise") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you change this code from config.ClusterStore != "" to config.IsValueSet("cluster-store")?
As I know, config.IsValueSet() checks only configs which were set in the configuration file.
So this code always returns false even if using --cluster-store option.

@aboch
Copy link
Contributor Author

aboch commented Jun 28, 2016

@ncoolz Thank you for spotting this. In my initial diff I was looking for the config parameter value and tested with that, later I moved to the IsValueSet(), that is a mistake.

Please let me know if you can push a patch to fix this, otherwise I will take care of it.

Thanks again.

@ncoolz
Copy link
Contributor

ncoolz commented Jun 28, 2016

@aboch I will send a PR immediately. Thank you.

@thaJeztah
Copy link
Member

Thanks @ncoolz!

@eungjun-yi
Copy link
Contributor

./experimental/vlan-networks.md still say that macvlan driver is in experimental mode. I think this should be updated.

The Macvlan and Ipvlan drivers are currently in experimental mode in order to incubate Docker users use cases and vet the implementation to ensure a hardened, production ready driver in a future release.

-- https://github.com/docker/docker/blob/master/experimental/vlan-networks.md#user-content-getting-started

# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.