Skip to content
This repository has been archived by the owner on Feb 1, 2021. It is now read-only.

swarm cluster provisioned with docker-machine does not seem to distribute workloads #428

Closed
chanezon opened this issue Feb 27, 2015 · 7 comments

Comments

@chanezon
Copy link

Maybe I'm getting the doc wrong.

I provision a 3 machine cluster according to docker-machine documentation.

docker $(docker-machine config --swarm pat-swarm-master-3) info
Containers: 9
Nodes: 3
 pat-swarm-master-3: pat-swarm-master-3.cloudapp.net:2376
  └ Containers: 7
  └ Reserved CPUs: 0 / 1
  └ Reserved Memory: 0 B / 1.639 GiB
 pat-swarm-node-01: pat-swarm-node-01.cloudapp.net:2376
  └ Containers: 1
  └ Reserved CPUs: 0 / 1
  └ Reserved Memory: 0 B / 1.639 GiB
 pat-swarm-node-00: pat-swarm-node-00.cloudapp.net:2376
  └ Containers: 1
  └ Reserved CPUs: 0 / 1
  └ Reserved Memory: 0 B / 1.639 GiB

If I try to do

docker $(docker-machine config --swarm pat-swarm-master-3) run -d -p 80:80 nginx

it blocks

Talking to the docker daemon on master directly works.

docker $(docker-machine config pat-swarm-master-3) run -d -p 80:80 nginx
620f8c204d80f3ab5bdd7dab0b3bf02938462ef84b46c65710d3a2d6027542a5

But then I get no benefit from swarm, a second invocation, instead of scheduling on another node, barfs on port conflict.

docker $(docker-machine config pat-swarm-master-3) run -d -p 80:80 nginx
eef7e6da4168e94c2843566a57f02517c3acf5c5928345c5f8a2c3cd976bbd48
FATA[0003] Error response from daemon: Cannot start container eef7e6da4168e94c2843566a57f02517c3acf5c5928345c5f8a2c3cd976bbd48: Bind for 0.0.0.0:80 failed: port is already allocated 

swarm-master runs 2 containers: swarm-agent on port 2376 and swarm manage on port 3376.
I understand I should be talking to swarm manage.

docker $(docker-machine config pat-swarm-master-3) ps
CONTAINER ID        IMAGE               COMMAND                CREATED             STATUS              PORTS                              NAMES
620f8c204d80        nginx:latest        "nginx -g 'daemon of   2 minutes ago       Up 2 minutes        443/tcp, 0.0.0.0:80->80/tcp        furious_mcclintock   
fd85fb5a33ae        swarm:latest        "/swarm join --addr    8 hours ago         Up 8 hours          2375/tcp                           swarm-agent          
5f509668bd99        swarm:latest        "/swarm manage --tls   8 hours ago         Up 8 hours          2375/tcp, 0.0.0.0:3376->3376/tcp   swarm-agent-master 

The logs are not very helpful:

docker $(docker-machine config pat-swarm-master-3) logs --tail all 5f509668bd99
time="2015-02-26T18:08:26Z" level=info msg="Listening for HTTP" addr="0.0.0.0:3376" proto=tcp 
time="2015-02-26T23:07:51Z" level=info msg="HTTP request received" method=GET uri="/v1.17/info" 
time="2015-02-27T00:36:51Z" level=info msg="HTTP request received" method=GET uri="/v1.17/info" 
time="2015-02-27T00:37:04Z" level=info msg="HTTP request received" method=POST uri="/v1.17/containers/create" 
time="2015-02-27T00:37:43Z" level=error msg="Flagging node as dead. Updated state failed: Get https://pat-swarm-node-01.cloudapp.net:2376/v1.15/containers/json?all=1&size=0: dial tcp 191.236.104.249:2376: i/o timeout" id="2HH5:U4L5:L3GB:5DWJ:3PKB:NZQ3:5X3Q:WAEC:4AZV:D33B:A7NW:P4MY" name=pat-swarm-node-01 
time="2015-02-27T00:38:14Z" level=info msg="Node came back to life. Hooray!" id="2HH5:U4L5:L3GB:5DWJ:3PKB:NZQ3:5X3Q:WAEC:4AZV:D33B:A7NW:P4MY" name=pat-swarm-node-01 
time="2015-02-27T00:40:37Z" level=info msg="HTTP request received" method=GET uri="/v1.17/info" 
time="2015-02-27T00:43:32Z" level=info msg="HTTP request received" method=GET uri="/v1.17/containers/json" 
time="2015-02-27T00:44:42Z" level=info msg="HTTP request received" method=GET uri="/v1.17/info" 
time="2015-02-27T00:45:20Z" level=info msg="HTTP request received" method=GET uri="/v1.17/containers/json" 
time="2015-02-27T00:45:34Z" level=info msg="HTTP request received" method=GET uri="/v1.17/containers/json" 
time="2015-02-27T00:45:39Z" level=info msg="HTTP request received" method=GET uri="/v1.17/containers/json" 
time="2015-02-27T00:45:42Z" level=info msg="HTTP request received" method=GET uri="/v1.17/containers/json" 
time="2015-02-27T00:45:51Z" level=info msg="HTTP request received" method=GET uri="/v1.17/info" 
time="2015-02-27T00:46:04Z" level=info msg="HTTP request received" method=GET uri="/v1.17/info" 
time="2015-02-27T00:46:23Z" level=info msg="HTTP request received" method=GET uri="/v1.17/info" 
time="2015-02-27T00:46:42Z" level=info msg="HTTP request received" method=GET uri="/v1.17/info" 
time="2015-02-27T00:46:59Z" level=info msg="HTTP request received" method=GET uri="/v1.17/info" 
time="2015-02-27T00:47:09Z" level=info msg="HTTP request received" method=GET uri="/v1.17/info" 
time="2015-02-27T00:53:36Z" level=info msg="HTTP request received" method=GET uri="/v1.17/info" 
time="2015-02-27T00:59:20Z" level=info msg="HTTP request received" method=GET uri="/v1.17/info" 
time="2015-02-27T00:59:33Z" level=info msg="HTTP request received" method=GET uri="/v1.17/containers/json" 
time="2015-02-27T01:04:58Z" level=info msg="HTTP request received" method=GET uri="/v1.17/containers/json" 
time="2015-02-27T01:26:59Z" level=info msg="HTTP request received" method=GET uri="/v1.17/containers/json" 
time="2015-02-27T01:27:05Z" level=info msg="HTTP request received" method=POST uri="/v1.17/containers/create" 
time="2015-02-27T01:29:55Z" level=info msg="HTTP request received" method=GET uri="/v1.17/info" 
time="2015-02-27T01:30:09Z" level=info msg="HTTP request received" method=GET uri="/v1.17/containers/json" 
time="2015-02-27T01:38:22Z" level=info msg="HTTP request received" method=GET uri="/v1.17/containers/json" 
time="2015-02-27T01:38:51Z" level=info msg="HTTP request received" method=GET uri="/v1.17/containers/json" 
time="2015-02-27T01:47:02Z" level=info msg="HTTP request received" method=POST uri="/v1.17/containers/create" 
time="2015-02-27T01:55:27Z" level=info msg="HTTP request received" method=GET uri="/v1.17/containers/json" 
time="2015-02-27T01:55:42Z" level=info msg="HTTP request received" method=GET uri="/v1.17/containers/json" 
time="2015-02-27T01:55:47Z" level=info msg="HTTP request received" method=POST uri="/v1.17/containers/create" 
time="2015-02-27T01:56:27Z" level=info msg="HTTP request received" method=POST uri="/v1.17/containers/create" 
time="2015-02-27T01:56:52Z" level=info msg="HTTP request received" method=POST uri="/v1.17/containers/create" 
time="2015-02-27T02:03:36Z" level=info msg="HTTP request received" method=POST uri="/v1.17/containers/create" 
time="2015-02-27T02:04:36Z" level=info msg="HTTP request received" method=DELETE uri="/v1.17/containers/75a6a8f54ca3" 
time="2015-02-27T02:07:24Z" level=info msg="HTTP request received" method=POST uri="/v1.17/containers/create" 
time="2015-02-27T02:11:46Z" level=info msg="HTTP request received" method=GET uri="/v1.17/containers/json" 
time="2015-02-27T02:11:52Z" level=info msg="HTTP request received" method=GET uri="/v1.17/info" 
time="2015-02-27T02:12:09Z" level=info msg="HTTP request received" method=GET uri="/v1.17/info" 

Are secure swarm clusters based on docker-machine deployment something that is not yet fully implemented?
Or is there something missing in the docs?
Or do I miss something in my config?

@vieux
Copy link
Contributor

vieux commented Feb 27, 2015

@ehazlett can you take a look

@chanezon
Copy link
Author

Seems to be azure specific, the crate team were successful on gce
https://crate.io/blog/deploying-crate-with-docker-machine-swarm/
Not sure what that container on port 3326 on azure is used for, but that's the port docker-machine env produces.

@chanezon
Copy link
Author

Cc @jeffmendoza

@nmackenzie
Copy link

Caveat. I haven't looked at this yet.

However,

  1. Is there a reason you have deployed each of these nodes into its own cloud service?
    pat-swarm-master-3.cloudapp.net:2376
    pat-swarm-node-01.cloudapp.net:2376
    pat-swarm-node-00.cloudapp.net:2376

  2. Did you create an appropriate port 2376 port-forwarded endpoint on each cloud service?

  3. Ideally, all the nodes could be deployed into a single cloud service so that you would not need to open a public endpoint to talk to them - or even access them over a VNET. if you have to use multiple cloud services then using a VNET would provide visibility among the nodes without the need for a endpoint on the public internet.

@chanezon
Copy link
Author

@nmackenzie
Yes that's how I would deploy it in production.
But here I'm just trying the out of the box experience with docker-machine (which creates one cloud service per machine) and swarm.

@bfirsh
Copy link
Contributor

bfirsh commented Feb 27, 2015

How long did it block for? It's quite possible it's blocking pulling the image.

When you create a container on a Swarm, it pulls the image for that container but doesn't report back pulling status with progress bars like usual – it just hangs. See #349 for a potential fix for this.

@chanezon
Copy link
Author

I tested this again, with a cluster provisioned with machine 0.2, and it works.
You can close this bug.

docker $(docker-machine config --swarm pat-swarm-master-0505) run -d -p 80:80 nginx
7f0841428f7b787e093d3ea0e3c26c19058603671c7a7015cd8ad4aca157b338
pat-2:~ pat$ docker $(docker-machine config --swarm pat-swarm-master-0505) ps
CONTAINER ID        IMAGE               COMMAND                CREATED             STATUS              PORTS                                NAMES
7f0841428f7b        nginx:latest        "nginx -g 'daemon of   45 seconds ago      Up 19 seconds       191.237.91.115:80->80/tcp, 443/tcp   pat-swarm-node-0505-00/backstabbing_wozniak   

# for free to subscribe to this conversation on GitHub. Already have an account? #.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants