mongodb-cluster

This is a set of Mongo containers for creating clusters using Docker

Check specific images used in this example at

For a more complete example, check docker-compose.yml

Usage

Initial cluster creation

In this example we will create a cluster with:
- 2 routers
- 3 config servers
- 2 shards, each with 2 replicas
We had some issues running the shards in Docker for Mac. Some shards would be freezed and Docker had to be restarted. Use a VirtualBox VM if needed in local machine.
Although this example starts with two-replica shards, it is recommended to use three replicas minimum to ensure automatic failover in case one of the nodes comes down. See more at https://docs.mongodb.com/manual/core/replica-set-architecture-three-members/
Create docker-compose.yml

version: '3.5'

services:

  mongo-express:
    image: mongo-express:0.54.0
    ports:
      - 8081:8081
    environment:
      - ME_CONFIG_MONGODB_SERVER=router
    #   - ME_CONFIG_BASICAUTH_USERNAME=
    #   - ME_CONFIG_BASICAUTH_PASSWORD=
    restart: always

  router:
    image: stutzlab/mongo-cluster-router
    environment:
      - CONFIG_REPLICA_SET=configsrv
      - CONFIG_SERVER_NODES=configsrv1
      - ADD_SHARD_NAME_PREFIX=shard
      - ADD_SHARD_1_NODES=shard1a
      - ADD_SHARD_2_NODES=shard2a

  configsrv1:
    image: stutzlab/mongo-cluster-configsrv
    environment:
      - CONFIG_REPLICA_SET=configsrv
      - INIT_CONFIG_NODES=configsrv1

  shard1a:
    image: stutzlab/mongo-cluster-shard
    environment:
      - SHARD_REPLICA_SET=shard1
      - INIT_SHARD_NODES=shard1a

  shard2a:
    image: stutzlab/mongo-cluster-shard
    environment:
      - SHARD_REPLICA_SET=shard2
      - INIT_SHARD_NODES=shard2a

docker-compose up

This may take a while. Check when logs stop going crazy!
Connect to mongo-express and see some internal collections
- open browser at http://localhost:8081
Show cluster status

docker-compose exec router mongo --port 27017
sh.status()

Enable sharding of a collection in a database

docker-compose exec router mongo --port 27017
>

//create sampledb
use sampledb

//switch to admin db for sharding operation
use admin

//enable sharding for database
sh.enableSharding("sampledb")

//enable sharding for collection 'sample-collection'
db.adminCommand( { shardCollection: "sampledb.collection1", key: { mykey: "hashed" } } )
db.adminCommand( { shardCollection: "sampledb.collection2", key: { _id: "hashed" } } )

//if collection already has data, you may need to create the sharded index manually before calling the command above
//db['collection2'].createIndex({ "_id" : "hashed" })

//add some data
for(i=0;i<1000;i++) {
  db.collection2.insert({"name": _rand(), "nbr": i})
}

//show details about qtty of records per shard
db.collection2.find().explain(true)

//inspect shard status
sh.status()

Explore shard structure

docker-compose exec shard1a mongo --port 27017
>

//verify shard replication nodes/configuration
rs.conf()

Volumes

mongo-cluster-configsrv volume is at "/data"
- Mount volumes as "myvolume:/data"
mongo-cluster-shard volume is at "/data"
- Mount volumes as "myvolume:/data"
The original mongo image has volumes at /data/db and /data/configdb but they are not used by this image because those paths are used differently depending if it is a shard or configsrv instance, so we simplyfied to be just "/data" on both type of instances to avoid catastrofic errors (once I mapped just /data and Swarm created individual instance volume for /data/db and I lost my data - lucky it was a test cluster!). Swarm will still create those volumes per instance (because they are declared at parent Dockerfile) but you can ignore them.

Memory considerations

Like every good database, MongoDB tries to use all available memory from the host it is running on. BE AWARE of resources exhaustion when placing two container instances of mongo on the same VM, because both will compete for memory.
To minimize this risk, this container image configures mongo to limit its internal cache to 1/2 of available memory (according to container's cgroup limit). But this only works if you use resource limiting on the container (deploy: resources: limits: memory on docker-compose etc).
For example, if you have a 8GB VM and place two instances of the shard container with mem resource limit set to 3.5GB each, you will probably be safe, because the container will automatically set the WiredTiger cache limit to 1.25GB for each one (it will use more memory than this because it does other tasks, but the real hit here is regarding to cache size).
If you want to fine control the size o WiredTiger cache, you can set the value using WIRED_TIGER_CACHE_SIZE_GB ENV. It is available on shard and configsrv instances.
See more at https://docs.mongodb.com/manual/reference/program/mongod/#bin.mongod

Add a new shard

Add the new services to docker-compose.yml

...
  router:
    image: stutzlab/mongo-cluster-router
    environment:
      - CONFIG_SERVER_NAME=configsrv
      - CONFIG_SERVER_NODES=configsrv1
      - ADD_SHARD_NAME_PREFIX=shard
      - ADD_SHARD_1_NODES=shard1a
      - ADD_SHARD_2_NODES=shard2a
      - ADD_SHARD_3_NODES=shard3a,shard3b

  shard3a:
    image: stutzlab/mongo-cluster-shard
    environment:
      - SHARD_NAME=shard3
      - INIT_SHARD_NODES=shard3a,shard3b

  shard3b:
    image: stutzlab/mongo-cluster-shard
    environment:
      - SHARD_NAME=shard3
...

Start new shard

docker-compose up shard3a shard3b

Add shards to cluster

docker-compose up router

Some data from shard1 and shard2 will be migrated to shard3 now. This may take a while.
Check if all is OK with "rs.status()" on router

(Swarm) Move a replica node from one Swarm Node VM to another when storage is fixed in VM

This is the case when you mount the Block Storage directly to the VM you run the node by using a "placement" to force that container to run only on that host and want to move it to another node (probably to expand the cluster size). If using NFS or another distributed volume manager, you don't need to worry about this.
Deactivate mongo nodes: docker service scale mongo_shard1c=0
Remove Block Storage from current VM and mount it to the new VM
- If using local storage, just copy the volume contents to the new VM using SCP
- Verify the commands you are meant to perform on the VM according to your cloud provider in order to mount the volume in filesystem
Change the container placement in docker-compose.yml to point to the new host. Ex:

  shard1c:
    image: stutzlab/mongo-cluster-shard:4.4.0.8
    environment:
      - SHARD_REPLICA_SET=shard1
    deploy:
      placement:
        constraints: [node.hostname == server13]
    networks:
      - mongo
    volumes:
      - /mnt/mongo1_shard1c:/data

Update service definitions
- docker stack deploy --compose-file docker-compose.yml mongo
Check if scale for moved services is '1'
Check if node went online successfuly
- Enter newly instantiated container and execute:

mongo
rs.status()

Verify if current node is OK

Add a new shard to the cluster

Create the new shard replicas services in docker-compose.yml

...
  shard3a:
    image: stutzlab/mongo-cluster-shard
    environment:
      - SHARD_REPLICA_SET=shard3
      - INIT_SHARD_NODES=shard3a,shard3b
    volumes:
      - shard3a:/data

  shard3b:
    image: stutzlab/mongo-cluster-shard
    environment:
      - SHARD_REPLICA_SET=shard3
    volumes:
      - shard3b:/data
...

Create new volumes and mount to the host that will execute the shard (if using Swarm, don't forget to add a placement constraint if needed)
Change router service and add the new SHARD config environment variables

  router:
    image: stutzlab/mongo-cluster-router:4.4.0.8
    environment:
      - CONFIG_REPLICA_SET=configsrv
      - CONFIG_SERVER_NODES=configsrv1,configsrv2,configsrv3
      - ADD_SHARD_REPLICA_SET_PREFIX=shard
      - ADD_SHARD_1_NODES=shard1a,shard1b,shard1c
      - ADD_SHARD_2_NODES=shard2a,shard2b,shard2c
      - ADD_SHARD_3_NODES=shard3a,shard3b

Instantiate new replica nodes and add shard to cluster

docker-compose up

#check replicaset status
docker-compose exec shard3a mongo --eval "rs.status()"

#check shard status
docker-compose exec router mongo --eval "sh.status()"

This should end adding the new shard and replicas automatically
(OPTIONAL) If you want to perform the above operations step by step do

#create replicaset instances for shard3
docker-compose up shard3a shard3b
#check new replicaset status
docker-compose exec shard3a mongo --eval "rs.status()"
#change router to add the new shard to configsrv
docker-compose up router
#check if new shard was added successfuly
docker-compose exec router mongo --eval "sh.status()"

Add a new node to an existing shard (add a new replica to a replicaset)

Add the new service do docker-compose.yml

...
  shard1c:
    image: stutzlab/mongo-cluster-shard
    environment:
      - SHARD_REPLICA_SET=shard1
    volumes:
      - shard1c:/data
...

```sh
docker-compose up shard1c

Add the new node to an existing shard (new replica node)
- Discover which node is currently the master by

docker-compose exec shard1a mongo --eval "rs.isMaster()"
#look in response for "ismaster: true"
docker-compose exec shard1b mongo --eval "rs.isMaster()"
#look in response for "ismaster: true"

Execute the "add" command on the master node. If shard1b is the master:

docker-compose exec shard1b mongo --eval "rs.add( { host: \"shard1c\", priority: 0, votes: 0 } )"

Recover shard with only one replica

When a shard has only two replicas and one goes down, no primary will be elected and the database will be freezed until you take action to force the usage of its state in despite of the other copy (no consensus takes place and you may lose data if the remaining node was behind the node that went down).
Create a new shard replica node service in docker-compose, and "up" it
Enter the mongo cli in the last shard node (probably not the primary one) and then reconfigure the entire replica set with the nodes you want to be present now, adding the new shard node and use "force:true".

docker-compose exec shard1d mongo --eval "rs.reconfig( { \"_id\": \"shard1\", members: [ { \"_id\": 0, host: \"shard1a\" }, { \"_id\": 4, host: \"shard1d\"} ]}, {force:true} )"

Enable sharding in an already populated collection

Certify that you already have a "hashed" index in collection
- https://stackoverflow.com/questions/60275976/unable-to-shard-an-already-populated-collection-in-mongodb
- Example:

use edidatico
//"_id" is the collection attribute key
db['testes-estudantes'].createIndex({ "_id" : "hashed" })
//now an index with name "_id_hashed" was created and will be used on the next step
db.adminCommand( { shardCollection: "edidatico.testes-estudantes", key: { _id: "hashed" } })

Create/restore database dumps

Create database dump to a file

mongodump -u admin --authenticationDatabase admin --archive=/tmp/sample1-dump.db --db=sample1

Restore database from a dump file (same target database name)

mongorestore -u admin --authenticationDatabase admin --archive=/tmp/sample1-dump.db

Restore database from a dump file to a different database name

mongorestore -u admin --authenticationDatabase admin --archive=/tmp/sample1.db --nsFrom sample1.* --nsTo sample1copy.*

Check if all indexes were copied too. In some cases they were not.

Create user and assign roles

#create user in admin database so that it will be global and grant access to "sample1" database
use admin
db.createUser( { user: "app", pwd: "anypass", roles: [{role: "readWrite", db: "sample1"}] } )

##add admin roles to the user
use admin
db.grantRolesToUser(
   "user1",
   [ { role: "readAnyDatabase", db: "admin" } ]
)

Disable access to a specific database during migrations

When you take a dump for moving database to another place, remember to deny access to applications while you are in movement to avoid losing data users place on the "old" database

use admin
db.revokeRolesFromUser( "user1",
                        [ { role: "readAnyDatabase", db: "admin" } ]
                      )

Application notes

Pay attention to the level of isolation during Write and Read operations so that you achieve the most optimal performance vs data integrity for your application. Take a look at
- https://docs.mongodb.com/manual/reference/read-concern/
- https://docs.mongodb.com/manual/reference/write-concern/
Knowledge about the CAP Theorem is useful to help you decide on this.

Monitoring

Login in container shell
Execute

mongo

db.printCollectionStats()
db.printReplicationInfo()
db.printShardingStatus()
db.printSlaveReplicationInfo()

Slow Queries are automatically logged to in-memory
- Connect to mongos and execute

use admin
db.adminCommand( { getLog: "global" } )

Look for "slow query" entries and view the executed query along with its time
In order to send some metrics from the in-memory logs to Prometheus, use http://github.com/stutzlab/mongo-log-exporter

Free monitoring

Enter console on primary container of

configsrv
shard1
shard2

On each node, configure free monitoring

mongo
> db.enableFreeMonitoring()

Get provided URL in log and load in browser

Crisis experiences

Nodes got exhausted in resources and cluster won't come back by itself

We had a situation where all resources exausted and containers got restarting by OOM
Some locks on volume where kept so that some containers wouldn't start OK because of locks (this is a expected behavior)
Docker daemon was strange (Swarm was not keeping the number of service instances), so we restarted the VMs
scale=0 the services that are not restarting
delete /mongod.lock from each mongo cluster volume
scale=1 the services again
Another option is to stop all docker services and restart again (not needed but we solved this way because things were too ugly and it worked!)
- docker stack rm mongo - be SURE all volumes are mounted OUTSIDE instances
- remove locks
- reboot VMs
- docker stack deploy --compose-file docker-compose-mongo.yml mongo
- this is the same procedure as restoring a backup (!)

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

mongodb-cluster

Usage

Initial cluster creation

Volumes

Memory considerations

Add a new shard

(Swarm) Move a replica node from one Swarm Node VM to another when storage is fixed in VM

Add a new shard to the cluster

Add a new node to an existing shard (add a new replica to a replicaset)

Recover shard with only one replica

Enable sharding in an already populated collection

Create/restore database dumps

Create user and assign roles

Disable access to a specific database during migrations

Application notes

Monitoring

Free monitoring

Crisis experiences

Nodes got exhausted in resources and cluster won't come back by itself

More resources

About

Releases

Packages

License

stutzlab/mongo-cluster

Folders and files

Latest commit

History

Repository files navigation

mongodb-cluster

Usage

Initial cluster creation

Volumes

Memory considerations

Add a new shard

(Swarm) Move a replica node from one Swarm Node VM to another when storage is fixed in VM

Add a new shard to the cluster

Add a new node to an existing shard (add a new replica to a replicaset)

Recover shard with only one replica

Enable sharding in an already populated collection

Create/restore database dumps

Create user and assign roles

Disable access to a specific database during migrations

Application notes

Monitoring

Free monitoring

Crisis experiences

Nodes got exhausted in resources and cluster won't come back by itself

More resources

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Packages