Mongodb Read From Secondary When Secondary Down

In previous posts of our "Get a MongoDB DBA" series, nosotros covered Deployment, Configuration, Monitoring (part 1), Monitoring (part two), fill-in and restore. From this blog post onwards, we shift our focus to the scaling aspects of MongoDB.

One of the cornerstones of MongoDB is that it is built with loftier availability and scaling in mind. Scaling tin can exist done either vertically (bigger hardware) or horizontally (more than nodes). Horizontal scaling is what MongoDB is good at, and it is not much more than spreading the workload to multiple machines. In upshot, nosotros're making use of multiple low-toll article hardware boxes, rather than upgrading to a more expensive high functioning server.

MongoDB offers both read- and write scaling, and nosotros will uncover the differences of these two strategies for you. Whether to cull read- or write scaling all depends on the workload of your application: if your awarding tends to read more than often than it writes data you will probably want to brand utilise of the read scaling capabilities of MongoDB. Today we will embrace MongoDB read scaling.

Read Scaling Considerations

With read scaling, nosotros volition scale out our read capacity. If you have used MongoDB before, or have followed this blog serial, you may exist aware that actually all reads stop upward on the principal past default. Regardless if your replicaSet contains 9 nodes, your read requests still get to the principal. Why was this washed deliberately?

In principle, there are a few considerations to make earlier you showtime reading from a secondary node directly. Starting time of all: the replication is asynchronous, so non all secondaries will requite the same results if you read the same data at the same point in time. Secondly: if you lot distribute read requests to all secondaries and use up too much of their capacity, if one of them becomes unavailable, the other secondaries may not be able to cope with the actress workload. Thirdly: on sharded clusters yous should never featherbed the shard router, as data may be out-of-date or data may take been moved to another shard. If you practise employ the shard router and set the read preference correctly, it may still return incorrect data due to incomplete or terminated chunk migrations.

As you have seen these are serious considerations you lot should brand before scaling out your read queries on MongoDB. In general, unless your primary is not able to cope with the read workload it is receiving, we would advise against reading from secondaries. The price you pay for inconsistency is relatively high, compared to the benefits of offloading work from the primary.

The master result here seems to be the eventual consistency of MongoDB, so your awarding needs to be able to piece of work around that. As well if you would have an application that is not bothered by stale data, analytics for example, you could benefit greatly from using the secondaries.

Reading From a Secondary

There are two things that are necessary to make reading from a secondary possible: tell the MongoDB customer driver that you actually wish to read from a secondary (if possible) and tell the MongoDB secondary server that it is okay to read from this node.

Setting read preference

For the driver, all you lot have to do is set the read preference. When reading information you but set the read preference to read from a secondary. Let's go over each and every read preference and explain what it does:

primary Always read from the master (default)
primaryPreferred E'er read from the primary, read from secondary if the primary is unavailable
secondary Always read from a secondary
secondaryPreferred Ever read from a secondary, read from the primary if no secondary is bachelor
nearest Ever read from the node with the lowest network latency

It is clear the default mode is the least preferred if you lot wish to calibration out reads. PrimaryPreferred is not much amend, every bit it will pick 99.999% of the time the chief. Withal if the primary becomes unavailable you volition have a fallback for read requests.

Secondary should work fine for scaling reads, only as you leave out the primary the reads will never have a fallback if no secondary is available. SecondaryPreferred is slightly better, but the reads will hit almost all of the fourth dimension the secondaries, which still causes an uneven spread of reads. Also if no secondaries are bachelor, in almost cases there will be no longer a cluster and the principal will bench itself to a secondary. Only when an arbiter is part of the cluster, the secondaryPreferred mode makes sense.

Nearest should always selection the node with the lowest network latency. Even though this sounds great from an application perspective, this will not guarantee yous become an even spread in read operations. But it volition piece of work very well in multi-regions where latency is high, and delays are noticeable. In such cases, reading from the nearest node means your application will be able to serve out data with the minimum latency.

Filtering Nodes with Tags

In MongoDB you can tag nodes in a replicaSet. This allows y'all to make groupings of nodes and use them for many purposes, including filtering them when reading from secondary nodes.

An example of a replicaSet with tagging can be:

          {     "_id" : "myrs",     "version" : 2,     "members" : [              {                      "_id" : 0,                      "host" : "host1:27017",                      "tags" : {                              "dc": "i",                              "rack": "e3"                      }              }, {                      "_id" : i,                      "host" : "host2:27017",                      "tags" : {                              "dc": "ane",                              "rack": "b2"                      }              }, {                      "_id" : 0,                      "host" : "host3:27017",                      "tags" : {                              "dc": "2",                              "rack": "q1"                      }              }     ] }        

This tagging allows us to limit our secondary to exist, for case, in our outset datacenter:

          db.getMongo().setReadPref('secondaryPreferred', [ { "dc": "1" } ] )        

Naturally the tags can be used with all read preference modes, except Primary.

Enabling Secondary Queries

Autonomously from setting the read preference in the client driver, at that place is another limitation. By default MongoDB disables reads from a secondary server side, unless you lot specifically tell the host to allow read operations. Changing this is relatively like shooting fish in a barrel, all you take to do is connect to the secondary and run this command:

          myset:SECONDARY> rs.slaveOk()        

This volition enable reads on this secondary for all incoming connections. You can too run this command in your awarding, just that would then imply your application is aware it could encounter a server that did not enable secondary reads.

Reading From a Secondary in a Shard

It is also possible to read from a secondary node in MongoDB sharded clusters. The MongoDB shard router (mongos) volition obey the read preference set in the asking and frontward the request to a secondary in the shard(s). This also ways you volition have to enable reads from a secondary on all hosts in the sharded environment.

And as said before: an issue that may arise with reading from secondaries on a sharded surround, is that it might exist possible to receive wrong information from a secondary. Due to the migration of data between shards, information may be in transit from one shard to another. Reading from a secondary may and so return incomplete data.

Adding More Secondary Nodes

Adding more secondary nodes to a MongoDB replicaSet would imply more than overhead for replication. However dissimilar MySQL, syncing the oplog on secondaries is not only limited to the primary node. MongoDB tin too sync the oplog from other secondaries, as long as they are up to date with the principal. This ways oplog servicing is as well possible from other secondaries, and we thus automatically accept "intermediate" primaries. This means theoretically that if yous add more secondaries, the performance bear upon volition be limited. Go on in mind that this also means the data trickles downward to the other nodes a bit slower, equally the oplog entries have to pass at least ii nodes now.

Conclusion

We have described what the impact is on MongoDB when reading from its secondaries, and what caveats to be enlightened of. If yous don't necessarily demand to scale your reads, it is amend not to perform this pre-emptive optimization. However if you think your setup would benefit from offloading the chief, better piece of work around the issues described in this post.

byarscapaidep.blogspot.com

Source: https://severalnines.com/blog/become-mongodb-dba-how-scale-reads

Related Posts

0 Response to "Mongodb Read From Secondary When Secondary Down"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel