Russell Bateman 7 June 2013 last update:
Sharding
Should I shard? One way is to create a three by three system with the idea of
standing up another set as you grow.
mongos is "almost" just an extension of the drivers.
Sharding is transparent to the application, but if you're doing updates, you
issue an update, what rows, etc., but default it will...
mongos will get update and one by one ask the shards until it finds
one that matches the criteria. You'll get failure if you issue an update
without a specific shard key or include MULTI=TRUE .
Different errors come back from mongos than from mongod .
If you've got retry logic that's error-specific, you could be unpleasantly
sruprised.
Version upgrades when sharding
Replace the binary and restart the process (mongos ). For all sharding
upgrades:
Disable the balancer and wait for any in-progress chunk migrations to
finish before beginning the actual upgrade.
Do a "rolling upgrade", secondaries and arbiters first.
Ensure secondary has caught up before upgrading it.
Step down the original primary, allow election to occur.
Upgrade the original primary.
Memory size...
...for dedicated VM running mongod . Rule of thumb...
15% of total document size for index. If index is totally resident, then a
100Gb data set would require DB nodes run about 16Gb memory plus whatever the
file system and other processes need. Plan on 24Gb memory per VM to cover this
very well.
Replica sets
The oplog is a ring buffer logging what's going on. Idempotent.
Don't share memory (don't let hypervisor share memory), but sharing CPUs is
less critical because MongoDB isn't compute-bound.
Here's a data center illustration:
WriteConcern
WriteConcern.FSYNC_SAFE , what to use that's better.
MongoClient , etc.
MMS
Download from 10gen, place on one server, discovers the other members of the
replica set/sharding cluster.
Collects samples every minute, can look at 5-minute samplings, etc.
Recommendations
- Don't use Ext3 because of zeroing out at file creation.
- Don't use ReiserFs.
- Use Ext4.
- See
http://docs.mongodb.org/manual/administration/production-notes/ .
- RAID10.
Fail-over
This is how to set up two data centers.
DC1
primary
secondary
arbiter (2 votes)
DC2
secondary
secondary
arbiter (1 vote)
This equals 7 votes (4 for a quorum).
Story
When DC1 goes down, rig Chef to deploy another arbiter (using force option) to
enable DC2 to elect itself a primary (because it then has 4 votes or a quorum).
Do this in the case where a human determines that DC1 isn't coming back very
soon and the down-streamers can't wait.
Java driver
Query retry: catch exception and try again. Be careful not to lock up the
system. Why is the query failing in the first place?
Primary reads, fail-over, takes several seconds for an election. Do we get an
exception? Can we do smart retries?
Problem is that sometimes get not-is-master. When primary election, old primary
steps down, disposes of its sockets. What can happen is the exception is
wrapped-IO connection (peer closed or somehthing). In general, trying to detect
reliably is hard because it doesn't look any different from a transient network
failure.
Driver will create new connections—won't tell you that the primary
changed.
Examples?
Return after 3 times on a MongoNetworkingException ?
No, the sublasses of MongoException are too generic and the actual
error depends hugely on whether returned from mongod or
mongos . mongos is a pipe binding with the socket, so no help
there.
Any network or server error is recoverable. DuplicateKey isn't
recoverable. InternalException is driver's SOL error.
NullPointerException shouldn't lead to retry. Most people retry
quickly, then wait longer and longer, etc.
Election could take 10 seconds.
- If DB is unavailable, save query for later time (if such an
action is applicable).
Mongo options
By default, timeout is 0, meaning never time out.
Close cursors
Will have a beneficent effect on garbage collection. Check Morphia out for
access to this cursor!
Random notes last morning...
Can make call on connection to get replica set status stuff if need be. Based
on that, you could make decisions on increasing or decreasing the
WriteConcern level, Etc.
Discussed
http://docs.mongodb.org/ecosystem/drivers/java-concurrency/ (but I
drew no conclusions on this.)
WriteResult.getN() gets the number of documents updated.