The zookeeper cluster is running well for half year a year. But today, after I re-configurate it and run command
It failed to startup and report
log4j:WARN No appenders could be found for logger (org.apache.zookeeper.server.quorum.QuorumPeerConfig). log4j:WARN Please initialize the log4j system properly. Invalid config, exiting abnormally
The point is the last term “Invalid config”(log4j is just warning); therefore I reviewed zoo.cfg many times but finding no mistake utterly.
After checking all configurations, I eventually find out the problem: the file “myid” missed. After adding the “myid” file, zookeeper startup correctly.
echo [hostname or ip] > /var/log/zookeeper/myid (The path is 'dataDir' in zoo.cfg)
It seems the error log of zookeeper is misleading——it says the config file is invalid but the true reason is missing of a config file.
For tolerating failure of four servers at most, we assumed that a five-servers zookeeper cluster will be enough. After learning of Paxos for a while, a problem occurs on me: the majority of five-servers-cluster is three-servers, how could zookeeper works to elect a new leader if more than two servers are down? So I do the test and find out that the zookeeper do fail to work if more than two servers are shutdown.
The correct number of zookeeper cluster which could tolerate failure of four servers is nine; because after four servers shutdown, the five survivors is also the majority of nine-server-cluster.