Cb Response: Response 6.1 upgrade, Startup halts after "Waiting for cb-datagrid to initialize..." in cluster
Carbon Black Response
On startup after an upgrade to 6.1 or later, cb-datagrid component indicates an OK startup, but then shows "waiting for cb-datagrid to initialize..." and then eventually startup halts.
Review of the /var/log/cb/datagrid/debug.log (on master/minion) may show messages indicating the following:
Wrong bind request from [<ip>]:5701! This node is not requested endpoint: [<ip>]:5701
This behavior will occur if the /etc/cb/cluster.conf is not consistent across cluster nodes or port 5701 is not open across master and minions. This type of configuration is problematic in 6.x due to the datagrid component which checks this file to determine what servers are allowed to connect.
Confirm port 5701 is open and accessible on Master and Minion nodes.
Check /etc/cb/cluster.conf content on master and all minions and confirm they are exactly the same using:
Note: If FQDN is used in one cluster.conf, it must be used consistently across all minions. You should not mix IP addresses and FQDN across cluster.conf files in Master/Minions. If they are not exactly the same, sync all minions with the master configuration of /etc/cb/cluster.conf, then shutdown/restart as per the following procedure
Once the above command completes, execute the following on master/minions:
killall -KILL -u cb
WARNING: this will also halt any integrations that may run a 'cb' which will need to be restarted
Then do a start on the master and confirm if the issue is resolved: