Environment
Symptoms
After revoking a sensor group client certificate, sensors in that group go offline
Cause
Revoking a client group cert is meant to be used in the event that there is a suspected bad actor. This is expected behavior
Resolution
If many sensors are now offline and need to be reconnected, the following steps can be performed
- Find the cert id by your group id
psql -p 5002 cb -c "select id from sensor_client_certs where sensor_group_id = <group_id>;"
- Update postgres by removing the revocation_time from the cert
psql -p 5002 cb -c "update sensor_client_certs set revocation_time = null where id = '<cert_id_here>';"
- Reload the cert into cb-datagrid
/usr/share/cb/cbdatagrid evict SensorClientCert <cert_id_here>
- Give it time for sensors to connect and get the updated cert, then revoke the old cert again
/usr/share/cb/cbssl sensor_certs --revoke --cert-id=<cert_id_here>
Additional Notes
To avoid this from happening, the following steps should be performed
- Move all sensors from the group to be revoked into another group, giving them some time to check in and get the update (a few hours to be safe)
- Revoke the cert
- Move the sensors back