IMPORTANT ANNOUNCEMENT: On May 6, 2024, Carbon Black User eXchange (UeX) and Case Management will move to a new platform!
The Community will be in read-only mode starting April 19th, 7:00 AM PDT. Check out the blog post!
You will still be able to use the case portal to create and interact with your support cases until the transition, view more information here!

Cluster Backlog Growing Due to Redis Communication Issue

Cluster Backlog Growing Due to Redis Communication Issue

Version

5.2.x

Issue

Cluster has a growing backlog. Sensors checking in but not submitting data

Symptoms

/var/log/cb/enterprise/enterprise.log:

2016-03-01 23:31:01 [14244] <err>  cb.enterprise.tasks.throttle_calc - Error updating sensor data throttle.  Trying again in 10 seconds.

Traceback (most recent call last):

  File "/usr/lib/python2.6/site-packages/cb/enterprise/tasks/throttle_calc.py", line 45, in work

  File "/usr/lib/python2.6/site-packages/cb/core/throttle/throttler.py", line 28, in update

  File "/usr/lib/python2.6/site-packages/cb/core/throttle/throttle_rates.py", line 22, in update

  File "/opt/jenkins/builds/workspace/build-cbent-release-5.1.1/code/coreservices/src/cb/core/throttle/rate.py", line 24, in update

  File "/opt/jenkins/builds/workspace/build-cbent-release-5.1.1/code/coreservices/src/cb/core/throttle/rate.py", line 45, in update

  File "/usr/lib/python2.6/site-packages/cb/core/throttle/pressured_rate.py", line 22, in _calc_rate

  File "/opt/jenkins/builds/workspace/build-cbent-release-5.1.1/code/coreservices/src/cb/core/throttle/rate_pressure.py", line 33, in update

  File "/opt/jenkins/builds/workspace/build-cbent-release-5.1.1/code/coreservices/src/cb/core/throttle/pressures/datastore_cache_pressure.py", line 34, in _update

  File "/opt/jenkins/builds/workspace/build-cbent-release-5.1.1/code/coreservices/src/cb/core/throttle/pressures/datastore_cache_pressure.py", line 53, in _get_stats

  File "/opt/jenkins/builds/workspace/build-cbent-release-5.1.1/code/coreservices/src/cb/core/stats/query.py", line 43, in collect_stats

  File "/usr/lib/python2.6/site-packages/cb/core/stats/query_source.py", line 114, in _collect_stats

  File "/usr/lib/python2.6/site-packages/cb/core/stats/query_source.py", line 117, in _query_redis

  File "/usr/lib/python2.6/site-packages/redis/client.py", line 1387, in hgetall

    return self.execute_command('HGETALL', name)

  File "/usr/lib/python2.6/site-packages/redis/client.py", line 397, in execute_command

    connection.send_command(*args)

  File "/usr/lib/python2.6/site-packages/redis/connection.py", line 306, in send_command

    self.send_packed_command(self.pack_command(...

2016-03-01 23:31:01 [14244] <err>  cb.enterprise.tasks.throttle_calc -

                                            ...*args))

  File "/usr/lib/python2.6/site-packages/redis/connection.py", line 288, in send_packed_command

    self.connect()

  File "/usr/lib/python2.6/site-packages/redis/connection.py", line 235, in connect

    raise ConnectionError(self._error_message(e))

ConnectionError: Error 111 connecting 10.107.74.45:6379. Connection refused.

2016-03-01 23:32:11 [14244] <err>  cb.enterprise.tasks.throttle_calc - Error updating sensor data throttle.  Trying again in 10 seconds.

Traceback (most recent call last):

  File "/usr/lib/python2.6/site-packages/cb/enterprise/tasks/throttle_calc.py", line 45, in work

  File "/usr/lib/python2.6/site-packages/cb/core/throttle/throttler.py", line 28, in update

  File "/usr/lib/python2.6/site-packages/cb/core/throttle/throttle_rates.py", line 22, in update

  File "/opt/jenkins/builds/workspace/build-cbent-release-5.1.1/code/coreservices/src/cb/core/throttle/rate.py", line 24, in update

  File "/opt/jenkins/builds/workspace/build-cbent-release-5.1.1/code/coreservices/src/cb/core/throttle/rate.py", line 45, in update

  File "/usr/lib/python2.6/site-packages/cb/core/throttle/pressured_rate.py", line 22, in _calc_rate

  File "/opt/jenkins/builds/workspace/build-cbent-release-5.1.1/code/coreservices/src/cb/core/throttle/rate_pressure.py", line 33, in update

  File "/usr/lib64/python2.6/contextlib.py", line 34, in __exit__

    self.gen.throw(type, value, traceback)

  File "/opt/jenkins/builds/workspace/build-cbent-release-5.1.1/code/coreservices/src/cb/core/throttle/rate_pressure.py", line 63, in _all_stop_context

  File "/opt/jenkins/builds/workspace/build-cbent-release-5.1.1/code/coreservices/src/cb/core/throttle/rate_pressure.py", line 33, in update

  File "/opt/jenkins/builds/workspace/build-cbent-release-5.1.1/code/coreservices/src/cb/core/throttle/pressures/datastore_cache_pressure.py", line 34, in _update

  File "/opt/jenkins/builds/workspace/build-cbent-release-5.1.1/code/coreservices/src/cb/core/throttle/pressures/datastore_cache_pressure.py", line 53, in _get_stats

  File "/opt/jenkins/builds/workspace/build-cbent-release-5.1.1/code/coreservices/src/cb/core/stats/query.py", line 43, in collect_stats

  File "/usr/lib/python2.6/site-packages/cb/core/stats/query_source.py", line 114, in _collect_stats

  File "/usr/lib/python2.6/site-packages/cb/core/stats/query_sources/datastore_solr_client_stats.py", line 8, in next_time_interval

  File "/usr/lib/python2.6/site-packages/cb/core/stats/query_source.py", line 41, in next_time_interval

KeyError: 'num_bytes_pushed'

Cause

Sensor eventlog throttle may end up in a permanent bad state if Redis connection is lost and restored

Solution

Workaround

Restart services:

  1. Standalone server:
    service cb-enterprise restart
  2. Cluster:
    /usr/share/cb/cbcluster stop
    /usr/share/cb/cbcluster start

Resolution

This issue is resolved in 5.3.1 and 6.x. Please check 5.3.1 release notes for more information on CB-8472.

Labels (1)
Was this article helpful? Yes No
No ratings
Article Information
Author:
Creation Date:
‎04-13-2017
Views:
961
Contributors