Kafka_consumer.yaml (python style) and more

Hi,

As a followup to the article i posted earlier ( https://log-it.ro/2019/03/15/get-the-info-you-need-from-consumer-group-python-style/ ) , you can use that info to put in into kafka_consumer.yaml for Datadog integration.

It’s not elegant by any means, but it works. As an advise, please don’t over complicate thinks more than they need.

In the last example i figured i wanted to create a list of GroupInfo objects for each line that was returned from consumer group script. Bad idea as you shall see below

So, in addition to what i wrote in the last article, now it’s not just printing the dictionary but order it, by partition.

def constructgroupdict():
 groupagregate = {}
 group_list = getgroups()
 for group in group_list:
    groupagregate[group] = getgroupinfo(group)
 
 for v in groupagregate.values():
    v.sort(key = lambda re: int(re.partition))
 
 return groupagregate

def printgroupdict():
 groupdict = constructgroupdict()
 infile = open('/etc/datadog-agent/conf.d/kafka_consumer.d/kafka_consumer.yaml.template','a')
 for k,v in groupdict.items():
    infile.write('      '+k+':\n')
    topics = []
    testdict = {}
    for re in v:
        if re.topic not in topics:
           topics.append(re.topic)
    for x in topics:
        partitions = []
        for re in v:
           if (re.topic == x):
              partitions.append(re.partition)
        testdict[x] = partitions
    for gr,partlst in testdict.items():
        infile.write('        '+gr+': ['+', '.join(partlst)+']\n')
 infile.close()
 os.rename('/etc/datadog-agent/conf.d/kafka_consumer.d/kafka_consumer.yaml.template','/etc/datadog-agent/conf.d/kafka_consumer.d/kafka_consumer.yaml')
  
printgroupdict()

And after that, it’s quite hard to get only the unique value for the topic name.

The logic i chose to grab all the data per consumer group is related to the fact that querying the cluster takes a very long time, so if i wanted to grab another set of data filtered by topic, i would have been very time costly.

In the way that is written now, there are a lot of for loop, that should become challenging in care there are too many records to process. Fortunately, this should not be the case for consumer groups in a normal case.

The easiest way to integrate the info in kafka_consumer.yaml, in our case is to create a template called kafka_consumer.yaml.template

init_config:
  # Customize the ZooKeeper connection timeout here
  # zk_timeout: 5
  # Customize the Kafka connection timeout here
  # kafka_timeout: 5
  # Customize max number of retries per failed query to Kafka
  # kafka_retries: 3
  # Customize the number of seconds that must elapse between running this check.
  # When checking Kafka offsets stored in Zookeeper, a single run of this check
  # must stat zookeeper more than the number of consumers * topic_partitions
  # that you're monitoring. If that number is greater than 100, it's recommended
  # to increase this value to avoid hitting zookeeper too hard.
  # https://docs.datadoghq.com/agent/faq/how-do-i-change-the-frequency-of-an-agent-check/
  # min_collection_interval: 600
  #
  # Please note that to avoid blindly collecting offsets and lag for an
  # unbounded number of partitions (as could be the case after introducing
  # the self discovery of consumer groups, topics and partitions) the check
  # will collect at metrics for at most 200 partitions.


instances:
  # In a production environment, it's often useful to specify multiple
  # Kafka / Zookeper nodes for a single check instance. This way you
  # only generate a single check process, but if one host goes down,
  # KafkaClient / KazooClient will try contacting the next host.
  # Details: https://github.com/DataDog/dd-agent/issues/2943
  #
  # If you wish to only collect consumer offsets from kafka, because
  # you're using the new style consumers, you can comment out all
  # zk_* configuration elements below.
  # Please note that unlisted consumer groups are not supported at
  # the moment when zookeeper consumer offset collection is disabled.
  - kafka_connect_str:
      - localhost:9092
    zk_connect_str:
      - localhost:2181
    # zk_iteration_ival: 1  # how many seconds between ZK consumer offset
                            # collections. If kafka consumer offsets disabled
                            # this has no effect.
    # zk_prefix: /0.8

    # SSL Configuration

    # ssl_cafile: /path/to/pem/file
    # security_protocol: PLAINTEXT
    # ssl_check_hostname: True
    # ssl_certfile: /path/to/pem/file
    # ssl_keyfile: /path/to/key/file
    # ssl_password: password1

    # kafka_consumer_offsets: false
    consumer_groups:

It’s true that i keep only one string for connectivity on Kafka and Zookeeper, and that things are a little bit more complicated once SSL is configured (but this is not our case, yet).

  - kafka_connect_str:
      - localhost:9092
    zk_connect_str:
      - localhost:2181

And append the info at the bottom of it after which it is renamed. Who is putting that template back? Easy, that would be puppet.

It works, it has been tested. One last thing that i wanted to warn you about.

There is a limit of metrics that can be uploaded per machine, and that is 350. Please be aware and think very serious if you want to activate it.

Than would be all for today.

Cheers