This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Kafka

List of how-to manage Kafka

This is a placeholder page that shows you how to use this template site.

1 - Create Kafka Topic

How to create kafka topic from github config repository

Create Kafka Topic

A Topic is a category/feed name to which records are stored and published. Each topic has unique name across the entire Kafka cluster. Topics are partitioned, meaning a topic is spread over a number of “buckets” located on different Kafka brokers.

For more detailed explanation about kafka topic and architecture please see What is Apache Kafka? from Confluent. Or you can watch Apache Kafka Fundamentals video in Youtube.

Prerequisites

  • Have write access to rhodes repository
  • Understanding how topic partition and replica works

Step 1 - Create a new domain/folder to store KafkaTopic manifest

We use domain to categorize kafka Topic, each domain is represented with folder in rhodes repository

All kafka topics should be stored in deployment/{environment}/topic/{domain}/{topic-name}.yaml, skip this step if the folder already exists.

Register new folder as ArgoCD apps by updating the file in apps/{environment}/kafka-topic-set.application.yaml

# kafka-topic-set.application.yaml

apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
...
spec:
...
  generators:
  - list:
      elements:
      ...
      - domain: new-domain
        # Insert environment here {dev,integration,stg,prod}
        namespace: dev
        branch: master
    ...

Then create a new domain folder

mkdir -p deployment/{environment}/topic/{domain}

# Example from rhodes root directory
mkdir -p deployment/dev/topic/new-domain

Step 2 - Create KafkaTopic manifest

Create a new KafkaTopic manifest in the domain folder with name {topic-name}.yaml

# topic-name.yaml
apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaTopic
metadata:
  name: insert-topic-name-here
  labels:
    strimzi.io/cluster: dev
spec:
  partitions: 10
  replicas: 3
  topicName: topic-name-if-different-from-metadata-name
  config:
    cleanup.policy: compact

IMPORTANT: Please don’t forget to include metadata.labels.strimzi.io/cluster value with the environment name you currently want to deploy

Step 3 - PR and merge manifest to master branch

Create branch and push the changes. Request review to other engineers. Merge after approved.

Step 4 - Check Topic in kafka-ui

After the PR merged, check if your Topic already created via kafka-ui

List URL for kafka-ui

FAQ

Spec value for KafkaTopic manifest

partitions and replicas value

Kafka Partition and replica explanation from Stack Overflow

We recommend to assign replicas: 3 for each topic created to ensure durability and high availability. Three replicas will ensure that the topic will be replicated into three different availibity zones in GCP.

For partitions replica, we recommend to assign partitions: 10 (minimum) or more than that (it’s okay to overprovision partition). This value is related to throughput from consumer side. You can use Sizing Calculator for Apache Kafka to calculate number of partition you need.

topicnName value

KafkaTopic manifest will create topic with the same name as metadata.name value. However, there is limitation with Kubernetes object name, it only allows alphanumeric and - character. To create topic with other character use spec.topicName parameter.

apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaTopic
metadata:
  name: topic-name-with-dot
  labels:
    strimzi.io/cluster: dev
spec:
  topicName: topic.name.with.dot
  ...

config value

Please read Confluent Topic Configurations docs to see all the topic configurations (except for confluent.*, we don’t use confluent platform)

cleanup.policy

A string that is either “delete” or “compact” or both. This string designates the retention policy to use on old log segments. The default policy (“delete”) will discard old segments when their retention time or size limit has been reached. The “compact” setting will enable log compaction on the topic.

min.insync.replicas

When a producer sets acks to “all” (or “-1”), this configuration specifies the minimum number of replicas that must acknowledge a write for the write to be considered successful. If this minimum cannot be met, then the producer will raise an exception (either NotEnoughReplicas or NotEnoughReplicasAfterAppend).

When used together, min.insync.replicas and acks allow you to enforce greater durability guarantees. A typical scenario would be to create a topic with a replication factor of 3, set min.insync.replicas to 2, and produce with acks of “all”. This will ensure that the producer raises an exception if a majority of replicas do not receive a write.

Examples

Create topic with special character

apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaTopic
metadata:
  name: topic-name-with-dot
  labels:
    strimzi.io/cluster: dev
spec:
  partitions: 10
  replicas: 3
  topicName: topic.name.with.dot
  config:
    cleanup.policy: compact

Create topic with 7 days retention in production

apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaTopic
metadata:
  name: retention-topic
  labels:
    strimzi.io/cluster: prod
spec:
  partitions: 10
  replicas: 3
  config:
    cleanup.policy: delete
    retention.ms: 604800000