This is a placeholder page that shows you how to use this template site.
This is the multi-page printable view of this section. Click here to print.
Kafka
1 - Create Kafka Topic
Create Kafka Topic
A Topic is a category/feed name to which records are stored and published. Each topic has unique name across the entire Kafka cluster. Topics are partitioned, meaning a topic is spread over a number of “buckets” located on different Kafka brokers.
For more detailed explanation about kafka topic and architecture please see What is Apache Kafka? from Confluent. Or you can watch Apache Kafka Fundamentals video in Youtube.
Prerequisites
- Have write access to
rhodes
repository - Understanding how topic partition and replica works
Step 1 - Create a new domain/folder to store KafkaTopic manifest
We use domain to categorize kafka Topic, each domain is represented with folder in rhodes
repository
All kafka topics should be stored in deployment/{environment}/topic/{domain}/{topic-name}.yaml
, skip this step if the folder already exists.
Register new folder as ArgoCD apps by updating the file in apps/{environment}/kafka-topic-set.application.yaml
# kafka-topic-set.application.yaml
apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
...
spec:
...
generators:
- list:
elements:
...
- domain: new-domain
# Insert environment here {dev,integration,stg,prod}
namespace: dev
branch: master
...
Then create a new domain folder
mkdir -p deployment/{environment}/topic/{domain}
# Example from rhodes root directory
mkdir -p deployment/dev/topic/new-domain
Step 2 - Create KafkaTopic manifest
Create a new KafkaTopic
manifest in the domain folder with name {topic-name}.yaml
# topic-name.yaml
apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaTopic
metadata:
name: insert-topic-name-here
labels:
strimzi.io/cluster: dev
spec:
partitions: 10
replicas: 3
topicName: topic-name-if-different-from-metadata-name
config:
cleanup.policy: compact
IMPORTANT: Please don’t forget to include metadata.labels.strimzi.io/cluster
value with the environment name you currently want to deploy
Step 3 - PR and merge manifest to master
branch
Create branch and push the changes. Request review to other engineers. Merge after approved.
Step 4 - Check Topic in kafka-ui
After the PR merged, check if your Topic already created via kafka-ui
List URL for kafka-ui
FAQ
Spec
value for KafkaTopic manifest
partitions
and replicas
value
Kafka Partition and replica explanation from Stack Overflow
We recommend to assign replicas: 3
for each topic created to ensure durability and high availability. Three replicas will ensure that the topic will be replicated into three different availibity zones in GCP.
For partitions
replica, we recommend to assign partitions: 10
(minimum) or more than that (it’s okay to overprovision partition). This value is related to throughput from consumer side. You can use Sizing Calculator for Apache Kafka to calculate number of partition you need.
topicnName
value
KafkaTopic
manifest will create topic with the same name as metadata.name
value. However, there is limitation with Kubernetes object name, it only allows alphanumeric and -
character. To create topic with other character use spec.topicName
parameter.
apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaTopic
metadata:
name: topic-name-with-dot
labels:
strimzi.io/cluster: dev
spec:
topicName: topic.name.with.dot
...
config
value
Please read Confluent Topic Configurations docs to see all the topic configurations (except for confluent.*
, we don’t use confluent platform)
cleanup.policy
A string that is either “delete” or “compact” or both. This string designates the retention policy to use on old log segments. The default policy (“delete”) will discard old segments when their retention time or size limit has been reached. The “compact” setting will enable log compaction on the topic.
min.insync.replicas
When a producer sets acks to “all” (or “-1”), this configuration specifies the minimum number of replicas that must acknowledge a write for the write to be considered successful. If this minimum cannot be met, then the producer will raise an exception (either NotEnoughReplicas or NotEnoughReplicasAfterAppend).
When used together, min.insync.replicas and acks allow you to enforce greater durability guarantees. A typical scenario would be to create a topic with a replication factor of 3, set min.insync.replicas to 2, and produce with acks of “all”. This will ensure that the producer raises an exception if a majority of replicas do not receive a write.
Examples
Create topic with special character
apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaTopic
metadata:
name: topic-name-with-dot
labels:
strimzi.io/cluster: dev
spec:
partitions: 10
replicas: 3
topicName: topic.name.with.dot
config:
cleanup.policy: compact
Create topic with 7 days retention in production
apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaTopic
metadata:
name: retention-topic
labels:
strimzi.io/cluster: prod
spec:
partitions: 10
replicas: 3
config:
cleanup.policy: delete
retention.ms: 604800000