Log Segments are stored at /data/kafka
. Within message logs, you see individual segment files accrue as messages are produced.
To maintain this over time, you can set file compaction.
You can also delete indexes, and they will populate automatically.
The segment we are currently writing to is called the active segment
. By default, the segment is deleted when it reaches a size of 1 GB or has been inactive for 1 week (whichever comes first).
The Broker has an open file handle to each segment in every partition, even inactive segments. Make sure your operating system can handle so many open files at once.
Stream is a sequence of events. Because Kafka does not rely on an external process framework, any type of stream can exist.
Multi-Cluster architectures include:
Can be used for replicating data between two data centers. It's a collection of consumers in a consumer group.
The group reads data from the set of topics you specify. Then MirrorMaker creates the thread and sends it to the target cluster. It creates one thread per consumer.
For configuration, MirrorMaker has a few options. You typically install it as a service and run it at the destination data center. The biggest task is monitoring lag to ensure the destination cluster is not falling behind the source.
Metrics can be accessed from:
get /brokers/ids/3
Use an Intelligent Platform Management Interface (IPMI) to monitor hardware health.
Problems:
Name | To ask |
---|---|
ACTIVE CONTROLLER COUNT | Is the broker the controller? |
REQUEST HANDLER IDLE RATIO | How much load is the broker under? |
ALL TOPICS BYTES IN | Do I need to scale up the number of brokers? |
ALL TOPICS BYTES OUT | How high is consumer traffic? |
ALL TOPICS MESSAGES IN | How many messages per second? |
PARTITION COUNT | How many partitions are assigned to be a broker? |
LEADER COUNT | How many partitions is this broker a leader for? |
OFFLINE PARTITIONS | How many brokers have no leader |
REQUEST METRICS | How many requests are going to the broker? |