Metrics information is provided with either for an individual node or for all nodes in a cluster and cluster data centre. The set of available metrics will expand as we build out this API.
The possible values for the metrics parameter is listed below:
n::cpuUtilization Current CPU utilisation as a percentage of total available.percentage ic_node_cpu_utilization n::osload Current OS load.last_one_minute Average metric value over 1 minute. ic_node_osload last_five_minutes Average metric value over 5 minutes. ic_node_osload last_fifteen_minutes Average metric value over 15 minutes. ic_node_osload n::diskUtilization Total disk space utilisation, by Cassandra, as a percentage of total available.percentage ic_node_disk_utilization n::diskAvailable Disk space available in bytesvalue ic_node_disk_available n::diskUsed Disk space used in bytesvalue ic_node_disk_used n::cpuguestpercent Time spent running a virtual CPU for guest OS’ under control of kernel.percentage ic_node_cpuguestpercent n::cpuguestnicepercent Niced processes executing in user mode in virtual OS.percentage ic_node_cpuguestnicepercent n::cpusystempercent Percentage of processes executing in kernel mode.percentage ic_node_cpusystempercent n::cpuidlepercent Percentage of time when one or more kernel threads are executing with the run queue empty and/or no I/O operations are currently cycling.percentage ic_node_cpuidlepercent n::cpuiowaitpercent CPU time the I/O thread spent waiting for a socket ready for reads or writes as a percent.percentage ic_node_cpuiowaitpercent n::cpuirqpercent Number of hardware interrupts the kernel is servicing.percentage ic_node_cpuirqpercent n::cpunicepercent Percentage of processes executing in user mode which have a positive nice value.percentage ic_node_cpunicepercent n::cpusoftirqpercent Number of software interrupts the kernel is servicing.percentage ic_node_cpusoftirqpercent n::cpustealpercent Percentage of time the hypervisor allocated to other tasks external to the one run on the current virtual CPUpercentage ic_node_cpustealpercent n::cpuuserpercent Processes executing in user mode, including application processes.percentage ic_node_cpuuserpercent n::memavailable Estimate of how much memory is available to start new applications without swap, taking into account page cache and re-claimability of slab.value ic_node_memavailable n::networkindelta Delta count of bytes received.value ic_node_networkindelta n::networkoutdelta Delta count of bytes transmitted.value ic_node_networkoutdelta n::networkin Count of bytes received.value ic_node_networkin n::networkout Count of bytes transmitted.value ic_node_networkout n::networkinerrorsdelta Delta count of receive errors detected.value ic_node_networkinerrorsdelta n::networkouterrorsdelta Delta count of transmit packets dropped.value ic_node_networkouterrorsdelta n::networkindroppeddelta Delta count of receive packets dropped.value ic_node_networkindroppeddelta n::networkoutdroppeddelta Delta count of transmit packets dropped.value ic_node_networkoutdroppeddelta n::filedescriptorlimit Maximum number of open files limit for the node OS.value ic_node_filedescriptorlimit n::filedescriptoropencount Current number of open files in the node OS.value ic_node_filedescriptoropencount n::tcpestablished Number of open TCP connections.value ic_node_tcpestablished n::tcptimewait Number of TCP sockets waiting for enough time to pass to be sure the remote TCP received the acknowledgment of its connection termination request.value ic_node_tcptimewait n::tcplistening Number of TCP sockets waiting for a connection request from any remote TCP and port.value ic_node_tcplistening n::tcpall Total number of TCP connections in all state.value ic_node_tcpall n::tcpclosewait Number of TCP sockets which connection is in the process of being closed.value ic_node_tcpclosewait Additional information on troubleshooting Cassandra metrics is available here.
n::compactions Number of pending compactions.pendingtasks Number of pending tasks. ic_node_compactions n::reads Reads per second by Cassandra. Returns single partition reads per second with count_per_second, and all reads (Single Partition + Multi Partition + CAS) per second with total_count_per_second.total_count_per_second ic_node_reads count_per_second ic_node_reads n::writes Writes per second by Cassandra. Returns writes per second with count_per_second and all writes (including CAS) per second with total_count_per_second.total_count_per_second ic_node_writes count_per_second ic_node_writes n::rangeSlices Range Slice reads by Cassandra.count_per_second ic_node_range_slices n::casReads Compare and Set reads by Cassandra.count_per_second ic_node_cas_reads n::casWrites Compare and Set writes by Cassandra.count_per_second ic_node_cas_writes n::clientRequestReadV2 Offers the percentile distribution and average latency per client read request (i.e. the period from when a node receives a client request, gathers the records and respond to the client).99thPercentile 99th percentile distribution of the metric ic_node_client_request_read_v2_microseconds latency_per_operation Average latency per operation. ic_node_client_request_read_v2 999thPercentile 99.9th percentile distribution of the metric ic_node_client_request_read_v2_microseconds 95thPercentile 95th percentile distribution of the metric ic_node_client_request_read_v2_microseconds n::clientRequestWrite Offers the percentile distribution and average latency per client write request (i.e. the period from when a node receives a client request, gathers the records and response to the client).latency_per_operation Average latency per operation. ic_node_client_request_write 99thPercentile 99th percentile distribution of the metric ic_node_client_request_write_microseconds 95thPercentile 95th percentile distribution of the metric ic_node_client_request_write_microseconds n::clientRequestRangeSlice Offers the percentile distribution and average latency per client range slice read request (i.e. the period from when a node receives a client request, gathers the records and response to the client).latency_per_operation Average latency per operation. ic_node_client_request_range_slice 99thPercentile 99th percentile distribution of the metric ic_node_client_request_range_slice_microseconds 95thPercentile 95th percentile distribution of the metric ic_node_client_request_range_slice_microseconds n::clientRequestCasRead Offers the percentile distribution and average latency per client CAS read request (i.e. the period from when a node receives a client request, gathers the records and response to the client).latency_per_operation Average latency per operation. ic_node_client_request_cas_read 99thPercentile 99th percentile distribution of the metric ic_node_client_request_cas_read_microseconds 95thPercentile 95th percentile distribution of the metric ic_node_client_request_cas_read_microseconds n::clientRequestCasWrite Offers the percentile distribution and average latency per client CAS write request (i.e. the period from when a node receives a client request, gathers the records and respond to the client).latency_per_operation Average latency per operation. ic_node_client_request_cas_write 99thPercentile 99th percentile distribution of the metric ic_node_client_request_cas_write_microseconds 95thPercentile 95th percentile distribution of the metric ic_node_client_request_cas_write_microseconds n::pausedConnections Monitors requests (back-pressure applied) from clients that have had their requests paused due to the node being overloaded from clients that have started with THROW_ON_OVERLOAD as default or set to False.value ic_node_paused_connections n::requestDiscarded Monitors requests discarded due to the node being overloaded from clients that have started with THROW_ON_OVERLOAD set to True.one_minute_rate One minute rate of the measured metric. ic_node_request_discarded count ic_node_request_discarded n::slalatency Monitors our SLA latency and alerts when it is above a threshold level.sla_write This is the synthetic write queries against an Instaclustr canary table. ic_node_slalatency_microseconds sla_read This is the synthetic read queries against an Instaclustr canary table. ic_node_slalatency_microseconds n::readstage The Read Stage metric represents Cassandra conducting reads from the local disk or cache.active_tasks_max Maximum number of active tasks. ic_node_readstage total_blocked_tasks_max Maximum number of blocked tasks in total. ic_node_readstage pending_tasks_max Maximum number of pending tasks. ic_node_readstage n::mutationstage The View Mutation Stage metric is responsible for materialised view writes.active_tasks_max Maximum number of active tasks. ic_node_mutationstage total_blocked_tasks_max Maximum number of blocked tasks in total. ic_node_mutationstage pending_tasks_max Maximum number of pending tasks. ic_node_mutationstage n::nativetransportrequest The Native Transport Request metric represents client CQL requests. If the requests are blocked by other Cassandra operations, this metric will display the abnormal values.currently_blocked_tasks_max Maximum number of currently blocked tasks. ic_node_nativetransportrequest active_tasks_max Maximum number of active tasks. ic_node_nativetransportrequest total_blocked_tasks_max Maximum number of blocked tasks in total. ic_node_nativetransportrequest total_blocked_tasks_per_second_max Maximum number of blocked tasks per second in total. ic_node_nativetransportrequest pending_tasks_max Maximum number of pending tasks. ic_node_nativetransportrequest total_blocked_tasks_differential Deprecated. ic_node_nativetransportrequest n::rpcthread The number of maximum concurrent requests from clients.currently_blocked_tasks_max Maximum number of currently blocked tasks. ic_node_rpcthread total_blocked_tasks_max Maximum number of blocked tasks in total. ic_node_rpcthread pending_tasks_max Maximum number of pending tasks. ic_node_rpcthread active_tasks_max Maximum number of active tasks. ic_node_rpcthread n::countermutationstage Responsible for materialized view writes.active_tasks_max Maximum number of active tasks. ic_node_countermutationstage total_blocked_tasks_max Maximum number of blocked tasks in total. ic_node_countermutationstage pending_tasks_max Maximum number of pending tasks. ic_node_countermutationstage n::viewmutationstage The View Mutation Stage metric is responsible for materialised view writes.active_tasks_max Maximum number of active tasks. ic_node_viewmutationstage total_blocked_tasks_max Maximum number of blocked tasks in total. ic_node_viewmutationstage pending_tasks_max Maximum number of pending tasks. ic_node_viewmutationstage n::droppedmessage The Dropped Messages metric represents the total number of dropped messages from all stages in the SEDA.differential_total_count Deprecated. ic_node_droppedmessage total_count ic_node_droppedmessage total_count_per_second_max Maximum total count per second. ic_node_droppedmessage n::hintsSucceeded Number of hints successfully delivered.count ic_node_hints_succeeded count_per_second_max Maximum count per second. ic_node_hints_succeeded differential_count Deprecated. ic_node_hints_succeeded n::hintsFailed Number of hints that failed delivery.count ic_node_hints_failed count_per_second_max Maximum count per second. ic_node_hints_failed differential_count Deprecated. ic_node_hints_failed n::hintsTimedOut Number of hints that timed out during deliverycount ic_node_hints_timed_out count_per_second_max Maximum count per second. ic_node_hints_timed_out differential_count Deprecated. ic_node_hints_timed_out n::hintsTotal Number of hint messages written to the node from the time Cassandra service starts.differential_value Deprecated. ic_node_hints_total value_per_second_max Maximum value per second. ic_node_hints_total value ic_node_hints_total n::load Size, in bytes, of the on disk data size this node manages.value ic_node_load_bytes n::offheapsizeallmemtables The total amount of data stored in the memtables including secondary indexes and pending flush memtables, that resides off-heap.value ic_node_offheapsizeallmemtables_bytes n::offheapsizememtable The total amount of data stored in the memtable that resides off-heap, including column related overhead and partitions overwritten.value ic_node_offheapsizememtable_bytes n::offheapmemoryusedbloomfilter The off-heap memory used by the bloom filtervalue ic_node_offheapmemoryusedbloomfilter_bytes n::offheapmemoryusedcompressionmetadata The off-heap memory used by compression metadata.value ic_node_offheapmemoryusedcompressionmetadata_bytes n::offheapmemoryusedindexsummary The off-heap memory used by the index summary.value ic_node_offheapmemoryusedindexsummary_bytes n::maxPartitionSize MaxPartitionSize is the size of the largest compacted partition where partition size is measured by the number of cells (values) that are stored in the partition.value ic_node_max_partition_size_bytes n::garbagecollectionparnewcollectioncount The total number of garbage collections that have occurred.count ic_node_garbagecollectionparnewcollectioncount n::garbagecollectionparnewcollectiontime The approximate accumulated garbage collection elapsed time.value ic_node_garbagecollectionparnewcollectiontime_milliseconds n::garbagecollectionparnewlastduration The elapsed time of the last garbage collection.value ic_node_garbagecollectionparnewlastduration_milliseconds n::garbagecollectiong1collectioncount The total number of garbage collections that have occurred.count ic_node_garbagecollectiong1collectioncount n::garbagecollectiong1collectiontime The approximate accumulated garbage collection elapsed time.value ic_node_garbagecollectiong1collectiontime_milliseconds n::garbagecollectiong1lastduration The elapsed time of the last garbage collection.value ic_node_garbagecollectiong1lastduration_milliseconds n::heapmemorycommitted The amount of memory that is committed for the Java Virtual Machine to use.value ic_node_heapmemorycommitted_bytes n::heapmemoryinit The amount of memory that the Java Virtual Machine initially requests from the operating system for memory management.value ic_node_heapmemoryinit_bytes n::heapmemorymax The maximum amount of memory that can be used for memory management.value ic_node_heapmemorymax_bytes n::heapmemoryused The amount of used memory.value ic_node_heapmemoryused_bytes n::schemaversioncount Number of active schema versions.value ic_node_schemaversioncount n::connectedNativeClients The number of connected clients to the Cassandra node.value ic_node_connected_native_clients n::readall Reads per second at the ALL consistency levelcount_per_second ic_node_readall n::readany Reads per second at the ANY consistency levelcount_per_second ic_node_readany n::readeachquorum Reads per second at the Each-Quorum consistency levelcount_per_second ic_node_readeachquorum n::readlocalone Reads per second at the Local-One consistency levelcount_per_second ic_node_readlocalone n::readlocalquorum Reads per second at the Local-Quorum consistency levelcount_per_second ic_node_readlocalquorum n::readlocalserial Reads per second at the Local-Serial consistency levelcount_per_second ic_node_readlocalserial n::readone Reads per second at the One consistency levelcount_per_second ic_node_readone n::readquorum Reads per second at the Quorum consistency levelcount_per_second ic_node_readquorum n::readserial Reads per second at the Serial consistency levelcount_per_second ic_node_readserial n::readthree Reads per second at the Three consistency levelcount_per_second ic_node_readthree n::readtwo Reads per second at the Two consistency levelcount_per_second ic_node_readtwo n::droppedMessageRead Reads that were dropped by the node.count_per_second ic_node_dropped_message_read n::writeall Write per second at the All consistency levelcount_per_second ic_node_writeall n::writeany Write per second at the Two consistency levelcount_per_second ic_node_writeany n::writeeachquorum Write per second at the Each Quorum consistency levelcount_per_second ic_node_writeeachquorum n::writelocalone Write per second at the Local One consistency levelcount_per_second ic_node_writelocalone n::writelocalquorum Writes per second at the Local Quorum consistency levelcount_per_second ic_node_writelocalquorum n::writelocalserial Writes per second at the Local Serial consistency levelcount_per_second ic_node_writelocalserial n::writeone Writes per second at the One consistency levelcount_per_second ic_node_writeone n::writequorum Writes per second at the Quorum consistency levelcount_per_second ic_node_writequorum n::writeserial Writes per second at the Serial consistency levelcount_per_second ic_node_writeserial n::writethree Writes per second at the Three consistency levelcount_per_second ic_node_writethree n::writetwo Writes per second at the Two consistency levelcount_per_second ic_node_writetwo n::droppedMessageMutation Writes that were dropped by the nodecount_per_second ic_node_dropped_message_mutation cf::{keyspace}::{table}::reads General measurements of local read latency for the table, on the individual node.latency_per_operation Average latency per operation. ic_table_reads count_per_second ic_table_reads cf::{keyspace}::{table}::writes General measurements of local write latency for the table, on the individual node.latency_per_operation Average latency per operation. ic_table_writes count_per_second ic_table_writes cf::{keyspace}::{table}::writeLatencyDistribution Metrics for local write latency for the table, on the individual node.50thPercentile 50th percentile distribution of the metric ic_table_write_latency_distribution_microseconds 99thPercentile 99th percentile distribution of the metric ic_table_write_latency_distribution_microseconds 75thPercentile 75th percentile distribution of the metric ic_table_write_latency_distribution_microseconds 95thPercentile 95th percentile distribution of the metric ic_table_write_latency_distribution_microseconds cf::{keyspace}::{table}::diskUsed Live and total disk used by the table.totaldiskspaceused Disk used by both live cells and tombstones ic_table_disk_used_bytes livediskspaceused Disk used by live cells. ic_table_disk_used_bytes cf::{keyspace}::{table}::sstablesPerRead SSTables accessed per read of the table on the individual node.average Average value of the metric. ic_table_sstables_per_read max Maximum value of the metric. ic_table_sstables_per_read cf::{keyspace}::{table}::liveCellsPerRead Live cells accessed per read of the table on the individual node.average Average value of the metric. ic_table_live_cells_per_read max Maximum value of the metric. ic_table_live_cells_per_read cf::{keyspace}::{table}::tombstonesPerRead Tombstoned cells accessed per read of the table on the individual node.average Average value of the metric. ic_table_tombstones_per_read max Maximum value of the metric. ic_table_tombstones_per_read cf::{keyspace}::{table}::partitionSize The size of partitions in the specified table in KB.average Average value of the metric. ic_table_partition_size max Maximum value of the metric. ic_table_partition_size cf::{keyspace}::{table}::offHeapSizeAllMemtables The total amount of data stored in the memtables including secondary indexes and pending flush memtables, that resides off-heap (in bytes).value ic_table_off_heap_size_all_memtables_bytes cf::{keyspace}::{table}::offHeapSizeMemtable The total amount of data stored in the memtable that resides off-heap, including column related overhead and partitions overwritten (in bytes).value ic_table_off_heap_size_memtable_bytes cf::{keyspace}::{table}::offHeapMemoryUsedBloomFilter The off-heap memory used by the bloom filter (in bytes).value ic_table_off_heap_memory_used_bloom_filter_bytes cf::{keyspace}::{table}::offHeapMemoryUsedCompressionMetadata The off-heap memory used by compression metadata (in bytes).value ic_table_off_heap_memory_used_compression_metadata_bytes cf::{keyspace}::{table}::offHeapMemoryUsedIndexSummary The off-heap memory used by the index summary (in bytes).value ic_table_off_heap_memory_used_index_summary_bytes cf::{keyspace}::{table}::estimatedPartitionCount The estimated count of partitions for a table.count ic_table_estimated_partition_count cf::{keyspace}::{table}::keyCacheHitRate The key cache hit rate for the specified table.percentage ic_table_key_cache_hit_rate value ic_table_key_cache_hit_rate cf::{keyspace}::{table}::readLatencyV2 Measurement of local read latency for the table, on the individual node.999thPercentile 99.9th percentile distribution of the metric ic_table_read_latency_v2_microseconds 95thPercentile 95th percentile distribution of the metric ic_table_read_latency_v2_microseconds 75thPercentile 75th percentile distribution of the metric ic_table_read_latency_v2_microseconds 99thPercentile 99th percentile distribution of the metric ic_table_read_latency_v2_microseconds count_per_second ic_table_read_latency_v2 50thPercentile 50th percentile distribution of the metric ic_table_read_latency_v2_microseconds latency_per_operation Average latency per operation. ic_table_read_latency_v2 cf::{keyspace}::{table}::sstablesPerReadDistribution SSTables accessed per read of the table on the individual node.99thPercentile 99th percentile distribution of the metric ic_table_sstables_per_read_distribution 95thPercentile 95th percentile distribution of the metric ic_table_sstables_per_read_distribution cf::{keyspace}::{table}::tombstonesPerReadDistribution Tombstoned cells accessed per read of the table on the individual node.99thPercentile 99th percentile distribution of the metric ic_table_tombstones_per_read_distribution 95thPercentile 95th percentile distribution of the metric ic_table_tombstones_per_read_distribution cf::{keyspace}::{table}::compressionRatio The ratio of compressed to uncompressed SSTable storage size, lower is better.value ic_table_compression_ratio hccsp::shotoverTransformFailuresCount The number of transform failures.value ic_node_shotover_transform_failures_count csp::shotoverTransformTotalCount The number of transforms used.value ic_node_shotover_transform_total_count csp::shotoverTransformPushedTotalCount The number of transforms used to process messages without a corresponding request (events).value ic_node_shotover_transform_pushed_total_count csp::shotoverTransformPushedFailuresCount The number of transform failures while processing messages without a corresponding request (events).value ic_node_shotover_transform_pushed_failures_count csp::shotoverTransformLatencySeconds0th 0th % latency for running the transform.value ic_node_shotover_transform_latency_seconds0th csp::shotoverTransformLatencySeconds50th 50th % latency for running the transform.value ic_node_shotover_transform_latency_seconds50th csp::shotoverTransformLatencySeconds90th 90th % latency for running the transform.value ic_node_shotover_transform_latency_seconds90th csp::shotoverTransformLatencySeconds95th 95th % latency for running the transform.value ic_node_shotover_transform_latency_seconds95th csp::shotoverTransformLatencySeconds99th 99th % latency for running the transform.value ic_node_shotover_transform_latency_seconds99th csp::shotoverTransformLatencySeconds999th 99.9th % latency for running the transform.value ic_node_shotover_transform_latency_seconds999th csp::shotoverTransformLatencySeconds100th 100th % latency for running the transform.value ic_node_shotover_transform_latency_seconds100th csp::shotoverTransformLatencySecondsCount The number of latency for running the transform.value ic_node_shotover_transform_latency_seconds_count csp::shotoverTransformLatencySecondsSum The sum of latency for running the transform.value ic_node_shotover_transform_latency_seconds_sum csp::shotoverTransformPushedLatencySeconds0th 0th % latency for running the transform on messages without a corresponding request (events).value ic_node_shotover_transform_pushed_latency_seconds0th csp::shotoverTransformPushedLatencySeconds50th 50th % latency for running the transform on messages without a corresponding request (events).value ic_node_shotover_transform_pushed_latency_seconds50th csp::shotoverTransformPushedLatencySeconds90th 90th % latency for running the transform on messages without a corresponding request (events).value ic_node_shotover_transform_pushed_latency_seconds90th csp::shotoverTransformPushedLatencySeconds95th 95th % latency for running the transform on messages without a corresponding request (events).value ic_node_shotover_transform_pushed_latency_seconds95th csp::shotoverTransformPushedLatencySeconds99th 99th % latency for running the transform on messages without a corresponding request (events).value ic_node_shotover_transform_pushed_latency_seconds99th csp::shotoverTransformPushedLatencySeconds999th 99.9th % latency for running the transform on messages without a corresponding request (events).value ic_node_shotover_transform_pushed_latency_seconds999th csp::shotoverTransformPushedLatencySeconds100th 100th % latency for running the transform on messages without a corresponding request (events).value ic_node_shotover_transform_pushed_latency_seconds100th csp::shotoverTransformPushedLatencySecondsCount The number of latency for running the transform on messages without a corresponding request (events).value ic_node_shotover_transform_pushed_latency_seconds_count csp::shotoverTransformPushedLatencySecondsSum The sum of latency for running the transform on messages without a corresponding request (events).value ic_node_shotover_transform_pushed_latency_seconds_sum csp::shotoverSourceToSinkLatencySeconds0th 0th % latency for running the transform from client to cluster.value ic_node_shotover_source_to_sink_latency_seconds0th csp::shotoverSourceToSinkLatencySeconds50th 50th % latency for running the transform from client to cluster.value ic_node_shotover_source_to_sink_latency_seconds50th csp::shotoverSourceToSinkLatencySeconds90th 90th % latency for running the transform from client to cluster.value ic_node_shotover_source_to_sink_latency_seconds90th csp::shotoverSourceToSinkLatencySeconds95th 95th % latency for running the transform from client to cluster.value ic_node_shotover_source_to_sink_latency_seconds95th csp::shotoverSourceToSinkLatencySeconds99th 99th % latency for running the transform from client to cluster.value ic_node_shotover_source_to_sink_latency_seconds99th csp::shotoverSourceToSinkLatencySeconds999th 99.9th % latency for running the transform from client to cluster.value ic_node_shotover_source_to_sink_latency_seconds999th csp::shotoverSourceToSinkLatencySeconds100th 100th % latency for running the transform from client to cluster.value ic_node_shotover_source_to_sink_latency_seconds100th csp::shotoverSourceToSinkLatencySecondsCount The number of latency for running the transform from client to cluster.value ic_node_shotover_source_to_sink_latency_seconds_count csp::shotoverSourceToSinkLatencySecondsSum The sum of latency for running the transform from client to cluster.value ic_node_shotover_source_to_sink_latency_seconds_sum csp::shotoverFailedRequestsCount The number of failed requests.value ic_node_shotover_failed_requests_count csp::shotoverOutOfRackRequestsCount The number of out of rack requests.value ic_node_shotover_out_of_rack_requests_count csp::shotoverAvailableConnectionsCount The number of available connections.value ic_node_shotover_available_connections_count csp::shotoverChainFailuresCount The number of chain failures.value ic_node_shotover_chain_failures_count csp::shotoverChainTotalCount The number of chains used.value ic_node_shotover_chain_total_count csp::shotoverSinkToSourceLatencySeconds0th 0th % latency for running the transform from cluster to client.value ic_node_shotover_sink_to_source_latency_seconds0th csp::shotoverSinkToSourceLatencySeconds50th 50th % latency for running the transform from cluster to client.value ic_node_shotover_sink_to_source_latency_seconds50th csp::shotoverSinkToSourceLatencySeconds90th 90th % latency for running the transform from cluster to client.value ic_node_shotover_sink_to_source_latency_seconds90th csp::shotoverSinkToSourceLatencySeconds95th 95th % latency for running the transform from cluster to client.value ic_node_shotover_sink_to_source_latency_seconds95th csp::shotoverSinkToSourceLatencySeconds99th 99th % latency for running the transform from cluster to client.value ic_node_shotover_sink_to_source_latency_seconds99th csp::shotoverSinkToSourceLatencySeconds999th 99.9th % latency for running the transform from cluster to client.value ic_node_shotover_sink_to_source_latency_seconds999th csp::shotoverSinkToSourceLatencySeconds100th 100th % latency for running the transform from cluster to client.value ic_node_shotover_sink_to_source_latency_seconds100th csp::shotoverSinkToSourceLatencySecondsCount The number of latency for running the transform from cluster to client.value ic_node_shotover_sink_to_source_latency_seconds_count csp::shotoverSinkToSourceLatencySecondsSum The sum of latency for running the transform from cluster to client.value ic_node_shotover_sink_to_source_latency_seconds_sum csp::shotoverChainMessagesPerBatchCount0th 0th % number of messages per batch.value ic_node_shotover_chain_messages_per_batch_count0th csp::shotoverChainMessagesPerBatchCount50th 50th % number of messages per batch.value ic_node_shotover_chain_messages_per_batch_count50th csp::shotoverChainMessagesPerBatchCount90th 90th % number of messages per batch.value ic_node_shotover_chain_messages_per_batch_count90th csp::shotoverChainMessagesPerBatchCount95th 95th % number of messages per batch.value ic_node_shotover_chain_messages_per_batch_count95th csp::shotoverChainMessagesPerBatchCount99th 99th % number of messages per batch.value ic_node_shotover_chain_messages_per_batch_count99th csp::shotoverChainMessagesPerBatchCount999th 99.9th % number of messages per batch.value ic_node_shotover_chain_messages_per_batch_count999th csp::shotoverChainMessagesPerBatchCount100th 100th % number of messages per batch.value ic_node_shotover_chain_messages_per_batch_count100th csp::shotoverChainMessagesPerBatchCountCount The number of messages per batch.value ic_node_shotover_chain_messages_per_batch_count_count csp::shotoverChainMessagesPerBatchCountSum The sum of number of messages per batch.value ic_node_shotover_chain_messages_per_batch_count_sum o::memused Percentage of used memory.value ic_node_memused o::docsCount Number of non-deleted documents in the segment. This number is based on Lucene documents and may include documents from nested fields.value ic_node_docs_count o::docsDeleted Number of deleted documents in the segment. This number is based on Lucene documents. Elasticsearch reclaims the disk space of deleted Lucene documents when a segment is merged.value ic_node_docs_deleted o::jvmheappercent Percentage of memory currently in use by the heap.value ic_node_jvmheappercent o::jvmthreadscount Number of active threads in use by JVM.value ic_node_jvmthreadscount o::indextotalpersec Indices per second.value ic_node_indextotalpersec o::querytotalpersec Queries per second.value ic_node_querytotalpersec o::indexlatency The latency of new indexing operations measured in milliseconds.value ic_node_indexlatency o::querylatency The latency of new query operations measured in milliseconds.value ic_node_querylatency o::slasearchlatency Monitors our SLA search latency and alerts when it is above a threshold level. This is the synthetic search query against an Instaclustr canary index.value ic_node_slasearchlatency o::slaindexlatency Monitors our SLA indexing latency and alerts when it is above a threshold level. This is the synthetic indexing against an Instaclustr canary index.value ic_node_slaindexlatency o::shardsCount Number of shards used per node.value ic_node_shards_count o::maxShards Maximum number of shards per node.value ic_node_max_shards op::ccr::leaderConnected Indicates the connection status of the connection between follower cluster and leader cluster.value ic_node_leader_connected op::ccr::followerCheckpoint Indicates the checkpoint at which the follower indices are at. This is a cumulative value across all replicating indices.value ic_node_follower_checkpoint op::ccr::leaderCheckpoint Indicates the checkpoint at which the leader indices are at. This is a cumulative value across all replicating indices.value ic_node_leader_checkpoint op::ccr::syncingIndicesCount Indicates the number of syncing/replicating indices.value ic_node_syncing_indices_count op::ccr::bootstrappingIndicesCount Indicates the number of indices which are at the stage of setting up replication.value ic_node_bootstrapping_indices_count op::ccr::pausedIndicesCount Indicates the number of replicating indices which are paused.value ic_node_paused_indices_count op::ccr::failedIndicesCount Indicates the number of failed replicating indices.value ic_node_failed_indices_count op::ccr::failedReadRequests Indicates the number of read requests failed during replication.value ic_node_failed_read_requests op::ccr::failedWriteRequests Indicates the number of write requests failed during replication.value ic_node_failed_write_requests op::ccr::throttledReadRequests Indicates the number of read requests throttled during replication.value ic_node_throttled_read_requests op::ccr::throttledWriteRequests Indicates the number of write requests throttled during replication.value ic_node_throttled_write_requests op::ccr::operationsWritten Indicates the number of operations written during replication.value ic_node_operations_written op::ccr::operationsRead Indicates the number of operations read during replication.value ic_node_operations_read op::ccr::autoFollowStartSuccess Indicates the number of successful auto follow replication attempts.value ic_node_auto_follow_start_success op::ccr::autoFollowStartFailed Indicates the number of failed auto follow replication attempts.value ic_node_auto_follow_start_failed op::ccr::autoFollowLeaderCallsFailed Indicates the number of failed replication calls to leader.value ic_node_auto_follow_leader_calls_failed e::memused Percentage of used memory.value ic_node_memused e::docsCount Number of non-deleted documents in the segment. This number is based on Lucene documents and may include documents from nested fields.value ic_node_docs_count e::docsDeleted Number of deleted documents in the segment. This number is based on Lucene documents. Elasticsearch reclaims the disk space of deleted Lucene documents when a segment is merged.value ic_node_docs_deleted e::jvmheappercent Percentage of memory currently in use by the heap.value ic_node_jvmheappercent e::jvmthreadscount Number of active threads in use by JVM.value ic_node_jvmthreadscount e::indextotalpersec Indices per second.value ic_node_indextotalpersec e::querytotalpersec Queries per second.value ic_node_querytotalpersec e::indexlatency The latency of new indexing operations measured in milliseconds.value ic_node_indexlatency e::querylatency The latency of new query operations measured in milliseconds.value ic_node_querylatency e::slasearchlatency Monitors our SLA search latency and alerts when it is above a threshold level. This is the synthetic search query against an Instaclustr canary index.value ic_node_slasearchlatency e::slaindexlatency Monitors our SLA indexing latency and alerts when it is above a threshold level. This is the synthetic indexing against an Instaclustr canary index.value ic_node_slaindexlatency k::activeControllerCount The number of active controllers on the node. In effect it is 0 or 1. The active controller of a cluster is usually the first node to start up in the cluster.value ic_node_active_controller_count k::offlinePartitions The number of partitions without an active leader. Any partitions that are offline will not be accessible since read and write operations are only performed on the leader of a partition.value ic_node_offline_partitions k::activeBrokerCount The number of registered and unfenced brokers.value ic_node_active_broker_count k::metadataErrorCount The number of times this controller node has encountered an error during metadata log processing.value ic_node_metadata_error_count k::lastCommittedRecordOffset The offset of the last record committed to this Controller. This is always advancing due to the NoOpRecord, and can be used to check cluster availability.value ic_node_last_committed_record_offset k::fencedBrokerCount The number of registered but fenced brokers.value ic_node_fenced_broker_count k::preferredReplicaImbalanceCount The count of topic partitions for which the leader is not the preferred leader.value ic_node_preferred_replica_imbalance_count k::brokerTopicMessagesIn The mean and one minute rate of incoming messages per second.one_minute_rate One minute rate of the measured metric. ic_node_broker_topic_messages_in mean_rate The average rate of the measured metric. ic_node_broker_topic_messages_in count ic_node_broker_topic_messages_in k::brokerTopicBytesIn The mean and one minute rate of incoming bytes to the cluster.one_minute_rate One minute rate of the measured metric. ic_node_broker_topic_bytes_in mean_rate The average rate of the measured metric. ic_node_broker_topic_bytes_in count ic_node_broker_topic_bytes_in k::brokerTopicBytesOut The mean and one minute rate of outgoing bytes from the cluster.one_minute_rate One minute rate of the measured metric. ic_node_broker_topic_bytes_out mean_rate The average rate of the measured metric. ic_node_broker_topic_bytes_out count ic_node_broker_topic_bytes_out k::leaderElectionRate The count, average, max, and one minute rate of leader elections per second.one_minute_rate One minute rate of the measured metric. ic_node_leader_election_rate max Maximum value of the metric. ic_node_leader_election_rate average Average value of the metric. ic_node_leader_election_rate count ic_node_leader_election_rate k::uncleanLeaderElections The number of failures to elect a suitable leader per second. In the case that no suitable leader can be chosen (ie. no available replicas are in sync), an out-of-sync replica will be elected as leader, resulting in data loss that is proportional to how out-of-sync the newly elected leader is.one_minute_rate One minute rate of the measured metric. ic_node_unclean_leader_elections mean_rate The average rate of the measured metric. ic_node_unclean_leader_elections count ic_node_unclean_leader_elections k::partitionLoadTimeAvg The average time of Consumer Group Coordinator to load the Commit Offset partition in 30 seconds interval. This is only available for Kafka 2.4.1+.ms ic_node_partition_load_time_avg_milliseconds k::partitionLoadTimeMax The maximum time of Consumer Group Coordinator to load the Commit Offset partition in 30 seconds interval. This is only available for Kafka 2.4.1+.ms ic_node_partition_load_time_max_milliseconds k::groupCompletedRebalanceCount The number of rebalancing operations triggered by a number of factors as the participants of the group change. The rebalancing leads to the reassignment of partitions across the consumers.value ic_node_group_completed_rebalance_count k::groupCompletedRebalanceRate The rate of rebalancing operations.value ic_node_group_completed_rebalance_rate k::replicaFetcherMaxLag The max message count lag between all fetchers/topics/partitions.value ic_node_replica_fetcher_max_lag k::replicaFetcherFailedPartitionsCount Increment count when partition truncation fails, storage exception is encountered, partition has older epoch than current leader or any other error encountered during fetch request. This is only available for Kafka 2.3.1+.value ic_node_replica_fetcher_failed_partitions_count k::replicaFetcherMinFetchRate The minimum number of messages fetched in one minute interval between all fetchers/topics/partitions.value ic_node_replica_fetcher_min_fetch_rate k::replicaFetcherDeadThreadCount The number of failed fetcher threads. This is only available for Kafka 2.4.1+.value ic_node_replica_fetcher_dead_thread_count k::partitionCount The number of partitions on a node. The number of partitions should be evenly distributed across all nodes in a cluster.value ic_node_partition_count k::isrShrinkRate The one minute rate, mean rate, and number of decreases in the number of In-Sync Replicas (ISR) per second. This metric is expected to change when adding or removing nodes from a cluster.one_minute_rate One minute rate of the measured metric. ic_node_isr_shrink_rate mean_rate The average rate of the measured metric. ic_node_isr_shrink_rate count ic_node_isr_shrink_rate k::isrExpandRate The one minute rate, mean rate, and number of increases in the number of In-Sync Replicas (ISR) per second. This metric is expected to change when adding or removing nodes from a cluster.one_minute_rate One minute rate of the measured metric. ic_node_isr_expand_rate mean_rate The average rate of the measured metric. ic_node_isr_expand_rate count ic_node_isr_expand_rate k::underMinIsrPartitions The number of partitions where the number of In-Sync Replicas (ISR) is less than the minimum number of in-sync replicas specified.value ic_node_under_min_isr_partitions k::underReplicatedPartitions The number of partitions that do not have enough replicas to meet the desired replication factor.value ic_node_under_replicated_partitions k::leaderCount The number of partitions that a node is a leader for. The number of partition leaders should be evenly distributed across all nodes in a cluster.value ic_node_leader_count k::kafkaBrokerState The current state of the broker represented as an Integer. Can be one of the following Integer values: value ic_node_kafka_broker_state k::produceRequestTime The count, average, 99th percentile distribution and max time taken to process requests from producers to send data. This is the sum of time spent waiting in request, time spent being processed by the leader, time spent waiting for follower response (if requests.required.acks = 1), and time taken to send the response.max ic_node_produce_request_time_milliseconds average ic_node_produce_request_time_milliseconds count ic_node_produce_request_time 99thPercentile 99th percentile distribution of time. ic_node_produce_request_time_milliseconds k::fetchConsumerRequestTime The count, average, 99th percentile distribution and max amount of time taken while processing, and the number of requests from consumers to get new data. This is the sum of time spent waiting in request, time spent being processed by the leader, time spent waiting for the leader to trigger sending the response (determined by fetch.min.bytes and fetch.wait.max.ms in the consumer configuration), and time taken to send the response.max ic_node_fetch_consumer_request_time_milliseconds average ic_node_fetch_consumer_request_time_milliseconds count ic_node_fetch_consumer_request_time 99thPercentile 99th percentile distribution of time. ic_node_fetch_consumer_request_time_milliseconds k::fetchFollowerRequestTime The count, average, and max amount of time taken while processing requests fromKafka brokers to get new data from partition leaders. This is the sum of time spent waiting in request, time spent being processed by the leader, and time taken to send the response.max ic_node_fetch_follower_request_time_milliseconds average ic_node_fetch_follower_request_time_milliseconds count ic_node_fetch_follower_request_time k::metadataRequestTime The 99th percentile distribution and max amount of time taken while processing requests from Kafka brokers to retrieve metadata. This is the sum of time spent waiting in request, time spent being processed by the leader, and time taken to send the response.max ic_node_metadata_request_time_milliseconds 99thPercentile 99th percentile distribution of time. ic_node_metadata_request_time_milliseconds k::produceRequestLocalTime The 99th percentile distribution and max amount of time taken by the leader to process requests from producers to send data.max ic_node_produce_request_local_time_milliseconds 99thPercentile 99th percentile distribution of time. ic_node_produce_request_local_time_milliseconds k::fetchConsumerRequestLocalTime The 99th percentile distribution and max amount of time spent being processed by the leader from consumer requests to get new data.max ic_node_fetch_consumer_request_local_time_milliseconds 99thPercentile 99th percentile distribution of time. ic_node_fetch_consumer_request_local_time_milliseconds k::metadataRequestLocalTime The 99th percentile distribution and max amount of time spent being processed by the leader while processing requests from Kafka brokers to retrieve metadata.max ic_node_metadata_request_local_time_milliseconds 99thPercentile 99th percentile distribution of time. ic_node_metadata_request_local_time_milliseconds k::produceRequestRemoteTime The 99th percentile distribution and max amount of time taken waiting for the follower to process requests from producers to send data.max ic_node_produce_request_remote_time_milliseconds 99thPercentile 99th percentile distribution of time. ic_node_produce_request_remote_time_milliseconds k::fetchConsumerRequestRemoteTime The 99th percentile distribution and max amount of time waiting for the follower from consumer requests to get new data.max ic_node_fetch_consumer_request_remote_time_milliseconds 99thPercentile 99th percentile distribution of time. ic_node_fetch_consumer_request_remote_time_milliseconds k::metadataRequestRemoteTime The 99th percentile distribution and max amount of time waiting for the follower while processing requests from Kafka brokers to retrieve metadata.max ic_node_metadata_request_remote_time_milliseconds 99thPercentile 99th percentile distribution of time. ic_node_metadata_request_remote_time_milliseconds k::produceRequestQueueTime The 99th percentile distribution and max amount of time the request waits in the request queue to process requests from producers to send data.max ic_node_produce_request_queue_time_milliseconds 99thPercentile 99th percentile distribution of time. ic_node_produce_request_queue_time_milliseconds k::fetchConsumerRequestQueueTime The 99th percentile distribution and max amount of time the request waits in the request queue from consumer requests to get new data.max ic_node_fetch_consumer_request_queue_time_milliseconds 99thPercentile 99th percentile distribution of time. ic_node_fetch_consumer_request_queue_time_milliseconds k::metadataRequestQueueTime The 99th percentile distribution and max amount of time the request waits in the request queue while processing requests from Kafka brokers to retrieve metadata.max ic_node_metadata_request_queue_time_milliseconds 99thPercentile 99th percentile distribution of time. ic_node_metadata_request_queue_time_milliseconds k::produceResponseQueueTime The 99th percentile distribution and max amount of time the request waits in the response queue to process requests from producers to send data.max ic_node_produce_response_queue_time_milliseconds 99thPercentile 99th percentile distribution of time. ic_node_produce_response_queue_time_milliseconds k::fetchConsumerResponseQueueTime The 99th percentile distribution and max amount of time the request waits in the response queue from consumer requests to get new data.max ic_node_fetch_consumer_response_queue_time_milliseconds 99thPercentile 99th percentile distribution of time. ic_node_fetch_consumer_response_queue_time_milliseconds k::metadataResponseQueueTime The 99th percentile distribution and max amount of time the request waits in the response queue while processing requests from Kafka brokers to retrieve metadata.max ic_node_metadata_response_queue_time_milliseconds 99thPercentile 99th percentile distribution of time. ic_node_metadata_response_queue_time_milliseconds k::producePurgatorySize The number of produce requests currently waiting in purgatory.value ic_node_produce_purgatory_size k::fetchPurgatorySize The number of fetch requests currently waiting in purgatory.value ic_node_fetch_purgatory_size k::networkProcessorAvgIdlePercent The average percentage of time the network processors are idle, expressed as a number between 0 and 1. Kafka’s network processor threads are responsible for reading and writing data to Kafka clients across the network.value ic_node_network_processor_avg_idle_percent k::requestHandlerAvgIdlePercent The average percentage of time Kafka’s request handler threads are idle, expressed as a number between 0 and 1. Kafka’s request handler threads are responsible for servicing client requests, including reading and writing messages to disk.one_minute_rate One minute rate of the measured metric. ic_node_request_handler_avg_idle_percent mean_rate The average rate of the measured metric. ic_node_request_handler_avg_idle_percent count ic_node_request_handler_avg_idle_percent k::produceMessageConversionsPerSec The one minute rate, mean rate, and number of produce requests per second that require message format conversion.one_minute_rate One minute rate of the measured metric. ic_node_produce_message_conversions_per_sec mean_rate The average rate of the measured metric. ic_node_produce_message_conversions_per_sec count ic_node_produce_message_conversions_per_sec k::fetchMessageConversionsPerSec The one minute rate, mean rate, and number of fetch requests per second that require message format conversion.one_minute_rate One minute rate of the measured metric. ic_node_fetch_message_conversions_per_sec mean_rate The average rate of the measured metric. ic_node_fetch_message_conversions_per_sec count ic_node_fetch_message_conversions_per_sec k::slaConsumerLatency The average and maximum time in milliseconds between a synthetic transaction message being sent by the producer and being received by the consumer.average Average value of the metric. ic_node_sla_consumer_latency max Maximum value of the metric. ic_node_sla_consumer_latency k::slaConsumerRecordsProcessed The number of synthetic transaction messages being successfully consumed and processed on each broker.count ic_node_sla_consumer_records_processed k::slaProducerLatencyMs The average and maximum time taken in milliseconds to send a synthetic transaction message to each broker that is successfully replicated to the required number of minimum in-sync replicas.average Average value of the metric. ic_node_sla_producer_latency_ms max Maximum value of the metric. ic_node_sla_producer_latency_ms k::slaProducerMessagesProcessed The number of synthetic transaction messages being successfully produced to each broker.count ic_node_sla_producer_messages_processed k::slaProducerErrors The number of errors encountered when producing synthetic transaction messages.count ic_node_sla_producer_errors k::youngGenLastGC Time taken for GC to run young generation during the latest event.value ic_node_young_gen_last_g_c k::oldGengcCollectionTime Total time taken for GC to run old generation.value ic_node_old_gengc_collection_time k::youngGengcCollectionTime Total time taken for GC to run young generation.value ic_node_young_gengc_collection_time k::logFlushRate The total count, one minute rate and mean rate of Kafka log flush.one_minute_rate One minute rate of the measured metric. ic_node_log_flush_rate mean_rate The average rate of the measured metric. ic_node_log_flush_rate count ic_node_log_flush_rate k::logFlushTime The average time and maximum time of Kafka log flush.max ic_node_log_flush_time_milliseconds average ic_node_log_flush_time_milliseconds k::produceRequestsPerSec The one minute rate, mean rate, and number of produce requests, since the beginning of program running. This only works for period below 3h.count ic_node_produce_requests_per_sec mean_rate ic_node_produce_requests_per_sec one_minute_rate ic_node_produce_requests_per_sec k::fetchConsumerRequestsPerSec The one minute rate, mean rate, and number of requests from consumer requests to get new data, since the beginning of program running. This only works for period below 3h.count ic_node_fetch_consumer_requests_per_sec mean_rate ic_node_fetch_consumer_requests_per_sec one_minute_rate ic_node_fetch_consumer_requests_per_sec k::fetchFollowerRequestsPerSec The one minute rate, mean rate, and number of requests from Kafka brokers to get new data from partition leaders, since the beginning of program running. This only works for period below 3h.count ic_node_fetch_follower_requests_per_sec mean_rate ic_node_fetch_follower_requests_per_sec one_minute_rate ic_node_fetch_follower_requests_per_sec k::controlPlaneNetworkProcessorAvgIdlePercent Monitoring the idle percentage of pinned control plane network thread.value ic_node_control_plane_network_processor_avg_idle_percent k::brokerFetcherLagConsumerLag The lag in the number of messages per follower replica aggregated at a broker level. Please note that brokers would not report this metric if it is not following a partition. For example all topics in the cluster is created with a replication factor of 1.count ic_node_broker_fetcher_lag_consumer_lag k::metadataApplyErrorCount The number of errors encountered by the BrokerMetadataPublisher while applying a new MetadataImage based on the latest MetadataDelta.value ic_node_metadata_apply_error_count k::metadataLoadErrorCount The number of errors encountered by the BrokerMetadataListener while loading the metadata log and generating a new MetadataDelta based on it.value ic_node_metadata_load_error_count k::commitLatencyAvg The average time in milliseconds to commit an entry in the raft log.ms ic_node_commit_latency_avg_milliseconds k::commitLatencyMax The maximum time in milliseconds to commit an entry in the raft log.ms ic_node_commit_latency_max_milliseconds k::appendRecordsRate The average number of records appended per sec by the leader of the raft quorum.one_minute_rate One minute rate of the measured metric. ic_node_append_records_rate mean_rate The average rate of the measured metric. ic_node_append_records_rate count ic_node_append_records_rate k::electionLatencyMax The maximum time in milliseconds spent on electing a new leader.ms ic_node_election_latency_max_milliseconds k::electionLatencyAvg The average time in milliseconds spent on electing a new leader.ms ic_node_election_latency_avg_milliseconds k::pollIdleRatioAvg The average fraction of time the client's poll() is idle as opposed to waiting for the user code to process records.value ic_node_poll_idle_ratio_avg k::currentState The current state of this member; possible values are leader, candidate, voted, follower, unattached.state ic_node_current_state k::currentStateKraft The current state of this member; possible values are leader, candidate, voted, follower, unattached.state ic_node_current_state_kraft k::highWatermark The high watermark maintained on this member; -1 if it is unknown.value ic_node_high_watermark k::currentLeader The current quorum leader's id; -1 indicates unknown.value ic_node_current_leader k::logEndOffset The current raft log end offset.value ic_node_log_end_offset k::fetchRecordsRate The average number of records fetched from the leader of the raft quorum.one_minute_rate One minute rate of the measured metric. ic_node_fetch_records_rate mean_rate The average rate of the measured metric. ic_node_fetch_records_rate count ic_node_fetch_records_rate k::currentEpoch The current quorum epoch.value ic_node_current_epoch k::globalPartitionCount The number of global partitions according to this Controller.value ic_node_global_partition_count k::globalTopicCount The number of global topics according to this Controller.value ic_node_global_topic_count k::lastAppliedRecordLagMs The difference between current time and the timestamp in milliseconds of the last record from the cluster metadata partition applied by this Controller.value ic_node_last_applied_record_lag_ms_milliseconds k::lastAppliedRecordOffset The offset of the last record from the cluster metadata partition applied by this Controller.value ic_node_last_applied_record_offset k::lastAppliedRecordTimestamp The timestamp in milliseconds of the last record from the cluster metadata partition applied by this Controller.value ic_node_last_applied_record_timestamp k::newActiveControllersCount Counts the number of times this node has seen a new controller elected. A transition to the "no leader" state is not counted here. If the same controller as before becomes active, that still counts. NOTE: This metric is for kraft onlyvalue ic_node_new_active_controllers_count k::timedOutBrokerHeartbeatCount The number of broker heartbeats that timed out on this controller since the process was started. Note that only active controllers handle heartbeats, so only they will see increases in this metric. NOTE: This metric is for kraft onlyvalue ic_node_timed_out_broker_heartbeat_count k::currentMetadataVersion Outputs the feature level of the current effective metadata version. NOTE: This metric is for kraft onlyvalue ic_node_current_metadata_version k::currentControllerId The CurrentControllerId metric shows the ID of the controller, as seen by the node in question. If the current node doesn't think there is an active controller, the value of this metric will be -1. NOTE: This metric is for kraft onlyvalue ic_node_current_controller_id k::remoteLogReaderTaskQueueSize Size of the queue holding remote storage read tasks value ic_node_remote_log_reader_task_queue_size k::remoteLogReaderAvgIdlePercent Average idle percent of thread pool for processing remote storage read tasks.value ic_node_remote_log_reader_avg_idle_percent k::remoteLogManagerTasksAvgIdlePercent Average idle percent of thread pool for copying data to remote storage. value ic_node_remote_log_manager_tasks_avg_idle_percent k::expiresPerSec The number of expired remote fetches per second. mean_rate The average rate of the measured metric. ic_node_expires_per_sec one_minute_rate One minute rate of the measured metric. ic_node_expires_per_sec k::activeControllerCountKraft The number of active controllers on the node. In effect it is 0 or 1. The active controller of a cluster is usually the first node to start up in the cluster.value ic_node_active_controller_count_kraft k::offlinePartitionsKraft The number of partitions without an active leader. Any partitions that are offline will not be accessible since read and write operations are only performed on the leader of a partition.value ic_node_offline_partitions_kraft k::activeBrokerCountKraft The number of registered and unfenced brokers.value ic_node_active_broker_count_kraft k::metadataErrorCountKraft The number of times this controller node has encountered an error during metadata log processing.value ic_node_metadata_error_count_kraft k::lastCommittedRecordOffsetKraft The offset of the last record committed to this Controller. This is always advancing due to the NoOpRecord, and can be used to check cluster availability.value ic_node_last_committed_record_offset_kraft k::fencedBrokerCountKraft The number of registered but fenced brokers.value ic_node_fenced_broker_count_kraft k::preferredReplicaImbalanceCountKraft The count of topic partitions for which the leader is not the preferred leader.value ic_node_preferred_replica_imbalance_count_kraft k::globalPartitionCountKraft The number of global partitions according to this Controller.value ic_node_global_partition_count_kraft k::globalTopicCountKraft The number of global topics according to this Controller.value ic_node_global_topic_count_kraft k::lastAppliedRecordLagMsKraft The difference between current time and the timestamp in milliseconds of the last record from the cluster metadata partition applied by this Controller.value ic_node_last_applied_record_lag_ms_kraft_milliseconds k::lastAppliedRecordOffsetKraft The offset of the last record from the cluster metadata partition applied by this Controller.value ic_node_last_applied_record_offset_kraft k::lastAppliedRecordTimestampKraft The timestamp in milliseconds of the last record from the cluster metadata partition applied by this Controller.value ic_node_last_applied_record_timestamp_kraft k::newActiveControllersCountKraft Counts the number of times this node has seen a new controller elected. A transition to the "no leader" state is not counted here. If the same controller as before becomes active, that still counts. NOTE: This metric is for kraft onlyvalue ic_node_new_active_controllers_count_kraft k::timedOutBrokerHeartbeatCountKraft The number of broker heartbeats that timed out on this controller since the process was started. Note that only active controllers handle heartbeats, so only they will see increases in this metric. NOTE: This metric is for kraft onlyvalue ic_node_timed_out_broker_heartbeat_count_kraft k::commitLatencyAvgKraft The average time in milliseconds to commit an entry in the raft log.ms ic_node_commit_latency_avg_kraft_milliseconds k::commitLatencyMaxKraft The maximum time in milliseconds to commit an entry in the raft log.ms ic_node_commit_latency_max_kraft_milliseconds k::appendRecordsRateKraft The average number of records appended per sec by the leader of the raft quorum.one_minute_rate One minute rate of the measured metric. ic_node_append_records_rate_kraft mean_rate The average rate of the measured metric. ic_node_append_records_rate_kraft count ic_node_append_records_rate_kraft k::electionLatencyMaxKraft The maximum time in milliseconds spent on electing a new leader.ms ic_node_election_latency_max_kraft_milliseconds k::electionLatencyAvgKraft The average time in milliseconds spent on electing a new leader.ms ic_node_election_latency_avg_kraft_milliseconds k::pollIdleRatioAvgKraft The average fraction of time the client's poll() is idle as opposed to waiting for the user code to process records.value ic_node_poll_idle_ratio_avg_kraft k::currentStateKraft The current state of this member; possible values are leader, candidate, voted, follower, unattached.state ic_node_current_state_kraft k::highWatermarkKraft The high watermark maintained on this member; -1 if it is unknown.value ic_node_high_watermark_kraft k::currentLeaderKraft The current quorum leader's id; -1 indicates unknown.value ic_node_current_leader_kraft k::logEndOffsetKraft The current raft log end offset.value ic_node_log_end_offset_kraft k::fetchRecordsRateKraft The average number of records fetched from the leader of the raft quorum.one_minute_rate One minute rate of the measured metric. ic_node_fetch_records_rate_kraft mean_rate The average rate of the measured metric. ic_node_fetch_records_rate_kraft count ic_node_fetch_records_rate_kraft k::currentEpochKraft The current quorum epoch.value ic_node_current_epoch_kraft k::logFlushRateKraft The total count, one minute rate and mean rate of Kafka log flush.one_minute_rate One minute rate of the measured metric. ic_node_log_flush_rate_kraft mean_rate The average rate of the measured metric. ic_node_log_flush_rate_kraft count ic_node_log_flush_rate_kraft k::logFlushTimeKraft The average time and maximum time of Kafka log flush.max ic_node_log_flush_time_kraft_milliseconds average ic_node_log_flush_time_kraft_milliseconds k::leaderElectionRateKraft The count, average, max, and one minute rate of leader elections per second.one_minute_rate One minute rate of the measured metric. ic_node_leader_election_rate_kraft max Maximum value of the metric. ic_node_leader_election_rate_kraft average Average value of the metric. ic_node_leader_election_rate_kraft count ic_node_leader_election_rate_kraft k::uncleanLeaderElectionsKraft The number of failures to elect a suitable leader per second. In the case that no suitable leader can be chosen (ie. no available replicas are in sync), an out-of-sync replica will be elected as leader, resulting in data loss that is proportional to how out-of-sync the newly elected leader is.one_minute_rate One minute rate of the measured metric. ic_node_unclean_leader_elections_kraft mean_rate The average rate of the measured metric. ic_node_unclean_leader_elections_kraft count ic_node_unclean_leader_elections_kraft k::youngGenLastGCKraft Time taken for GC to run young generation during the latest event.value ic_node_young_gen_last_g_c_kraft k::oldGengcCollectionTimeKraft Total time taken for GC to run old generation.value ic_node_old_gengc_collection_time_kraft k::youngGengcCollectionTimeKraft Total time taken for GC to run young generation.value ic_node_young_gengc_collection_time_kraft Per-topic metric names follow the format kt::{topic}::{metricName}. Optionally, a ‘sub-type’ may be specified to return a specific part of the metric - kt::{topic}::{metricName}:{subType}
kt::{topic}::messagesInPerTopic The rate of messages received by the topic. One sub-type must be specified.one_minute_rate One minute rate of the measured metric. ic_topic_messages_in_per_topic mean_rate The average rate of the measured metric. ic_topic_messages_in_per_topic kt::{topic}::bytesInPerTopic The rate of incoming bytes to the topic per second. One sub-type must be specified.one_minute_rate One minute rate of the measured metric. ic_topic_bytes_in_per_topic mean_rate The average rate of the measured metric. ic_topic_bytes_in_per_topic kt::{topic}::bytesOutPerTopic The rate of outgoing bytes from the topic. One sub-type must be specified.one_minute_rate One minute rate of the measured metric. ic_topic_bytes_out_per_topic mean_rate The average rate of the measured metric. ic_topic_bytes_out_per_topic kt::{topic}::fetchMessageConversionsPerTopic The amount and rate of fetch request messages which required message format conversions for the topic. One sub-type must be specified.one_minute_rate One minute rate of the measured metric. ic_topic_fetch_message_conversions_per_topic mean_rate The average rate of the measured metric. ic_topic_fetch_message_conversions_per_topic count ic_topic_fetch_message_conversions_per_topic kt::{topic}::produceMessageConversionsPerTopic The amount and rate of produce request messages which required message format conversions for the topic. One sub-type must be specified.one_minute_rate One minute rate of the measured metric. ic_topic_produce_message_conversions_per_topic mean_rate The average rate of the measured metric. ic_topic_produce_message_conversions_per_topic count ic_topic_produce_message_conversions_per_topic kt::{topic}::failedFetchMessagePerTopic The amount and rate of failed fetch requests to the topic. One sub-type must be specified.one_minute_rate One minute rate of the measured metric. ic_topic_failed_fetch_message_per_topic mean_rate The average rate of the measured metric. ic_topic_failed_fetch_message_per_topic count ic_topic_failed_fetch_message_per_topic kt::{topic}::failedProduceMessagePerTopic The amount and rate of failed produce requests to the topic. One sub-type must be specified.one_minute_rate One minute rate of the measured metric. ic_topic_failed_produce_message_per_topic mean_rate The average rate of the measured metric. ic_topic_failed_produce_message_per_topic count ic_topic_failed_produce_message_per_topic kt::{topic}::diskUsage The total size fo the files on disk associated with the topic, summed across all partitions.disk_usage_kilobytes The total size of the files on disk associated with the topic, summed across all partitions. ic_topic_disk_usage kt::{topic}::remoteCopyLagBytes Bytes which are eligible for tiering, but are not in remote storage yet.value ic_topic_remote_copy_lag_bytes kt::{topic}::remoteDeleteLagBytes Tiered bytes which are eligible for deletion, but have not been deleted yet.value ic_topic_remote_delete_lag_bytes kt::{topic}::remoteLogSizeBytes The total size of a remote log in bytes.value ic_topic_remote_log_size_bytes kt::{topic}::remoteFetchBytesPerSecPerTopic Rate of bytes read from remote storage per topic. mean_rate The average rate of the measured metric. ic_topic_remote_fetch_bytes_per_sec_per_topic one_minute_rate One minute rate of the measured metric. ic_topic_remote_fetch_bytes_per_sec_per_topic kt::{topic}::remoteFetchRequestsPerSecPerTopic Rate of read requests from remote storage per topic. mean_rate The average rate of the measured metric. ic_topic_remote_fetch_requests_per_sec_per_topic one_minute_rate One minute rate of the measured metric. ic_topic_remote_fetch_requests_per_sec_per_topic kt::{topic}::remoteFetchErrorsPerSecPerTopic Rate of read errors from remote storage per topic.mean_rate The average rate of the measured metric. ic_topic_remote_fetch_errors_per_sec_per_topic one_minute_rate One minute rate of the measured metric. ic_topic_remote_fetch_errors_per_sec_per_topic kt::{topic}::remoteCopyBytesPerSecPerTopic Rate of bytes copied to remote storage per topic. mean_rate The average rate of the measured metric. ic_topic_remote_copy_bytes_per_sec_per_topic one_minute_rate One minute rate of the measured metric. ic_topic_remote_copy_bytes_per_sec_per_topic kt::{topic}::remoteCopyRequestsPerSecPerTopic Rate of write requests to remote storage per topic. mean_rate The average rate of the measured metric. ic_topic_remote_copy_requests_per_sec_per_topic one_minute_rate One minute rate of the measured metric. ic_topic_remote_copy_requests_per_sec_per_topic kt::{topic}::remoteCopyErrorsPerSecPerTopic Rate of write errors from remote storage per topic.mean_rate The average rate of the measured metric. ic_topic_remote_copy_errors_per_sec_per_topic one_minute_rate One minute rate of the measured metric. ic_topic_remote_copy_errors_per_sec_per_topic Per-user metric names follow the format ku::{user}::{metricName}. Per-user metric can take up to 50 minutes to be refreshed in case of user removal or user becoming idle. Optionally, a ‘sub-type’ may be specified to return a specific part of the metric - ku::{user}::{metricName}:{subType}
ku::{user}::produceBandwidthQuotaPerUser Bandwidth quota metrics (produce) per userbyte_rate ic_user_produce_bandwidth_quota_per_user throttle_time ic_user_produce_bandwidth_quota_per_user ku::{user}::fetchBandwidthQuotaPerUser Bandwidth quota metrics (fetch) per userbyte_rate ic_user_fetch_bandwidth_quota_per_user throttle_time ic_user_fetch_bandwidth_quota_per_user kc::taskCount Number of tasks currently assigned to each worker node.value ic_node_task_count kc::connectorCount Number of connectors currently assigned to each worker node.value ic_node_connector_count kc::connectorStartupAttemptsTotal Number of times a connector has been instructed to start on each worker node.value ic_node_connector_startup_attempts_total kc::connectorStartupFailurePercentage Percentage of connecter start-up attempts that have failed to complete.percentage ic_node_connector_startup_failure_percentage kc::connectorStartupFailureTotal Number of times a connector has been instructed to start and failed to do so.value ic_node_connector_startup_failure_total kc::connectorStartupSuccessPercentage Percentage of connecter start-up attempts that have successfully completed.percentage ic_node_connector_startup_success_percentage kc::connectorStartupSuccessTotal Number of times a connector has been instructed to start and has succeeded in doing so.value ic_node_connector_startup_success_total kc::taskStartupAttemptsTotal Number of times a task has been instructed to start on each worker node.value ic_node_task_startup_attempts_total kc::taskStartupFailurePercentage Percentage of task start-up attempts that have failed to complete.percentage ic_node_task_startup_failure_percentage kc::taskStartupFailureTotal Number of times a task has been instructed to start and failed to do so.value ic_node_task_startup_failure_total kc::taskStartupSuccessPercentage Percentage of task start-up attempts that have successfully completed.percentage ic_node_task_startup_success_percentage kc::taskStartupSuccessTotal Number of times a task has been instructed to start and has succeeded in doing so.value ic_node_task_startup_success_total kc::leaderName Identity of the current leader worker node. Typically this is the IP address of the leader.state ic_node_leader_name kc::isLeader Monitors the number of worker nodes which believe it is the leader for the Kafka Connect cluster.value ic_node_is_leader kc::completedRebalancesTotal Number of rebalances that have completed since Kafka Connect has started (per node).value ic_node_completed_rebalances_total kc::epoch Monotonically increasing number that indicates the current state of assigned tasks. Will increase by one for each completed rebalance.value ic_node_epoch kc::timeSinceLastRebalanceMs Time since the last successful rebalance that each node participated in (per node, in milliseconds).ms ic_node_time_since_last_rebalance_ms_milliseconds kc::rebalanceAvgTimeMs The average time each rebalance has taken to complete (per node, in milliseconds).ms ic_node_rebalance_avg_time_ms_milliseconds kc::rebalanceMaxTimeMs The maximum time each rebalance has taken to complete (per node, in milliseconds).ms ic_node_rebalance_max_time_ms_milliseconds kc::rebalancing Whether or not the worked is currently rebalancing (per node).value ic_node_rebalancing kc::restApiAvailable Whether or not the Kafka Connect REST API is currently available.value ic_node_rest_api_available kc::latencyRecordsProcessed The number of messages processed to produce the latencyMedianMs measure. Only available if attached to an Instaclustr managed Kafka cluster.value ic_node_latency_records_processed kc::latencyMedianMs The time taken from a record being produced on the connected Kafka Cluster to it being read on the Kafka Connect cluster. Measured using synthetic messages. Only available if attached to an Instaclustr managed Kafka cluster.ms ic_node_latency_median_ms_milliseconds kc::customConnectorLoadStatus The result of loading custom connectors from external source. Can be one of FAILED, SUCCEEDED, UNDEFINED. The value is UNDEFINED when the cluster does not have any custom connector or due to an error while collecting the metrics.state ic_node_custom_connector_load_status Task General, Task Error, Sink Task and Source Task metrics are listed below:
kct::<connector-name>::<task-id>::batchSizeAvg The average size of the batches processed by the connector.value ic_connector_task_batch_size_avg kct::<connector-name>::<task-id>::offsetCommitAvgTimeMs The average time in milliseconds taken by this task to commit offsets.ms ic_connector_task_offset_commit_avg_time_ms_milliseconds kct::<connector-name>::<task-id>::offsetCommitFailurePercentage The average percentage of this task’s offset commit attempts that failed.percentage ic_connector_task_offset_commit_failure_percentage kct::<connector-name>::<task-id>::pauseRatio The fraction of time this task has spent in the pause state.value ic_connector_task_pause_ratio kct::<connector-name>::<task-id>::status The status of the connector task. Can be of ‘unassigned’, ‘running’, ‘paused’ or ‘failed’.state ic_connector_task_status kct::<connector-name>::<task-id>::deadletterqueueProduceFailures The number of failed writes to the dead letter queue.value ic_connector_task_deadletterqueue_produce_failures kct::<connector-name>::<task-id>::deadletterqueueProduceRequests The number of attempted writes to the dead letter queue.value ic_connector_task_deadletterqueue_produce_requests kct::<connector-name>::<task-id>::lastErrorTimestamp The epoch timestamp when this task last encountered an error.value ic_connector_task_last_error_timestamp kct::<connector-name>::<task-id>::totalErrorsLogged The number of errors that were logged.value ic_connector_task_total_errors_logged kct::<connector-name>::<task-id>::totalRecordErrors The number of record processing errors in this task.value ic_connector_task_total_record_errors kct::<connector-name>::<task-id>::totalRecordFailures The number of record processing failures in this task.value ic_connector_task_total_record_failures kct::<connector-name>::<task-id>::totalRecordsSkipped The number of records skipped due to errors.value ic_connector_task_total_records_skipped kct::<connector-name>::<task-id>::totalRetries The number of operations retried.value ic_connector_task_total_retries kct::<connector-name>::<task-id>::offsetCommitCompletionRate The average per-second number of offset commit completions that were completed successfully.value ic_connector_task_offset_commit_completion_rate kct::<connector-name>::<task-id>::offsetCommitCompletionTotal The total number of offset commit completions that were completed successfully.value ic_connector_task_offset_commit_completion_total kct::<connector-name>::<task-id>::offsetCommitSeqNo The current sequence number for offset commits.value ic_connector_task_offset_commit_seq_no kct::<connector-name>::<task-id>::offsetCommitSkipRate The average per-second number of offset commit completions that were received too late and skipped/ignored.value ic_connector_task_offset_commit_skip_rate kct::<connector-name>::<task-id>::offsetCommitSkipTotal The total number of offset commit completions that were received too late and skipped/ignored.value ic_connector_task_offset_commit_skip_total kct::<connector-name>::<task-id>::partitionCount The number of topic partitions assigned to this task belonging to the named sink connector in this worker.value ic_connector_task_partition_count kct::<connector-name>::<task-id>::putBatchAvgTimeMs The average time taken by this task to put a batch of sinks records.ms ic_connector_task_put_batch_avg_time_ms_milliseconds kct::<connector-name>::<task-id>::sinkRecordActiveCount The number of records that have been read from Kafka but not yet completely committed/flushed/acknowledged by the sink task.value ic_connector_task_sink_record_active_count kct::<connector-name>::<task-id>::sinkRecordActiveCountAvg The average number of records that have been read from Kafka but not yet completely committed/flushed/acknowledged by the sink task.value ic_connector_task_sink_record_active_count_avg kct::<connector-name>::<task-id>::sinkRecordLagMax The maximum lag in terms of number of records behind the consumer the offset commits are for any topic partitions.value ic_connector_task_sink_record_lag_max kct::<connector-name>::<task-id>::sinkRecordReadRate The average per-second number of records read from Kafka for this task belonging to the named sink connector in this worker. This is before transformations are applied.value ic_connector_task_sink_record_read_rate kct::<connector-name>::<task-id>::sinkRecordReadTotal The total number of records read from Kafka by this task belonging to the named sink connector in this worker, since the task was last restarted.value ic_connector_task_sink_record_read_total kct::<connector-name>::<task-id>::sinkRecordSendRate The average per-second number of records output from the transformations and sent/put to this task belonging to the named sink connector in this worker. This is after transformations are applied and excludes any records filtered out by the transformations.value ic_connector_task_sink_record_send_rate kct::<connector-name>::<task-id>::sinkRecordSendTotal The total number of records output from the transformations and sent/put to this task belonging to the named sink connector in this worker, since the task was last restarted.value ic_connector_task_sink_record_send_total kct::<connector-name>::<task-id>::pollBatchAvgTimeMs The average time in milliseconds taken by this task to poll for a batch of source records.ms ic_connector_task_poll_batch_avg_time_ms_milliseconds kct::<connector-name>::<task-id>::sourceRecordActiveCount The number of records that have been produced by this task but not yet completely written to Kafka.value ic_connector_task_source_record_active_count kct::<connector-name>::<task-id>::sourceRecordActiveCountAvg The average number of records that have been produced by this task but not yet completely written to Kafka.value ic_connector_task_source_record_active_count_avg kct::<connector-name>::<task-id>::sourceRecordPollRate The average per-second number of records produced/polled (before transformation) by this task belonging to the named source connector in this worker.value ic_connector_task_source_record_poll_rate kct::<connector-name>::<task-id>::sourceRecordPollTotal The total number of records produced/polled (before transformation) by this task belonging to the named source connector in this worker.value ic_connector_task_source_record_poll_total kct::<connector-name>::<task-id>::sourceRecordWriteRate The average per-second number of records output from the transformations and written to Kafka for this task belonging to the named source connector in this worker. This is after transformations are applied and excludes any records filtered out by the transformations.value ic_connector_task_source_record_write_rate kct::<connector-name>::<task-id>::sourceRecordWriteTotal The number of records output from the transformations and written to Kafka for this task belonging to the named source connector in this worker, since the task was last restarted.value ic_connector_task_source_record_write_total kcc::<connectorName>::connectorUnassignedTaskCount This is only available for Kafka Connect 2.5.1+.value ic_connector_connector_unassigned_task_count kcc::<connectorName>::connectorTotalTaskCount The total number of tasks assigned to the connector. This is only available for Kafka Connect 2.5.1+.value ic_connector_connector_total_task_count kcc::<connectorName>::connectorRunningTaskCount The number of running tasks assigned to the connector. This is only available for Kafka Connect 2.5.1+.value ic_connector_connector_running_task_count kcc::<connectorName>::connectorDestroyedTaskCount The number of running tasks assigned to the connector. This is only available for Kafka Connect 2.5.1+.value ic_connector_connector_destroyed_task_count kcc::<connectorName>::connectorFailedTaskCount The number of failed tasks assigned to the connector. This is only available for Kafka Connect 2.5.1+.value ic_connector_connector_failed_task_count kcc::<connectorName>::connectorPausedTaskCount The number of paused tasks assigned to the connector. This is only available for Kafka Connect 2.5.1+.value ic_connector_connector_paused_task_count kc::mm::source::<target>::<topic-name-in-target>::recordCount Number of records replicated by the mirroring source connector.count ic_mirror_source_connector_record_count kc::mm::source::<target>::<topic-name-in-target>::byteCount Byte count replicated by the mirroring source connector.count ic_mirror_source_connector_byte_count kc::mm::source::<target>::<topic-name-in-target>::recordRate Record replication rate of the mirroring source connector.value ic_mirror_source_connector_record_rate kc::mm::source::<target>::<topic-name-in-target>::byteRate Byte replication rate of the mirroring source connector.value ic_mirror_source_connector_byte_rate kc::mm::source::<target>::<topic-name-in-target>::recordAgeMs Age of each record at the time when consumed by the mirroring source connector.value ic_mirror_source_connector_record_age_ms_milliseconds min ic_mirror_source_connector_record_age_ms_milliseconds max ic_mirror_source_connector_record_age_ms_milliseconds kc::mm::source::<target>::<topic-name-in-target>::replicationLatencyMs Timespan between each record’s timestamp and downstream acknowledgment.value ic_mirror_source_connector_replication_latency_ms_milliseconds min ic_mirror_source_connector_replication_latency_ms_milliseconds max ic_mirror_source_connector_replication_latency_ms_milliseconds kc::mm::checkpoint::<source>::<target>::<group>::<topic-name-in-target>::checkpointLatencyMs Timestamp between consumer group commit and downstream checkpoint acknowledgment.value ic_mirror_checkpoint_connector_checkpoint_latency_ms_milliseconds min ic_mirror_checkpoint_connector_checkpoint_latency_ms_milliseconds max ic_mirror_checkpoint_connector_checkpoint_latency_ms_milliseconds r::masterSlotsCount The number of hash slots a master node has been assigned. The number of hash slots of all master nodes should add to 16384.value ic_node_master_slots_count r::clusterUnassignedSlotsCount Number of slots which are NOT associated to some node (unbound).value ic_node_cluster_unassigned_slots_count r::clusterSlotsNotOkCount Number of hash slots mapping to a node in FAIL or PFAIL state.value ic_node_cluster_slots_not_ok_count r::slaWritesLatency The average and maximum time taken in milliseconds by a client to write to a random master node in the cluster.average Average value of the metric. ic_node_sla_writes_latency max Maximum value of the metric. ic_node_sla_writes_latency r::slaWritesSuccessfulOps Number of successful write operations performed on the cluster. Every 20 seconds, 30 synthetic write transactions are performed on each node.count ic_node_sla_writes_successful_ops r::slaWritesFailedOps Number of failed write operations performed on the cluster.count ic_node_sla_writes_failed_ops r::slaReadsLatency The average and maximum time taken in milliseconds by a client to read from a random node in the cluster.average Average value of the metric. ic_node_sla_reads_latency max Maximum value of the metric. ic_node_sla_reads_latency r::slaReadsSuccessfulOps Number of successful read operations performed on the cluster. Every 20 seconds, 30 synthetic read transactions are performed on each node.count ic_node_sla_reads_successful_ops r::slaReadsFailedOps Number of failed read operations performed on the cluster.count ic_node_sla_reads_failed_ops r::localWritesLatency Tthe average and maximum time taken in milliseconds by a client to write to its local node.average Average value of the metric. ic_node_local_writes_latency max Maximum value of the metric. ic_node_local_writes_latency r::localWritesSuccessfulOps Number of successful write operations performed on the local node. Every 20 seconds, 30 synthetic write transactions are performed on each node.count ic_node_local_writes_successful_ops r::localWritesFailedOps Number of failed write operations performed on the local node.count ic_node_local_writes_failed_ops r::localReadsLatency The average and maximum time taken in milliseconds by a client to read from its local node.average Average value of the metric. ic_node_local_reads_latency max Maximum value of the metric. ic_node_local_reads_latency r::localReadsSuccessfulOps Number of successful read operations performed on the local node. Every 20 seconds, 30 synthetic read transactions are performed on each node.count ic_node_local_reads_successful_ops r::localReadsFailedOps Number of failed read operations performed on the local node.count ic_node_local_reads_failed_ops r::usedMemory Total memory in megabytes allocated by Redis using its allocator (either standard libc, jemalloc, or an alternative allocator such as tcmalloc).value ic_node_used_memory r::usedMemoryRss Memory in megabytes that Redis allocated as seen by the operating system (a.k.a resident set size). This is the number reported by tools such as top(1) and ps(1).value ic_node_used_memory_rss r::usedMemoryDataset The size in bytes of the dataset.value ic_node_used_memory_dataset r::usedMemoryLua Number of bytes used by the Lua engine.value ic_node_used_memory_lua r::memoryFragmentationRatio Ratio between Used Memory Rss and Used Memory.value ic_node_memory_fragmentation_ratio r::connectedClients Number of clients connected to the node.value ic_node_connected_clients r::operationsPerSec Number of commands processed per second.value ic_node_operations_per_sec r::roleIsMaster Is the node the master, will be 1.0 if it is and 0.0 otherwisestate ic_node_role_is_master v::masterSlotsCount The number of hash slots a master node has been assigned. The number of hash slots of all master nodes should add to 16384.value ic_node_master_slots_count v::clusterUnassignedSlotsCount Number of slots which are NOT associated to some node (unbound).value ic_node_cluster_unassigned_slots_count v::clusterSlotsNotOkCount Number of hash slots mapping to a node in FAIL or PFAIL state.value ic_node_cluster_slots_not_ok_count v::slaWritesLatency The average and maximum time taken in milliseconds by a client to write to a random master node in the cluster.average Average value of the metric. ic_node_sla_writes_latency max Maximum value of the metric. ic_node_sla_writes_latency v::slaWritesSuccessfulOps Number of successful write operations performed on the cluster. Every 20 seconds, 30 synthetic write transactions are performed on each node.count ic_node_sla_writes_successful_ops v::slaWritesFailedOps Number of failed write operations performed on the cluster.count ic_node_sla_writes_failed_ops v::slaReadsLatency The average and maximum time taken in milliseconds by a client to read from a random node in the cluster.average Average value of the metric. ic_node_sla_reads_latency max Maximum value of the metric. ic_node_sla_reads_latency v::slaReadsSuccessfulOps Number of successful read operations performed on the cluster. Every 20 seconds, 30 synthetic read transactions are performed on each node.count ic_node_sla_reads_successful_ops v::slaReadsFailedOps Number of failed read operations performed on the cluster.count ic_node_sla_reads_failed_ops v::localWritesLatency Tthe average and maximum time taken in milliseconds by a client to write to its local node.average Average value of the metric. ic_node_local_writes_latency max Maximum value of the metric. ic_node_local_writes_latency v::localWritesSuccessfulOps Number of successful write operations performed on the local node. Every 20 seconds, 30 synthetic write transactions are performed on each node.count ic_node_local_writes_successful_ops v::localWritesFailedOps Number of failed write operations performed on the local node.count ic_node_local_writes_failed_ops v::localReadsLatency The average and maximum time taken in milliseconds by a client to read from its local node.average Average value of the metric. ic_node_local_reads_latency max Maximum value of the metric. ic_node_local_reads_latency v::localReadsSuccessfulOps Number of successful read operations performed on the local node. Every 20 seconds, 30 synthetic read transactions are performed on each node.count ic_node_local_reads_successful_ops v::localReadsFailedOps Number of failed read operations performed on the local node.count ic_node_local_reads_failed_ops v::usedMemory Total memory in megabytes allocated by Valkey using its allocator (either standard libc, jemalloc, or an alternative allocator such as tcmalloc).value ic_node_used_memory v::usedMemoryRss Memory in megabytes that Valkey allocated as seen by the operating system (a.k.a resident set size). This is the number reported by tools such as top(1) and ps(1).value ic_node_used_memory_rss v::usedMemoryDataset The size in bytes of the dataset.value ic_node_used_memory_dataset v::usedMemoryLua Number of bytes used by the Lua engine.value ic_node_used_memory_lua v::memoryFragmentationRatio Ratio between Used Memory Rss and Used Memory.value ic_node_memory_fragmentation_ratio v::connectedClients Number of clients connected to the node.value ic_node_connected_clients v::operationsPerSec Number of commands processed per second.value ic_node_operations_per_sec v::roleIsMaster Is the node the master, will be 1.0 if it is and 0.0 otherwisestate ic_node_role_is_master z::electionTimeTaken Time taken to complete election.ms ic_node_election_time_taken_milliseconds z::packetsReceived Number of packet operations received.value ic_node_packets_received z::txnLogElapsedSyncTime The elapsed sync time of transaction log in milliseconds.ms ic_node_txn_log_elapsed_sync_time_milliseconds z::packetsSent Number of packet operations sent.value ic_node_packets_sent z::numAliveConnections Total number of active client connections in the server.value ic_node_num_alive_connections z::maxRequestLatency Maximum time it takes for the server to respond to a request.ms ic_node_max_request_latency_milliseconds z::minRequestLatency Minimum time it takes for the server to respond to a request.ms ic_node_min_request_latency_milliseconds z::avgRequestLatency Average time it takes for the server to respond to a request.ms ic_node_avg_request_latency_milliseconds z::outstandingRequests Number of pending requests in the server.value ic_node_outstanding_requests z::openFileDescriptorCount Number of file descriptors in use.value ic_node_open_file_descriptor_count z::lastZxidCounter Last Zookeeper Transaction ID (ZXID) counter value.value ic_node_last_zxid_counter pg::misc::numBackends Number of connections against each nodecount ic_num_backends pg::misc::locks Current count of locks in each nodecount ic_locks pg::misc::timelineId Timeline id of the nodevalue ic_timeline_id pg::misc::isMaster Is the node the primary, will be 1.0 if it is and 0.0 otherwisecount ic_is_master pg::misc::isRunning Is Postgresql running, will be 1.0 if it is and 0.0 otherwisecount ic_is_running pg::misc::stateActive Number of active connections (state = 'active') in pg_stat_activity.count ic_state_active pg::misc::stateIdle Number of idle connections (state = 'idle') in pg_stat_activity.count ic_state_idle pg::misc::stateIdleInTransaction Number of connections idle in transaction (state = 'idle in transaction') in pg_stat_activity.count ic_state_idle_in_transaction pg::misc::stateNull Number of connections with null state in pg_stat_activity.count ic_state_null pg::misc::stateOthers Number of connections in 'other' states (state = 'idle in transaction (aborted)', 'fastpath function call' or 'disabled') in pg_stat_activity.count ic_state_others pg::misc::waitEventTypeLwlock Number of connections waiting on LWLock (wait_event_type = 'LWLock') in pg_stat_activity.count ic_wait_event_type_lwlock pg::misc::waitEventTypeIo Number of connections waiting on IO (wait_event_type = 'IO') in pg_stat_activity.count ic_wait_event_type_io pg::misc::waitEventTypeLock Number of connections waiting on Lock (wait_event_type = 'Lock') in pg_stat_activity.count ic_wait_event_type_lock pg::misc::waitEventTypeClient Number of connections waiting on Client (wait_event_type = 'Client') in pg_stat_activity.count ic_wait_event_type_client pg::misc::waitEventTypeExtension Number of connections waiting on Extension (wait_event_type = 'Extension') in pg_stat_activity.count ic_wait_event_type_extension pg::misc::waitEventTypeBufferPin Number of connections waiting on BufferPin (wait_event_type = 'BufferPin') in pg_stat_activity.count ic_wait_event_type_buffer_pin pg::misc::waitEventTypeActivity Number of connections waiting on Activity (wait_event_type = 'Activity') in pg_stat_activity.count ic_wait_event_type_activity pg::misc::waitEventTypeTimeout Number of connections waiting on Timeout (wait_event_type = 'Timeout') in pg_stat_activity.count ic_wait_event_type_timeout pg::misc::waitEventTypeInjectionPoint Number of connections waiting on InjectionPoint events (wait_event_type = 'InjectionPoint') in pg_stat_activity.count ic_wait_event_type_injection_point pg::misc::waitEventTypeIpc Number of connections waiting on IPC events (wait_event_type = 'IPC') in pg_stat_activity.count ic_wait_event_type_ipc pg::misc::waitEventTypeNull Number of connections with null wait_event_type in pg_stat_activity.count ic_wait_event_type_null pg::transactions::oldestTransactionId Oldest transaction ID in each nodecount ic_oldest_transaction_id pg::transactions::percentTowardsEmergencyVacuum Percentage towards an emergency vacuum being required in each nodecount ic_percent_towards_emergency_vacuum pg::transactions::percentTowardsWraparound Percentage towards transaction ID wraparound in each nodecount ic_percent_towards_wraparound pg::replication::lsnCurrent Current WAL LSN for database-cluster (this will be empty on replicas)count ic_lsn_current pg::replication::lsnReceived Last WAL LSN received by this replica (this will be empty on the primary)count ic_lsn_received pg::replication::isInRecovery Is the node a replica, will be 1.0 if it is and 0.0 otherwisecount ic_is_in_recovery pg::replication::replicationStatus Is the replica node's replication status streaming, will be 1 if it is and 0 otherwisevalue ic_replication_status pg::replication::isStandbyLeader Is the node the standby leader, will be 1 if it is and 0 otherwisecount ic_is_standby_leader pg::replication::slots::<node-id>::lsnSent Last WAL LSN sent on this connection (this will be empty on replicas)count ic_slot_lsn_sent pg::replication::lag::<node-id>::replicationLagByte The replication lag in byte for the replica nodesvalue ic_lag_replication_lag_byte_bytes pg::replication::lag::<node-id>::replicationLagMs The replication lag in ms for the replica nodesms ic_lag_replication_lag_ms_milliseconds pg::replication::lag::<node-id>::replayLag The replay lag for the replica nodesms ic_lag_replay_lag_milliseconds byte ic_lag_replay_lag_bytes pg::sla::avgWriteLatency Average write latency for synthetic write requests.ms ic_avg_write_latency_milliseconds pg::sla::avgReadLatency Average read latency for synthetic read requests.ms ic_avg_read_latency_milliseconds pg::sla::writeErrors Number of write errors for synthetic write requests.count ic_write_errors pg::sla::readErrors Number of read errors for synthetic write requests.count ic_read_errors If your database name contains : please escape it using
pg::db::<database-name>::rowsInsertedCountPerSecond Number of rows inserted per secondcount_per_second ic_database_rows_inserted_count_per_second pg::db::<database-name>::rowsUpdatedCountPerSecond Number of rows updated per secondcount_per_second ic_database_rows_updated_count_per_second pg::db::<database-name>::rowsDeletedCountPerSecond Number of rows deleted per secondcount_per_second ic_database_rows_deleted_count_per_second pg::db::<database-name>::rowsReturnedCountPerSecond Number of rows returned per secondcount_per_second ic_database_rows_returned_count_per_second pg::db::<database-name>::rowsFetchedCountPerSecond Number of rows fetched per secondcount_per_second ic_database_rows_fetched_count_per_second pg::db::<database-name>::deadlocks Number of deadlocks detected in this databasecount ic_database_deadlocks pg::db::<database-name>::bufferCacheHitCountPerSecond Number of times disk blocks were found already in the buffer cache, so that a read was not necessary, per secondcount_per_second ic_database_buffer_cache_hit_count_per_second pg::db::<database-name>::diskBlocksReadCountPerSecond Number of disk blocks read per second in this databasecount_per_second ic_database_disk_blocks_read_count_per_second pg::db::<database-name>::transactionsCommittedPerSecond Number of transactions in this database that have been committed per secondcount_per_second ic_database_transactions_committed_per_second pg::db::<database-name>::transactionsRolledBackPerSecond Number of transactions in this database that have been rolled back per secondcount_per_second ic_database_transactions_rolled_back_per_second pg::db::<database-name>::tempBytesPerSecond Number of temporary bytes written per secondvalue ic_database_temp_bytes_per_second_bytes pg::db::<database-name>::numBackends Number of connections against the databasecount ic_database_num_backends If your database name or table name contains : please escape it using
pg::tbl::<database-name>::<schema-name>::<table-name>::rowsInsertedCountPerSecond Number of rows inserted per secondcount_per_second ic_database_schema_table_rows_inserted_count_per_second pg::tbl::<database-name>::<schema-name>::<table-name>::rowsUpdatedCountPerSecond Number of rows updated per secondcount_per_second ic_database_schema_table_rows_updated_count_per_second pg::tbl::<database-name>::<schema-name>::<table-name>::rowsDeletedCountPerSecond Number of rows deleted per secondcount_per_second ic_database_schema_table_rows_deleted_count_per_second pg::tbl::<database-name>::<schema-name>::<table-name>::blocksHitCountPerSecond Number of blocks hit per secondcount_per_second ic_database_schema_table_blocks_hit_count_per_second pg::tbl::<database-name>::<schema-name>::<table-name>::blocksReadCountPerSecond Number of blocks read per secondcount_per_second ic_database_schema_table_blocks_read_count_per_second pg::tbl::<database-name>::<schema-name>::<table-name>::indexScansPerSecond Number of index scans initiated on this table per secondcount_per_second ic_database_schema_table_index_scans_per_second pg::tbl::<database-name>::<schema-name>::<table-name>::sequentialScansPerSecond Number of sequential scans initiated on this table per secondcount_per_second ic_database_schema_table_sequential_scans_per_second pg::tbl::<database-name>::<schema-name>::<table-name>::deadRows Estimated number of dead rowscount ic_database_schema_table_dead_rows pg::tbl::<database-name>::<schema-name>::<table-name>::bufferCacheIndexHitCountPerSecond Number of buffer hits in all indexes on this table per secondcount_per_second ic_database_schema_table_buffer_cache_index_hit_count_per_second pg::tbl::<database-name>::<schema-name>::<table-name>::diskBlocksReadIndexCountPerSecond Number of disk blocks read from all indexes on this table per secondcount_per_second ic_database_schema_table_disk_blocks_read_index_count_per_second pg::tbl::<database-name>::<schema-name>::<table-name>::tableSize Computes the disk space used by the specified table, excluding indexes (but including its TOAST table if any, free space map, and visibility map)value ic_database_schema_table_table_size_bytes pg::tbl::<database-name>::<schema-name>::<table-name>::indexSize Computes the total disk space used by indexes attached to the specified table.value ic_database_schema_table_index_size_bytes pgb::isAvailable PgBouncer availabilitycount ic_pgbouncer_is_available If your database name contains : please escape it using
pgb::stats::<database-name>::avgQueryCount Average queries per second in last stat collecting periodcount ic_pgbouncer_stats_avg_query_count pgb::stats::<database-name>::avgQueryTime Average query duration in microsecondsvalue ic_pgbouncer_stats_avg_query_time_microseconds pgb::stats::<database-name>::avgRecv Average size of client network traffic received in bytes per secondvalue ic_pgbouncer_stats_avg_recv_bytes pgb::stats::<database-name>::avgSent Average size of client network traffic sent in bytes per secondvalue ic_pgbouncer_stats_avg_sent_bytes pgb::stats::<database-name>::avgWaitTime Time spent by clients waiting for a server in microseconds (average per second)value ic_pgbouncer_stats_avg_wait_time_microseconds pgb::stats::<database-name>::avgXactCount Average transactions per second in last stat collecting periodcount ic_pgbouncer_stats_avg_xact_count pgb::stats::<database-name>::avgXactTime Average transaction duration in microsecondsvalue ic_pgbouncer_stats_avg_xact_time_microseconds If the database name or user name of connection pools contains : please escape it using
pgb::pools::<database-name>::<user-name>::clActive Number of client connections that are linked to server connection and are able to process queriescount ic_pgbouncer_pools_cl_active pgb::pools::<database-name>::<user-name>::clCancelReq Number of client connections that have not forwarded query cancellations to the server yetcount ic_pgbouncer_pools_cl_cancel_req pgb::pools::<database-name>::<user-name>::clWaiting Number of client connections that are waiting on a server connectioncount ic_pgbouncer_pools_cl_waiting pgb::pools::<database-name>::<user-name>::maxWait Current longest time (in seconds) that an unserved client connection is waiting in the poolvalue ic_pgbouncer_pools_max_wait_seconds pgb::pools::<database-name>::<user-name>::svActive Number of server connections that are linked to a client connectioncount ic_pgbouncer_pools_sv_active pgb::pools::<database-name>::<user-name>::svIdle Number of server connections that are idling and ready for a client querycount ic_pgbouncer_pools_sv_idle pgb::pools::<database-name>::<user-name>::svLogin Number of server connections that are currently in the process of logging incount ic_pgbouncer_pools_sv_login pgb::pools::<database-name>::<user-name>::svTested Number of server connections that are currently running either server_reset_query or server_check_querycount ic_pgbouncer_pools_sv_tested pgb::pools::<database-name>::<user-name>::svUsed Number of server connections that are idling more than server_check_delaycount ic_pgbouncer_pools_sv_used Summary metric names follow the format cads::{metricName}. Optionally, a ‘sub-type’ may be specified to return a specific part of the metric - cads::{metricName}::{subType}
cads::frontendV2MemoryHeapInUse The current heap memory usage of the Cadence Frontend service, in bytes.value ic_node_frontend_v2_memory_heap_in_use_bytes cads::frontendV2MemoryAllocated The current memory allocation to the Cadence Frontend service, in bytes.value ic_node_frontend_v2_memory_allocated_bytes cads::matchingV2MemoryHeapInUse The current heap memory usage of the Cadence Matching service, in bytes.value ic_node_matching_v2_memory_heap_in_use_bytes cads::matchingV2MemoryAllocated The current memory allocation to the Cadence Matching service, in bytes.value ic_node_matching_v2_memory_allocated_bytes cads::historyV2MemoryHeapInUse The current heap memory usage of the Cadence History service, in bytes.value ic_node_history_v2_memory_heap_in_use_bytes cads::historyV2MemoryAllocated The current memory allocation to the Cadence History service, in bytes.value ic_node_history_v2_memory_allocated_bytes cads::workerV2MemoryHeapInUse The current heap memory usage of the Cadence Worker service, in bytes.value ic_node_worker_v2_memory_heap_in_use_bytes cads::workerV2MemoryAllocated The current memory allocation to the Cadence Worker service, in bytes.value ic_node_worker_v2_memory_allocated_bytes cads::slaV2WorkflowSuccess Number of reported Cadence Canary workflow successes, per second.count_per_second ic_node_sla_v2_workflow_success cads::slaV2WorkflowCancel Number of reported Cadence Canary workflow cancellations, per second.count_per_second ic_node_sla_v2_workflow_cancel cads::slaV2WorkflowFail Number of reported Cadence Canary workflow failures, per second.count_per_second ic_node_sla_v2_workflow_fail cads::slaV2WorkflowTimeout Number of reported Cadence Canary workflow time-outs, per second.count_per_second ic_node_sla_v2_workflow_timeout cads::slaV2WorkflowTerminate Number of reported Cadence Canary workflow terminations, per second.count_per_second ic_node_sla_v2_workflow_terminate cads::slaV2WorkflowLatency The average end-to-end latency of the Cadence Canary workflow, in seconds.average ic_node_sla_v2_workflow_latency_seconds cads::frontendV2MeanPersistenceRequestRate Average Number of persistence requests made by the Cadence Frontend service, per second.count_per_second ic_node_frontend_v2_mean_persistence_request_rate cads::frontendV2MeanPersistenceErrorRate Average Number of internal errors from persistence requests made by the Cadence Frontend service, per second.count_per_second ic_node_frontend_v2_mean_persistence_error_rate cads::frontendV2MeanPersistenceLatency Average Latency of persistence requests made by the Cadence Frontend service, in seconds.average ic_node_frontend_v2_mean_persistence_latency_seconds cads::frontendV2MeanCadenceRequestRate Average Number of Cadence requests made to the Cadence Frontend service, per second.count_per_second ic_node_frontend_v2_mean_cadence_request_rate cads::frontendV2MeanCadenceErrorRate Average Number of internal errors from Cadence requests made to the Cadence Frontend service, per second.count_per_second ic_node_frontend_v2_mean_cadence_error_rate cads::frontendV2MeanCadenceLatency Average Latency of Cadence requests made to the Cadence Frontend service, in seconds.average ic_node_frontend_v2_mean_cadence_latency_seconds cads::syncMatchV2Latency Average synchronous match latency of the Cadence Matching service, in seconds.average ic_node_sync_match_v2_latency_seconds cads::asyncMatchV2Latency Average asynchronous match latency of the Cadence Matching service, in seconds.average ic_node_async_match_v2_latency_seconds cads::matchingV2MeanPersistenceRequestRate Average Number of persistence requests made by the Cadence Matching service, per second.count_per_second ic_node_matching_v2_mean_persistence_request_rate cads::matchingV2MeanPersistenceErrorRate Average Number of internal errors from persistence requests made by the Cadence Matching service, per second.count_per_second ic_node_matching_v2_mean_persistence_error_rate cads::matchingV2MeanPersistenceLatency Average Latency of persistence requests made by the Cadence Matching service, in seconds.average ic_node_matching_v2_mean_persistence_latency_seconds cads::matchingV2MeanCadenceRequestRate Average Number of Cadence requests made to the Cadence Matching service, per second.count_per_second ic_node_matching_v2_mean_cadence_request_rate cads::matchingV2MeanCadenceErrorRate Average Number of internal errors from Cadence requests made to the Cadence Matching service, per second.count_per_second ic_node_matching_v2_mean_cadence_error_rate cads::matchingV2MeanCadenceLatency Average Latency of Cadence requests made to the Cadence Matching service, in seconds.average ic_node_matching_v2_mean_cadence_latency_seconds cads::historyV2MeanCadenceRequestRate Average Number of Cadence requests made to the Cadence History service, per second.count_per_second ic_node_history_v2_mean_cadence_request_rate cads::historyV2MeanCadenceErrorRate Average Number of internal errors from Cadence requests made to the Cadence History service, per second.count_per_second ic_node_history_v2_mean_cadence_error_rate cads::historyV2MeanCadenceLatency Average Latency of Cadence requests made to the Cadence History service, in seconds.average ic_node_history_v2_mean_cadence_latency_seconds cads::historyV2MeanPersistenceRequestRate Average Number of persistence requests made by the Cadence History service, per second.count_per_second ic_node_history_v2_mean_persistence_request_rate cads::historyV2MeanPersistenceErrorRate Average Number of internal errors from persistence requests made by the Cadence History service, per second.count_per_second ic_node_history_v2_mean_persistence_error_rate cads::historyV2MeanPersistenceLatency Average Latency of persistence requests made by the Cadence History service, in seconds.average ic_node_history_v2_mean_persistence_latency_seconds cads::historyV2MeanTaskRequestRate Average Number of task requests to the Cadence History service, per second.count_per_second ic_node_history_v2_mean_task_request_rate cads::historyV2MeanTaskErrorRate Average Number of errors from task requests to the Cadence History service, per second.count_per_second ic_node_history_v2_mean_task_error_rate cads::historyV2MeanTaskLatency Average Execution latency of tasks in the Cadence History service, in seconds.average ic_node_history_v2_mean_task_latency_seconds cads::historyV2MeanTaskLatencyQueue Average Queue latency of tasks in the Cadence History service, in seconds.average ic_node_history_v2_mean_task_latency_queue_seconds cads::historyV2MeanTaskLatencyProcessing Average Processing latency of tasks in the Cadence History service, in seconds.average ic_node_history_v2_mean_task_latency_processing_seconds cads::historyV2MeanWorkflowSuccess Average Number of successful workflows, per second.count_per_second ic_node_history_v2_mean_workflow_success cads::historyV2MeanWorkflowCancel Average Number of cancelled workflows, per second.count_per_second ic_node_history_v2_mean_workflow_cancel cads::historyV2MeanWorkflowFailed Average Number of failed workflows, per second.count_per_second ic_node_history_v2_mean_workflow_failed cads::historyV2MeanWorkflowTimeout Average Number of timed out workflows, per second.count_per_second ic_node_history_v2_mean_workflow_timeout cads::historyV2MeanWorkflowTerminate Average Number of terminated workflows, per second.count_per_second ic_node_history_v2_mean_workflow_terminate cads::historyV2MeanReplicationTasksApplied Average Number of successfully applied replication tasks in the Cadence History service.count_per_second ic_node_history_v2_mean_replication_tasks_applied cads::historyV2MeanReplicationTasksAppliedLatency Average latency from replication tasks being received to them being applied in the Cadence History service, in seconds.average ic_node_history_v2_mean_replication_tasks_applied_latency_seconds cads::historyV2MeanReplicationTaskLatency Average latency from replication tasks being created to them being applied in the Cadence History service, in seconds.average ic_node_history_v2_mean_replication_task_latency_seconds cads::historyV2MeanReplicationTaskCleanupCount Average Number of cleaned up replication tasks after being acknowledged by the standby Cadence clusters in the Cadence History service.count_per_second ic_node_history_v2_mean_replication_task_cleanup_count cads::historyV2MeanReplicationTaskCleanupFailed Average Number of replication tasks failed to be cleaned up after being acknowledged by the standby Cadence clusters in the Cadence History service.count_per_second ic_node_history_v2_mean_replication_task_cleanup_failed cads::historyV2ReplicationDlqSize Size of the DLQ of replication tasks that could not be applied after retry in the Cadence History service.value ic_node_history_v2_replication_dlq_size cads::historyV2MeanReplicationDlqEnqueueFailed Average Number of replication tasks that could not be applied after retry and are failed to be put into DLQ in the Cadence History service.count_per_second ic_node_history_v2_mean_replication_dlq_enqueue_failed cads::workerV2MeanPersistenceRequestRate Average Number of persistence requests made by the Cadence Worker service, per second.count_per_second ic_node_worker_v2_mean_persistence_request_rate cads::workerV2MeanPersistenceErrorRate Average Number of internal errors from persistence requests made by the Cadence Worker service, per second.count_per_second ic_node_worker_v2_mean_persistence_error_rate cads::workerV2MeanPersistenceLatency Average Latency of persistence requests made by the Cadence Worker service, in seconds.average ic_node_worker_v2_mean_persistence_latency_seconds Tag-level metric names follow the format cadt::{tag}::{metricName}. Optionally, a ‘sub-type’ may be specified to return a specific part of the metric - cadt::{tag}::{metricName}::{subType}
cadt::{tag}::frontendV2PersistenceRequestRate Number of persistence requests made by the Cadence Frontend service, per operation, per second.count_per_second ic_cadence_frontend_v2_persistence_request_rate cadt::{tag}::frontendV2PersistenceErrorRate Number of internal errors from persistence requests made by the Cadence Frontend service, per operation, per second.count_per_second ic_cadence_frontend_v2_persistence_error_rate cadt::{tag}::frontendV2PersistenceLatency Latency of persistence requests made by the Cadence Frontend service, per operation, in seconds.95thPercentile ic_cadence_frontend_v2_persistence_latency_seconds 50thPercentile ic_cadence_frontend_v2_persistence_latency_seconds cadt::{tag}::frontendV2CadenceRequestRate Number of Cadence requests made to the Cadence Frontend service, per operation, per second.count_per_second ic_cadence_frontend_v2_cadence_request_rate cadt::{tag}::frontendV2CadenceErrorRate Number of internal errors from Cadence requests made to the Cadence Frontend service, per operation, per second.count_per_second ic_cadence_frontend_v2_cadence_error_rate cadt::{tag}::frontendV2CadenceClientBadRequestErrorRate Number of client-side errors (bad request) from Cadence requests made to the Cadence Frontend service, per operation, in seconds.count_per_second ic_cadence_frontend_v2_cadence_client_bad_request_error_rate cadt::{tag}::frontendV2CadenceClientServiceBusyErrorRate Number of client-side errors (service busy) from Cadence requests made to the Cadence Frontend service, per operation, in seconds.count_per_second ic_cadence_frontend_v2_cadence_client_service_busy_error_rate cadt::{tag}::frontendV2CadenceClientCriticalErrorRate Number of client-side errors (critical) from Cadence requests made to the Cadence Frontend service, per operation, in seconds.count_per_second ic_cadence_frontend_v2_cadence_client_critical_error_rate cadt::{tag}::frontendV2CadenceClientQueryFailedErrorRate Number of client-side errors (query failed) from Cadence requests made to the Cadence Frontend service, per operation, in seconds.count_per_second ic_cadence_frontend_v2_cadence_client_query_failed_error_rate cadt::{tag}::frontendV2CadenceClientLimitExceededErrorRate Number of client-side errors (limit exceeded) from Cadence requests made to the Cadence Frontend service, per operation, in seconds.count_per_second ic_cadence_frontend_v2_cadence_client_limit_exceeded_error_rate cadt::{tag}::frontendV2CadenceClientContextTimeoutErrorRate Number of client-side errors (context timeout) from Cadence requests made to the Cadence Frontend service, per operation, in seconds.count_per_second ic_cadence_frontend_v2_cadence_client_context_timeout_error_rate cadt::{tag}::frontendV2CadenceClientRetryTaskErrorRate Number of client-side errors (retry task) from Cadence requests made to the Cadence Frontend service, per operation, in seconds.count_per_second ic_cadence_frontend_v2_cadence_client_retry_task_error_rate cadt::{tag}::frontendV2CadenceLatency Latency of Cadence requests made to the Cadence Frontend service, per operation, in seconds.95thPercentile ic_cadence_frontend_v2_cadence_latency_seconds 50thPercentile ic_cadence_frontend_v2_cadence_latency_seconds cadt::{tag}::matchingV2CadenceRequestRate Number of Cadence requests made to the Cadence Matching service, per operation, per second.count_per_second ic_cadence_matching_v2_cadence_request_rate cadt::{tag}::matchingV2CadenceErrorRate Number of internal errors from Cadence requests made to the Cadence Matching service, per operation, per second.count_per_second ic_cadence_matching_v2_cadence_error_rate cadt::{tag}::matchingV2CadenceLatency Latency of Cadence requests made to the Cadence Matching service, per operation, in seconds.95thPercentile ic_cadence_matching_v2_cadence_latency_seconds 50thPercentile ic_cadence_matching_v2_cadence_latency_seconds cadt::{tag}::matchingV2CadenceClientBadRequestErrorRate Number of client-side errors (bad request) from Cadence requests made to the Cadence Matching service, per operation, in seconds.count_per_second ic_cadence_matching_v2_cadence_client_bad_request_error_rate cadt::{tag}::matchingV2CadenceClientServiceBusyErrorRate Number of client-side errors (service busy) from Cadence requests made to the Cadence Matching service, per operation, in seconds.count_per_second ic_cadence_matching_v2_cadence_client_service_busy_error_rate cadt::{tag}::matchingV2CadenceClientCriticalErrorRate Number of client-side errors (critical) from Cadence requests made to the Cadence Matching service, per operation, in seconds.count_per_second ic_cadence_matching_v2_cadence_client_critical_error_rate cadt::{tag}::matchingV2CadenceClientQueryFailedErrorRate Number of client-side errors (query failed) from Cadence requests made to the Cadence Matching service, per operation, in seconds.count_per_second ic_cadence_matching_v2_cadence_client_query_failed_error_rate cadt::{tag}::matchingV2CadenceClientLimitExceededErrorRate Number of client-side errors (limit exceeded) from Cadence requests made to the Cadence Matching service, per operation, in seconds.count_per_second ic_cadence_matching_v2_cadence_client_limit_exceeded_error_rate cadt::{tag}::matchingV2CadenceClientContextTimeoutErrorRate Number of client-side errors (context timeout) from Cadence requests made to the Cadence Matching service, per operation, in seconds.count_per_second ic_cadence_matching_v2_cadence_client_context_timeout_error_rate cadt::{tag}::matchingV2CadenceClientRetryTaskErrorRate Number of client-side errors (retry task) from Cadence requests made to the Cadence Matching service, per operation, in seconds.count_per_second ic_cadence_matching_v2_cadence_client_retry_task_error_rate cadt::{tag}::matchingV2SyncMatchLatency The synchronous match latency of the Cadence Matching service, per operation, in seconds.95thPercentile ic_cadence_matching_v2_sync_match_latency_seconds 50thPercentile ic_cadence_matching_v2_sync_match_latency_seconds cadt::{tag}::matchingV2AsyncMatchLatency The asynchronous match latency of the Cadence Matching service, per operation, in seconds.95thPercentile ic_cadence_matching_v2_async_match_latency_seconds 50thPercentile ic_cadence_matching_v2_async_match_latency_seconds cadt::{tag}::matchingV2PersistenceRequestRate Number of persistence requests made by the Cadence Matching service, per operation, per second.count_per_second ic_cadence_matching_v2_persistence_request_rate cadt::{tag}::matchingV2PersistenceErrorRate Number of internal errors from persistence requests made by the Cadence Matching service, per operation, per second.count_per_second ic_cadence_matching_v2_persistence_error_rate cadt::{tag}::matchingV2PersistenceLatency Latency of persistence requests made by the Cadence Matching service, per operation, in seconds.95thPercentile ic_cadence_matching_v2_persistence_latency_seconds 50thPercentile ic_cadence_matching_v2_persistence_latency_seconds cadt::{tag}::historyV2CadenceRequestRate Number of Cadence requests made to the Cadence History service, per operation, per second.count_per_second ic_cadence_history_v2_cadence_request_rate cadt::{tag}::historyV2CadenceErrorRate Number of internal errors from Cadence requests made to the Cadence History service, per operation, per second.count_per_second ic_cadence_history_v2_cadence_error_rate cadt::{tag}::historyV2CadenceLatency Latency of Cadence requests made to the Cadence History service, per operation, in seconds.95thPercentile ic_cadence_history_v2_cadence_latency_seconds 50thPercentile ic_cadence_history_v2_cadence_latency_seconds cadt::{tag}::historyV2CadenceClientBadRequestErrorRate Number of client-side errors (bad request) from Cadence requests made to the Cadence History service, per operation, in seconds.count_per_second ic_cadence_history_v2_cadence_client_bad_request_error_rate cadt::{tag}::historyV2CadenceClientServiceBusyErrorRate Number of client-side errors (service busy) from Cadence requests made to the Cadence History service, per operation, in seconds.count_per_second ic_cadence_history_v2_cadence_client_service_busy_error_rate cadt::{tag}::historyV2CadenceClientCriticalErrorRate Number of client-side errors (critical) from Cadence requests made to the Cadence History service, per operation, in seconds.count_per_second ic_cadence_history_v2_cadence_client_critical_error_rate cadt::{tag}::historyV2CadenceClientQueryFailedErrorRate Number of client-side errors (query failed) from Cadence requests made to the Cadence History service, per operation, in seconds.count_per_second ic_cadence_history_v2_cadence_client_query_failed_error_rate cadt::{tag}::historyV2CadenceClientLimitExceededErrorRate Number of client-side errors (limit exceeded) from Cadence requests made to the Cadence History service, per operation, in seconds.count_per_second ic_cadence_history_v2_cadence_client_limit_exceeded_error_rate cadt::{tag}::historyV2CadenceClientContextTimeoutErrorRate Number of client-side errors (context timeout) from Cadence requests made to the Cadence History service, per operation, in seconds.count_per_second ic_cadence_history_v2_cadence_client_context_timeout_error_rate cadt::{tag}::historyV2CadenceClientRetryTaskErrorRate Number of client-side errors (retry task) from Cadence requests made to the Cadence History service, per operation, in seconds.count_per_second ic_cadence_history_v2_cadence_client_retry_task_error_rate cadt::{tag}::historyV2PersistenceRequestRate Number of persistence requests made by the Cadence History service, per operation, per second.count_per_second ic_cadence_history_v2_persistence_request_rate cadt::{tag}::historyV2PersistenceErrorRate Number of internal errors from persistence requests made by the Cadence History service, per operation, per second.count_per_second ic_cadence_history_v2_persistence_error_rate cadt::{tag}::historyV2PersistenceLatency Latency of persistence requests made by the Cadence History service, per operation, in seconds.95thPercentile ic_cadence_history_v2_persistence_latency_seconds 50thPercentile ic_cadence_history_v2_persistence_latency_seconds cadt::{tag}::historyV2TaskRequestRate Number of task requests to the Cadence History service, per operation, per second.count_per_second ic_cadence_history_v2_task_request_rate cadt::{tag}::historyV2TaskErrorRate Number of errors from task requests to the Cadence History service, per operation, per second.count_per_second ic_cadence_history_v2_task_error_rate cadt::{tag}::historyV2TaskLatency Execution latency of tasks in the Cadence History service, per operation, in seconds.95thPercentile ic_cadence_history_v2_task_latency_seconds 50thPercentile ic_cadence_history_v2_task_latency_seconds cadt::{tag}::historyV2TaskLatencyQueue End-to-end latency of tasks in the Cadence History service, per operation, in seconds.95thPercentile ic_cadence_history_v2_task_latency_queue_seconds 50thPercentile ic_cadence_history_v2_task_latency_queue_seconds cadt::{tag}::historyV2TaskLatencyProcessing Processing latency of tasks in the Cadence History service, per operation, in seconds.95thPercentile ic_cadence_history_v2_task_latency_processing_seconds 50thPercentile ic_cadence_history_v2_task_latency_processing_seconds cadt::{tag}::historyV2WorkflowSuccess Number of successful workflows, per operation, per second.count_per_second ic_cadence_history_v2_workflow_success cadt::{tag}::historyV2WorkflowCancel Number of cancelled workflows, per operation, per second.count_per_second ic_cadence_history_v2_workflow_cancel cadt::{tag}::historyV2WorkflowFailed Number of failed workflows, per operation, per second.count_per_second ic_cadence_history_v2_workflow_failed cadt::{tag}::historyV2WorkflowTimeout Number of timed out workflows, per operation, per second.count_per_second ic_cadence_history_v2_workflow_timeout cadt::{tag}::historyV2WorkflowTerminate Number of terminated workflows, per operation, per second.count_per_second ic_cadence_history_v2_workflow_terminate cadt::{tag}::historyV2WorkflowFailedCount Number of failed workflows count.value ic_cadence_history_v2_workflow_failed_count cadt::{tag}::historyV2ReplicationTasksApplied Average Number of successfully applied replication tasks in the Cadence History service, per operation.count_per_second ic_cadence_history_v2_replication_tasks_applied cadt::{tag}::historyV2ReplicationTasksAppliedPerDomain Average Number of successfully applied replication tasks in the Cadence History service, per domain.count_per_second ic_cadence_history_v2_replication_tasks_applied_per_domain cadt::{tag}::historyV2ReplicationTasksAppliedLatency Latency from replication tasks being received to them being applied in the Cadence History service, in seconds.95thPercentile ic_cadence_history_v2_replication_tasks_applied_latency_seconds 50thPercentile ic_cadence_history_v2_replication_tasks_applied_latency_seconds cadt::{tag}::historyV2ReplicationTaskLatency Latency from replication tasks being created to them being applied in the Cadence History service, in seconds95thPercentile ic_cadence_history_v2_replication_task_latency_seconds 50thPercentile ic_cadence_history_v2_replication_task_latency_seconds cadt::{tag}::historyV2ReplicationTaskCleanupCount Average Number of cleaned up replication tasks after being acknowledged by the standby Cadence clusters in the Cadence History service, per operation.count_per_second ic_cadence_history_v2_replication_task_cleanup_count cadt::{tag}::historyV2ReplicationTaskCleanupFailed Average Number of replication tasks failed to be cleaned up after being acknowledged by the standby Cadence clusters in the Cadence History service, per operation.count_per_second ic_cadence_history_v2_replication_task_cleanup_failed cadt::{tag}::historyV2ReplicationDlqSize Size of the DLQ of replication tasks that could not be applied after retry in the Cadence History service, per operation.value ic_cadence_history_v2_replication_dlq_size cadt::{tag}::historyV2ReplicationDlqEnqueueFailed Average Number of replication tasks that could not be applied after retry and are failed to be put into DLQ in the Cadence History service, per operation.count_per_second ic_cadence_history_v2_replication_dlq_enqueue_failed cadt::{tag}::workerV2PersistenceRequestRate Number of persistence requests made by the Cadence Worker service, per operation, per second.count_per_second ic_cadence_worker_v2_persistence_request_rate cadt::{tag}::workerV2PersistenceErrorRate Number of internal errors from persistence requests made by the Cadence Worker service, per operation, per second.count_per_second ic_cadence_worker_v2_persistence_error_rate cadt::{tag}::workerV2PersistenceLatency Latency of persistence requests made by the Cadence Worker service, per operation, in seconds.95thPercentile ic_cadence_worker_v2_persistence_latency_seconds 50thPercentile ic_cadence_worker_v2_persistence_latency_seconds clk::slaAvgWriteLatency Average write latency for 20 writes.value ic_node_sla_avg_write_latency clk::slaAvgReadLatency Average read latency 20 reads.value ic_node_sla_avg_read_latency clk::slaWriteErrors Number of write request errors.value ic_node_sla_write_errors clk::slaReadErrors Number of read request errors.value ic_node_sla_read_errors clk::slaKeeperErrors Number of ClickHouse Keeper errors.value ic_node_sla_keeper_errors clk::rwLockWaitingReaders Number of threads waiting for read on a table RWLock.value ic_node_rw_lock_waiting_readers clk::rwLockWaitingWriters Number of threads waiting for write on a table RWLock.value ic_node_rw_lock_waiting_writers clk::merge Number of executing background merges.value ic_node_merge clk::readonlyReplica Number of Replicated tables that are currently in readonly state due to re-initialization after ZooKeeper session loss or due to startup without ZooKeeper configured.value ic_node_readonly_replica clk::query Number of executing queries.value ic_node_query clk::delayedInserts Number of INSERT queries that are throttled due to high number of active data parts for partition in a MergeTree table.value ic_node_delayed_inserts clk::s3Requests Number of S3 requests.value ic_node_s3_requests clk::distributedFilesToInsert Number of pending files to process for asynchronous insertion into Distributed tables.value ic_node_distributed_files_to_insert clk::keeperOutstandingRequests Number of outstanding ClickHouse Keeper requests.value ic_node_keeper_outstanding_requests clk::insertQueriesPerSecond Average number of insert queries per second over the last one minute.value ic_node_insert_queries_per_second clk::httpConnection Number of connections to HTTP server.value ic_node_http_connection clk::totalRows The total number of rows for all active parts.value ic_node_total_rows clk::pendingAsyncInsert Number of asynchronous inserts waiting to be flushed.value ic_node_pending_async_insert clk::osOpenFiles The total number of opened files on the host machine. This is a system-wide metric, it includes all the processes on the host machine, not just clickhouse-server.value ic_node_os_open_files clk::mergesInQueue The total number of merge operations that are waiting in queue.value ic_node_merges_in_queue clk::maxInactiveParts The maximum number of inactive partsvalue ic_node_max_inactive_parts clk::znodeCount The number of znodes in ClickHouse Keeper process.value ic_node_znode_count clk::totalPartsOfMergeTreeTables Total amount of data parts in all tables of MergeTree family. Numbers larger than 10 000 will negatively affect the server startup time, and it may indicate unreasonable choice of the partition key.value ic_node_total_parts_of_merge_tree_tables clk::totalRowsOfMergeTreeTables Total amount of rows (records) stored in all tables of MergeTree family.value ic_node_total_rows_of_merge_tree_tables clk::maxPartCountForPartition Maximum number of parts per partition across all partitions of all tables of MergeTree family. Values larger than 300 indicates misconfiguration, overload, or massive data loading.value ic_node_max_part_count_for_partition clk::replicasMaxAbsoluteDelay Maximum difference in seconds between the most fresh replicated part and the most fresh data part still to be replicated, across Replicated tables. A very high value indicates a replica with no data.value ic_node_replicas_max_absolute_delay clk::remoteStorageUsage Total amount of data stored in remote storage (such as AWS S3), in GiB.value ic_node_remote_storage_usage clk::markCacheBytes Total size of mark cache in bytes.value ic_node_mark_cache_bytes clk::markCacheHits Number of times an entry has been found in the mark cache, so we didn't have to load a mark file.value ic_node_mark_cache_hits clk::markCacheMisses Number of times an entry has not been found in the mark cache, so we had to load a mark file in memory, which is a costly operation, adding to query latency.value ic_node_mark_cache_misses clk::queryCacheBytes Total size of the query cache in bytes.value ic_node_query_cache_bytes clk::queryCacheHits Number of times a query result has been found in the query cache (and query computation was avoided). Only updated for SELECT queries with SETTING use_query_cache = 1.value ic_node_query_cache_hits clk::queryCacheMisses Number of times a query result has not been found in the query cache (and required query computation). Only updated for SELECT queries with SETTING use_query_cache = 1.value ic_node_query_cache_misses clk::uncompressedCacheBytes Total size of uncompressed cache in bytes. Uncompressed cache does not usually improve the performance and should be mostly avoided.value ic_node_uncompressed_cache_bytes clk::uncompressedCacheHits Number of times an entry has been found in the uncompressed cache.value ic_node_uncompressed_cache_hits clk::uncompressedCacheMisses Number of times an entry has not been found in the uncompressed cache.value ic_node_uncompressed_cache_misses Successfully retrieved monitoring results of metrics set.
Bad Request
Not Authorized
Forbidden
Resource not found
Unsupported media type: returned when the payload is in an unsupported format.
Too many requests: returned when more than 35 requests per second are being received by your user.
Broker Level Per-Topic Metrics (Cluster)
[- {
- "id": "694294d9-ea82-49c2-9f71-aacac81f0325",
- "payload": [
- {
- "metric": "messagesInPerTopic",
- "topic": "instaclustr-sla",
- "type": "mean_rate",
- "unit": "1",
- "values": [
- {
- "time": "2017-01-04T04:19:28.000Z",
- "value": "1.5051724911338817"
}
]
}
], - "privateIp": "10.0.0.1",
- "publicIp": "123.123.123.123",
- "rack": {
- "dataCentre": {
- "displayName": "AWS_VPC_US_EAST_1",
- "name": "US_EAST_1",
- "provider": "AWS_VPC",
- "uuid": null
}, - "name": "us-east-1a",
- "providerAccount": {
- "name": "INSTACLUSTR",
- "provider": "AWS_VPC"
}
}
}, - {
- "id": "4d848f48-5e24-41d6-81f2-44c2f578895f",
- "payload": [
- {
- "metric": "messagesInPerTopic",
- "topic": "instaclustr-sla",
- "type": "mean_rate",
- "unit": "1",
- "values": [
- {
- "time": "2017-01-04T04:19:28.000Z",
- "value": "1.4515722583651829"
}
]
}
], - "privateIp": "10.0.0.2",
- "publicIp": "123.123.123.124",
- "rack": {
- "dataCentre": {
- "displayName": "AWS_VPC_US_EAST_1",
- "name": "US_EAST_1",
- "provider": "AWS_VPC",
- "uuid": null
}, - "name": "us-east-1b",
- "providerAccount": {
- "name": "INSTACLUSTR",
- "provider": "AWS_VPC"
}
}
}, - {
- "id": "3bccad4b-087b-471d-8f24-0452edb86bf1",
- "payload": [
- {
- "metric": "messagesInPerTopic",
- "topic": "instaclustr-sla",
- "type": "mean_rate",
- "unit": "1",
- "values": [
- {
- "time": "2017-01-04T04:19:28.000Z",
- "value": "1.4708695545998745"
}
]
}
], - "privateIp": "10.0.0.3",
- "publicIp": "123.123.123.125",
- "rack": {
- "dataCentre": {
- "displayName": "AWS_VPC_US_EAST_1",
- "name": "US_EAST_1",
- "provider": "AWS_VPC",
- "uuid": null
}, - "name": "us-east-1c",
- "providerAccount": {
- "name": "INSTACLUSTR",
- "provider": "AWS_VPC"
}
}
}
]You can use this endpoint to list all the Cadence domains on the specified cluster.
Successfully retrieved the cluster's Cadence domains.
Bad Request
Not Authorized
Forbidden
Resource not found
Unsupported media type: returned when the payload is in an unsupported format.
Too many requests: returned when more than 35 requests per second are being received by your user.
[- "cadence_canary",
- "sample_domain"
]You can use this endpoint to list all the Cadence tags on the specified cluster.
Successfully retrieved the cluster's Cadence tags.
Bad Request
Not Authorized
Forbidden
Resource not found
Unsupported media type: returned when the payload is in an unsupported format.
Too many requests: returned when more than 35 requests per second are being received by your user.
{- "historyV2TaskLatency": [
- "domain=cadence_canary;operation=TimerActiveTaskUserTimer",
- "domain=cadence_canary;operation=TransferActiveTaskCloseExecution"
], - "matchingV2CadenceLatency": [
- "operation=PollForDecisionTask",
- "operation=AddDecisionTask",
- "operation=AddActivityTask"
]
}By making a GET request to this endpoint with cluster ID, you can get a list of monitored tables, grouped by keyspace.
Successfully retrieved a list of monitored tables. Return type: Map<String, List<String>>
Bad Request
Not Authorized
Forbidden
Resource not found
Unsupported media type: returned when the payload is in an unsupported format.
Too many requests: returned when more than 35 requests per second are being received by your user.
{- "keyspace1": [
- "standard1",
- "counter1",
- "Counter3"
], - "keyspace2": [
- "table2",
- "table1"
]
}By making a GET request to this endpoint with cluster ID, you can get a list of monitored indices.
Successfully retrieved a list of monitored indices
Bad Request
Not Authorized
Forbidden
Resource not found
Unsupported media type: returned when the payload is in an unsupported format.
Too many requests: returned when more than 35 requests per second are being received by your user.
[- "test_index_01",
- "test_index_02",
- "test_index_03"
]Cluster Health Indicator API provides a summary of indicators on the long-term health of your cluster. A detailed description of cluster health indicators can be found in this support article: https://www.instaclustr.com/support/documentation/monitoring-information/cluster-health-check/
Successfully retrieve cluster health indicators
Bad Request
Not Authorized
Forbidden
Resource not found
Unsupported media type: returned when the payload is in an unsupported format.
Too many requests: returned when more than 35 requests per second are being received by your user.
[- {
- "type": "DISK_USAGE",
- "stateDetails": {
- "PASS": [
- {
- "message": "",
- "privateIp": "10.224.145.126",
- "publicIp": "52.5.37.217"
}, - {
- "message": "",
- "privateIp": "10.224.80.183",
- "publicIp": "34.232.115.13"
}, - {
- "message": "",
- "privateIp": "10.224.9.122",
- "publicIp": "34.233.151.239"
}
]
}
}
]All metrics are reported under a consumer group and the consumed topic aggregated at a client level. A client within a consumer group is a logical grouping defined by setting the client.id configuration on a consumer.
Available Metrics:
consumerLag : defined as the sum of consumer lag reported by all consumers with the same client id.partitionCount : defined as the total number of partitions assigned to consumers with the same client id.consumerCount : defined as the total number of consumers with the same client id.Successfully retrieve consumer group client metrics.
Bad Request
Not Authorized
Forbidden
Resource not found
Unsupported media type: returned when the payload is in an unsupported format.
Too many requests: returned when more than 35 requests per second are being received by your user.
JSON response (no 'format' query parameter specified)
[- {
- "clientID": "client-2",
- "consumerGroup": "group-20",
- "payload": [
- {
- "metric": "consumerLag",
- "type": "count",
- "unit": "messages",
- "values": [
- {
- "time": "2019-09-17T11:38:59.000Z",
- "value": "30.0"
}
]
}, - {
- "metric": "consumerCount",
- "type": "count",
- "unit": "consumers",
- "values": [
- {
- "time": "2019-09-17T11:38:59.000Z",
- "value": "1.0"
}
]
}
], - "topic": "test1"
}
]All metrics are reported under a consumer group and the consumed topic aggregated at a group level.
consumerGroupLag : defined as the sum of consumer lag reported by all consumers within the consumer group.clientCount : defined as the total number of unique clients within the consumer group.Successfully retrieved consumer group metrics.
Bad Request
Not Authorized
Forbidden
Resource not found
Unsupported media type: returned when the payload is in an unsupported format.
Too many requests: returned when more than 35 requests per second are being received by your user.
JSON response (no 'format' query parameter specified)
[- {
- "consumerGroup": "group-20",
- "payload": [
- {
- "metric": "consumerGroupLag",
- "type": "count",
- "unit": "messages",
- "values": [
- {
- "time": "2019-09-17T11:52:45.000Z",
- "value": "30.0"
}
]
}, - {
- "metric": "clientCount",
- "type": "count",
- "unit": "clients",
- "values": [
- {
- "time": "2019-09-17T11:52:45.000Z",
- "value": "1.0"
}
]
}
], - "topic": "test1"
}
]Retrieve the information regarding the consumed topics and the clients for a specific consumer group.
Successfully retrieved consumer group state.
Bad Request
Not Authorized
Forbidden
Resource not found
Unsupported media type: returned when the payload is in an unsupported format.
Too many requests: returned when more than 35 requests per second are being received by your user.
{- "test-topic": [
- "client-1",
- "client-2"
]
}Retrieve the information regarding consumer group state, consumed topics and clients for consumer groups.
Successfully retrieved consumer group state.
Bad Request
Not Authorized
Forbidden
Resource not found
Unsupported media type: returned when the payload is in an unsupported format.
Too many requests: returned when more than 35 requests per second are being received by your user.
{- "itemsPerPage": 20,
- "resources": [
- {
- "consumerGroup": "KafkaConsumer-1",
- "consumerGroupClientDetails": {
- "instaclustr-sla": [
- "consumer-1"
]
}, - "consumerGroupState": "Stable"
}, - {
- "consumerGroup": "KafkaConsumer-2",
- "consumerGroupClientDetails": {
- "instaclustr-sla": [
- "consumer-1"
]
}, - "consumerGroupState": "Stable"
}, - {
- "consumerGroup": "KafkaConsumer-3",
- "consumerGroupClientDetails": {
- "instaclustr-sla": [
- "consumer-1"
]
}, - "consumerGroupState": "Stable"
}
], - "startIndex": 1,
- "totalResults": 3
}List Kafka consumer groups for a cluster.
Successfully retrieved all consumer groups.
Bad Request
Not Authorized
Forbidden
Resource not found
Unsupported media type: returned when the payload is in an unsupported format.
Too many requests: returned when more than 35 requests per second are being received by your user.
[- "KafkaConsumer-1",
- "KafkaConsumer-2",
- "KafkaConsumer-3",
- "group-10",
- "group-20"
]To request the same metrics for all topics, do not define the topic in the path. If the number of metrics retrieved by the query exceeds 20, the endpoint will paginate through the topics using the query parameter of pageNumber. Available Metrics:
topicMessageDistribution : Metrics derived by analysing the message distribution among partitions of a topic. Metrics will be reported for non internal topics only.
outliers : Number of partitions identified as outliers using the statistical method of MADe (reference). With the high and low fences defined by (median ± 2 * 1.4826 * MAD). The metric will also return a JSON array of outlier partitions and their message counts. This metric will be limited to periods of 1h or below for retrieval.standard_deviation : the population standard deviation of message distribution across partitions for the topicSuccessfully retrieved topic level metrics for all topics.
Bad Request
Not Authorized
Forbidden
Resource not found
Unsupported media type: returned when the payload is in an unsupported format.
Too many requests: returned when more than 35 requests per second are being received by your user.
JSON response (no 'format' query parameter specified)
[- {
- "payload": [
- {
- "metric": "topicMessageDistribution",
- "type": "standard_deviation",
- "unit": "1",
- "values": [
- {
- "time": "2020-07-02T06:28:58.000Z",
- "value": "5.23"
}
]
}, - {
- "metric": "topicMessageDistribution",
- "type": "outliers",
- "unit": "1",
- "values": [
- {
- "details": [
- {
- "count": 30,
- "partition": 1
}, - {
- "count": 0,
- "partition": 5
}
], - "time": "2020-07-02T06:28:58.000Z",
- "value": "2"
}
]
}
], - "topic": "instaclustr-sla"
}
]Retrieve topic metrics for a specific topic. Available Metrics:
topicMessageDistribution : Metrics derived by analysing the message distribution among partitions of a topic. Metrics will be reported for non internal topics only.
outliers : Number of partitions identified as outliers using the statistical method of MADe (reference). With the high and low fences defined by (median ± 2 * 1.4826 * MAD). The metric will also return a JSON array of outlier partitions and their message counts. This metric will be limited to periods of 1h or below for retrieval.standard_deviation : the population standard deviation of message distribution across partitions for the topicSuccessfully retrieved topic level metrics for a specific topic.
Bad Request
Not Authorized
Forbidden
Resource not found
Unsupported media type: returned when the payload is in an unsupported format.
Too many requests: returned when more than 35 requests per second are being received by your user.
JSON response (no 'format' query parameter specified)
[- {
- "payload": [
- {
- "metric": "topicMessageDistribution",
- "type": "standard_deviation",
- "unit": "1",
- "values": [
- {
- "time": "2020-07-02T06:28:58.000Z",
- "value": "5.23"
}
]
}, - {
- "metric": "topicMessageDistribution",
- "type": "outliers",
- "unit": "1",
- "values": [
- {
- "details": [
- {
- "count": 30,
- "partition": 1
}, - {
- "count": 0,
- "partition": 5
}
], - "time": "2020-07-02T06:28:58.000Z",
- "value": "2"
}
]
}
], - "topic": "instaclustr-sla"
}
]By making a GET request to this endpoint, you can get a list of monitored indices.
Successfully retrieved a list of monitored indices
Bad Request
Not Authorized
Forbidden
Resource not found
Unsupported media type: returned when the payload is in an unsupported format.
Too many requests: returned when more than 35 requests per second are being received by your user.
[- "test_index_01",
- "test_index_02",
- "test_index_03"
]Metrics information is provided with either for an individual node or for all nodes in a cluster and cluster data centre. The number of results displayed will depend on the startIndex and count parameter. For Kafka broker level topic metrics, this paged metrics also accepts wildcard character * in the place of unknown topics. The set of available metrics will expand as we build out this API.
The possible values for the metrics parameter is listed below:
n::cpuUtilization Current CPU utilisation as a percentage of total available.percentage ic_node_cpu_utilization n::osload Current OS load.last_one_minute Average metric value over 1 minute. ic_node_osload last_five_minutes Average metric value over 5 minutes. ic_node_osload last_fifteen_minutes Average metric value over 15 minutes. ic_node_osload n::diskUtilization Total disk space utilisation, by Cassandra, as a percentage of total available.percentage ic_node_disk_utilization n::diskAvailable Disk space available in bytesvalue ic_node_disk_available n::diskUsed Disk space used in bytesvalue ic_node_disk_used n::cpuguestpercent Time spent running a virtual CPU for guest OS’ under control of kernel.percentage ic_node_cpuguestpercent n::cpuguestnicepercent Niced processes executing in user mode in virtual OS.percentage ic_node_cpuguestnicepercent n::cpusystempercent Percentage of processes executing in kernel mode.percentage ic_node_cpusystempercent n::cpuidlepercent Percentage of time when one or more kernel threads are executing with the run queue empty and/or no I/O operations are currently cycling.percentage ic_node_cpuidlepercent n::cpuiowaitpercent CPU time the I/O thread spent waiting for a socket ready for reads or writes as a percent.percentage ic_node_cpuiowaitpercent n::cpuirqpercent Number of hardware interrupts the kernel is servicing.percentage ic_node_cpuirqpercent n::cpunicepercent Percentage of processes executing in user mode which have a positive nice value.percentage ic_node_cpunicepercent n::cpusoftirqpercent Number of software interrupts the kernel is servicing.percentage ic_node_cpusoftirqpercent n::cpustealpercent Percentage of time the hypervisor allocated to other tasks external to the one run on the current virtual CPUpercentage ic_node_cpustealpercent n::cpuuserpercent Processes executing in user mode, including application processes.percentage ic_node_cpuuserpercent n::memavailable Estimate of how much memory is available to start new applications without swap, taking into account page cache and re-claimability of slab.value ic_node_memavailable n::networkindelta Delta count of bytes received.value ic_node_networkindelta n::networkoutdelta Delta count of bytes transmitted.value ic_node_networkoutdelta n::networkin Count of bytes received.value ic_node_networkin n::networkout Count of bytes transmitted.value ic_node_networkout n::networkinerrorsdelta Delta count of receive errors detected.value ic_node_networkinerrorsdelta n::networkouterrorsdelta Delta count of transmit packets dropped.value ic_node_networkouterrorsdelta n::networkindroppeddelta Delta count of receive packets dropped.value ic_node_networkindroppeddelta n::networkoutdroppeddelta Delta count of transmit packets dropped.value ic_node_networkoutdroppeddelta n::filedescriptorlimit Maximum number of open files limit for the node OS.value ic_node_filedescriptorlimit n::filedescriptoropencount Current number of open files in the node OS.value ic_node_filedescriptoropencount n::tcpestablished Number of open TCP connections.value ic_node_tcpestablished n::tcptimewait Number of TCP sockets waiting for enough time to pass to be sure the remote TCP received the acknowledgment of its connection termination request.value ic_node_tcptimewait n::tcplistening Number of TCP sockets waiting for a connection request from any remote TCP and port.value ic_node_tcplistening n::tcpall Total number of TCP connections in all state.value ic_node_tcpall n::tcpclosewait Number of TCP sockets which connection is in the process of being closed.value ic_node_tcpclosewait Additional information on troubleshooting Cassandra metrics is available here.
n::compactions Number of pending compactions.pendingtasks Number of pending tasks. ic_node_compactions n::reads Reads per second by Cassandra. Returns single partition reads per second with count_per_second, and all reads (Single Partition + Multi Partition + CAS) per second with total_count_per_second.total_count_per_second ic_node_reads count_per_second ic_node_reads n::writes Writes per second by Cassandra. Returns writes per second with count_per_second and all writes (including CAS) per second with total_count_per_second.total_count_per_second ic_node_writes count_per_second ic_node_writes n::rangeSlices Range Slice reads by Cassandra.count_per_second ic_node_range_slices n::casReads Compare and Set reads by Cassandra.count_per_second ic_node_cas_reads n::casWrites Compare and Set writes by Cassandra.count_per_second ic_node_cas_writes n::clientRequestReadV2 Offers the percentile distribution and average latency per client read request (i.e. the period from when a node receives a client request, gathers the records and respond to the client).99thPercentile 99th percentile distribution of the metric ic_node_client_request_read_v2_microseconds latency_per_operation Average latency per operation. ic_node_client_request_read_v2 999thPercentile 99.9th percentile distribution of the metric ic_node_client_request_read_v2_microseconds 95thPercentile 95th percentile distribution of the metric ic_node_client_request_read_v2_microseconds n::clientRequestWrite Offers the percentile distribution and average latency per client write request (i.e. the period from when a node receives a client request, gathers the records and response to the client).latency_per_operation Average latency per operation. ic_node_client_request_write 99thPercentile 99th percentile distribution of the metric ic_node_client_request_write_microseconds 95thPercentile 95th percentile distribution of the metric ic_node_client_request_write_microseconds n::clientRequestRangeSlice Offers the percentile distribution and average latency per client range slice read request (i.e. the period from when a node receives a client request, gathers the records and response to the client).latency_per_operation Average latency per operation. ic_node_client_request_range_slice 99thPercentile 99th percentile distribution of the metric ic_node_client_request_range_slice_microseconds 95thPercentile 95th percentile distribution of the metric ic_node_client_request_range_slice_microseconds n::clientRequestCasRead Offers the percentile distribution and average latency per client CAS read request (i.e. the period from when a node receives a client request, gathers the records and response to the client).latency_per_operation Average latency per operation. ic_node_client_request_cas_read 99thPercentile 99th percentile distribution of the metric ic_node_client_request_cas_read_microseconds 95thPercentile 95th percentile distribution of the metric ic_node_client_request_cas_read_microseconds n::clientRequestCasWrite Offers the percentile distribution and average latency per client CAS write request (i.e. the period from when a node receives a client request, gathers the records and respond to the client).latency_per_operation Average latency per operation. ic_node_client_request_cas_write 99thPercentile 99th percentile distribution of the metric ic_node_client_request_cas_write_microseconds 95thPercentile 95th percentile distribution of the metric ic_node_client_request_cas_write_microseconds n::pausedConnections Monitors requests (back-pressure applied) from clients that have had their requests paused due to the node being overloaded from clients that have started with THROW_ON_OVERLOAD as default or set to False.value ic_node_paused_connections n::requestDiscarded Monitors requests discarded due to the node being overloaded from clients that have started with THROW_ON_OVERLOAD set to True.one_minute_rate One minute rate of the measured metric. ic_node_request_discarded count ic_node_request_discarded n::slalatency Monitors our SLA latency and alerts when it is above a threshold level.sla_write This is the synthetic write queries against an Instaclustr canary table. ic_node_slalatency_microseconds sla_read This is the synthetic read queries against an Instaclustr canary table. ic_node_slalatency_microseconds n::readstage The Read Stage metric represents Cassandra conducting reads from the local disk or cache.active_tasks_max Maximum number of active tasks. ic_node_readstage total_blocked_tasks_max Maximum number of blocked tasks in total. ic_node_readstage pending_tasks_max Maximum number of pending tasks. ic_node_readstage n::mutationstage The View Mutation Stage metric is responsible for materialised view writes.active_tasks_max Maximum number of active tasks. ic_node_mutationstage total_blocked_tasks_max Maximum number of blocked tasks in total. ic_node_mutationstage pending_tasks_max Maximum number of pending tasks. ic_node_mutationstage n::nativetransportrequest The Native Transport Request metric represents client CQL requests. If the requests are blocked by other Cassandra operations, this metric will display the abnormal values.currently_blocked_tasks_max Maximum number of currently blocked tasks. ic_node_nativetransportrequest active_tasks_max Maximum number of active tasks. ic_node_nativetransportrequest total_blocked_tasks_max Maximum number of blocked tasks in total. ic_node_nativetransportrequest total_blocked_tasks_per_second_max Maximum number of blocked tasks per second in total. ic_node_nativetransportrequest pending_tasks_max Maximum number of pending tasks. ic_node_nativetransportrequest total_blocked_tasks_differential Deprecated. ic_node_nativetransportrequest n::rpcthread The number of maximum concurrent requests from clients.currently_blocked_tasks_max Maximum number of currently blocked tasks. ic_node_rpcthread total_blocked_tasks_max Maximum number of blocked tasks in total. ic_node_rpcthread pending_tasks_max Maximum number of pending tasks. ic_node_rpcthread active_tasks_max Maximum number of active tasks. ic_node_rpcthread n::countermutationstage Responsible for materialized view writes.active_tasks_max Maximum number of active tasks. ic_node_countermutationstage total_blocked_tasks_max Maximum number of blocked tasks in total. ic_node_countermutationstage pending_tasks_max Maximum number of pending tasks. ic_node_countermutationstage n::viewmutationstage The View Mutation Stage metric is responsible for materialised view writes.active_tasks_max Maximum number of active tasks. ic_node_viewmutationstage total_blocked_tasks_max Maximum number of blocked tasks in total. ic_node_viewmutationstage pending_tasks_max Maximum number of pending tasks. ic_node_viewmutationstage n::droppedmessage The Dropped Messages metric represents the total number of dropped messages from all stages in the SEDA.differential_total_count Deprecated. ic_node_droppedmessage total_count ic_node_droppedmessage total_count_per_second_max Maximum total count per second. ic_node_droppedmessage n::hintsSucceeded Number of hints successfully delivered.count ic_node_hints_succeeded count_per_second_max Maximum count per second. ic_node_hints_succeeded differential_count Deprecated. ic_node_hints_succeeded n::hintsFailed Number of hints that failed delivery.count ic_node_hints_failed count_per_second_max Maximum count per second. ic_node_hints_failed differential_count Deprecated. ic_node_hints_failed n::hintsTimedOut Number of hints that timed out during deliverycount ic_node_hints_timed_out count_per_second_max Maximum count per second. ic_node_hints_timed_out differential_count Deprecated. ic_node_hints_timed_out n::hintsTotal Number of hint messages written to the node from the time Cassandra service starts.differential_value Deprecated. ic_node_hints_total value_per_second_max Maximum value per second. ic_node_hints_total value ic_node_hints_total n::load Size, in bytes, of the on disk data size this node manages.value ic_node_load_bytes n::offheapsizeallmemtables The total amount of data stored in the memtables including secondary indexes and pending flush memtables, that resides off-heap.value ic_node_offheapsizeallmemtables_bytes n::offheapsizememtable The total amount of data stored in the memtable that resides off-heap, including column related overhead and partitions overwritten.value ic_node_offheapsizememtable_bytes n::offheapmemoryusedbloomfilter The off-heap memory used by the bloom filtervalue ic_node_offheapmemoryusedbloomfilter_bytes n::offheapmemoryusedcompressionmetadata The off-heap memory used by compression metadata.value ic_node_offheapmemoryusedcompressionmetadata_bytes n::offheapmemoryusedindexsummary The off-heap memory used by the index summary.value ic_node_offheapmemoryusedindexsummary_bytes n::maxPartitionSize MaxPartitionSize is the size of the largest compacted partition where partition size is measured by the number of cells (values) that are stored in the partition.value ic_node_max_partition_size_bytes n::garbagecollectionparnewcollectioncount The total number of garbage collections that have occurred.count ic_node_garbagecollectionparnewcollectioncount n::garbagecollectionparnewcollectiontime The approximate accumulated garbage collection elapsed time.value ic_node_garbagecollectionparnewcollectiontime_milliseconds n::garbagecollectionparnewlastduration The elapsed time of the last garbage collection.value ic_node_garbagecollectionparnewlastduration_milliseconds n::garbagecollectiong1collectioncount The total number of garbage collections that have occurred.count ic_node_garbagecollectiong1collectioncount n::garbagecollectiong1collectiontime The approximate accumulated garbage collection elapsed time.value ic_node_garbagecollectiong1collectiontime_milliseconds n::garbagecollectiong1lastduration The elapsed time of the last garbage collection.value ic_node_garbagecollectiong1lastduration_milliseconds n::heapmemorycommitted The amount of memory that is committed for the Java Virtual Machine to use.value ic_node_heapmemorycommitted_bytes n::heapmemoryinit The amount of memory that the Java Virtual Machine initially requests from the operating system for memory management.value ic_node_heapmemoryinit_bytes n::heapmemorymax The maximum amount of memory that can be used for memory management.value ic_node_heapmemorymax_bytes n::heapmemoryused The amount of used memory.value ic_node_heapmemoryused_bytes n::schemaversioncount Number of active schema versions.value ic_node_schemaversioncount n::connectedNativeClients The number of connected clients to the Cassandra node.value ic_node_connected_native_clients n::readall Reads per second at the ALL consistency levelcount_per_second ic_node_readall n::readany Reads per second at the ANY consistency levelcount_per_second ic_node_readany n::readeachquorum Reads per second at the Each-Quorum consistency levelcount_per_second ic_node_readeachquorum n::readlocalone Reads per second at the Local-One consistency levelcount_per_second ic_node_readlocalone n::readlocalquorum Reads per second at the Local-Quorum consistency levelcount_per_second ic_node_readlocalquorum n::readlocalserial Reads per second at the Local-Serial consistency levelcount_per_second ic_node_readlocalserial n::readone Reads per second at the One consistency levelcount_per_second ic_node_readone n::readquorum Reads per second at the Quorum consistency levelcount_per_second ic_node_readquorum n::readserial Reads per second at the Serial consistency levelcount_per_second ic_node_readserial n::readthree Reads per second at the Three consistency levelcount_per_second ic_node_readthree n::readtwo Reads per second at the Two consistency levelcount_per_second ic_node_readtwo n::droppedMessageRead Reads that were dropped by the node.count_per_second ic_node_dropped_message_read n::writeall Write per second at the All consistency levelcount_per_second ic_node_writeall n::writeany Write per second at the Two consistency levelcount_per_second ic_node_writeany n::writeeachquorum Write per second at the Each Quorum consistency levelcount_per_second ic_node_writeeachquorum n::writelocalone Write per second at the Local One consistency levelcount_per_second ic_node_writelocalone n::writelocalquorum Writes per second at the Local Quorum consistency levelcount_per_second ic_node_writelocalquorum n::writelocalserial Writes per second at the Local Serial consistency levelcount_per_second ic_node_writelocalserial n::writeone Writes per second at the One consistency levelcount_per_second ic_node_writeone n::writequorum Writes per second at the Quorum consistency levelcount_per_second ic_node_writequorum n::writeserial Writes per second at the Serial consistency levelcount_per_second ic_node_writeserial n::writethree Writes per second at the Three consistency levelcount_per_second ic_node_writethree n::writetwo Writes per second at the Two consistency levelcount_per_second ic_node_writetwo n::droppedMessageMutation Writes that were dropped by the nodecount_per_second ic_node_dropped_message_mutation cf::{keyspace}::{table}::reads General measurements of local read latency for the table, on the individual node.latency_per_operation Average latency per operation. ic_table_reads count_per_second ic_table_reads cf::{keyspace}::{table}::writes General measurements of local write latency for the table, on the individual node.latency_per_operation Average latency per operation. ic_table_writes count_per_second ic_table_writes cf::{keyspace}::{table}::writeLatencyDistribution Metrics for local write latency for the table, on the individual node.50thPercentile 50th percentile distribution of the metric ic_table_write_latency_distribution_microseconds 99thPercentile 99th percentile distribution of the metric ic_table_write_latency_distribution_microseconds 75thPercentile 75th percentile distribution of the metric ic_table_write_latency_distribution_microseconds 95thPercentile 95th percentile distribution of the metric ic_table_write_latency_distribution_microseconds cf::{keyspace}::{table}::diskUsed Live and total disk used by the table.totaldiskspaceused Disk used by both live cells and tombstones ic_table_disk_used_bytes livediskspaceused Disk used by live cells. ic_table_disk_used_bytes cf::{keyspace}::{table}::sstablesPerRead SSTables accessed per read of the table on the individual node.average Average value of the metric. ic_table_sstables_per_read max Maximum value of the metric. ic_table_sstables_per_read cf::{keyspace}::{table}::liveCellsPerRead Live cells accessed per read of the table on the individual node.average Average value of the metric. ic_table_live_cells_per_read max Maximum value of the metric. ic_table_live_cells_per_read cf::{keyspace}::{table}::tombstonesPerRead Tombstoned cells accessed per read of the table on the individual node.average Average value of the metric. ic_table_tombstones_per_read max Maximum value of the metric. ic_table_tombstones_per_read cf::{keyspace}::{table}::partitionSize The size of partitions in the specified table in KB.average Average value of the metric. ic_table_partition_size max Maximum value of the metric. ic_table_partition_size cf::{keyspace}::{table}::offHeapSizeAllMemtables The total amount of data stored in the memtables including secondary indexes and pending flush memtables, that resides off-heap (in bytes).value ic_table_off_heap_size_all_memtables_bytes cf::{keyspace}::{table}::offHeapSizeMemtable The total amount of data stored in the memtable that resides off-heap, including column related overhead and partitions overwritten (in bytes).value ic_table_off_heap_size_memtable_bytes cf::{keyspace}::{table}::offHeapMemoryUsedBloomFilter The off-heap memory used by the bloom filter (in bytes).value ic_table_off_heap_memory_used_bloom_filter_bytes cf::{keyspace}::{table}::offHeapMemoryUsedCompressionMetadata The off-heap memory used by compression metadata (in bytes).value ic_table_off_heap_memory_used_compression_metadata_bytes cf::{keyspace}::{table}::offHeapMemoryUsedIndexSummary The off-heap memory used by the index summary (in bytes).value ic_table_off_heap_memory_used_index_summary_bytes cf::{keyspace}::{table}::estimatedPartitionCount The estimated count of partitions for a table.count ic_table_estimated_partition_count cf::{keyspace}::{table}::keyCacheHitRate The key cache hit rate for the specified table.percentage ic_table_key_cache_hit_rate value ic_table_key_cache_hit_rate cf::{keyspace}::{table}::readLatencyV2 Measurement of local read latency for the table, on the individual node.999thPercentile 99.9th percentile distribution of the metric ic_table_read_latency_v2_microseconds 95thPercentile 95th percentile distribution of the metric ic_table_read_latency_v2_microseconds 75thPercentile 75th percentile distribution of the metric ic_table_read_latency_v2_microseconds 99thPercentile 99th percentile distribution of the metric ic_table_read_latency_v2_microseconds count_per_second ic_table_read_latency_v2 50thPercentile 50th percentile distribution of the metric ic_table_read_latency_v2_microseconds latency_per_operation Average latency per operation. ic_table_read_latency_v2 cf::{keyspace}::{table}::sstablesPerReadDistribution SSTables accessed per read of the table on the individual node.99thPercentile 99th percentile distribution of the metric ic_table_sstables_per_read_distribution 95thPercentile 95th percentile distribution of the metric ic_table_sstables_per_read_distribution cf::{keyspace}::{table}::tombstonesPerReadDistribution Tombstoned cells accessed per read of the table on the individual node.99thPercentile 99th percentile distribution of the metric ic_table_tombstones_per_read_distribution 95thPercentile 95th percentile distribution of the metric ic_table_tombstones_per_read_distribution cf::{keyspace}::{table}::compressionRatio The ratio of compressed to uncompressed SSTable storage size, lower is better.value ic_table_compression_ratio hccsp::shotoverTransformFailuresCount The number of transform failures.value ic_node_shotover_transform_failures_count csp::shotoverTransformTotalCount The number of transforms used.value ic_node_shotover_transform_total_count csp::shotoverTransformPushedTotalCount The number of transforms used to process messages without a corresponding request (events).value ic_node_shotover_transform_pushed_total_count csp::shotoverTransformPushedFailuresCount The number of transform failures while processing messages without a corresponding request (events).value ic_node_shotover_transform_pushed_failures_count csp::shotoverTransformLatencySeconds0th 0th % latency for running the transform.value ic_node_shotover_transform_latency_seconds0th csp::shotoverTransformLatencySeconds50th 50th % latency for running the transform.value ic_node_shotover_transform_latency_seconds50th csp::shotoverTransformLatencySeconds90th 90th % latency for running the transform.value ic_node_shotover_transform_latency_seconds90th csp::shotoverTransformLatencySeconds95th 95th % latency for running the transform.value ic_node_shotover_transform_latency_seconds95th csp::shotoverTransformLatencySeconds99th 99th % latency for running the transform.value ic_node_shotover_transform_latency_seconds99th csp::shotoverTransformLatencySeconds999th 99.9th % latency for running the transform.value ic_node_shotover_transform_latency_seconds999th csp::shotoverTransformLatencySeconds100th 100th % latency for running the transform.value ic_node_shotover_transform_latency_seconds100th csp::shotoverTransformLatencySecondsCount The number of latency for running the transform.value ic_node_shotover_transform_latency_seconds_count csp::shotoverTransformLatencySecondsSum The sum of latency for running the transform.value ic_node_shotover_transform_latency_seconds_sum csp::shotoverTransformPushedLatencySeconds0th 0th % latency for running the transform on messages without a corresponding request (events).value ic_node_shotover_transform_pushed_latency_seconds0th csp::shotoverTransformPushedLatencySeconds50th 50th % latency for running the transform on messages without a corresponding request (events).value ic_node_shotover_transform_pushed_latency_seconds50th csp::shotoverTransformPushedLatencySeconds90th 90th % latency for running the transform on messages without a corresponding request (events).value ic_node_shotover_transform_pushed_latency_seconds90th csp::shotoverTransformPushedLatencySeconds95th 95th % latency for running the transform on messages without a corresponding request (events).value ic_node_shotover_transform_pushed_latency_seconds95th csp::shotoverTransformPushedLatencySeconds99th 99th % latency for running the transform on messages without a corresponding request (events).value ic_node_shotover_transform_pushed_latency_seconds99th csp::shotoverTransformPushedLatencySeconds999th 99.9th % latency for running the transform on messages without a corresponding request (events).value ic_node_shotover_transform_pushed_latency_seconds999th csp::shotoverTransformPushedLatencySeconds100th 100th % latency for running the transform on messages without a corresponding request (events).value ic_node_shotover_transform_pushed_latency_seconds100th csp::shotoverTransformPushedLatencySecondsCount The number of latency for running the transform on messages without a corresponding request (events).value ic_node_shotover_transform_pushed_latency_seconds_count csp::shotoverTransformPushedLatencySecondsSum The sum of latency for running the transform on messages without a corresponding request (events).value ic_node_shotover_transform_pushed_latency_seconds_sum csp::shotoverSourceToSinkLatencySeconds0th 0th % latency for running the transform from client to cluster.value ic_node_shotover_source_to_sink_latency_seconds0th csp::shotoverSourceToSinkLatencySeconds50th 50th % latency for running the transform from client to cluster.value ic_node_shotover_source_to_sink_latency_seconds50th csp::shotoverSourceToSinkLatencySeconds90th 90th % latency for running the transform from client to cluster.value ic_node_shotover_source_to_sink_latency_seconds90th csp::shotoverSourceToSinkLatencySeconds95th 95th % latency for running the transform from client to cluster.value ic_node_shotover_source_to_sink_latency_seconds95th csp::shotoverSourceToSinkLatencySeconds99th 99th % latency for running the transform from client to cluster.value ic_node_shotover_source_to_sink_latency_seconds99th csp::shotoverSourceToSinkLatencySeconds999th 99.9th % latency for running the transform from client to cluster.value ic_node_shotover_source_to_sink_latency_seconds999th csp::shotoverSourceToSinkLatencySeconds100th 100th % latency for running the transform from client to cluster.value ic_node_shotover_source_to_sink_latency_seconds100th csp::shotoverSourceToSinkLatencySecondsCount The number of latency for running the transform from client to cluster.value ic_node_shotover_source_to_sink_latency_seconds_count csp::shotoverSourceToSinkLatencySecondsSum The sum of latency for running the transform from client to cluster.value ic_node_shotover_source_to_sink_latency_seconds_sum csp::shotoverFailedRequestsCount The number of failed requests.value ic_node_shotover_failed_requests_count csp::shotoverOutOfRackRequestsCount The number of out of rack requests.value ic_node_shotover_out_of_rack_requests_count csp::shotoverAvailableConnectionsCount The number of available connections.value ic_node_shotover_available_connections_count csp::shotoverChainFailuresCount The number of chain failures.value ic_node_shotover_chain_failures_count csp::shotoverChainTotalCount The number of chains used.value ic_node_shotover_chain_total_count csp::shotoverSinkToSourceLatencySeconds0th 0th % latency for running the transform from cluster to client.value ic_node_shotover_sink_to_source_latency_seconds0th csp::shotoverSinkToSourceLatencySeconds50th 50th % latency for running the transform from cluster to client.value ic_node_shotover_sink_to_source_latency_seconds50th csp::shotoverSinkToSourceLatencySeconds90th 90th % latency for running the transform from cluster to client.value ic_node_shotover_sink_to_source_latency_seconds90th csp::shotoverSinkToSourceLatencySeconds95th 95th % latency for running the transform from cluster to client.value ic_node_shotover_sink_to_source_latency_seconds95th csp::shotoverSinkToSourceLatencySeconds99th 99th % latency for running the transform from cluster to client.value ic_node_shotover_sink_to_source_latency_seconds99th csp::shotoverSinkToSourceLatencySeconds999th 99.9th % latency for running the transform from cluster to client.value ic_node_shotover_sink_to_source_latency_seconds999th csp::shotoverSinkToSourceLatencySeconds100th 100th % latency for running the transform from cluster to client.value ic_node_shotover_sink_to_source_latency_seconds100th csp::shotoverSinkToSourceLatencySecondsCount The number of latency for running the transform from cluster to client.value ic_node_shotover_sink_to_source_latency_seconds_count csp::shotoverSinkToSourceLatencySecondsSum The sum of latency for running the transform from cluster to client.value ic_node_shotover_sink_to_source_latency_seconds_sum csp::shotoverChainMessagesPerBatchCount0th 0th % number of messages per batch.value ic_node_shotover_chain_messages_per_batch_count0th csp::shotoverChainMessagesPerBatchCount50th 50th % number of messages per batch.value ic_node_shotover_chain_messages_per_batch_count50th csp::shotoverChainMessagesPerBatchCount90th 90th % number of messages per batch.value ic_node_shotover_chain_messages_per_batch_count90th csp::shotoverChainMessagesPerBatchCount95th 95th % number of messages per batch.value ic_node_shotover_chain_messages_per_batch_count95th csp::shotoverChainMessagesPerBatchCount99th 99th % number of messages per batch.value ic_node_shotover_chain_messages_per_batch_count99th csp::shotoverChainMessagesPerBatchCount999th 99.9th % number of messages per batch.value ic_node_shotover_chain_messages_per_batch_count999th csp::shotoverChainMessagesPerBatchCount100th 100th % number of messages per batch.value ic_node_shotover_chain_messages_per_batch_count100th csp::shotoverChainMessagesPerBatchCountCount The number of messages per batch.value ic_node_shotover_chain_messages_per_batch_count_count csp::shotoverChainMessagesPerBatchCountSum The sum of number of messages per batch.value ic_node_shotover_chain_messages_per_batch_count_sum o::memused Percentage of used memory.value ic_node_memused o::docsCount Number of non-deleted documents in the segment. This number is based on Lucene documents and may include documents from nested fields.value ic_node_docs_count o::docsDeleted Number of deleted documents in the segment. This number is based on Lucene documents. Elasticsearch reclaims the disk space of deleted Lucene documents when a segment is merged.value ic_node_docs_deleted o::jvmheappercent Percentage of memory currently in use by the heap.value ic_node_jvmheappercent o::jvmthreadscount Number of active threads in use by JVM.value ic_node_jvmthreadscount o::indextotalpersec Indices per second.value ic_node_indextotalpersec o::querytotalpersec Queries per second.value ic_node_querytotalpersec o::indexlatency The latency of new indexing operations measured in milliseconds.value ic_node_indexlatency o::querylatency The latency of new query operations measured in milliseconds.value ic_node_querylatency o::slasearchlatency Monitors our SLA search latency and alerts when it is above a threshold level. This is the synthetic search query against an Instaclustr canary index.value ic_node_slasearchlatency o::slaindexlatency Monitors our SLA indexing latency and alerts when it is above a threshold level. This is the synthetic indexing against an Instaclustr canary index.value ic_node_slaindexlatency o::shardsCount Number of shards used per node.value ic_node_shards_count o::maxShards Maximum number of shards per node.value ic_node_max_shards op::ccr::leaderConnected Indicates the connection status of the connection between follower cluster and leader cluster.value ic_node_leader_connected op::ccr::followerCheckpoint Indicates the checkpoint at which the follower indices are at. This is a cumulative value across all replicating indices.value ic_node_follower_checkpoint op::ccr::leaderCheckpoint Indicates the checkpoint at which the leader indices are at. This is a cumulative value across all replicating indices.value ic_node_leader_checkpoint op::ccr::syncingIndicesCount Indicates the number of syncing/replicating indices.value ic_node_syncing_indices_count op::ccr::bootstrappingIndicesCount Indicates the number of indices which are at the stage of setting up replication.value ic_node_bootstrapping_indices_count op::ccr::pausedIndicesCount Indicates the number of replicating indices which are paused.value ic_node_paused_indices_count op::ccr::failedIndicesCount Indicates the number of failed replicating indices.value ic_node_failed_indices_count op::ccr::failedReadRequests Indicates the number of read requests failed during replication.value ic_node_failed_read_requests op::ccr::failedWriteRequests Indicates the number of write requests failed during replication.value ic_node_failed_write_requests op::ccr::throttledReadRequests Indicates the number of read requests throttled during replication.value ic_node_throttled_read_requests op::ccr::throttledWriteRequests Indicates the number of write requests throttled during replication.value ic_node_throttled_write_requests op::ccr::operationsWritten Indicates the number of operations written during replication.value ic_node_operations_written op::ccr::operationsRead Indicates the number of operations read during replication.value ic_node_operations_read op::ccr::autoFollowStartSuccess Indicates the number of successful auto follow replication attempts.value ic_node_auto_follow_start_success op::ccr::autoFollowStartFailed Indicates the number of failed auto follow replication attempts.value ic_node_auto_follow_start_failed op::ccr::autoFollowLeaderCallsFailed Indicates the number of failed replication calls to leader.value ic_node_auto_follow_leader_calls_failed e::memused Percentage of used memory.value ic_node_memused e::docsCount Number of non-deleted documents in the segment. This number is based on Lucene documents and may include documents from nested fields.value ic_node_docs_count e::docsDeleted Number of deleted documents in the segment. This number is based on Lucene documents. Elasticsearch reclaims the disk space of deleted Lucene documents when a segment is merged.value ic_node_docs_deleted e::jvmheappercent Percentage of memory currently in use by the heap.value ic_node_jvmheappercent e::jvmthreadscount Number of active threads in use by JVM.value ic_node_jvmthreadscount e::indextotalpersec Indices per second.value ic_node_indextotalpersec e::querytotalpersec Queries per second.value ic_node_querytotalpersec e::indexlatency The latency of new indexing operations measured in milliseconds.value ic_node_indexlatency e::querylatency The latency of new query operations measured in milliseconds.value ic_node_querylatency e::slasearchlatency Monitors our SLA search latency and alerts when it is above a threshold level. This is the synthetic search query against an Instaclustr canary index.value ic_node_slasearchlatency e::slaindexlatency Monitors our SLA indexing latency and alerts when it is above a threshold level. This is the synthetic indexing against an Instaclustr canary index.value ic_node_slaindexlatency k::activeControllerCount The number of active controllers on the node. In effect it is 0 or 1. The active controller of a cluster is usually the first node to start up in the cluster.value ic_node_active_controller_count k::offlinePartitions The number of partitions without an active leader. Any partitions that are offline will not be accessible since read and write operations are only performed on the leader of a partition.value ic_node_offline_partitions k::activeBrokerCount The number of registered and unfenced brokers.value ic_node_active_broker_count k::metadataErrorCount The number of times this controller node has encountered an error during metadata log processing.value ic_node_metadata_error_count k::lastCommittedRecordOffset The offset of the last record committed to this Controller. This is always advancing due to the NoOpRecord, and can be used to check cluster availability.value ic_node_last_committed_record_offset k::fencedBrokerCount The number of registered but fenced brokers.value ic_node_fenced_broker_count k::preferredReplicaImbalanceCount The count of topic partitions for which the leader is not the preferred leader.value ic_node_preferred_replica_imbalance_count k::brokerTopicMessagesIn The mean and one minute rate of incoming messages per second.one_minute_rate One minute rate of the measured metric. ic_node_broker_topic_messages_in mean_rate The average rate of the measured metric. ic_node_broker_topic_messages_in count ic_node_broker_topic_messages_in k::brokerTopicBytesIn The mean and one minute rate of incoming bytes to the cluster.one_minute_rate One minute rate of the measured metric. ic_node_broker_topic_bytes_in mean_rate The average rate of the measured metric. ic_node_broker_topic_bytes_in count ic_node_broker_topic_bytes_in k::brokerTopicBytesOut The mean and one minute rate of outgoing bytes from the cluster.one_minute_rate One minute rate of the measured metric. ic_node_broker_topic_bytes_out mean_rate The average rate of the measured metric. ic_node_broker_topic_bytes_out count ic_node_broker_topic_bytes_out k::leaderElectionRate The count, average, max, and one minute rate of leader elections per second.one_minute_rate One minute rate of the measured metric. ic_node_leader_election_rate max Maximum value of the metric. ic_node_leader_election_rate average Average value of the metric. ic_node_leader_election_rate count ic_node_leader_election_rate k::uncleanLeaderElections The number of failures to elect a suitable leader per second. In the case that no suitable leader can be chosen (ie. no available replicas are in sync), an out-of-sync replica will be elected as leader, resulting in data loss that is proportional to how out-of-sync the newly elected leader is.one_minute_rate One minute rate of the measured metric. ic_node_unclean_leader_elections mean_rate The average rate of the measured metric. ic_node_unclean_leader_elections count ic_node_unclean_leader_elections k::partitionLoadTimeAvg The average time of Consumer Group Coordinator to load the Commit Offset partition in 30 seconds interval. This is only available for Kafka 2.4.1+.ms ic_node_partition_load_time_avg_milliseconds k::partitionLoadTimeMax The maximum time of Consumer Group Coordinator to load the Commit Offset partition in 30 seconds interval. This is only available for Kafka 2.4.1+.ms ic_node_partition_load_time_max_milliseconds k::groupCompletedRebalanceCount The number of rebalancing operations triggered by a number of factors as the participants of the group change. The rebalancing leads to the reassignment of partitions across the consumers.value ic_node_group_completed_rebalance_count k::groupCompletedRebalanceRate The rate of rebalancing operations.value ic_node_group_completed_rebalance_rate k::replicaFetcherMaxLag The max message count lag between all fetchers/topics/partitions.value ic_node_replica_fetcher_max_lag k::replicaFetcherFailedPartitionsCount Increment count when partition truncation fails, storage exception is encountered, partition has older epoch than current leader or any other error encountered during fetch request. This is only available for Kafka 2.3.1+.value ic_node_replica_fetcher_failed_partitions_count k::replicaFetcherMinFetchRate The minimum number of messages fetched in one minute interval between all fetchers/topics/partitions.value ic_node_replica_fetcher_min_fetch_rate k::replicaFetcherDeadThreadCount The number of failed fetcher threads. This is only available for Kafka 2.4.1+.value ic_node_replica_fetcher_dead_thread_count k::partitionCount The number of partitions on a node. The number of partitions should be evenly distributed across all nodes in a cluster.value ic_node_partition_count k::isrShrinkRate The one minute rate, mean rate, and number of decreases in the number of In-Sync Replicas (ISR) per second. This metric is expected to change when adding or removing nodes from a cluster.one_minute_rate One minute rate of the measured metric. ic_node_isr_shrink_rate mean_rate The average rate of the measured metric. ic_node_isr_shrink_rate count ic_node_isr_shrink_rate k::isrExpandRate The one minute rate, mean rate, and number of increases in the number of In-Sync Replicas (ISR) per second. This metric is expected to change when adding or removing nodes from a cluster.one_minute_rate One minute rate of the measured metric. ic_node_isr_expand_rate mean_rate The average rate of the measured metric. ic_node_isr_expand_rate count ic_node_isr_expand_rate k::underMinIsrPartitions The number of partitions where the number of In-Sync Replicas (ISR) is less than the minimum number of in-sync replicas specified.value ic_node_under_min_isr_partitions k::underReplicatedPartitions The number of partitions that do not have enough replicas to meet the desired replication factor.value ic_node_under_replicated_partitions k::leaderCount The number of partitions that a node is a leader for. The number of partition leaders should be evenly distributed across all nodes in a cluster.value ic_node_leader_count k::kafkaBrokerState The current state of the broker represented as an Integer. Can be one of the following Integer values: value ic_node_kafka_broker_state k::produceRequestTime The count, average, 99th percentile distribution and max time taken to process requests from producers to send data. This is the sum of time spent waiting in request, time spent being processed by the leader, time spent waiting for follower response (if requests.required.acks = 1), and time taken to send the response.max ic_node_produce_request_time_milliseconds average ic_node_produce_request_time_milliseconds count ic_node_produce_request_time 99thPercentile 99th percentile distribution of time. ic_node_produce_request_time_milliseconds k::fetchConsumerRequestTime The count, average, 99th percentile distribution and max amount of time taken while processing, and the number of requests from consumers to get new data. This is the sum of time spent waiting in request, time spent being processed by the leader, time spent waiting for the leader to trigger sending the response (determined by fetch.min.bytes and fetch.wait.max.ms in the consumer configuration), and time taken to send the response.max ic_node_fetch_consumer_request_time_milliseconds average ic_node_fetch_consumer_request_time_milliseconds count ic_node_fetch_consumer_request_time 99thPercentile 99th percentile distribution of time. ic_node_fetch_consumer_request_time_milliseconds k::fetchFollowerRequestTime The count, average, and max amount of time taken while processing requests fromKafka brokers to get new data from partition leaders. This is the sum of time spent waiting in request, time spent being processed by the leader, and time taken to send the response.max ic_node_fetch_follower_request_time_milliseconds average ic_node_fetch_follower_request_time_milliseconds count ic_node_fetch_follower_request_time k::metadataRequestTime The 99th percentile distribution and max amount of time taken while processing requests from Kafka brokers to retrieve metadata. This is the sum of time spent waiting in request, time spent being processed by the leader, and time taken to send the response.max ic_node_metadata_request_time_milliseconds 99thPercentile 99th percentile distribution of time. ic_node_metadata_request_time_milliseconds k::produceRequestLocalTime The 99th percentile distribution and max amount of time taken by the leader to process requests from producers to send data.max ic_node_produce_request_local_time_milliseconds 99thPercentile 99th percentile distribution of time. ic_node_produce_request_local_time_milliseconds k::fetchConsumerRequestLocalTime The 99th percentile distribution and max amount of time spent being processed by the leader from consumer requests to get new data.max ic_node_fetch_consumer_request_local_time_milliseconds 99thPercentile 99th percentile distribution of time. ic_node_fetch_consumer_request_local_time_milliseconds k::metadataRequestLocalTime The 99th percentile distribution and max amount of time spent being processed by the leader while processing requests from Kafka brokers to retrieve metadata.max ic_node_metadata_request_local_time_milliseconds 99thPercentile 99th percentile distribution of time. ic_node_metadata_request_local_time_milliseconds k::produceRequestRemoteTime The 99th percentile distribution and max amount of time taken waiting for the follower to process requests from producers to send data.max ic_node_produce_request_remote_time_milliseconds 99thPercentile 99th percentile distribution of time. ic_node_produce_request_remote_time_milliseconds k::fetchConsumerRequestRemoteTime The 99th percentile distribution and max amount of time waiting for the follower from consumer requests to get new data.max ic_node_fetch_consumer_request_remote_time_milliseconds 99thPercentile 99th percentile distribution of time. ic_node_fetch_consumer_request_remote_time_milliseconds k::metadataRequestRemoteTime The 99th percentile distribution and max amount of time waiting for the follower while processing requests from Kafka brokers to retrieve metadata.max ic_node_metadata_request_remote_time_milliseconds 99thPercentile 99th percentile distribution of time. ic_node_metadata_request_remote_time_milliseconds k::produceRequestQueueTime The 99th percentile distribution and max amount of time the request waits in the request queue to process requests from producers to send data.max ic_node_produce_request_queue_time_milliseconds 99thPercentile 99th percentile distribution of time. ic_node_produce_request_queue_time_milliseconds k::fetchConsumerRequestQueueTime The 99th percentile distribution and max amount of time the request waits in the request queue from consumer requests to get new data.max ic_node_fetch_consumer_request_queue_time_milliseconds 99thPercentile 99th percentile distribution of time. ic_node_fetch_consumer_request_queue_time_milliseconds k::metadataRequestQueueTime The 99th percentile distribution and max amount of time the request waits in the request queue while processing requests from Kafka brokers to retrieve metadata.max ic_node_metadata_request_queue_time_milliseconds 99thPercentile 99th percentile distribution of time. ic_node_metadata_request_queue_time_milliseconds k::produceResponseQueueTime The 99th percentile distribution and max amount of time the request waits in the response queue to process requests from producers to send data.max ic_node_produce_response_queue_time_milliseconds 99thPercentile 99th percentile distribution of time. ic_node_produce_response_queue_time_milliseconds k::fetchConsumerResponseQueueTime The 99th percentile distribution and max amount of time the request waits in the response queue from consumer requests to get new data.max ic_node_fetch_consumer_response_queue_time_milliseconds 99thPercentile 99th percentile distribution of time. ic_node_fetch_consumer_response_queue_time_milliseconds k::metadataResponseQueueTime The 99th percentile distribution and max amount of time the request waits in the response queue while processing requests from Kafka brokers to retrieve metadata.max ic_node_metadata_response_queue_time_milliseconds 99thPercentile 99th percentile distribution of time. ic_node_metadata_response_queue_time_milliseconds k::producePurgatorySize The number of produce requests currently waiting in purgatory.value ic_node_produce_purgatory_size k::fetchPurgatorySize The number of fetch requests currently waiting in purgatory.value ic_node_fetch_purgatory_size k::networkProcessorAvgIdlePercent The average percentage of time the network processors are idle, expressed as a number between 0 and 1. Kafka’s network processor threads are responsible for reading and writing data to Kafka clients across the network.value ic_node_network_processor_avg_idle_percent k::requestHandlerAvgIdlePercent The average percentage of time Kafka’s request handler threads are idle, expressed as a number between 0 and 1. Kafka’s request handler threads are responsible for servicing client requests, including reading and writing messages to disk.one_minute_rate One minute rate of the measured metric. ic_node_request_handler_avg_idle_percent mean_rate The average rate of the measured metric. ic_node_request_handler_avg_idle_percent count ic_node_request_handler_avg_idle_percent k::produceMessageConversionsPerSec The one minute rate, mean rate, and number of produce requests per second that require message format conversion.one_minute_rate One minute rate of the measured metric. ic_node_produce_message_conversions_per_sec mean_rate The average rate of the measured metric. ic_node_produce_message_conversions_per_sec count ic_node_produce_message_conversions_per_sec k::fetchMessageConversionsPerSec The one minute rate, mean rate, and number of fetch requests per second that require message format conversion.one_minute_rate One minute rate of the measured metric. ic_node_fetch_message_conversions_per_sec mean_rate The average rate of the measured metric. ic_node_fetch_message_conversions_per_sec count ic_node_fetch_message_conversions_per_sec k::slaConsumerLatency The average and maximum time in milliseconds between a synthetic transaction message being sent by the producer and being received by the consumer.average Average value of the metric. ic_node_sla_consumer_latency max Maximum value of the metric. ic_node_sla_consumer_latency k::slaConsumerRecordsProcessed The number of synthetic transaction messages being successfully consumed and processed on each broker.count ic_node_sla_consumer_records_processed k::slaProducerLatencyMs The average and maximum time taken in milliseconds to send a synthetic transaction message to each broker that is successfully replicated to the required number of minimum in-sync replicas.average Average value of the metric. ic_node_sla_producer_latency_ms max Maximum value of the metric. ic_node_sla_producer_latency_ms k::slaProducerMessagesProcessed The number of synthetic transaction messages being successfully produced to each broker.count ic_node_sla_producer_messages_processed k::slaProducerErrors The number of errors encountered when producing synthetic transaction messages.count ic_node_sla_producer_errors k::youngGenLastGC Time taken for GC to run young generation during the latest event.value ic_node_young_gen_last_g_c k::oldGengcCollectionTime Total time taken for GC to run old generation.value ic_node_old_gengc_collection_time k::youngGengcCollectionTime Total time taken for GC to run young generation.value ic_node_young_gengc_collection_time k::logFlushRate The total count, one minute rate and mean rate of Kafka log flush.one_minute_rate One minute rate of the measured metric. ic_node_log_flush_rate mean_rate The average rate of the measured metric. ic_node_log_flush_rate count ic_node_log_flush_rate k::logFlushTime The average time and maximum time of Kafka log flush.max ic_node_log_flush_time_milliseconds average ic_node_log_flush_time_milliseconds k::produceRequestsPerSec The one minute rate, mean rate, and number of produce requests, since the beginning of program running. This only works for period below 3h.count ic_node_produce_requests_per_sec mean_rate ic_node_produce_requests_per_sec one_minute_rate ic_node_produce_requests_per_sec k::fetchConsumerRequestsPerSec The one minute rate, mean rate, and number of requests from consumer requests to get new data, since the beginning of program running. This only works for period below 3h.count ic_node_fetch_consumer_requests_per_sec mean_rate ic_node_fetch_consumer_requests_per_sec one_minute_rate ic_node_fetch_consumer_requests_per_sec k::fetchFollowerRequestsPerSec The one minute rate, mean rate, and number of requests from Kafka brokers to get new data from partition leaders, since the beginning of program running. This only works for period below 3h.count ic_node_fetch_follower_requests_per_sec mean_rate ic_node_fetch_follower_requests_per_sec one_minute_rate ic_node_fetch_follower_requests_per_sec k::controlPlaneNetworkProcessorAvgIdlePercent Monitoring the idle percentage of pinned control plane network thread.value ic_node_control_plane_network_processor_avg_idle_percent k::brokerFetcherLagConsumerLag The lag in the number of messages per follower replica aggregated at a broker level. Please note that brokers would not report this metric if it is not following a partition. For example all topics in the cluster is created with a replication factor of 1.count ic_node_broker_fetcher_lag_consumer_lag k::metadataApplyErrorCount The number of errors encountered by the BrokerMetadataPublisher while applying a new MetadataImage based on the latest MetadataDelta.value ic_node_metadata_apply_error_count k::metadataLoadErrorCount The number of errors encountered by the BrokerMetadataListener while loading the metadata log and generating a new MetadataDelta based on it.value ic_node_metadata_load_error_count k::commitLatencyAvg The average time in milliseconds to commit an entry in the raft log.ms ic_node_commit_latency_avg_milliseconds k::commitLatencyMax The maximum time in milliseconds to commit an entry in the raft log.ms ic_node_commit_latency_max_milliseconds k::appendRecordsRate The average number of records appended per sec by the leader of the raft quorum.one_minute_rate One minute rate of the measured metric. ic_node_append_records_rate mean_rate The average rate of the measured metric. ic_node_append_records_rate count ic_node_append_records_rate k::electionLatencyMax The maximum time in milliseconds spent on electing a new leader.ms ic_node_election_latency_max_milliseconds k::electionLatencyAvg The average time in milliseconds spent on electing a new leader.ms ic_node_election_latency_avg_milliseconds k::pollIdleRatioAvg The average fraction of time the client's poll() is idle as opposed to waiting for the user code to process records.value ic_node_poll_idle_ratio_avg k::currentState The current state of this member; possible values are leader, candidate, voted, follower, unattached.state ic_node_current_state k::currentStateKraft The current state of this member; possible values are leader, candidate, voted, follower, unattached.state ic_node_current_state_kraft k::highWatermark The high watermark maintained on this member; -1 if it is unknown.value ic_node_high_watermark k::currentLeader The current quorum leader's id; -1 indicates unknown.value ic_node_current_leader k::logEndOffset The current raft log end offset.value ic_node_log_end_offset k::fetchRecordsRate The average number of records fetched from the leader of the raft quorum.one_minute_rate One minute rate of the measured metric. ic_node_fetch_records_rate mean_rate The average rate of the measured metric. ic_node_fetch_records_rate count ic_node_fetch_records_rate k::currentEpoch The current quorum epoch.value ic_node_current_epoch k::globalPartitionCount The number of global partitions according to this Controller.value ic_node_global_partition_count k::globalTopicCount The number of global topics according to this Controller.value ic_node_global_topic_count k::lastAppliedRecordLagMs The difference between current time and the timestamp in milliseconds of the last record from the cluster metadata partition applied by this Controller.value ic_node_last_applied_record_lag_ms_milliseconds k::lastAppliedRecordOffset The offset of the last record from the cluster metadata partition applied by this Controller.value ic_node_last_applied_record_offset k::lastAppliedRecordTimestamp The timestamp in milliseconds of the last record from the cluster metadata partition applied by this Controller.value ic_node_last_applied_record_timestamp k::newActiveControllersCount Counts the number of times this node has seen a new controller elected. A transition to the "no leader" state is not counted here. If the same controller as before becomes active, that still counts. NOTE: This metric is for kraft onlyvalue ic_node_new_active_controllers_count k::timedOutBrokerHeartbeatCount The number of broker heartbeats that timed out on this controller since the process was started. Note that only active controllers handle heartbeats, so only they will see increases in this metric. NOTE: This metric is for kraft onlyvalue ic_node_timed_out_broker_heartbeat_count k::currentMetadataVersion Outputs the feature level of the current effective metadata version. NOTE: This metric is for kraft onlyvalue ic_node_current_metadata_version k::currentControllerId The CurrentControllerId metric shows the ID of the controller, as seen by the node in question. If the current node doesn't think there is an active controller, the value of this metric will be -1. NOTE: This metric is for kraft onlyvalue ic_node_current_controller_id k::remoteLogReaderTaskQueueSize Size of the queue holding remote storage read tasks value ic_node_remote_log_reader_task_queue_size k::remoteLogReaderAvgIdlePercent Average idle percent of thread pool for processing remote storage read tasks.value ic_node_remote_log_reader_avg_idle_percent k::remoteLogManagerTasksAvgIdlePercent Average idle percent of thread pool for copying data to remote storage. value ic_node_remote_log_manager_tasks_avg_idle_percent k::expiresPerSec The number of expired remote fetches per second. mean_rate The average rate of the measured metric. ic_node_expires_per_sec one_minute_rate One minute rate of the measured metric. ic_node_expires_per_sec k::activeControllerCountKraft The number of active controllers on the node. In effect it is 0 or 1. The active controller of a cluster is usually the first node to start up in the cluster.value ic_node_active_controller_count_kraft k::offlinePartitionsKraft The number of partitions without an active leader. Any partitions that are offline will not be accessible since read and write operations are only performed on the leader of a partition.value ic_node_offline_partitions_kraft k::activeBrokerCountKraft The number of registered and unfenced brokers.value ic_node_active_broker_count_kraft k::metadataErrorCountKraft The number of times this controller node has encountered an error during metadata log processing.value ic_node_metadata_error_count_kraft k::lastCommittedRecordOffsetKraft The offset of the last record committed to this Controller. This is always advancing due to the NoOpRecord, and can be used to check cluster availability.value ic_node_last_committed_record_offset_kraft k::fencedBrokerCountKraft The number of registered but fenced brokers.value ic_node_fenced_broker_count_kraft k::preferredReplicaImbalanceCountKraft The count of topic partitions for which the leader is not the preferred leader.value ic_node_preferred_replica_imbalance_count_kraft k::globalPartitionCountKraft The number of global partitions according to this Controller.value ic_node_global_partition_count_kraft k::globalTopicCountKraft The number of global topics according to this Controller.value ic_node_global_topic_count_kraft k::lastAppliedRecordLagMsKraft The difference between current time and the timestamp in milliseconds of the last record from the cluster metadata partition applied by this Controller.value ic_node_last_applied_record_lag_ms_kraft_milliseconds k::lastAppliedRecordOffsetKraft The offset of the last record from the cluster metadata partition applied by this Controller.value ic_node_last_applied_record_offset_kraft k::lastAppliedRecordTimestampKraft The timestamp in milliseconds of the last record from the cluster metadata partition applied by this Controller.value ic_node_last_applied_record_timestamp_kraft k::newActiveControllersCountKraft Counts the number of times this node has seen a new controller elected. A transition to the "no leader" state is not counted here. If the same controller as before becomes active, that still counts. NOTE: This metric is for kraft onlyvalue ic_node_new_active_controllers_count_kraft k::timedOutBrokerHeartbeatCountKraft The number of broker heartbeats that timed out on this controller since the process was started. Note that only active controllers handle heartbeats, so only they will see increases in this metric. NOTE: This metric is for kraft onlyvalue ic_node_timed_out_broker_heartbeat_count_kraft k::commitLatencyAvgKraft The average time in milliseconds to commit an entry in the raft log.ms ic_node_commit_latency_avg_kraft_milliseconds k::commitLatencyMaxKraft The maximum time in milliseconds to commit an entry in the raft log.ms ic_node_commit_latency_max_kraft_milliseconds k::appendRecordsRateKraft The average number of records appended per sec by the leader of the raft quorum.one_minute_rate One minute rate of the measured metric. ic_node_append_records_rate_kraft mean_rate The average rate of the measured metric. ic_node_append_records_rate_kraft count ic_node_append_records_rate_kraft k::electionLatencyMaxKraft The maximum time in milliseconds spent on electing a new leader.ms ic_node_election_latency_max_kraft_milliseconds k::electionLatencyAvgKraft The average time in milliseconds spent on electing a new leader.ms ic_node_election_latency_avg_kraft_milliseconds k::pollIdleRatioAvgKraft The average fraction of time the client's poll() is idle as opposed to waiting for the user code to process records.value ic_node_poll_idle_ratio_avg_kraft k::currentStateKraft The current state of this member; possible values are leader, candidate, voted, follower, unattached.state ic_node_current_state_kraft k::highWatermarkKraft The high watermark maintained on this member; -1 if it is unknown.value ic_node_high_watermark_kraft k::currentLeaderKraft The current quorum leader's id; -1 indicates unknown.value ic_node_current_leader_kraft k::logEndOffsetKraft The current raft log end offset.value ic_node_log_end_offset_kraft k::fetchRecordsRateKraft The average number of records fetched from the leader of the raft quorum.one_minute_rate One minute rate of the measured metric. ic_node_fetch_records_rate_kraft mean_rate The average rate of the measured metric. ic_node_fetch_records_rate_kraft count ic_node_fetch_records_rate_kraft k::currentEpochKraft The current quorum epoch.value ic_node_current_epoch_kraft k::logFlushRateKraft The total count, one minute rate and mean rate of Kafka log flush.one_minute_rate One minute rate of the measured metric. ic_node_log_flush_rate_kraft mean_rate The average rate of the measured metric. ic_node_log_flush_rate_kraft count ic_node_log_flush_rate_kraft k::logFlushTimeKraft The average time and maximum time of Kafka log flush.max ic_node_log_flush_time_kraft_milliseconds average ic_node_log_flush_time_kraft_milliseconds k::leaderElectionRateKraft The count, average, max, and one minute rate of leader elections per second.one_minute_rate One minute rate of the measured metric. ic_node_leader_election_rate_kraft max Maximum value of the metric. ic_node_leader_election_rate_kraft average Average value of the metric. ic_node_leader_election_rate_kraft count ic_node_leader_election_rate_kraft k::uncleanLeaderElectionsKraft The number of failures to elect a suitable leader per second. In the case that no suitable leader can be chosen (ie. no available replicas are in sync), an out-of-sync replica will be elected as leader, resulting in data loss that is proportional to how out-of-sync the newly elected leader is.one_minute_rate One minute rate of the measured metric. ic_node_unclean_leader_elections_kraft mean_rate The average rate of the measured metric. ic_node_unclean_leader_elections_kraft count ic_node_unclean_leader_elections_kraft k::youngGenLastGCKraft Time taken for GC to run young generation during the latest event.value ic_node_young_gen_last_g_c_kraft k::oldGengcCollectionTimeKraft Total time taken for GC to run old generation.value ic_node_old_gengc_collection_time_kraft k::youngGengcCollectionTimeKraft Total time taken for GC to run young generation.value ic_node_young_gengc_collection_time_kraft Per-topic metric names follow the format kt::{topic}::{metricName}. Optionally, a ‘sub-type’ may be specified to return a specific part of the metric - kt::{topic}::{metricName}:{subType}
kt::{topic}::messagesInPerTopic The rate of messages received by the topic. One sub-type must be specified.one_minute_rate One minute rate of the measured metric. ic_topic_messages_in_per_topic mean_rate The average rate of the measured metric. ic_topic_messages_in_per_topic kt::{topic}::bytesInPerTopic The rate of incoming bytes to the topic per second. One sub-type must be specified.one_minute_rate One minute rate of the measured metric. ic_topic_bytes_in_per_topic mean_rate The average rate of the measured metric. ic_topic_bytes_in_per_topic kt::{topic}::bytesOutPerTopic The rate of outgoing bytes from the topic. One sub-type must be specified.one_minute_rate One minute rate of the measured metric. ic_topic_bytes_out_per_topic mean_rate The average rate of the measured metric. ic_topic_bytes_out_per_topic kt::{topic}::fetchMessageConversionsPerTopic The amount and rate of fetch request messages which required message format conversions for the topic. One sub-type must be specified.one_minute_rate One minute rate of the measured metric. ic_topic_fetch_message_conversions_per_topic mean_rate The average rate of the measured metric. ic_topic_fetch_message_conversions_per_topic count ic_topic_fetch_message_conversions_per_topic kt::{topic}::produceMessageConversionsPerTopic The amount and rate of produce request messages which required message format conversions for the topic. One sub-type must be specified.one_minute_rate One minute rate of the measured metric. ic_topic_produce_message_conversions_per_topic mean_rate The average rate of the measured metric. ic_topic_produce_message_conversions_per_topic count ic_topic_produce_message_conversions_per_topic kt::{topic}::failedFetchMessagePerTopic The amount and rate of failed fetch requests to the topic. One sub-type must be specified.one_minute_rate One minute rate of the measured metric. ic_topic_failed_fetch_message_per_topic mean_rate The average rate of the measured metric. ic_topic_failed_fetch_message_per_topic count ic_topic_failed_fetch_message_per_topic kt::{topic}::failedProduceMessagePerTopic The amount and rate of failed produce requests to the topic. One sub-type must be specified.one_minute_rate One minute rate of the measured metric. ic_topic_failed_produce_message_per_topic mean_rate The average rate of the measured metric. ic_topic_failed_produce_message_per_topic count ic_topic_failed_produce_message_per_topic kt::{topic}::diskUsage The total size fo the files on disk associated with the topic, summed across all partitions.disk_usage_kilobytes The total size of the files on disk associated with the topic, summed across all partitions. ic_topic_disk_usage kt::{topic}::remoteCopyLagBytes Bytes which are eligible for tiering, but are not in remote storage yet.value ic_topic_remote_copy_lag_bytes kt::{topic}::remoteDeleteLagBytes Tiered bytes which are eligible for deletion, but have not been deleted yet.value ic_topic_remote_delete_lag_bytes kt::{topic}::remoteLogSizeBytes The total size of a remote log in bytes.value ic_topic_remote_log_size_bytes kt::{topic}::remoteFetchBytesPerSecPerTopic Rate of bytes read from remote storage per topic. mean_rate The average rate of the measured metric. ic_topic_remote_fetch_bytes_per_sec_per_topic one_minute_rate One minute rate of the measured metric. ic_topic_remote_fetch_bytes_per_sec_per_topic kt::{topic}::remoteFetchRequestsPerSecPerTopic Rate of read requests from remote storage per topic. mean_rate The average rate of the measured metric. ic_topic_remote_fetch_requests_per_sec_per_topic one_minute_rate One minute rate of the measured metric. ic_topic_remote_fetch_requests_per_sec_per_topic kt::{topic}::remoteFetchErrorsPerSecPerTopic Rate of read errors from remote storage per topic.mean_rate The average rate of the measured metric. ic_topic_remote_fetch_errors_per_sec_per_topic one_minute_rate One minute rate of the measured metric. ic_topic_remote_fetch_errors_per_sec_per_topic kt::{topic}::remoteCopyBytesPerSecPerTopic Rate of bytes copied to remote storage per topic. mean_rate The average rate of the measured metric. ic_topic_remote_copy_bytes_per_sec_per_topic one_minute_rate One minute rate of the measured metric. ic_topic_remote_copy_bytes_per_sec_per_topic kt::{topic}::remoteCopyRequestsPerSecPerTopic Rate of write requests to remote storage per topic. mean_rate The average rate of the measured metric. ic_topic_remote_copy_requests_per_sec_per_topic one_minute_rate One minute rate of the measured metric. ic_topic_remote_copy_requests_per_sec_per_topic kt::{topic}::remoteCopyErrorsPerSecPerTopic Rate of write errors from remote storage per topic.mean_rate The average rate of the measured metric. ic_topic_remote_copy_errors_per_sec_per_topic one_minute_rate One minute rate of the measured metric. ic_topic_remote_copy_errors_per_sec_per_topic Per-user metric names follow the format ku::{user}::{metricName}. Per-user metric can take up to 50 minutes to be refreshed in case of user removal or user becoming idle. Optionally, a ‘sub-type’ may be specified to return a specific part of the metric - ku::{user}::{metricName}:{subType}
ku::{user}::produceBandwidthQuotaPerUser Bandwidth quota metrics (produce) per userbyte_rate ic_user_produce_bandwidth_quota_per_user throttle_time ic_user_produce_bandwidth_quota_per_user ku::{user}::fetchBandwidthQuotaPerUser Bandwidth quota metrics (fetch) per userbyte_rate ic_user_fetch_bandwidth_quota_per_user throttle_time ic_user_fetch_bandwidth_quota_per_user kc::taskCount Number of tasks currently assigned to each worker node.value ic_node_task_count kc::connectorCount Number of connectors currently assigned to each worker node.value ic_node_connector_count kc::connectorStartupAttemptsTotal Number of times a connector has been instructed to start on each worker node.value ic_node_connector_startup_attempts_total kc::connectorStartupFailurePercentage Percentage of connecter start-up attempts that have failed to complete.percentage ic_node_connector_startup_failure_percentage kc::connectorStartupFailureTotal Number of times a connector has been instructed to start and failed to do so.value ic_node_connector_startup_failure_total kc::connectorStartupSuccessPercentage Percentage of connecter start-up attempts that have successfully completed.percentage ic_node_connector_startup_success_percentage kc::connectorStartupSuccessTotal Number of times a connector has been instructed to start and has succeeded in doing so.value ic_node_connector_startup_success_total kc::taskStartupAttemptsTotal Number of times a task has been instructed to start on each worker node.value ic_node_task_startup_attempts_total kc::taskStartupFailurePercentage Percentage of task start-up attempts that have failed to complete.percentage ic_node_task_startup_failure_percentage kc::taskStartupFailureTotal Number of times a task has been instructed to start and failed to do so.value ic_node_task_startup_failure_total kc::taskStartupSuccessPercentage Percentage of task start-up attempts that have successfully completed.percentage ic_node_task_startup_success_percentage kc::taskStartupSuccessTotal Number of times a task has been instructed to start and has succeeded in doing so.value ic_node_task_startup_success_total kc::leaderName Identity of the current leader worker node. Typically this is the IP address of the leader.state ic_node_leader_name kc::isLeader Monitors the number of worker nodes which believe it is the leader for the Kafka Connect cluster.value ic_node_is_leader kc::completedRebalancesTotal Number of rebalances that have completed since Kafka Connect has started (per node).value ic_node_completed_rebalances_total kc::epoch Monotonically increasing number that indicates the current state of assigned tasks. Will increase by one for each completed rebalance.value ic_node_epoch kc::timeSinceLastRebalanceMs Time since the last successful rebalance that each node participated in (per node, in milliseconds).ms ic_node_time_since_last_rebalance_ms_milliseconds kc::rebalanceAvgTimeMs The average time each rebalance has taken to complete (per node, in milliseconds).ms ic_node_rebalance_avg_time_ms_milliseconds kc::rebalanceMaxTimeMs The maximum time each rebalance has taken to complete (per node, in milliseconds).ms ic_node_rebalance_max_time_ms_milliseconds kc::rebalancing Whether or not the worked is currently rebalancing (per node).value ic_node_rebalancing kc::restApiAvailable Whether or not the Kafka Connect REST API is currently available.value ic_node_rest_api_available kc::latencyRecordsProcessed The number of messages processed to produce the latencyMedianMs measure. Only available if attached to an Instaclustr managed Kafka cluster.value ic_node_latency_records_processed kc::latencyMedianMs The time taken from a record being produced on the connected Kafka Cluster to it being read on the Kafka Connect cluster. Measured using synthetic messages. Only available if attached to an Instaclustr managed Kafka cluster.ms ic_node_latency_median_ms_milliseconds kc::customConnectorLoadStatus The result of loading custom connectors from external source. Can be one of FAILED, SUCCEEDED, UNDEFINED. The value is UNDEFINED when the cluster does not have any custom connector or due to an error while collecting the metrics.state ic_node_custom_connector_load_status Task General, Task Error, Sink Task and Source Task metrics are listed below:
kct::<connector-name>::<task-id>::batchSizeAvg The average size of the batches processed by the connector.value ic_connector_task_batch_size_avg kct::<connector-name>::<task-id>::offsetCommitAvgTimeMs The average time in milliseconds taken by this task to commit offsets.ms ic_connector_task_offset_commit_avg_time_ms_milliseconds kct::<connector-name>::<task-id>::offsetCommitFailurePercentage The average percentage of this task’s offset commit attempts that failed.percentage ic_connector_task_offset_commit_failure_percentage kct::<connector-name>::<task-id>::pauseRatio The fraction of time this task has spent in the pause state.value ic_connector_task_pause_ratio kct::<connector-name>::<task-id>::status The status of the connector task. Can be of ‘unassigned’, ‘running’, ‘paused’ or ‘failed’.state ic_connector_task_status kct::<connector-name>::<task-id>::deadletterqueueProduceFailures The number of failed writes to the dead letter queue.value ic_connector_task_deadletterqueue_produce_failures kct::<connector-name>::<task-id>::deadletterqueueProduceRequests The number of attempted writes to the dead letter queue.value ic_connector_task_deadletterqueue_produce_requests kct::<connector-name>::<task-id>::lastErrorTimestamp The epoch timestamp when this task last encountered an error.value ic_connector_task_last_error_timestamp kct::<connector-name>::<task-id>::totalErrorsLogged The number of errors that were logged.value ic_connector_task_total_errors_logged kct::<connector-name>::<task-id>::totalRecordErrors The number of record processing errors in this task.value ic_connector_task_total_record_errors kct::<connector-name>::<task-id>::totalRecordFailures The number of record processing failures in this task.value ic_connector_task_total_record_failures kct::<connector-name>::<task-id>::totalRecordsSkipped The number of records skipped due to errors.value ic_connector_task_total_records_skipped kct::<connector-name>::<task-id>::totalRetries The number of operations retried.value ic_connector_task_total_retries kct::<connector-name>::<task-id>::offsetCommitCompletionRate The average per-second number of offset commit completions that were completed successfully.value ic_connector_task_offset_commit_completion_rate kct::<connector-name>::<task-id>::offsetCommitCompletionTotal The total number of offset commit completions that were completed successfully.value ic_connector_task_offset_commit_completion_total kct::<connector-name>::<task-id>::offsetCommitSeqNo The current sequence number for offset commits.value ic_connector_task_offset_commit_seq_no kct::<connector-name>::<task-id>::offsetCommitSkipRate The average per-second number of offset commit completions that were received too late and skipped/ignored.value ic_connector_task_offset_commit_skip_rate kct::<connector-name>::<task-id>::offsetCommitSkipTotal The total number of offset commit completions that were received too late and skipped/ignored.value ic_connector_task_offset_commit_skip_total kct::<connector-name>::<task-id>::partitionCount The number of topic partitions assigned to this task belonging to the named sink connector in this worker.value ic_connector_task_partition_count kct::<connector-name>::<task-id>::putBatchAvgTimeMs The average time taken by this task to put a batch of sinks records.ms ic_connector_task_put_batch_avg_time_ms_milliseconds kct::<connector-name>::<task-id>::sinkRecordActiveCount The number of records that have been read from Kafka but not yet completely committed/flushed/acknowledged by the sink task.value ic_connector_task_sink_record_active_count kct::<connector-name>::<task-id>::sinkRecordActiveCountAvg The average number of records that have been read from Kafka but not yet completely committed/flushed/acknowledged by the sink task.value ic_connector_task_sink_record_active_count_avg kct::<connector-name>::<task-id>::sinkRecordLagMax The maximum lag in terms of number of records behind the consumer the offset commits are for any topic partitions.value ic_connector_task_sink_record_lag_max kct::<connector-name>::<task-id>::sinkRecordReadRate The average per-second number of records read from Kafka for this task belonging to the named sink connector in this worker. This is before transformations are applied.value ic_connector_task_sink_record_read_rate kct::<connector-name>::<task-id>::sinkRecordReadTotal The total number of records read from Kafka by this task belonging to the named sink connector in this worker, since the task was last restarted.value ic_connector_task_sink_record_read_total kct::<connector-name>::<task-id>::sinkRecordSendRate The average per-second number of records output from the transformations and sent/put to this task belonging to the named sink connector in this worker. This is after transformations are applied and excludes any records filtered out by the transformations.value ic_connector_task_sink_record_send_rate kct::<connector-name>::<task-id>::sinkRecordSendTotal The total number of records output from the transformations and sent/put to this task belonging to the named sink connector in this worker, since the task was last restarted.value ic_connector_task_sink_record_send_total kct::<connector-name>::<task-id>::pollBatchAvgTimeMs The average time in milliseconds taken by this task to poll for a batch of source records.ms ic_connector_task_poll_batch_avg_time_ms_milliseconds kct::<connector-name>::<task-id>::sourceRecordActiveCount The number of records that have been produced by this task but not yet completely written to Kafka.value ic_connector_task_source_record_active_count kct::<connector-name>::<task-id>::sourceRecordActiveCountAvg The average number of records that have been produced by this task but not yet completely written to Kafka.value ic_connector_task_source_record_active_count_avg kct::<connector-name>::<task-id>::sourceRecordPollRate The average per-second number of records produced/polled (before transformation) by this task belonging to the named source connector in this worker.value ic_connector_task_source_record_poll_rate kct::<connector-name>::<task-id>::sourceRecordPollTotal The total number of records produced/polled (before transformation) by this task belonging to the named source connector in this worker.value ic_connector_task_source_record_poll_total kct::<connector-name>::<task-id>::sourceRecordWriteRate The average per-second number of records output from the transformations and written to Kafka for this task belonging to the named source connector in this worker. This is after transformations are applied and excludes any records filtered out by the transformations.value ic_connector_task_source_record_write_rate kct::<connector-name>::<task-id>::sourceRecordWriteTotal The number of records output from the transformations and written to Kafka for this task belonging to the named source connector in this worker, since the task was last restarted.value ic_connector_task_source_record_write_total kcc::<connectorName>::connectorUnassignedTaskCount This is only available for Kafka Connect 2.5.1+.value ic_connector_connector_unassigned_task_count kcc::<connectorName>::connectorTotalTaskCount The total number of tasks assigned to the connector. This is only available for Kafka Connect 2.5.1+.value ic_connector_connector_total_task_count kcc::<connectorName>::connectorRunningTaskCount The number of running tasks assigned to the connector. This is only available for Kafka Connect 2.5.1+.value ic_connector_connector_running_task_count kcc::<connectorName>::connectorDestroyedTaskCount The number of running tasks assigned to the connector. This is only available for Kafka Connect 2.5.1+.value ic_connector_connector_destroyed_task_count kcc::<connectorName>::connectorFailedTaskCount The number of failed tasks assigned to the connector. This is only available for Kafka Connect 2.5.1+.value ic_connector_connector_failed_task_count kcc::<connectorName>::connectorPausedTaskCount The number of paused tasks assigned to the connector. This is only available for Kafka Connect 2.5.1+.value ic_connector_connector_paused_task_count kc::mm::source::<target>::<topic-name-in-target>::recordCount Number of records replicated by the mirroring source connector.count ic_mirror_source_connector_record_count kc::mm::source::<target>::<topic-name-in-target>::byteCount Byte count replicated by the mirroring source connector.count ic_mirror_source_connector_byte_count kc::mm::source::<target>::<topic-name-in-target>::recordRate Record replication rate of the mirroring source connector.value ic_mirror_source_connector_record_rate kc::mm::source::<target>::<topic-name-in-target>::byteRate Byte replication rate of the mirroring source connector.value ic_mirror_source_connector_byte_rate kc::mm::source::<target>::<topic-name-in-target>::recordAgeMs Age of each record at the time when consumed by the mirroring source connector.value ic_mirror_source_connector_record_age_ms_milliseconds min ic_mirror_source_connector_record_age_ms_milliseconds max ic_mirror_source_connector_record_age_ms_milliseconds kc::mm::source::<target>::<topic-name-in-target>::replicationLatencyMs Timespan between each record’s timestamp and downstream acknowledgment.value ic_mirror_source_connector_replication_latency_ms_milliseconds min ic_mirror_source_connector_replication_latency_ms_milliseconds max ic_mirror_source_connector_replication_latency_ms_milliseconds kc::mm::checkpoint::<source>::<target>::<group>::<topic-name-in-target>::checkpointLatencyMs Timestamp between consumer group commit and downstream checkpoint acknowledgment.value ic_mirror_checkpoint_connector_checkpoint_latency_ms_milliseconds min ic_mirror_checkpoint_connector_checkpoint_latency_ms_milliseconds max ic_mirror_checkpoint_connector_checkpoint_latency_ms_milliseconds r::masterSlotsCount The number of hash slots a master node has been assigned. The number of hash slots of all master nodes should add to 16384.value ic_node_master_slots_count r::clusterUnassignedSlotsCount Number of slots which are NOT associated to some node (unbound).value ic_node_cluster_unassigned_slots_count r::clusterSlotsNotOkCount Number of hash slots mapping to a node in FAIL or PFAIL state.value ic_node_cluster_slots_not_ok_count r::slaWritesLatency The average and maximum time taken in milliseconds by a client to write to a random master node in the cluster.average Average value of the metric. ic_node_sla_writes_latency max Maximum value of the metric. ic_node_sla_writes_latency r::slaWritesSuccessfulOps Number of successful write operations performed on the cluster. Every 20 seconds, 30 synthetic write transactions are performed on each node.count ic_node_sla_writes_successful_ops r::slaWritesFailedOps Number of failed write operations performed on the cluster.count ic_node_sla_writes_failed_ops r::slaReadsLatency The average and maximum time taken in milliseconds by a client to read from a random node in the cluster.average Average value of the metric. ic_node_sla_reads_latency max Maximum value of the metric. ic_node_sla_reads_latency r::slaReadsSuccessfulOps Number of successful read operations performed on the cluster. Every 20 seconds, 30 synthetic read transactions are performed on each node.count ic_node_sla_reads_successful_ops r::slaReadsFailedOps Number of failed read operations performed on the cluster.count ic_node_sla_reads_failed_ops r::localWritesLatency Tthe average and maximum time taken in milliseconds by a client to write to its local node.average Average value of the metric. ic_node_local_writes_latency max Maximum value of the metric. ic_node_local_writes_latency r::localWritesSuccessfulOps Number of successful write operations performed on the local node. Every 20 seconds, 30 synthetic write transactions are performed on each node.count ic_node_local_writes_successful_ops r::localWritesFailedOps Number of failed write operations performed on the local node.count ic_node_local_writes_failed_ops r::localReadsLatency The average and maximum time taken in milliseconds by a client to read from its local node.average Average value of the metric. ic_node_local_reads_latency max Maximum value of the metric. ic_node_local_reads_latency r::localReadsSuccessfulOps Number of successful read operations performed on the local node. Every 20 seconds, 30 synthetic read transactions are performed on each node.count ic_node_local_reads_successful_ops r::localReadsFailedOps Number of failed read operations performed on the local node.count ic_node_local_reads_failed_ops r::usedMemory Total memory in megabytes allocated by Redis using its allocator (either standard libc, jemalloc, or an alternative allocator such as tcmalloc).value ic_node_used_memory r::usedMemoryRss Memory in megabytes that Redis allocated as seen by the operating system (a.k.a resident set size). This is the number reported by tools such as top(1) and ps(1).value ic_node_used_memory_rss r::usedMemoryDataset The size in bytes of the dataset.value ic_node_used_memory_dataset r::usedMemoryLua Number of bytes used by the Lua engine.value ic_node_used_memory_lua r::memoryFragmentationRatio Ratio between Used Memory Rss and Used Memory.value ic_node_memory_fragmentation_ratio r::connectedClients Number of clients connected to the node.value ic_node_connected_clients r::operationsPerSec Number of commands processed per second.value ic_node_operations_per_sec r::roleIsMaster Is the node the master, will be 1.0 if it is and 0.0 otherwisestate ic_node_role_is_master v::masterSlotsCount The number of hash slots a master node has been assigned. The number of hash slots of all master nodes should add to 16384.value ic_node_master_slots_count v::clusterUnassignedSlotsCount Number of slots which are NOT associated to some node (unbound).value ic_node_cluster_unassigned_slots_count v::clusterSlotsNotOkCount Number of hash slots mapping to a node in FAIL or PFAIL state.value ic_node_cluster_slots_not_ok_count v::slaWritesLatency The average and maximum time taken in milliseconds by a client to write to a random master node in the cluster.average Average value of the metric. ic_node_sla_writes_latency max Maximum value of the metric. ic_node_sla_writes_latency v::slaWritesSuccessfulOps Number of successful write operations performed on the cluster. Every 20 seconds, 30 synthetic write transactions are performed on each node.count ic_node_sla_writes_successful_ops v::slaWritesFailedOps Number of failed write operations performed on the cluster.count ic_node_sla_writes_failed_ops v::slaReadsLatency The average and maximum time taken in milliseconds by a client to read from a random node in the cluster.average Average value of the metric. ic_node_sla_reads_latency max Maximum value of the metric. ic_node_sla_reads_latency v::slaReadsSuccessfulOps Number of successful read operations performed on the cluster. Every 20 seconds, 30 synthetic read transactions are performed on each node.count ic_node_sla_reads_successful_ops v::slaReadsFailedOps Number of failed read operations performed on the cluster.count ic_node_sla_reads_failed_ops v::localWritesLatency Tthe average and maximum time taken in milliseconds by a client to write to its local node.average Average value of the metric. ic_node_local_writes_latency max Maximum value of the metric. ic_node_local_writes_latency v::localWritesSuccessfulOps Number of successful write operations performed on the local node. Every 20 seconds, 30 synthetic write transactions are performed on each node.count ic_node_local_writes_successful_ops v::localWritesFailedOps Number of failed write operations performed on the local node.count ic_node_local_writes_failed_ops v::localReadsLatency The average and maximum time taken in milliseconds by a client to read from its local node.average Average value of the metric. ic_node_local_reads_latency max Maximum value of the metric. ic_node_local_reads_latency v::localReadsSuccessfulOps Number of successful read operations performed on the local node. Every 20 seconds, 30 synthetic read transactions are performed on each node.count ic_node_local_reads_successful_ops v::localReadsFailedOps Number of failed read operations performed on the local node.count ic_node_local_reads_failed_ops v::usedMemory Total memory in megabytes allocated by Valkey using its allocator (either standard libc, jemalloc, or an alternative allocator such as tcmalloc).value ic_node_used_memory v::usedMemoryRss Memory in megabytes that Valkey allocated as seen by the operating system (a.k.a resident set size). This is the number reported by tools such as top(1) and ps(1).value ic_node_used_memory_rss v::usedMemoryDataset The size in bytes of the dataset.value ic_node_used_memory_dataset v::usedMemoryLua Number of bytes used by the Lua engine.value ic_node_used_memory_lua v::memoryFragmentationRatio Ratio between Used Memory Rss and Used Memory.value ic_node_memory_fragmentation_ratio v::connectedClients Number of clients connected to the node.value ic_node_connected_clients v::operationsPerSec Number of commands processed per second.value ic_node_operations_per_sec v::roleIsMaster Is the node the master, will be 1.0 if it is and 0.0 otherwisestate ic_node_role_is_master z::electionTimeTaken Time taken to complete election.ms ic_node_election_time_taken_milliseconds z::packetsReceived Number of packet operations received.value ic_node_packets_received z::txnLogElapsedSyncTime The elapsed sync time of transaction log in milliseconds.ms ic_node_txn_log_elapsed_sync_time_milliseconds z::packetsSent Number of packet operations sent.value ic_node_packets_sent z::numAliveConnections Total number of active client connections in the server.value ic_node_num_alive_connections z::maxRequestLatency Maximum time it takes for the server to respond to a request.ms ic_node_max_request_latency_milliseconds z::minRequestLatency Minimum time it takes for the server to respond to a request.ms ic_node_min_request_latency_milliseconds z::avgRequestLatency Average time it takes for the server to respond to a request.ms ic_node_avg_request_latency_milliseconds z::outstandingRequests Number of pending requests in the server.value ic_node_outstanding_requests z::openFileDescriptorCount Number of file descriptors in use.value ic_node_open_file_descriptor_count z::lastZxidCounter Last Zookeeper Transaction ID (ZXID) counter value.value ic_node_last_zxid_counter pg::misc::numBackends Number of connections against each nodecount ic_num_backends pg::misc::locks Current count of locks in each nodecount ic_locks pg::misc::timelineId Timeline id of the nodevalue ic_timeline_id pg::misc::isMaster Is the node the primary, will be 1.0 if it is and 0.0 otherwisecount ic_is_master pg::misc::isRunning Is Postgresql running, will be 1.0 if it is and 0.0 otherwisecount ic_is_running pg::misc::stateActive Number of active connections (state = 'active') in pg_stat_activity.count ic_state_active pg::misc::stateIdle Number of idle connections (state = 'idle') in pg_stat_activity.count ic_state_idle pg::misc::stateIdleInTransaction Number of connections idle in transaction (state = 'idle in transaction') in pg_stat_activity.count ic_state_idle_in_transaction pg::misc::stateNull Number of connections with null state in pg_stat_activity.count ic_state_null pg::misc::stateOthers Number of connections in 'other' states (state = 'idle in transaction (aborted)', 'fastpath function call' or 'disabled') in pg_stat_activity.count ic_state_others pg::misc::waitEventTypeLwlock Number of connections waiting on LWLock (wait_event_type = 'LWLock') in pg_stat_activity.count ic_wait_event_type_lwlock pg::misc::waitEventTypeIo Number of connections waiting on IO (wait_event_type = 'IO') in pg_stat_activity.count ic_wait_event_type_io pg::misc::waitEventTypeLock Number of connections waiting on Lock (wait_event_type = 'Lock') in pg_stat_activity.count ic_wait_event_type_lock pg::misc::waitEventTypeClient Number of connections waiting on Client (wait_event_type = 'Client') in pg_stat_activity.count ic_wait_event_type_client pg::misc::waitEventTypeExtension Number of connections waiting on Extension (wait_event_type = 'Extension') in pg_stat_activity.count ic_wait_event_type_extension pg::misc::waitEventTypeBufferPin Number of connections waiting on BufferPin (wait_event_type = 'BufferPin') in pg_stat_activity.count ic_wait_event_type_buffer_pin pg::misc::waitEventTypeActivity Number of connections waiting on Activity (wait_event_type = 'Activity') in pg_stat_activity.count ic_wait_event_type_activity pg::misc::waitEventTypeTimeout Number of connections waiting on Timeout (wait_event_type = 'Timeout') in pg_stat_activity.count ic_wait_event_type_timeout pg::misc::waitEventTypeInjectionPoint Number of connections waiting on InjectionPoint events (wait_event_type = 'InjectionPoint') in pg_stat_activity.count ic_wait_event_type_injection_point pg::misc::waitEventTypeIpc Number of connections waiting on IPC events (wait_event_type = 'IPC') in pg_stat_activity.count ic_wait_event_type_ipc pg::misc::waitEventTypeNull Number of connections with null wait_event_type in pg_stat_activity.count ic_wait_event_type_null pg::transactions::oldestTransactionId Oldest transaction ID in each nodecount ic_oldest_transaction_id pg::transactions::percentTowardsEmergencyVacuum Percentage towards an emergency vacuum being required in each nodecount ic_percent_towards_emergency_vacuum pg::transactions::percentTowardsWraparound Percentage towards transaction ID wraparound in each nodecount ic_percent_towards_wraparound pg::replication::lsnCurrent Current WAL LSN for database-cluster (this will be empty on replicas)count ic_lsn_current pg::replication::lsnReceived Last WAL LSN received by this replica (this will be empty on the primary)count ic_lsn_received pg::replication::isInRecovery Is the node a replica, will be 1.0 if it is and 0.0 otherwisecount ic_is_in_recovery pg::replication::replicationStatus Is the replica node's replication status streaming, will be 1 if it is and 0 otherwisevalue ic_replication_status pg::replication::isStandbyLeader Is the node the standby leader, will be 1 if it is and 0 otherwisecount ic_is_standby_leader pg::replication::slots::<node-id>::lsnSent Last WAL LSN sent on this connection (this will be empty on replicas)count ic_slot_lsn_sent pg::replication::lag::<node-id>::replicationLagByte The replication lag in byte for the replica nodesvalue ic_lag_replication_lag_byte_bytes pg::replication::lag::<node-id>::replicationLagMs The replication lag in ms for the replica nodesms ic_lag_replication_lag_ms_milliseconds pg::replication::lag::<node-id>::replayLag The replay lag for the replica nodesms ic_lag_replay_lag_milliseconds byte ic_lag_replay_lag_bytes pg::sla::avgWriteLatency Average write latency for synthetic write requests.ms ic_avg_write_latency_milliseconds pg::sla::avgReadLatency Average read latency for synthetic read requests.ms ic_avg_read_latency_milliseconds pg::sla::writeErrors Number of write errors for synthetic write requests.count ic_write_errors pg::sla::readErrors Number of read errors for synthetic write requests.count ic_read_errors If your database name contains : please escape it using
pg::db::<database-name>::rowsInsertedCountPerSecond Number of rows inserted per secondcount_per_second ic_database_rows_inserted_count_per_second pg::db::<database-name>::rowsUpdatedCountPerSecond Number of rows updated per secondcount_per_second ic_database_rows_updated_count_per_second pg::db::<database-name>::rowsDeletedCountPerSecond Number of rows deleted per secondcount_per_second ic_database_rows_deleted_count_per_second pg::db::<database-name>::rowsReturnedCountPerSecond Number of rows returned per secondcount_per_second ic_database_rows_returned_count_per_second pg::db::<database-name>::rowsFetchedCountPerSecond Number of rows fetched per secondcount_per_second ic_database_rows_fetched_count_per_second pg::db::<database-name>::deadlocks Number of deadlocks detected in this databasecount ic_database_deadlocks pg::db::<database-name>::bufferCacheHitCountPerSecond Number of times disk blocks were found already in the buffer cache, so that a read was not necessary, per secondcount_per_second ic_database_buffer_cache_hit_count_per_second pg::db::<database-name>::diskBlocksReadCountPerSecond Number of disk blocks read per second in this databasecount_per_second ic_database_disk_blocks_read_count_per_second pg::db::<database-name>::transactionsCommittedPerSecond Number of transactions in this database that have been committed per secondcount_per_second ic_database_transactions_committed_per_second pg::db::<database-name>::transactionsRolledBackPerSecond Number of transactions in this database that have been rolled back per secondcount_per_second ic_database_transactions_rolled_back_per_second pg::db::<database-name>::tempBytesPerSecond Number of temporary bytes written per secondvalue ic_database_temp_bytes_per_second_bytes pg::db::<database-name>::numBackends Number of connections against the databasecount ic_database_num_backends If your database name or table name contains : please escape it using
pg::tbl::<database-name>::<schema-name>::<table-name>::rowsInsertedCountPerSecond Number of rows inserted per secondcount_per_second ic_database_schema_table_rows_inserted_count_per_second pg::tbl::<database-name>::<schema-name>::<table-name>::rowsUpdatedCountPerSecond Number of rows updated per secondcount_per_second ic_database_schema_table_rows_updated_count_per_second pg::tbl::<database-name>::<schema-name>::<table-name>::rowsDeletedCountPerSecond Number of rows deleted per secondcount_per_second ic_database_schema_table_rows_deleted_count_per_second pg::tbl::<database-name>::<schema-name>::<table-name>::blocksHitCountPerSecond Number of blocks hit per secondcount_per_second ic_database_schema_table_blocks_hit_count_per_second pg::tbl::<database-name>::<schema-name>::<table-name>::blocksReadCountPerSecond Number of blocks read per secondcount_per_second ic_database_schema_table_blocks_read_count_per_second pg::tbl::<database-name>::<schema-name>::<table-name>::indexScansPerSecond Number of index scans initiated on this table per secondcount_per_second ic_database_schema_table_index_scans_per_second pg::tbl::<database-name>::<schema-name>::<table-name>::sequentialScansPerSecond Number of sequential scans initiated on this table per secondcount_per_second ic_database_schema_table_sequential_scans_per_second pg::tbl::<database-name>::<schema-name>::<table-name>::deadRows Estimated number of dead rowscount ic_database_schema_table_dead_rows pg::tbl::<database-name>::<schema-name>::<table-name>::bufferCacheIndexHitCountPerSecond Number of buffer hits in all indexes on this table per secondcount_per_second ic_database_schema_table_buffer_cache_index_hit_count_per_second pg::tbl::<database-name>::<schema-name>::<table-name>::diskBlocksReadIndexCountPerSecond Number of disk blocks read from all indexes on this table per secondcount_per_second ic_database_schema_table_disk_blocks_read_index_count_per_second pg::tbl::<database-name>::<schema-name>::<table-name>::tableSize Computes the disk space used by the specified table, excluding indexes (but including its TOAST table if any, free space map, and visibility map)value ic_database_schema_table_table_size_bytes pg::tbl::<database-name>::<schema-name>::<table-name>::indexSize Computes the total disk space used by indexes attached to the specified table.value ic_database_schema_table_index_size_bytes pgb::isAvailable PgBouncer availabilitycount ic_pgbouncer_is_available If your database name contains : please escape it using
pgb::stats::<database-name>::avgQueryCount Average queries per second in last stat collecting periodcount ic_pgbouncer_stats_avg_query_count pgb::stats::<database-name>::avgQueryTime Average query duration in microsecondsvalue ic_pgbouncer_stats_avg_query_time_microseconds pgb::stats::<database-name>::avgRecv Average size of client network traffic received in bytes per secondvalue ic_pgbouncer_stats_avg_recv_bytes pgb::stats::<database-name>::avgSent Average size of client network traffic sent in bytes per secondvalue ic_pgbouncer_stats_avg_sent_bytes pgb::stats::<database-name>::avgWaitTime Time spent by clients waiting for a server in microseconds (average per second)value ic_pgbouncer_stats_avg_wait_time_microseconds pgb::stats::<database-name>::avgXactCount Average transactions per second in last stat collecting periodcount ic_pgbouncer_stats_avg_xact_count pgb::stats::<database-name>::avgXactTime Average transaction duration in microsecondsvalue ic_pgbouncer_stats_avg_xact_time_microseconds If the database name or user name of connection pools contains : please escape it using
pgb::pools::<database-name>::<user-name>::clActive Number of client connections that are linked to server connection and are able to process queriescount ic_pgbouncer_pools_cl_active pgb::pools::<database-name>::<user-name>::clCancelReq Number of client connections that have not forwarded query cancellations to the server yetcount ic_pgbouncer_pools_cl_cancel_req pgb::pools::<database-name>::<user-name>::clWaiting Number of client connections that are waiting on a server connectioncount ic_pgbouncer_pools_cl_waiting pgb::pools::<database-name>::<user-name>::maxWait Current longest time (in seconds) that an unserved client connection is waiting in the poolvalue ic_pgbouncer_pools_max_wait_seconds pgb::pools::<database-name>::<user-name>::svActive Number of server connections that are linked to a client connectioncount ic_pgbouncer_pools_sv_active pgb::pools::<database-name>::<user-name>::svIdle Number of server connections that are idling and ready for a client querycount ic_pgbouncer_pools_sv_idle pgb::pools::<database-name>::<user-name>::svLogin Number of server connections that are currently in the process of logging incount ic_pgbouncer_pools_sv_login pgb::pools::<database-name>::<user-name>::svTested Number of server connections that are currently running either server_reset_query or server_check_querycount ic_pgbouncer_pools_sv_tested pgb::pools::<database-name>::<user-name>::svUsed Number of server connections that are idling more than server_check_delaycount ic_pgbouncer_pools_sv_used Summary metric names follow the format cads::{metricName}. Optionally, a ‘sub-type’ may be specified to return a specific part of the metric - cads::{metricName}::{subType}
cads::frontendV2MemoryHeapInUse The current heap memory usage of the Cadence Frontend service, in bytes.value ic_node_frontend_v2_memory_heap_in_use_bytes cads::frontendV2MemoryAllocated The current memory allocation to the Cadence Frontend service, in bytes.value ic_node_frontend_v2_memory_allocated_bytes cads::matchingV2MemoryHeapInUse The current heap memory usage of the Cadence Matching service, in bytes.value ic_node_matching_v2_memory_heap_in_use_bytes cads::matchingV2MemoryAllocated The current memory allocation to the Cadence Matching service, in bytes.value ic_node_matching_v2_memory_allocated_bytes cads::historyV2MemoryHeapInUse The current heap memory usage of the Cadence History service, in bytes.value ic_node_history_v2_memory_heap_in_use_bytes cads::historyV2MemoryAllocated The current memory allocation to the Cadence History service, in bytes.value ic_node_history_v2_memory_allocated_bytes cads::workerV2MemoryHeapInUse The current heap memory usage of the Cadence Worker service, in bytes.value ic_node_worker_v2_memory_heap_in_use_bytes cads::workerV2MemoryAllocated The current memory allocation to the Cadence Worker service, in bytes.value ic_node_worker_v2_memory_allocated_bytes cads::slaV2WorkflowSuccess Number of reported Cadence Canary workflow successes, per second.count_per_second ic_node_sla_v2_workflow_success cads::slaV2WorkflowCancel Number of reported Cadence Canary workflow cancellations, per second.count_per_second ic_node_sla_v2_workflow_cancel cads::slaV2WorkflowFail Number of reported Cadence Canary workflow failures, per second.count_per_second ic_node_sla_v2_workflow_fail cads::slaV2WorkflowTimeout Number of reported Cadence Canary workflow time-outs, per second.count_per_second ic_node_sla_v2_workflow_timeout cads::slaV2WorkflowTerminate Number of reported Cadence Canary workflow terminations, per second.count_per_second ic_node_sla_v2_workflow_terminate cads::slaV2WorkflowLatency The average end-to-end latency of the Cadence Canary workflow, in seconds.average ic_node_sla_v2_workflow_latency_seconds cads::frontendV2MeanPersistenceRequestRate Average Number of persistence requests made by the Cadence Frontend service, per second.count_per_second ic_node_frontend_v2_mean_persistence_request_rate cads::frontendV2MeanPersistenceErrorRate Average Number of internal errors from persistence requests made by the Cadence Frontend service, per second.count_per_second ic_node_frontend_v2_mean_persistence_error_rate cads::frontendV2MeanPersistenceLatency Average Latency of persistence requests made by the Cadence Frontend service, in seconds.average ic_node_frontend_v2_mean_persistence_latency_seconds cads::frontendV2MeanCadenceRequestRate Average Number of Cadence requests made to the Cadence Frontend service, per second.count_per_second ic_node_frontend_v2_mean_cadence_request_rate cads::frontendV2MeanCadenceErrorRate Average Number of internal errors from Cadence requests made to the Cadence Frontend service, per second.count_per_second ic_node_frontend_v2_mean_cadence_error_rate cads::frontendV2MeanCadenceLatency Average Latency of Cadence requests made to the Cadence Frontend service, in seconds.average ic_node_frontend_v2_mean_cadence_latency_seconds cads::syncMatchV2Latency Average synchronous match latency of the Cadence Matching service, in seconds.average ic_node_sync_match_v2_latency_seconds cads::asyncMatchV2Latency Average asynchronous match latency of the Cadence Matching service, in seconds.average ic_node_async_match_v2_latency_seconds cads::matchingV2MeanPersistenceRequestRate Average Number of persistence requests made by the Cadence Matching service, per second.count_per_second ic_node_matching_v2_mean_persistence_request_rate cads::matchingV2MeanPersistenceErrorRate Average Number of internal errors from persistence requests made by the Cadence Matching service, per second.count_per_second ic_node_matching_v2_mean_persistence_error_rate cads::matchingV2MeanPersistenceLatency Average Latency of persistence requests made by the Cadence Matching service, in seconds.average ic_node_matching_v2_mean_persistence_latency_seconds cads::matchingV2MeanCadenceRequestRate Average Number of Cadence requests made to the Cadence Matching service, per second.count_per_second ic_node_matching_v2_mean_cadence_request_rate cads::matchingV2MeanCadenceErrorRate Average Number of internal errors from Cadence requests made to the Cadence Matching service, per second.count_per_second ic_node_matching_v2_mean_cadence_error_rate cads::matchingV2MeanCadenceLatency Average Latency of Cadence requests made to the Cadence Matching service, in seconds.average ic_node_matching_v2_mean_cadence_latency_seconds cads::historyV2MeanCadenceRequestRate Average Number of Cadence requests made to the Cadence History service, per second.count_per_second ic_node_history_v2_mean_cadence_request_rate cads::historyV2MeanCadenceErrorRate Average Number of internal errors from Cadence requests made to the Cadence History service, per second.count_per_second ic_node_history_v2_mean_cadence_error_rate cads::historyV2MeanCadenceLatency Average Latency of Cadence requests made to the Cadence History service, in seconds.average ic_node_history_v2_mean_cadence_latency_seconds cads::historyV2MeanPersistenceRequestRate Average Number of persistence requests made by the Cadence History service, per second.count_per_second ic_node_history_v2_mean_persistence_request_rate cads::historyV2MeanPersistenceErrorRate Average Number of internal errors from persistence requests made by the Cadence History service, per second.count_per_second ic_node_history_v2_mean_persistence_error_rate cads::historyV2MeanPersistenceLatency Average Latency of persistence requests made by the Cadence History service, in seconds.average ic_node_history_v2_mean_persistence_latency_seconds cads::historyV2MeanTaskRequestRate Average Number of task requests to the Cadence History service, per second.count_per_second ic_node_history_v2_mean_task_request_rate cads::historyV2MeanTaskErrorRate Average Number of errors from task requests to the Cadence History service, per second.count_per_second ic_node_history_v2_mean_task_error_rate cads::historyV2MeanTaskLatency Average Execution latency of tasks in the Cadence History service, in seconds.average ic_node_history_v2_mean_task_latency_seconds cads::historyV2MeanTaskLatencyQueue Average Queue latency of tasks in the Cadence History service, in seconds.average ic_node_history_v2_mean_task_latency_queue_seconds cads::historyV2MeanTaskLatencyProcessing Average Processing latency of tasks in the Cadence History service, in seconds.average ic_node_history_v2_mean_task_latency_processing_seconds cads::historyV2MeanWorkflowSuccess Average Number of successful workflows, per second.count_per_second ic_node_history_v2_mean_workflow_success cads::historyV2MeanWorkflowCancel Average Number of cancelled workflows, per second.count_per_second ic_node_history_v2_mean_workflow_cancel cads::historyV2MeanWorkflowFailed Average Number of failed workflows, per second.count_per_second ic_node_history_v2_mean_workflow_failed cads::historyV2MeanWorkflowTimeout Average Number of timed out workflows, per second.count_per_second ic_node_history_v2_mean_workflow_timeout cads::historyV2MeanWorkflowTerminate Average Number of terminated workflows, per second.count_per_second ic_node_history_v2_mean_workflow_terminate cads::historyV2MeanReplicationTasksApplied Average Number of successfully applied replication tasks in the Cadence History service.count_per_second ic_node_history_v2_mean_replication_tasks_applied cads::historyV2MeanReplicationTasksAppliedLatency Average latency from replication tasks being received to them being applied in the Cadence History service, in seconds.average ic_node_history_v2_mean_replication_tasks_applied_latency_seconds cads::historyV2MeanReplicationTaskLatency Average latency from replication tasks being created to them being applied in the Cadence History service, in seconds.average ic_node_history_v2_mean_replication_task_latency_seconds cads::historyV2MeanReplicationTaskCleanupCount Average Number of cleaned up replication tasks after being acknowledged by the standby Cadence clusters in the Cadence History service.count_per_second ic_node_history_v2_mean_replication_task_cleanup_count cads::historyV2MeanReplicationTaskCleanupFailed Average Number of replication tasks failed to be cleaned up after being acknowledged by the standby Cadence clusters in the Cadence History service.count_per_second ic_node_history_v2_mean_replication_task_cleanup_failed cads::historyV2ReplicationDlqSize Size of the DLQ of replication tasks that could not be applied after retry in the Cadence History service.value ic_node_history_v2_replication_dlq_size cads::historyV2MeanReplicationDlqEnqueueFailed Average Number of replication tasks that could not be applied after retry and are failed to be put into DLQ in the Cadence History service.count_per_second ic_node_history_v2_mean_replication_dlq_enqueue_failed cads::workerV2MeanPersistenceRequestRate Average Number of persistence requests made by the Cadence Worker service, per second.count_per_second ic_node_worker_v2_mean_persistence_request_rate cads::workerV2MeanPersistenceErrorRate Average Number of internal errors from persistence requests made by the Cadence Worker service, per second.count_per_second ic_node_worker_v2_mean_persistence_error_rate cads::workerV2MeanPersistenceLatency Average Latency of persistence requests made by the Cadence Worker service, in seconds.average ic_node_worker_v2_mean_persistence_latency_seconds Tag-level metric names follow the format cadt::{tag}::{metricName}. Optionally, a ‘sub-type’ may be specified to return a specific part of the metric - cadt::{tag}::{metricName}::{subType}
cadt::{tag}::frontendV2PersistenceRequestRate Number of persistence requests made by the Cadence Frontend service, per operation, per second.count_per_second ic_cadence_frontend_v2_persistence_request_rate cadt::{tag}::frontendV2PersistenceErrorRate Number of internal errors from persistence requests made by the Cadence Frontend service, per operation, per second.count_per_second ic_cadence_frontend_v2_persistence_error_rate cadt::{tag}::frontendV2PersistenceLatency Latency of persistence requests made by the Cadence Frontend service, per operation, in seconds.95thPercentile ic_cadence_frontend_v2_persistence_latency_seconds 50thPercentile ic_cadence_frontend_v2_persistence_latency_seconds cadt::{tag}::frontendV2CadenceRequestRate Number of Cadence requests made to the Cadence Frontend service, per operation, per second.count_per_second ic_cadence_frontend_v2_cadence_request_rate cadt::{tag}::frontendV2CadenceErrorRate Number of internal errors from Cadence requests made to the Cadence Frontend service, per operation, per second.count_per_second ic_cadence_frontend_v2_cadence_error_rate cadt::{tag}::frontendV2CadenceClientBadRequestErrorRate Number of client-side errors (bad request) from Cadence requests made to the Cadence Frontend service, per operation, in seconds.count_per_second ic_cadence_frontend_v2_cadence_client_bad_request_error_rate cadt::{tag}::frontendV2CadenceClientServiceBusyErrorRate Number of client-side errors (service busy) from Cadence requests made to the Cadence Frontend service, per operation, in seconds.count_per_second ic_cadence_frontend_v2_cadence_client_service_busy_error_rate cadt::{tag}::frontendV2CadenceClientCriticalErrorRate Number of client-side errors (critical) from Cadence requests made to the Cadence Frontend service, per operation, in seconds.count_per_second ic_cadence_frontend_v2_cadence_client_critical_error_rate cadt::{tag}::frontendV2CadenceClientQueryFailedErrorRate Number of client-side errors (query failed) from Cadence requests made to the Cadence Frontend service, per operation, in seconds.count_per_second ic_cadence_frontend_v2_cadence_client_query_failed_error_rate cadt::{tag}::frontendV2CadenceClientLimitExceededErrorRate Number of client-side errors (limit exceeded) from Cadence requests made to the Cadence Frontend service, per operation, in seconds.count_per_second ic_cadence_frontend_v2_cadence_client_limit_exceeded_error_rate cadt::{tag}::frontendV2CadenceClientContextTimeoutErrorRate Number of client-side errors (context timeout) from Cadence requests made to the Cadence Frontend service, per operation, in seconds.count_per_second ic_cadence_frontend_v2_cadence_client_context_timeout_error_rate cadt::{tag}::frontendV2CadenceClientRetryTaskErrorRate Number of client-side errors (retry task) from Cadence requests made to the Cadence Frontend service, per operation, in seconds.count_per_second ic_cadence_frontend_v2_cadence_client_retry_task_error_rate cadt::{tag}::frontendV2CadenceLatency Latency of Cadence requests made to the Cadence Frontend service, per operation, in seconds.95thPercentile ic_cadence_frontend_v2_cadence_latency_seconds 50thPercentile ic_cadence_frontend_v2_cadence_latency_seconds cadt::{tag}::matchingV2CadenceRequestRate Number of Cadence requests made to the Cadence Matching service, per operation, per second.count_per_second ic_cadence_matching_v2_cadence_request_rate cadt::{tag}::matchingV2CadenceErrorRate Number of internal errors from Cadence requests made to the Cadence Matching service, per operation, per second.count_per_second ic_cadence_matching_v2_cadence_error_rate cadt::{tag}::matchingV2CadenceLatency Latency of Cadence requests made to the Cadence Matching service, per operation, in seconds.95thPercentile ic_cadence_matching_v2_cadence_latency_seconds 50thPercentile ic_cadence_matching_v2_cadence_latency_seconds cadt::{tag}::matchingV2CadenceClientBadRequestErrorRate Number of client-side errors (bad request) from Cadence requests made to the Cadence Matching service, per operation, in seconds.count_per_second ic_cadence_matching_v2_cadence_client_bad_request_error_rate cadt::{tag}::matchingV2CadenceClientServiceBusyErrorRate Number of client-side errors (service busy) from Cadence requests made to the Cadence Matching service, per operation, in seconds.count_per_second ic_cadence_matching_v2_cadence_client_service_busy_error_rate cadt::{tag}::matchingV2CadenceClientCriticalErrorRate Number of client-side errors (critical) from Cadence requests made to the Cadence Matching service, per operation, in seconds.count_per_second ic_cadence_matching_v2_cadence_client_critical_error_rate cadt::{tag}::matchingV2CadenceClientQueryFailedErrorRate Number of client-side errors (query failed) from Cadence requests made to the Cadence Matching service, per operation, in seconds.count_per_second ic_cadence_matching_v2_cadence_client_query_failed_error_rate cadt::{tag}::matchingV2CadenceClientLimitExceededErrorRate Number of client-side errors (limit exceeded) from Cadence requests made to the Cadence Matching service, per operation, in seconds.count_per_second ic_cadence_matching_v2_cadence_client_limit_exceeded_error_rate cadt::{tag}::matchingV2CadenceClientContextTimeoutErrorRate Number of client-side errors (context timeout) from Cadence requests made to the Cadence Matching service, per operation, in seconds.count_per_second ic_cadence_matching_v2_cadence_client_context_timeout_error_rate cadt::{tag}::matchingV2CadenceClientRetryTaskErrorRate Number of client-side errors (retry task) from Cadence requests made to the Cadence Matching service, per operation, in seconds.count_per_second ic_cadence_matching_v2_cadence_client_retry_task_error_rate cadt::{tag}::matchingV2SyncMatchLatency The synchronous match latency of the Cadence Matching service, per operation, in seconds.95thPercentile ic_cadence_matching_v2_sync_match_latency_seconds 50thPercentile ic_cadence_matching_v2_sync_match_latency_seconds cadt::{tag}::matchingV2AsyncMatchLatency The asynchronous match latency of the Cadence Matching service, per operation, in seconds.95thPercentile ic_cadence_matching_v2_async_match_latency_seconds 50thPercentile ic_cadence_matching_v2_async_match_latency_seconds cadt::{tag}::matchingV2PersistenceRequestRate Number of persistence requests made by the Cadence Matching service, per operation, per second.count_per_second ic_cadence_matching_v2_persistence_request_rate cadt::{tag}::matchingV2PersistenceErrorRate Number of internal errors from persistence requests made by the Cadence Matching service, per operation, per second.count_per_second ic_cadence_matching_v2_persistence_error_rate cadt::{tag}::matchingV2PersistenceLatency Latency of persistence requests made by the Cadence Matching service, per operation, in seconds.95thPercentile ic_cadence_matching_v2_persistence_latency_seconds 50thPercentile ic_cadence_matching_v2_persistence_latency_seconds cadt::{tag}::historyV2CadenceRequestRate Number of Cadence requests made to the Cadence History service, per operation, per second.count_per_second ic_cadence_history_v2_cadence_request_rate cadt::{tag}::historyV2CadenceErrorRate Number of internal errors from Cadence requests made to the Cadence History service, per operation, per second.count_per_second ic_cadence_history_v2_cadence_error_rate cadt::{tag}::historyV2CadenceLatency Latency of Cadence requests made to the Cadence History service, per operation, in seconds.95thPercentile ic_cadence_history_v2_cadence_latency_seconds 50thPercentile ic_cadence_history_v2_cadence_latency_seconds cadt::{tag}::historyV2CadenceClientBadRequestErrorRate Number of client-side errors (bad request) from Cadence requests made to the Cadence History service, per operation, in seconds.count_per_second ic_cadence_history_v2_cadence_client_bad_request_error_rate cadt::{tag}::historyV2CadenceClientServiceBusyErrorRate Number of client-side errors (service busy) from Cadence requests made to the Cadence History service, per operation, in seconds.count_per_second ic_cadence_history_v2_cadence_client_service_busy_error_rate cadt::{tag}::historyV2CadenceClientCriticalErrorRate Number of client-side errors (critical) from Cadence requests made to the Cadence History service, per operation, in seconds.count_per_second ic_cadence_history_v2_cadence_client_critical_error_rate cadt::{tag}::historyV2CadenceClientQueryFailedErrorRate Number of client-side errors (query failed) from Cadence requests made to the Cadence History service, per operation, in seconds.count_per_second ic_cadence_history_v2_cadence_client_query_failed_error_rate cadt::{tag}::historyV2CadenceClientLimitExceededErrorRate Number of client-side errors (limit exceeded) from Cadence requests made to the Cadence History service, per operation, in seconds.count_per_second ic_cadence_history_v2_cadence_client_limit_exceeded_error_rate cadt::{tag}::historyV2CadenceClientContextTimeoutErrorRate Number of client-side errors (context timeout) from Cadence requests made to the Cadence History service, per operation, in seconds.count_per_second ic_cadence_history_v2_cadence_client_context_timeout_error_rate cadt::{tag}::historyV2CadenceClientRetryTaskErrorRate Number of client-side errors (retry task) from Cadence requests made to the Cadence History service, per operation, in seconds.count_per_second ic_cadence_history_v2_cadence_client_retry_task_error_rate cadt::{tag}::historyV2PersistenceRequestRate Number of persistence requests made by the Cadence History service, per operation, per second.count_per_second ic_cadence_history_v2_persistence_request_rate cadt::{tag}::historyV2PersistenceErrorRate Number of internal errors from persistence requests made by the Cadence History service, per operation, per second.count_per_second ic_cadence_history_v2_persistence_error_rate cadt::{tag}::historyV2PersistenceLatency Latency of persistence requests made by the Cadence History service, per operation, in seconds.95thPercentile ic_cadence_history_v2_persistence_latency_seconds 50thPercentile ic_cadence_history_v2_persistence_latency_seconds cadt::{tag}::historyV2TaskRequestRate Number of task requests to the Cadence History service, per operation, per second.count_per_second ic_cadence_history_v2_task_request_rate cadt::{tag}::historyV2TaskErrorRate Number of errors from task requests to the Cadence History service, per operation, per second.count_per_second ic_cadence_history_v2_task_error_rate cadt::{tag}::historyV2TaskLatency Execution latency of tasks in the Cadence History service, per operation, in seconds.95thPercentile ic_cadence_history_v2_task_latency_seconds 50thPercentile ic_cadence_history_v2_task_latency_seconds cadt::{tag}::historyV2TaskLatencyQueue End-to-end latency of tasks in the Cadence History service, per operation, in seconds.95thPercentile ic_cadence_history_v2_task_latency_queue_seconds 50thPercentile ic_cadence_history_v2_task_latency_queue_seconds cadt::{tag}::historyV2TaskLatencyProcessing Processing latency of tasks in the Cadence History service, per operation, in seconds.95thPercentile ic_cadence_history_v2_task_latency_processing_seconds 50thPercentile ic_cadence_history_v2_task_latency_processing_seconds cadt::{tag}::historyV2WorkflowSuccess Number of successful workflows, per operation, per second.count_per_second ic_cadence_history_v2_workflow_success cadt::{tag}::historyV2WorkflowCancel Number of cancelled workflows, per operation, per second.count_per_second ic_cadence_history_v2_workflow_cancel cadt::{tag}::historyV2WorkflowFailed Number of failed workflows, per operation, per second.count_per_second ic_cadence_history_v2_workflow_failed cadt::{tag}::historyV2WorkflowTimeout Number of timed out workflows, per operation, per second.count_per_second ic_cadence_history_v2_workflow_timeout cadt::{tag}::historyV2WorkflowTerminate Number of terminated workflows, per operation, per second.count_per_second ic_cadence_history_v2_workflow_terminate cadt::{tag}::historyV2WorkflowFailedCount Number of failed workflows count.value ic_cadence_history_v2_workflow_failed_count cadt::{tag}::historyV2ReplicationTasksApplied Average Number of successfully applied replication tasks in the Cadence History service, per operation.count_per_second ic_cadence_history_v2_replication_tasks_applied cadt::{tag}::historyV2ReplicationTasksAppliedPerDomain Average Number of successfully applied replication tasks in the Cadence History service, per domain.count_per_second ic_cadence_history_v2_replication_tasks_applied_per_domain cadt::{tag}::historyV2ReplicationTasksAppliedLatency Latency from replication tasks being received to them being applied in the Cadence History service, in seconds.95thPercentile ic_cadence_history_v2_replication_tasks_applied_latency_seconds 50thPercentile ic_cadence_history_v2_replication_tasks_applied_latency_seconds cadt::{tag}::historyV2ReplicationTaskLatency Latency from replication tasks being created to them being applied in the Cadence History service, in seconds95thPercentile ic_cadence_history_v2_replication_task_latency_seconds 50thPercentile ic_cadence_history_v2_replication_task_latency_seconds cadt::{tag}::historyV2ReplicationTaskCleanupCount Average Number of cleaned up replication tasks after being acknowledged by the standby Cadence clusters in the Cadence History service, per operation.count_per_second ic_cadence_history_v2_replication_task_cleanup_count cadt::{tag}::historyV2ReplicationTaskCleanupFailed Average Number of replication tasks failed to be cleaned up after being acknowledged by the standby Cadence clusters in the Cadence History service, per operation.count_per_second ic_cadence_history_v2_replication_task_cleanup_failed cadt::{tag}::historyV2ReplicationDlqSize Size of the DLQ of replication tasks that could not be applied after retry in the Cadence History service, per operation.value ic_cadence_history_v2_replication_dlq_size cadt::{tag}::historyV2ReplicationDlqEnqueueFailed Average Number of replication tasks that could not be applied after retry and are failed to be put into DLQ in the Cadence History service, per operation.count_per_second ic_cadence_history_v2_replication_dlq_enqueue_failed cadt::{tag}::workerV2PersistenceRequestRate Number of persistence requests made by the Cadence Worker service, per operation, per second.count_per_second ic_cadence_worker_v2_persistence_request_rate cadt::{tag}::workerV2PersistenceErrorRate Number of internal errors from persistence requests made by the Cadence Worker service, per operation, per second.count_per_second ic_cadence_worker_v2_persistence_error_rate cadt::{tag}::workerV2PersistenceLatency Latency of persistence requests made by the Cadence Worker service, per operation, in seconds.95thPercentile ic_cadence_worker_v2_persistence_latency_seconds 50thPercentile ic_cadence_worker_v2_persistence_latency_seconds clk::slaAvgWriteLatency Average write latency for 20 writes.value ic_node_sla_avg_write_latency clk::slaAvgReadLatency Average read latency 20 reads.value ic_node_sla_avg_read_latency clk::slaWriteErrors Number of write request errors.value ic_node_sla_write_errors clk::slaReadErrors Number of read request errors.value ic_node_sla_read_errors clk::slaKeeperErrors Number of ClickHouse Keeper errors.value ic_node_sla_keeper_errors clk::rwLockWaitingReaders Number of threads waiting for read on a table RWLock.value ic_node_rw_lock_waiting_readers clk::rwLockWaitingWriters Number of threads waiting for write on a table RWLock.value ic_node_rw_lock_waiting_writers clk::merge Number of executing background merges.value ic_node_merge clk::readonlyReplica Number of Replicated tables that are currently in readonly state due to re-initialization after ZooKeeper session loss or due to startup without ZooKeeper configured.value ic_node_readonly_replica clk::query Number of executing queries.value ic_node_query clk::delayedInserts Number of INSERT queries that are throttled due to high number of active data parts for partition in a MergeTree table.value ic_node_delayed_inserts clk::s3Requests Number of S3 requests.value ic_node_s3_requests clk::distributedFilesToInsert Number of pending files to process for asynchronous insertion into Distributed tables.value ic_node_distributed_files_to_insert clk::keeperOutstandingRequests Number of outstanding ClickHouse Keeper requests.value ic_node_keeper_outstanding_requests clk::insertQueriesPerSecond Average number of insert queries per second over the last one minute.value ic_node_insert_queries_per_second clk::httpConnection Number of connections to HTTP server.value ic_node_http_connection clk::totalRows The total number of rows for all active parts.value ic_node_total_rows clk::pendingAsyncInsert Number of asynchronous inserts waiting to be flushed.value ic_node_pending_async_insert clk::osOpenFiles The total number of opened files on the host machine. This is a system-wide metric, it includes all the processes on the host machine, not just clickhouse-server.value ic_node_os_open_files clk::mergesInQueue The total number of merge operations that are waiting in queue.value ic_node_merges_in_queue clk::maxInactiveParts The maximum number of inactive partsvalue ic_node_max_inactive_parts clk::znodeCount The number of znodes in ClickHouse Keeper process.value ic_node_znode_count clk::totalPartsOfMergeTreeTables Total amount of data parts in all tables of MergeTree family. Numbers larger than 10 000 will negatively affect the server startup time, and it may indicate unreasonable choice of the partition key.value ic_node_total_parts_of_merge_tree_tables clk::totalRowsOfMergeTreeTables Total amount of rows (records) stored in all tables of MergeTree family.value ic_node_total_rows_of_merge_tree_tables clk::maxPartCountForPartition Maximum number of parts per partition across all partitions of all tables of MergeTree family. Values larger than 300 indicates misconfiguration, overload, or massive data loading.value ic_node_max_part_count_for_partition clk::replicasMaxAbsoluteDelay Maximum difference in seconds between the most fresh replicated part and the most fresh data part still to be replicated, across Replicated tables. A very high value indicates a replica with no data.value ic_node_replicas_max_absolute_delay clk::remoteStorageUsage Total amount of data stored in remote storage (such as AWS S3), in GiB.value ic_node_remote_storage_usage clk::markCacheBytes Total size of mark cache in bytes.value ic_node_mark_cache_bytes clk::markCacheHits Number of times an entry has been found in the mark cache, so we didn't have to load a mark file.value ic_node_mark_cache_hits clk::markCacheMisses Number of times an entry has not been found in the mark cache, so we had to load a mark file in memory, which is a costly operation, adding to query latency.value ic_node_mark_cache_misses clk::queryCacheBytes Total size of the query cache in bytes.value ic_node_query_cache_bytes clk::queryCacheHits Number of times a query result has been found in the query cache (and query computation was avoided). Only updated for SELECT queries with SETTING use_query_cache = 1.value ic_node_query_cache_hits clk::queryCacheMisses Number of times a query result has not been found in the query cache (and required query computation). Only updated for SELECT queries with SETTING use_query_cache = 1.value ic_node_query_cache_misses clk::uncompressedCacheBytes Total size of uncompressed cache in bytes. Uncompressed cache does not usually improve the performance and should be mostly avoided.value ic_node_uncompressed_cache_bytes clk::uncompressedCacheHits Number of times an entry has been found in the uncompressed cache.value ic_node_uncompressed_cache_hits clk::uncompressedCacheMisses Number of times an entry has not been found in the uncompressed cache.value ic_node_uncompressed_cache_misses Successfully retrieved monitoring results of metrics set.
Bad Request
Not Authorized
Forbidden
Resource not found
Unsupported media type: returned when the payload is in an unsupported format.
Too many requests: returned when more than 35 requests per second are being received by your user.
Broker Level Per-Topic Metrics (Cluster) - Paged with Wildcard
{- "itemsPerPage": 5,
- "resources": [
- {
- "id": "694294d9-ea82-49c2-9f71-aacac81f0325",
- "payload": [
- {
- "metric": "messagesInPerTopic",
- "topic": "instaclustr-sla",
- "type": "mean_rate",
- "unit": "1",
- "values": [
- {
- "time": "2017-01-04T04:19:28.000Z",
- "value": "1.5051724911338817"
}
]
}
], - "privateIp": "10.0.0.1",
- "publicIp": "123.123.123.123",
- "rack": {
- "dataCentre": {
- "displayName": "AWS_VPC_US_EAST_1",
- "name": "US_EAST_1",
- "provider": "AWS_VPC",
- "uuid": null
}, - "name": "us-east-1a",
- "providerAccount": {
- "name": "INSTACLUSTR",
- "provider": "AWS_VPC"
}
}
}, - {
- "id": "4d848f48-5e24-41d6-81f2-44c2f578895f",
- "payload": [
- {
- "metric": "messagesInPerTopic",
- "topic": "instaclustr-sla",
- "type": "mean_rate",
- "unit": "1",
- "values": [
- {
- "time": "2017-01-04T04:19:28.000Z",
- "value": "1.4515722583651829"
}
]
}
], - "privateIp": "10.0.0.2",
- "publicIp": "123.123.123.124",
- "rack": {
- "dataCentre": {
- "displayName": "AWS_VPC_US_EAST_1",
- "name": "US_EAST_1",
- "provider": "AWS_VPC",
- "uuid": null
}, - "name": "us-east-1b",
- "providerAccount": {
- "name": "INSTACLUSTR",
- "provider": "AWS_VPC"
}
}
}, - {
- "id": "3bccad4b-087b-471d-8f24-0452edb86bf1",
- "payload": [
- {
- "metric": "messagesInPerTopic",
- "topic": "instaclustr-sla",
- "type": "mean_rate",
- "unit": "1",
- "values": [
- {
- "time": "2017-01-04T04:19:28.000Z",
- "value": "1.4708695545998745"
}
]
}
], - "privateIp": "10.0.0.3",
- "publicIp": "123.123.123.125",
- "rack": {
- "dataCentre": {
- "displayName": "AWS_VPC_US_EAST_1",
- "name": "US_EAST_1",
- "provider": "AWS_VPC",
- "uuid": null
}, - "name": "us-east-1c",
- "providerAccount": {
- "name": "INSTACLUSTR",
- "provider": "AWS_VPC"
}
}
}, - {
- "id": "694294d9-ea82-49c2-9f71-aacac81f0325",
- "payload": [
- {
- "metric": "messagesInPerTopic",
- "topic": "test-topic",
- "type": "mean_rate",
- "unit": "1",
- "values": [
- {
- "time": "2017-01-04T04:19:28.000Z",
- "value": "1.0517249113388175"
}
]
}
], - "privateIp": "10.0.0.1",
- "publicIp": "123.123.123.123",
- "rack": {
- "dataCentre": {
- "displayName": "AWS_VPC_US_EAST_1",
- "name": "US_EAST_1",
- "provider": "AWS_VPC",
- "uuid": null
}, - "name": "us-east-1a",
- "providerAccount": {
- "name": "INSTACLUSTR",
- "provider": "AWS_VPC"
}
}
}, - {
- "id": "4d848f48-5e24-41d6-81f2-44c2f578895f",
- "payload": [
- {
- "metric": "messagesInPerTopic",
- "topic": "test-topic",
- "type": "mean_rate",
- "unit": "1",
- "values": [
- {
- "time": "2017-01-04T04:19:28.000Z",
- "value": "1.0515722583651829"
}
]
}
], - "privateIp": "10.0.0.2",
- "publicIp": "123.123.123.124",
- "rack": {
- "dataCentre": {
- "displayName": "AWS_VPC_US_EAST_1",
- "name": "US_EAST_1",
- "provider": "AWS_VPC",
- "uuid": null
}, - "name": "us-east-1b",
- "providerAccount": {
- "name": "INSTACLUSTR",
- "provider": "AWS_VPC"
}
}
}
], - "startIndex": 1,
- "totalResults": 9
}You can use this endpoint to retrieve the PgBouncer connection pool schemas. A connection pool in PgBouncer is represented by the database being connected to and the user used to connect.
Successfully retrieved PgBouncer connection pool schemas.
Bad Request
Not Authorized
Forbidden
Resource not found
Unsupported media type: returned when the payload is in an unsupported format.
Too many requests: returned when more than 35 requests per second are being received by your user.
{- "cdcs": [
- {
- "cdcId": "cdc-1",
- "nodes": [
- {
- "nodeId": "node-1",
- "pools": [
- {
- "database": "db-1",
- "users": [
- "user-1",
- "user-2"
]
}, - {
- "database": "db-2",
- "users": [
- "user-1"
]
}
]
}
]
}
]
}You can use this endpoint to retrieve the PostgreSQL schema definition
Successfully retrieved PostgreSQL schema.
Bad Request
Not Authorized
Forbidden
Resource not found
Unsupported media type: returned when the payload is in an unsupported format.
Too many requests: returned when more than 35 requests per second are being received by your user.
{- "db-1": {
- "schema-1": [
- "table-1"
]
}
}You can use this endpoint to list all the Kafka topics.
Successfully retrieved a list of all the Kafka topics.
Bad Request
Not Authorized
Forbidden
Resource not found
Unsupported media type: returned when the payload is in an unsupported format.
Too many requests: returned when more than 35 requests per second are being received by your user.
[- "instaclustr-sla",
- "topic-1"
]