(MySQL) Performance Monitoring with Prometheus [UPDATE]

| Keine Kommentare | Keine TrackBacks

In my last I was looking for a way to do performance monitoring and I stumbled upon Prometheus. Prometheus is much more than monitoring a single node service. Anyway let’s get the idea of gathering metrics using MySQL as example.

This how a simple configuration of Prometheus could look like:

global: 
  scrape_interval: 1m
  scrape_timeout: 10s
  evaluation_interval: 1m

scrape_configs:
  - job_name: mysql
    scheme: http
    target_groups:
    - targets: 
        - '10.17.148.31:9104'
      labels:
        zone: mysql

Every minute Prometheus accesses 172.17.148.31:9104/metrics (/metrics is a Prometheus convention) and labels the result with zone=mysql. Querying the data you can use the labels.

This is a simple configuration. The fun of Prometheus is to have a lot of targets/jobs.

Let’s have a look at our specific endpoint:

> curl 10.17.148.31:9104/metrics
...
mysql_global_status_threads_cached 26
mysql_global_status_threads_connected 99
mysql_global_status_threads_created 125
mysql_global_status_threads_running 2
...

You as a MySQL administrator know what this is all about. The data is provided by an exporter. In our case a container :)

> docker run -d -p 9104:9104 --link=mysql:backend \
  -e DATA_SOURCE_NAME=prometheus:prometheus@secret(backend:3306)/ \ 
   prom/mysqld-exporter

This is old school Docker. Obviously the MySQL is running in a container also (mysql) and we are using the deprecated --link :)

The mysqld-exporter has a lot of options:

$ docker run --rm prom/mysqld-exporter --help
Usage of /bin/mysqld_exporter:
  -collect.auto_increment.columns
      Collect auto_increment columns and max values from information_schema
  -collect.binlog_size
      Collect the current size of all registered binlog files
  -collect.global_status
      Collect from SHOW GLOBAL STATUS (default true)
  -collect.global_variables
      Collect from SHOW GLOBAL VARIABLES (default true)
  -collect.info_schema.processlist
      Collect current thread state counts from the information_schema.processlist
  -collect.info_schema.tables
      Collect metrics from information_schema.tables (default true)
  -collect.info_schema.tables.databases string
      The list of databases to collect table stats for, or '*' for all (default "*")
  -collect.info_schema.tablestats
      If running with userstat=1, set to true to collect table statistics
  -collect.info_schema.userstats
      If running with userstat=1, set to true to collect user statistics
  -collect.perf_schema.eventsstatements
      Collect metrics from performance_schema.events_statements_summary_by_digest
  -collect.perf_schema.eventsstatements.digest_text_limit int
      Maximum length of the normalized statement text (default 120)
  -collect.perf_schema.eventsstatements.limit int
      Limit the number of events statements digests by response time (default 250)
  -collect.perf_schema.eventsstatements.timelimit int
      Limit how old the 'last_seen' events statements can be, in seconds (default 86400)
  -collect.perf_schema.eventswaits
      Collect metrics from performance_schema.events_waits_summary_global_by_event_name
  -collect.perf_schema.file_events
      Collect metrics from performance_schema.file_summary_by_event_name
  -collect.perf_schema.indexiowaits
      Collect metrics from performance_schema.table_io_waits_summary_by_index_usage
  -collect.perf_schema.tableiowaits
      Collect metrics from performance_schema.table_io_waits_summary_by_table
  -collect.perf_schema.tablelocks
      Collect metrics from performance_schema.table_lock_waits_summary_by_table
  -collect.slave_status
      Collect from SHOW SLAVE STATUS (default true)
  -config.my-cnf string
      Path to .my.cnf file to read MySQL credentials from. (default "/home/golang/.my.cnf")
  -log.level value
      Only log messages with the given severity or above. Valid levels: [debug, info, warn, error, fatal, panic]. (default info)
  -log_slow_filter
      Add a log_slow_filter to avoid exessive MySQL slow logging.  NOTE: Not supported by Oracle MySQL.
  -web.listen-address string
      Address to listen on for web interface and telemetry. (default ":9104")
  -web.telemetry-path string
      Path under which to expose metrics. (default "/metrics")

Prometheus ships with an expression browser. Giving you the opportunity to access and graph the data. It also provides his own query language :) Without graphing the following two queries should be self-explaining:

mysql_global_status_created_tmp_disk_tables

increase(mysql_global_status_created_tmp_disk_tables[2m])/120

[UPDATE]

Brian Brazil mentioned to use another (function)[https://prometheus.io/docs/querying/functions/] thx \o/

rate(mysql_global_status_created_tmp_disk_tables[2m])

I recommend to use Grafana as dashboard. You just need to provide the Prometheus server as source and reuse the Queries you used in the expression browser. There is also PomDash, but afaik Grafana is the way to go.

Prometheus rocks. Having a central point to do the performance analysis of the whole datacenter is awesome. There a lot of exporters you can use. Even writing your own exporter is quite easy.

Viel Spaß

Erkan

[Update]

There is a nice presentation I recommend to check it and see the nice graphs Grafana builds :)

[Update2]

Percona is always great in adopting new stuff. Today they announced there Percona Monitoring and Management. Of course it uses also some exporters, Prometheus and Grafana. I’m quite sure it would/could kill other solutions on the market \o/

Keine TrackBacks

TrackBack-URL: http://linsenraum.de/mt/mt-tb.cgi/390

Jetzt kommentieren