Questions tagged [prometheus]

The Prometheus monitoring system, including the server, alert manager, push gateway, exporters, client libraries and other components.

Filter by
Sorted by
Tagged with
206 votes
3 answers
109k views

Do I understand Prometheus's rate vs increase functions correctly?

I have read the Prometheus documentation carefully, but its still a bit unclear to me, so I am here to get confirmation about my understanding. (Please note that for the sake of the simplest examples ...
beatrice's user avatar
  • 4,067
194 votes
11 answers
328k views

Get Total requests in a period of time

I need to show, in Grafana, a panel with the number of requests in the period of time selected in the upper right corner. For this I need to solve 2 issues here, I will ask the prometheus question ...
Facundo Chambo's user avatar
77 votes
4 answers
51k views

Usecases: InfluxDB vs. Prometheus [closed]

Following the Prometheus webpage one main difference between Prometheus and InfluxDB is the usecase: while Prometheus stores time series only InfluxDB is better geared towards storing individual ...
SpaceMonkey's user avatar
77 votes
2 answers
51k views

How to persist data in Prometheus running in a Docker container?

I'm developing something that needs Prometheus to persist its data between restarts. Having followed the instructions $ docker volume create a-new-volume $ docker run \ --publish 9090:9090 \ ...
Matt's user avatar
  • 9,247
63 votes
4 answers
130k views

Prometheus query to count unique label values

I want to count number of unique label values. Kind of like select count (distinct a) from hello_info For example if my metric 'hello_info' has labels a and b. I want to count number of unique a's. ...
emperorspride188's user avatar
63 votes
6 answers
100k views

Prometheus - add target specific label in static_configs

I have job definition as follows: - job_name: 'test-name' static_configs: - targets: [ '192.168.1.1:9100', '192.168.1.1:9101', '192.168.1.1:9102' ] labels: group: '...
Krzysztof Rosiński's user avatar
63 votes
9 answers
182k views

How do I write a Prometheus query that returns the value of a label?

I'm making a Grafana dashboard and want a panel that reports the latest version of our app. The version is reported as a label in the app_version_updated (say) metric like so: app_version_updated{...
kmoe's user avatar
  • 2,003
60 votes
5 answers
198k views

How to calculate containers' cpu usage in kubernetes with prometheus as monitoring?

I want to calculate the cpu usage of all pods in a kubernetes cluster. I found two metrics in prometheus may be useful: container_cpu_usage_seconds_total: Cumulative cpu time consumed per cpu in ...
Haoyuan Ge's user avatar
  • 3,579
58 votes
3 answers
180k views

How can I group labels in a Prometheus query?

If I have a metric with the following labels: my_metric{group="group a"} 100 my_metric{group="group b"} 100 my_metric{group="group c"} 100 my_metric{group="misc group a"} 1 my_metric{group="misc ...
checketts's user avatar
  • 14.6k
56 votes
3 answers
112k views

How can I 'join' two metrics in a Prometheus query?

I am using the consul exporter to ingest the health and status of my services into Prometheus. I'd like to fire alerts when the status of services and nodes in Consul is critical and then use tags ...
Rob Best's user avatar
  • 561
52 votes
5 answers
100k views

Increasing Prometheus storage retention

I have Prometheus server installed on my AWS instance, but the data is being removed automatically after 15 days. I need to have data for a year or months. Is there anything I need to change in my ...
Rohit Bharati's user avatar
49 votes
4 answers
160k views

Prometheus - Convert cpu_user_seconds to CPU Usage %?

I'm monitoring docker containers via Prometheus.io. My problem is that I'm just getting cpu_user_seconds_total or cpu_system_seconds_total. How to convert this ever-increasing value to a CPU ...
M156's user avatar
  • 1,094
48 votes
14 answers
152k views

Context Deadline Exceeded - prometheus

I have Prometheus configuration with many jobs where I am scraping metrics over HTTP. But I have one job where I need to scrape the metrics over HTTPS. When I access: https://ip-address:port/metrics I ...
xmlParser's user avatar
  • 1,943
47 votes
1 answer
72k views

How do I write an "or" logical operator on Prometheus or Grafana

I need to write a query that use any of the different jobs I define. {job="traefik" OR job="cadvisor" OR job="prometheus"} Is it possible to write logical binary operators?
Asier Gomez's user avatar
  • 6,272
47 votes
2 answers
21k views

What does the "instant" checkbox in Grafana graphs based on prometheus do?

I have no clue what the option "instant" means in Grafana when creating graph with Prometheus. Any ideas?
eventhorizon's user avatar
  • 3,197
41 votes
8 answers
85k views

Getting error "Get http://localhost:9443/metrics: dial tcp 127.0.0.1:9443: connect: connection refused"

I'm trying to configure Prometheus and Grafana with my Hyperledger fabric v1.4 network to analyze the peer and chaincode mertics. I've mapped peer container's port 9443 to my host machine's port 9443 ...
Kartik Chauhan's user avatar
40 votes
2 answers
47k views

Prometheus endpoint of all available metrics

I was curious concerning the workings of Prometheus. Using the Prometheus interface I am able to see a drop-down list which I assume contains all available metrics. However, I am not able to access ...
Tony.H's user avatar
  • 671
39 votes
4 answers
80k views

Different Prometheus scrape URL for every target

Every instance of my application has a different URL. How can I configure prometheus.yml so that it takes path of a target along with the host name? scrape_configs: - job_name: 'example-random'...
poojabh's user avatar
  • 435
38 votes
2 answers
59k views

Monitor custom kubernetes pod metrics using Prometheus

I am using Prometheus to monitor my Kubernetes cluster. I have set up Prometheus in a separate namespace. I have multiple namespaces and multiple pods are running. Each pod container exposes a custom ...
Dinesh Ahuja's user avatar
38 votes
3 answers
46k views

How to use the selected period of time in a query?

I'm using Grafana with Prometheus and I'd like to build a query that depends on the selected period of time selected in the upper right corner of the screen. Is there any variable (or something like ...
Facundo Chambo's user avatar
37 votes
11 answers
84k views

Relabel instance to hostname in Prometheus

I have Prometheus scraping metrics from node exporters on several machines with a config like this: scrape_configs: - job_name: node_exporter static_configs: - targets: - 1.2.3.4:...
Norrius's user avatar
  • 7,698
37 votes
8 answers
45k views

Is there a way to monitor kube cron jobs using prometheus

Is there a way to monitor kube cronjob? I have a kube cronjob which runs every 10mins on my cluster. Is there a way to collect metrics every time my cronjob fails due to some error or notify when my ...
user3587892's user avatar
37 votes
4 answers
36k views

Monitoring log files using some metrics exporter + Prometheus + Grafana

I need to monitor very different log files for errors, success status etc. And I need to grab corresponding metrics using Prometheus and show in Grafana + set some alerting on it. Prometheus + Grafana ...
JosMac's user avatar
  • 2,232
36 votes
4 answers
55k views

How can I alert for container restarted?

I like to monitor the containers using Prometheus and cAdvisor so that when a container restart, I get an alert. I wonder if anyone have sample Prometheus alert for this.
qingsong's user avatar
  • 745
35 votes
3 answers
64k views

What's the difference between Prometheus and Zabbix? [closed]

What are the differences between Prometheus and Zabbix?
The One's user avatar
  • 2,291
35 votes
3 answers
45k views

Prometheus: grouping metrics by metric names

Is there a way to group all metrics of an app by metric names? A portion from a query listing all metrics for an app (i.e. {app="bar"}) : ch_qos_logback_core_Appender_all_total{affiliation="foo",app="...
naimdjon's user avatar
  • 3,402
35 votes
4 answers
28k views

Why there are both counters and gauges in Prometheus if gauges can act as counters?

When deciding between Counter and Gauge, Prometheus documentation states that To pick between counter and gauge, there is a simple rule of thumb: if the value can go down, it is a gauge. Counters ...
Jose Armesto's user avatar
  • 13.2k
35 votes
2 answers
15k views

Why does increase() return a value of 1.33 in prometheus?

We graph a timeseries with sum(increase(foo_requests_total[1m])) to show the number of foo requests per minute. Requests come in quite sporadically - just a couple of requests per day. The value that ...
James's user avatar
  • 12k
34 votes
7 answers
117k views

Most recent value or last seen value

Prometheus is built around returning a time series representation of metrics. In many cases, however, I only care about what the state of a metric is right now, and I'm having a hard time figuring out ...
Cory Klein's user avatar
  • 53.6k
34 votes
3 answers
36k views

How to display all metrics that don't have a specific label

I want to select all metrics that don't have label "container". Is there any possibility to do that with prometheus query?
cristi's user avatar
  • 2,119
34 votes
4 answers
30k views

Prometheus (in Docker container) Cannot Scrape Target on Host

Prometheus running inside a docker container (version 18.09.2, build 6247962, docker-compose.xml below) and the scrape target is on localhost:8000 which is created by a Python 3 script. Error ...
Nyxynyx's user avatar
  • 62.5k
34 votes
4 answers
49k views

How can I visualize a histogram with Promdash or Grafana?

I'm attracted to prometheus by the histogram (and summaries) time-series, but I've been unsuccessful to display a histogram in either promdash or grafana. What I expect is to be able to show: a ...
TvE's user avatar
  • 1,076
34 votes
2 answers
26k views

How dangerous are high-cardinality labels in Prometheus?

I'm considering exporting some metrics to Prometheus, and I'm getting nervous about what I'm planning to do. My system consists of a workflow engine, and I'd like to track some metrics for each step ...
Mark's user avatar
  • 11.5k
33 votes
3 answers
45k views

Prometheus vs ElasticSearch. Which is better for container and server monitoring? [closed]

ElasticSearch is a document store and more of a search engine, I think ElasticSearch is not good choice for monitoring high dimensional data as it consumes lot of resources. On the other hand ...
Aditya C S's user avatar
33 votes
3 answers
91k views

Get total and free disk space using Prometheus

I try to get Total and Free disk space on my Kubernetes VM so I can display % of taken space on it. I tried various metrics that included "filesystem" in name but none of these displayed correct total ...
Uliysess's user avatar
  • 639
33 votes
3 answers
35k views

increase() in Prometheus sometimes doubles values: how to avoid?

I've found that for some graphs I get doubles values from Prometheus where should be just ones: Query I use: increase(signups_count[4m]) Scrape interval is set to the recommended maximum of 2 ...
sanmai's user avatar
  • 30.1k
32 votes
2 answers
28k views

How to add https url on target prometheus

I want to add my HTTPS target URL to Prometheus, an error like this appears: "https://myDomain.dev" is not a valid hostname" my domain can access and run using proxy pass Nginx with ...
Inadrawiba's user avatar
31 votes
1 answer
34k views

Prometheus - exclude 0 values from query result

I'm displaying Prometheus query on a Grafana table. That's the query (Counter metric): sum(increase(check_fail{app="monitor"}[20m])) by (reason) The result is a table of failure reason and its count....
nirsky's user avatar
  • 3,065
31 votes
5 answers
90k views

How to rename label within a metric in Prometheus

I have a query: node_systemd_unit_state{instance="server-01",job="node-exporters",name="kubelet.service",state="active"} 1 I want the label name being renamed (or replaced) to unit_name ONLY within ...
Konstantin Vustin's user avatar
29 votes
3 answers
100k views

prometheus doesn't match regex query

I'm trying to write a prometheus query in grafana that will select visits_total{route!~"/api/docs/*"} What I'm trying to say is that it should select all the instances where the route doesn't match /...
ninesalt's user avatar
  • 4,224
29 votes
6 answers
24k views

How to scrape all metrics from a federate endpoint?

We have a hierachical prometheus setup with some server scraping others. We'd like to have some servers scrape all metrics from others. Currently we try to use match[]="{__name__=~".*"}" as a metric ...
tex's user avatar
  • 2,121
29 votes
2 answers
19k views

Prometheus - Aggregate and relabel by regex

I currently have the following Promql query which allow me to query the memory used by each of my K8S pods: sum(container_memory_working_set_bytes{image!="",name=~"^k8s_.*"}) by (pod_name) The pod's ...
Mornor's user avatar
  • 3,672
28 votes
2 answers
45k views

multiple values from grafana variable in prometheus query

We have a situation where we need to select the multiple values (instances/servers) from grafana variable field, and multiple values needs to passed to the Prometheus query using some regex, so that i ...
Anil Kumar's user avatar
28 votes
3 answers
51k views

Generating range vectors from return values in Prometheus queries

I have a metric varnish_main_client_req of type counter and I want to set up an alert that triggers if the rate of requests drops/raises by a certain amount in a given time (e.g. "Amount of ...
Paul Voss's user avatar
  • 735
28 votes
5 answers
29k views

How to gracefully avoid divide by zero in Prometheus

There are times when you need to divide one metric by another metric. For example, I'd like to calculate a mean latency like that: rate({__name__="hystrix_command_latency_total_seconds_sum"}[60s]) / ...
Yoory N.'s user avatar
  • 5,191
27 votes
9 answers
32k views

Prometheus instant vector vs range vector

There's something I still dont understand about instant vector and range vectors Instant vector - a set of time series containing a single sample for each time series, all sharing the same timestamp. ...
small's user avatar
  • 303
27 votes
3 answers
19k views

Prometheus/PromQL subtract two gauge metrics

I have this gauge metric "metric_awesome" from two different instances. What i want to do, is subtract instance one from instance two like so metric_awesome{instance="one"} - metric_awesome{instance="...
alknows's user avatar
  • 2,022
27 votes
3 answers
37k views

How to execute multiple queries in one call in Prometheus

I'm running prometheus inside kubernetes cluster. I need to send queries to Prometheus every minute, to gather information of many metrics from many containers. There are too match queries, so I ...
roie's user avatar
  • 765
26 votes
3 answers
88k views

Filter prometheus results by metric value, not by label value

Because Prometheus topk returns more results than expected, and because https://github.com/prometheus/prometheus/issues/586 requires client-side processing that has not yet been made available via ...
Steve Dwire's user avatar
26 votes
5 answers
42k views

Understanding histogram_quantile based on rate in Prometheus

According to Prometheus documentation in order to have a 95th percentile using histogram metric I can use following query: histogram_quantile(0.95, sum(rate(http_request_duration_seconds_bucket[5m])) ...
evgeniy44's user avatar
  • 3,012

1
2 3 4 5
140