How often does Grafana Cloud have outages?

Grafana Cloud has had 50 reported incidents recently. Visit apistatuscheck.com/incidents/grafana for the complete outage history with timelines and resolution details.

What was the last Grafana Cloud outage?

The most recent Grafana Cloud incident was "IRM Pages Not Accessible" on February 3, 2026. It was classified as critical impact and is currently resolved.

Grafana Cloud Outage History

50 incidents reported. Data sourced from the official Grafana Cloud status page.

Total Incidents

Major/Critical

Minor

Resolved

February 2026

IRM Pages Not Accessible

critical

Feb 3, 04:52 PM→Feb 3, 05:56 PMresolved

Feb 3, 05:56 PM

resolved — This incident has been resolved.

Feb 3, 05:38 PM

monitoring — A fix as implemented, and we are seeing recovery throughout the rollout. We will continue to monitor results.

Feb 3, 05:29 PM

identified — The issue has been identified and we are implementing a fix.

+1 more updates

January 2026

Some Dashboards in Prod-Us-Central-3 unable to load

major

Jan 28, 05:27 PM→Jan 28, 08:25 PMresolved

Jan 28, 08:25 PM

resolved — This incident has been resolved.

Jan 28, 06:24 PM

monitoring — A fix has been implemented, and we are monitoring the results.

Jan 28, 05:27 PM

investigating — We are currently investigating an issue impacting dashboards for users in the prod-us-central-3 region. This is preventing impacted dashboards from loading as expected. This is also impacting a very ...

Grafana OnCall and IRM Loading Issues

minor

Jan 27, 08:37 PM→Jan 28, 12:22 AMresolved

Jan 28, 12:22 AM

resolved — We continue to observe a continued period of recovery. At this time, we are considering this issue resolved. No further updates.

Jan 27, 10:56 PM

monitoring — As of 22:55 UTC, we have observed marked improvement with the incident impacting IRM and OnCall. We are still investigating and will continue to monitor and provide updates.

Jan 27, 08:37 PM

investigating — We are currently investigating an issue impacting some customers when accessing Grafana Oncall and IRM. Impacted customers may experience long load times, or even time-outs when attempting to access t...

Grafana Cloud instances unavailable

major

Jan 27, 10:17 AM→Jan 27, 11:14 AMresolved

Jan 27, 11:14 AM

resolved — This incident has been resolved.

Jan 27, 10:33 AM

monitoring — A fix has been implemented and we are monitoring the results.

Jan 27, 10:17 AM

investigating — Some users experience their Grafana Cloud instances as unavailable.

Increased write error rate for logs in prod-us-west-0

none

Jan 27, 07:49 AM→Jan 27, 07:49 AMresolved

Jan 27, 07:49 AM

resolved — We were experiencing increased write error rate for logs in prod-us-west-0 from 6:55 to 7:15 UTC. We have since observed continued stability and are marking this as resolved.

Upgrade from Free → Pro failing for users

major

Jan 26, 08:53 PM→Jan 27, 12:13 AMresolved

Jan 27, 12:13 AM

resolved — Engineering has released a fix and as of 00:13 UTC, customers should no longer experience issues upgrading from Free to Pro subscriptions. At this time, we are considering this issue resolved. No furt...

Jan 26, 09:52 PM

identified — Engineering has identified the issue and is currently exploring remediation options. At this time, users will continue to experience the inability to upgrade from Free to Pro subscriptions. We will c...

Jan 26, 08:53 PM

investigating — As of 20:05 UTC, our engineering team became aware of an issue related to subscription plan upgrades. Users experiencing this issue will not be able to upgrade from a Free plan to a Pro subscription. ...

Investigating Issues with Email Delivery

none

Jan 23, 03:37 PM→Jan 23, 06:44 PMresolved

Jan 23, 06:44 PM

resolved — This incident has been resolved.

Jan 23, 04:55 PM

monitoring — We are noticing significant improvement, and things are stabilizing as expected. Our engineering teams will continue to monitor progress.

Jan 23, 03:37 PM

investigating — We are currently investigating an issues impacting Email delivery for some Services, including Alert Notifications.

Synthetic monitoring secrets - proxy URL changes

none

Jan 21, 09:16 PM→Jan 22, 10:29 PMresolved

Jan 22, 10:29 PM

resolved — The incident is resolved. We are in contact with customers affected by this change.

Jan 21, 09:16 PM

identified — During the secrets migration in https://status.grafana.com/incidents/47d1q4sphrmj, secrets proxy URLs for some customers updated in the following regions: prod-us-central-0, prod-us-east-0, and pro...

Hosted Traces elevated write latency in prod-us-central-0 region.

minor

Jan 21, 01:24 PM→Jan 21, 03:21 PMresolved

Jan 21, 03:21 PM

resolved — We consider this incident as resolved since the latency hasn't been elevated since the fix was applied. The issue was caused by a latency spike in a downstream dependency, causing an increased backpre...

Jan 21, 01:35 PM

monitoring — The issue was identified and a fix was applied. After applying the fix, latency went down to a regular and expected value. We're currently monitoring the component's health before resolving the incide...

Jan 21, 01:24 PM

investigating — We're currently investigating an issue with elevated write latency in Hosted Traces prod-us-central-0 region. It's experiencing sustained high write latency since 7:20 AM UTC. Only a small subset of t...

Incident: Metrics Querying Unavailable in EU (Resolved)

none

Jan 19, 02:30 PM→Jan 19, 02:30 PMresolved

Jan 19, 03:30 PM

resolved — Impact: Between 14:30 and 14:38 UTC, some customers in prod-eu-west-2 may have experienced issues querying metrics. During this time, read requests to the metrics backend were unavailable, resulting i...

Degraded Writes in AWS us-east-2

minor

Jan 17, 11:28 AM→Jan 19, 01:21 AMresolved

Jan 19, 01:21 AM

resolved — This incident has been resolved.

Jan 18, 09:24 AM

monitoring — The issue hasn't been seen for a reasonable amount of time and hasn't occurred when it was expected to occur. We're still closely monitoring systems behaviour and will update this incident accordingly...

Jan 18, 02:20 AM

investigating — We are continuing to investigate this issue. It is impacting all components using the write path in cortex-prod-13 and mimir-prod-56. We do not yet have a root cause but have found that this issue s...

+6 more updates

Degraded Writes in AWS us-east-2

none

Jan 17, 07:51 AM→Jan 17, 09:04 AMresolved

Jan 17, 09:04 AM

resolved — This incident has been resolved.

Jan 17, 07:51 AM

investigating — We are currently investigating an issue causing degraded write performance across multiple products in the AWS us-east-2 region. Our engineering team is actively working to determine the full scope an...

Partial Mimir Write Outage

major

Jan 16, 12:28 AM→Jan 16, 04:00 AMresolved

Jan 16, 04:00 AM

resolved — This incident has been resolved. Both read and write 5xx's and increased latency were experienced in the two periods: 23:56:15 to 00:32:45 UTC 00:55:30 to 01:36:15 UTC

Jan 16, 02:27 AM

monitoring — Customers should no longer experience issues. We will continue to monitor and provide updates.

Jan 16, 01:35 AM

investigating — We are continuing to investigate this issue.

+3 more updates

Connectivity issues for Azure PrivateLink endpoints.

major

Jan 14, 02:30 PM→Jan 14, 08:17 PMresolved

Jan 14, 08:17 PM

resolved — The scope of this incident was smaller than originally anticipated. As of 16:27 UTC our engineering team merged a fix for those affected and we are considering this as resolved.

Jan 14, 02:30 PM

investigating — We're experiencing an issue with connectivity loss for Azure PrivateLink endpoints in all available Azure regions. The issue affects users trying to ingest Alloy data or use PDC over Azure PrivateLink...

PDC Agent Connectivity Issues in prod-eu-west-3

major

Jan 12, 03:44 PM→Jan 12, 06:21 PMresolved

Jan 12, 06:21 PM

resolved — We continue to observe a continued period of recovery. At this time, we are considering this issue resolved. No further updates.

Jan 12, 05:01 PM

monitoring — Engineering has released a fix and as of 17:01 UTC, customers should no longer experience connectivity issues. We will continue to monitor for recurrence and provide updates accordingly.

Jan 12, 04:50 PM

identified — Engineering has identified the issue and will be deploying a fix shortly. At this time, users will continue to experience disruptions for queries routed via PDC. We will continue to provide updates a...

+1 more updates

Tempo write degradation in prod-eu-west-3 - tempo-prod-08

minor

Jan 12, 09:03 AM→Jan 12, 03:26 PMresolved

Jan 12, 03:26 PM

resolved — Engineering has released a fix and we continue to observe a period of recovery. As of 15:12 UTC we are considering this resolved.

Jan 12, 11:41 AM

investigating — There was a full degradation of write service between 9:13 UTC - 9:35 UTC. The cell is operational but there is still degradation in the write path. Our Engineering team is actively working on this.

Jan 12, 09:09 AM

investigating — We are continuing to investigate this issue.

+1 more updates

Write Degradation in Grafana Cloud Logs (prod-us-east-3)

none

Jan 9, 08:30 PM→Jan 9, 08:30 PMresolved

Jan 9, 11:08 PM

resolved — Between 20:23 UTC and 20:53 UTC, Grafana Cloud Logs in prod-us-east-3 experienced a write degradation, which may have resulted in delayed or failed log ingestion for some customers. The issue has bee...

Partial Write Outage in prod-us-central-0

none

Jan 7, 05:41 PM→Jan 7, 05:41 PMresolved

Jan 7, 05:41 PM

resolved — There was a ~15 minute partial write outage for some customers in prod-us-central-0. The time frame for this outage was 15:43-15:57 UTC.

High Latency and Errors in Prod-Us-Central-7

major

Jan 6, 05:41 PM→Jan 6, 08:26 PMresolved

Jan 6, 08:26 PM

resolved — This incident has been resolved.

Jan 6, 05:50 PM

monitoring — We are seeing some recovery in affected products. We are continuing to monitor the progress.

Jan 6, 05:41 PM

investigating — We are currently investigating an issue causing degraded Mimir and Tempo read performance in the prod-us-central-7 region.

Cloudflare Error 1016

none

Jan 6, 03:09 PM→Jan 6, 03:09 PMresolved

Jan 6, 03:09 PM

resolved — From 20:32 to 20:37 UTC, a DNS record misconfiguration resulted in temporary Cloudflare 1016 DNS errors on many Grafana Cloud stacks. The misconfiguration was mitigated within 5 minutes, and we are w...

K6 test-runs cannot be started and the overall navigation experience is degraded

minor

Jan 2, 10:44 AM→Jan 2, 01:38 PMresolved

Jan 2, 01:38 PM

resolved — This incident has been resolved.

Jan 2, 11:53 AM

monitoring — We are continuing to monitor for any further issues.

Jan 2, 11:51 AM

monitoring — A fix has been implemented and we are monitoring the results.

+1 more updates

December 2025

PDC Queries in Prod-Us-East-3 Are Failing

minor

Dec 23, 08:31 PM→Dec 23, 09:59 PMresolved

Dec 23, 09:59 PM

resolved — This incident has been resolved.

Dec 23, 09:09 PM

monitoring — A fix has been implemented and we are monitoring the results.

Dec 23, 08:34 PM

identified — We have identified the issue, and are working on a fix now.

+1 more updates

New K6 Tests Intermittently Failing

major

Dec 19, 04:10 PM→Dec 19, 04:50 PMresolved

Dec 19, 04:50 PM

resolved — This incident has been resolved.

Dec 19, 04:50 PM

investigating — We are continuing to investigate this issue.

Dec 19, 04:10 PM

investigating — We are currently investigating an issue that is causing intermittent failure when trying to start new k6 tests on the k6-app in the cloud. The tests are starting normally from CLI.

Logs write path degradation on GCP Belgium - prod-eu-west-0

none

Dec 17, 01:02 PM→Dec 17, 01:02 PMresolved

Dec 17, 01:02 PM

resolved — Today 17th from 10:00 UTC to 10:38 UTC we experienced logs write path degradation. Customers may have experienced 5xx errors while ingestion on that cluster. Service is fully restored.

IRM access issues on instances in GCP Belgium (prod-eu-west-0)

none

Dec 16, 12:51 PM→Dec 16, 12:51 PMresolved

Dec 16, 12:51 PM

resolved — Due to an incident today in IRM (OnCall) product, access to this application was degraded from 10:39 UTC until 11:16. Customers may have experienced the app not operable or not accessible. The servic...

k6 Cloud service is down

none

Dec 16, 10:37 AM→Dec 16, 10:42 AMresolved

Dec 16, 10:42 AM

resolved — This incident has been resolved.

Dec 16, 10:37 AM

investigating — The main k6 Cloud service is down due to database issues, and it is not possible to access test runs or start new ones.

Some Trace QL Queries Failing

major

Dec 12, 02:17 PM→Dec 12, 02:38 PMresolved

Dec 12, 02:38 PM

resolved — This incident has been resolved.

Dec 12, 02:17 PM

identified — TraceQL queries with "= nil" in Explore Traces and part of Drilldown Traces are failing with 400 Bad Request errors. The issue has been identified, and a fix is currently being rolled out.

Mimir Partial Write Outage

major

Dec 11, 09:43 PM→Dec 11, 11:17 PMresolved

Dec 11, 11:17 PM

resolved — We continue to observe a continued period of recovery. At this time, we are considering this issue resolved. No further updates.

Dec 11, 11:02 PM

monitoring — Synthetic Monitoring has now also recovered. Customers should no longer experience alert rules failing to evaluate. We continue to monitor for recurrence and will provide updates accordingly.

Dec 11, 10:33 PM

monitoring — Engineering has released a fix and as of 22:25 UTC, customers should no longer experience ingestion issues. We will continue to monitor for recurrence and provide updates accordingly.

+2 more updates

Elevated Metric Push Failures and Latency

major

Dec 10, 07:29 PM→Dec 10, 11:05 PMresolved

Dec 10, 11:05 PM

resolved — We continue to observe a continued period of recovery. At this time, we are considering this issue resolved. No further updates.

Dec 10, 10:22 PM

monitoring — We are observing a trend in improvement after implementing a fix. We will continue to monitor and update accordingly. During our investigation, we also became aware that some alerts associated with ...

Dec 10, 09:38 PM

identified — Our engineering team has identified a potential root cause, and a fix is being implemented.

+4 more updates

Elevated Log Push Failures and Latency on prod-eu-west-0 cluster

major

Dec 10, 07:30 PM→Dec 10, 07:30 PMresolved

Dec 11, 02:35 PM

resolved — Users experienced failed log pushes as well as increased latency when sending logs to Loki service hosted on prod-eu-west-0 cluster between 18:30 UTC to ~23:00 UTC. Our engineering team has engaged o...

Metrics read issue affecting cortex-prod-13 on prod-us-east-0

critical

Dec 10, 08:23 AM→Dec 10, 09:06 AMresolved

Dec 10, 09:06 AM

resolved — The incident has been resolved.

Dec 10, 08:30 AM

monitoring — Read path has been restored at 08:23 UTC and queries are fully functioning again. The read path outage lasted from 08:04 to 08:23 UTC.

Dec 10, 08:23 AM

investigating — At 08:04 UTC we detected read path outage (queries) on cortex-prod-13. We are currently investigating this issue. The ingestion path (writes) is not affected.

Logs query degradation on AWS Germany (prod-eu-west-2)

major

Dec 9, 12:59 PM→Dec 9, 01:10 PMresolved

Dec 9, 01:10 PM

resolved — The issue has been resolved.

Dec 9, 01:06 PM

monitoring — The query service is operational again, logs reads should be available on the cluster. Our engineers are monitoring the health status of the service to ensure full recovery.

Dec 9, 12:59 PM

identified — Since today 9th, around 12:30 UTC time, we are experiencing problems on the Loki read path of cluster eu-west-2. This translates in difficulties to query logs for customers on this cluster, and can a...

Hosted Grafana is currently being impacted as a result of the Cloudflare outage

none

Dec 5, 09:25 AM→Dec 5, 09:44 AMresolved

Dec 5, 09:44 AM

resolved — This incident has been resolved.

Dec 5, 09:25 AM

investigating — We are currently experiencing disruptions to Hosted Grafana services due to a widespread Cloudflare outage impacting connectivity across multiple regions. Our team is actively monitoring the situation...

Loki prod-ap-northeast-0-loki-prod-030 writes degradation

minor

Dec 1, 08:00 AM→Dec 1, 08:00 AMresolved

Dec 1, 09:30 AM

resolved — Loki prod-ap-northeast-0-loki-prod-030 cell had writes degradation between 8:11 - 8:58 AM UTC. The engineering team mitigated the situation and the cell is stable now.

November 2025

Alerts failing with prometheus

minor

Nov 27, 04:49 PM→Nov 27, 06:27 PMresolved

Nov 27, 06:27 PM

resolved — This incident has been resolved.

Nov 27, 06:02 PM

monitoring — A fix has been implemented and we are monitoring the results.

Nov 27, 04:49 PM

investigating — We are currently investigating an issue around degraded services in prod-us-central-0. The expected behavioral impact is that the queries may take longer than usual to respond.

Synthetic Monitoring is down in prod-us-central-7

critical

Nov 24, 11:19 AM→Nov 24, 11:49 AMresolved

Nov 24, 11:49 AM

resolved — This incident has been resolved.

Nov 24, 11:19 AM

investigating — Users can not interact with the SM API for any DB-related action, such as: CRUD checks CRUD probes

Longer Than Expected Load Times on Grafana Cloud

major

Nov 21, 03:47 PM→Nov 22, 05:06 PMresolved

Nov 22, 05:06 PM

resolved — We continue to observe a continued period of recovery. At this time, we are considering this issue resolved. No further updates.

Nov 21, 05:04 PM

monitoring — A fix has been implemented, and we are seeing latency down across clusters. We are continuing to monitor progress.

Nov 21, 03:47 PM

investigating — We are currently investigating reports of long load times on Grafana Cloud. We will update as more information becomes available.

Some Loki Writes in Prod-Gb-South-0 Failed

major

Nov 21, 02:32 PM→Nov 21, 03:38 PMresolved

Nov 21, 03:38 PM

resolved — This incident has been resolved.

Nov 21, 02:32 PM

monitoring — From approximately 14:10-14:25 UTC, writes to Loki failed for a subset of customers in the gb-south-0 region. Most of these errors have already recovered, and our team continues to monitor the recover...

Slow user queries exceeds threshold

minor

Nov 21, 12:09 PM→Nov 21, 02:52 PMresolved

Nov 21, 02:52 PM

resolved — This incident has been resolved.

Nov 21, 01:08 PM

monitoring — A fix has been implemented and we are monitoring the results.

Nov 21, 12:10 PM

investigating — We are having some intermittent query failures

+1 more updates

Elevated Read & Write Latency for Some Cells in Prod-Us-East-0

major

Nov 20, 05:12 PM→Nov 20, 06:47 PMresolved

Nov 20, 06:47 PM

resolved — This incident has been resolved.

Nov 20, 05:47 PM

monitoring — Things have recovered, and we are monitoring to ensure stability.

Nov 20, 05:23 PM

identified — The previous post mentioned that this was occurring in some cells in prod-us-central-0. This is incorrect and is occurring in some cells in prod-us-east-0.

+1 more updates

Intermittent issues when starting k6 cloud test runs

none

Nov 19, 09:00 PM→Nov 19, 09:00 PMresolved

Nov 20, 04:35 PM

resolved — We are experiencing issues starting cloud test runs. This is primarily affecting browser test runs and test runs using static IPs.

Missing Billing Metrics for Loki

none

Nov 19, 06:20 AM→Nov 19, 06:20 AMresolved

Nov 19, 06:20 AM

resolved — An incident has impacted Loki billing metrics from 05:30 to 06:10 UTC across all clusters. This is currently resolved, however, users may notice some billing metrics missing from the billing dashboard...

Hosted grafana is currently being impacted as a result of the Cloudflare outage

none

Nov 18, 11:59 AM→Nov 18, 05:34 PMresolved

Nov 18, 05:34 PM

resolved — This incident has been resolved.

Nov 18, 11:59 AM

investigating — We are currently experiencing disruption to Hosted Grafana services due to a widespread Cloudflare outage impacting connectivity across multiple regions. Our team is actively monitoring the situation ...

Cortex - read/write path disruption

major

Nov 17, 02:06 PM→Nov 18, 07:26 AMresolved

Nov 18, 07:26 AM

resolved — This incident has been resolved.

Nov 17, 10:49 PM

identified — We’re continuing to work on this issue, and are actively investigating the remaining details. Our team is making progress, and we’ll share another update as soon as we have more information to provide...

Nov 17, 02:49 PM

identified — We are continuing to work on a fix for this issue.

+1 more updates

Elevated Mimir Read/Write Errors

major

Nov 17, 10:22 PM→Nov 18, 12:10 AMresolved

Nov 18, 12:10 AM

resolved — We continue to observe a continued period of recovery. As of 00:08 UTC, we are considering this issue resolved. No further updates.

Nov 17, 11:27 PM

monitoring — A fix has been implemented and we are monitoring the results.

Nov 17, 10:55 PM

investigating — Our teams have been alerted that Synthetic Monitoring will also be affected by this outage. Users may see gaps in their Synthetic Monitoring metrics as well as missed alerts as a result of this. W...

+4 more updates

Metrics Write Outage in Multiple Cells

major

Nov 17, 08:02 PM→Nov 17, 11:08 PMresolved

Nov 17, 11:08 PM

resolved — As of 23:07 UTC we are considering this incident as resolved. Mitigation efforts have restored normal write performance, and error rates have returned to expected levels. We have confirmed stabili...

Nov 17, 09:22 PM

monitoring — We are seeing improvement on the metrics side, with write performance recovering. We continue to investigate the remaining impact to Synthetic Monitoring and are working to determine the underlying ...

Nov 17, 08:50 PM

identified — Our teams have been alerted that Synthetic Monitoring will also be affected by this outage. Users may see gaps in their Synthetic Monitoring metrics as well as missed alerts as a result of this. W...

+3 more updates

Hyderabad Probe Issues

minor

Nov 17, 07:01 PM→Nov 17, 07:01 PMresolved

Nov 17, 07:01 PM

resolved — We experienced degraded service with the Hyderabad probe today around 13:20 UTC which was resolved as of 17:30 UTC.

PDC-Prod-eu-west-2 cluster degraded performance

none

Nov 17, 11:56 AM→Nov 17, 12:23 PMresolved

Nov 17, 12:23 PM

resolved — This incident has been resolved.

Nov 17, 12:22 PM

monitoring — Engineering has released a fix and as of 12:15 UTC, customers should no longer experience performance degradation on the PDC service. We will continue to monitor for recurrence and provide updates acc...

Nov 17, 11:56 AM

investigating — We are currently facing performance degradation on PDC service hosted on prod-eu-west-2 cluster. Our Engineering Team is currently working on fixing the issue, we do apologize for any inconvenience.

Loki Prod 012 read-path-unstable

minor

Nov 17, 04:12 AM→Nov 17, 04:55 AMresolved

Nov 17, 04:55 AM

resolved — resolved since 03:02UTC

Nov 17, 04:13 AM

monitoring — We started having instabilities for Alerting and Recording rules of this cell starting at 2:30 AM UTC. They started recovering at around 3 AM UTC, but we're still watching.

Nov 17, 04:12 AM

monitoring — A fix has been implemented and we are monitoring the results.

Degraded Brower Check Performance

minor

Nov 12, 04:50 PM→Nov 12, 04:50 PMresolved

Nov 12, 04:50 PM

resolved — Spanning from November 10th, 18:00 UTC to November 11th, 22:00 UTC, Synthetic Monitoring experienced degraded browser check performance due to a faulty release that has been rolled back. This impacte...

Related Incident Histories

Get Grafana Cloud Outage Alerts

Be the first to know when Grafana Cloud go down.