Grafana Cloud Outage History
50 incidents reported. Data sourced from the official Grafana Cloud status page.
50
Total Incidents
22
Major/Critical
12
Minor
50
Resolved
February 2026
IRM Pages Not Accessible
criticalFeb 3, 04:52 PM→Feb 3, 05:56 PMresolved
Feb 3, 05:56 PM
resolved — This incident has been resolved.
Feb 3, 05:38 PM
monitoring — A fix as implemented, and we are seeing recovery throughout the rollout. We will continue to monitor results.
Feb 3, 05:29 PM
identified — The issue has been identified and we are implementing a fix.
+1 more updates
January 2026
Some Dashboards in Prod-Us-Central-3 unable to load
majorJan 28, 05:27 PM→Jan 28, 08:25 PMresolved
Jan 28, 08:25 PM
resolved — This incident has been resolved.
Jan 28, 06:24 PM
monitoring — A fix has been implemented, and we are monitoring the results.
Jan 28, 05:27 PM
investigating — We are currently investigating an issue impacting dashboards for users in the prod-us-central-3 region. This is preventing impacted dashboards from loading as expected.
This is also impacting a very ...
Grafana OnCall and IRM Loading Issues
minorJan 27, 08:37 PM→Jan 28, 12:22 AMresolved
Jan 28, 12:22 AM
resolved — We continue to observe a continued period of recovery. At this time, we are considering this issue resolved. No further updates.
Jan 27, 10:56 PM
monitoring — As of 22:55 UTC, we have observed marked improvement with the incident impacting IRM and OnCall. We are still investigating and will continue to monitor and provide updates.
Jan 27, 08:37 PM
investigating — We are currently investigating an issue impacting some customers when accessing Grafana Oncall and IRM. Impacted customers may experience long load times, or even time-outs when attempting to access t...
Grafana Cloud instances unavailable
majorJan 27, 10:17 AM→Jan 27, 11:14 AMresolved
Jan 27, 11:14 AM
resolved — This incident has been resolved.
Jan 27, 10:33 AM
monitoring — A fix has been implemented and we are monitoring the results.
Jan 27, 10:17 AM
investigating — Some users experience their Grafana Cloud instances as unavailable.
Increased write error rate for logs in prod-us-west-0
noneJan 27, 07:49 AM→Jan 27, 07:49 AMresolved
Jan 27, 07:49 AM
resolved — We were experiencing increased write error rate for logs in prod-us-west-0 from 6:55 to 7:15 UTC. We have since observed continued stability and are marking this as resolved.
Upgrade from Free → Pro failing for users
majorJan 26, 08:53 PM→Jan 27, 12:13 AMresolved
Jan 27, 12:13 AM
resolved — Engineering has released a fix and as of 00:13 UTC, customers should no longer experience issues upgrading from Free to Pro subscriptions. At this time, we are considering this issue resolved. No furt...
Jan 26, 09:52 PM
identified — Engineering has identified the issue and is currently exploring remediation options. At this time, users will continue to experience the inability to upgrade from Free to Pro subscriptions.
We will c...
Jan 26, 08:53 PM
investigating — As of 20:05 UTC, our engineering team became aware of an issue related to subscription plan upgrades. Users experiencing this issue will not be able to upgrade from a Free plan to a Pro subscription.
...
Investigating Issues with Email Delivery
noneJan 23, 03:37 PM→Jan 23, 06:44 PMresolved
Jan 23, 06:44 PM
resolved — This incident has been resolved.
Jan 23, 04:55 PM
monitoring — We are noticing significant improvement, and things are stabilizing as expected. Our engineering teams will continue to monitor progress.
Jan 23, 03:37 PM
investigating — We are currently investigating an issues impacting Email delivery for some Services, including Alert Notifications.
Synthetic monitoring secrets - proxy URL changes
noneJan 21, 09:16 PM→Jan 22, 10:29 PMresolved
Jan 22, 10:29 PM
resolved — The incident is resolved. We are in contact with customers affected by this change.
Jan 21, 09:16 PM
identified — During the secrets migration in https://status.grafana.com/incidents/47d1q4sphrmj, secrets proxy URLs for some customers updated in the following regions: prod-us-central-0, prod-us-east-0, and pro...
Hosted Traces elevated write latency in prod-us-central-0 region.
minorJan 21, 01:24 PM→Jan 21, 03:21 PMresolved
Jan 21, 03:21 PM
resolved — We consider this incident as resolved since the latency hasn't been elevated since the fix was applied. The issue was caused by a latency spike in a downstream dependency, causing an increased backpre...
Jan 21, 01:35 PM
monitoring — The issue was identified and a fix was applied. After applying the fix, latency went down to a regular and expected value. We're currently monitoring the component's health before resolving the incide...
Jan 21, 01:24 PM
investigating — We're currently investigating an issue with elevated write latency in Hosted Traces prod-us-central-0 region. It's experiencing sustained high write latency since 7:20 AM UTC. Only a small subset of t...
Incident: Metrics Querying Unavailable in EU (Resolved)
noneJan 19, 02:30 PM→Jan 19, 02:30 PMresolved
Jan 19, 03:30 PM
resolved — Impact:
Between 14:30 and 14:38 UTC, some customers in prod-eu-west-2 may have experienced issues querying metrics. During this time, read requests to the metrics backend were unavailable, resulting i...
Degraded Writes in AWS us-east-2
minorJan 17, 11:28 AM→Jan 19, 01:21 AMresolved
Jan 19, 01:21 AM
resolved — This incident has been resolved.
Jan 18, 09:24 AM
monitoring — The issue hasn't been seen for a reasonable amount of time and hasn't occurred when it was expected to occur. We're still closely monitoring systems behaviour and will update this incident accordingly...
Jan 18, 02:20 AM
investigating — We are continuing to investigate this issue.
It is impacting all components using the write path in cortex-prod-13 and mimir-prod-56.
We do not yet have a root cause but have found that this issue s...
+6 more updates
Degraded Writes in AWS us-east-2
noneJan 17, 07:51 AM→Jan 17, 09:04 AMresolved
Jan 17, 09:04 AM
resolved — This incident has been resolved.
Jan 17, 07:51 AM
investigating — We are currently investigating an issue causing degraded write performance across multiple products in the AWS us-east-2 region. Our engineering team is actively working to determine the full scope an...
Partial Mimir Write Outage
majorJan 16, 12:28 AM→Jan 16, 04:00 AMresolved
Jan 16, 04:00 AM
resolved — This incident has been resolved.
Both read and write 5xx's and increased latency were experienced in the two periods:
23:56:15 to 00:32:45 UTC
00:55:30 to 01:36:15 UTC
Jan 16, 02:27 AM
monitoring — Customers should no longer experience issues.
We will continue to monitor and provide updates.
Jan 16, 01:35 AM
investigating — We are continuing to investigate this issue.
+3 more updates
Connectivity issues for Azure PrivateLink endpoints.
majorJan 14, 02:30 PM→Jan 14, 08:17 PMresolved
Jan 14, 08:17 PM
resolved — The scope of this incident was smaller than originally anticipated.
As of 16:27 UTC our engineering team merged a fix for those affected and we are considering this as resolved.
Jan 14, 02:30 PM
investigating — We're experiencing an issue with connectivity loss for Azure PrivateLink endpoints in all available Azure regions. The issue affects users trying to ingest Alloy data or use PDC over Azure PrivateLink...
PDC Agent Connectivity Issues in prod-eu-west-3
majorJan 12, 03:44 PM→Jan 12, 06:21 PMresolved
Jan 12, 06:21 PM
resolved — We continue to observe a continued period of recovery. At this time, we are considering this issue resolved. No further updates.
Jan 12, 05:01 PM
monitoring — Engineering has released a fix and as of 17:01 UTC, customers should no longer experience connectivity issues. We will continue to monitor for recurrence and provide updates accordingly.
Jan 12, 04:50 PM
identified — Engineering has identified the issue and will be deploying a fix shortly. At this time, users will continue to experience disruptions for queries routed via PDC.
We will continue to provide updates a...
+1 more updates
Tempo write degradation in prod-eu-west-3 - tempo-prod-08
minorJan 12, 09:03 AM→Jan 12, 03:26 PMresolved
Jan 12, 03:26 PM
resolved — Engineering has released a fix and we continue to observe a period of recovery. As of 15:12 UTC we are considering this resolved.
Jan 12, 11:41 AM
investigating — There was a full degradation of write service between 9:13 UTC - 9:35 UTC. The cell is operational but there is still degradation in the write path. Our Engineering team is actively working on this.
Jan 12, 09:09 AM
investigating — We are continuing to investigate this issue.
+1 more updates
Write Degradation in Grafana Cloud Logs (prod-us-east-3)
noneJan 9, 08:30 PM→Jan 9, 08:30 PMresolved
Jan 9, 11:08 PM
resolved — Between 20:23 UTC and 20:53 UTC, Grafana Cloud Logs in prod-us-east-3 experienced a write degradation, which may have resulted in delayed or failed log ingestion for some customers.
The issue has bee...
Partial Write Outage in prod-us-central-0
noneJan 7, 05:41 PM→Jan 7, 05:41 PMresolved
Jan 7, 05:41 PM
resolved — There was a ~15 minute partial write outage for some customers in prod-us-central-0. The time frame for this outage was 15:43-15:57 UTC.
High Latency and Errors in Prod-Us-Central-7
majorJan 6, 05:41 PM→Jan 6, 08:26 PMresolved
Jan 6, 08:26 PM
resolved — This incident has been resolved.
Jan 6, 05:50 PM
monitoring — We are seeing some recovery in affected products. We are continuing to monitor the progress.
Jan 6, 05:41 PM
investigating — We are currently investigating an issue causing degraded Mimir and Tempo read performance in the prod-us-central-7 region.
Cloudflare Error 1016
noneJan 6, 03:09 PM→Jan 6, 03:09 PMresolved
Jan 6, 03:09 PM
resolved — From 20:32 to 20:37 UTC, a DNS record misconfiguration resulted in temporary Cloudflare 1016 DNS errors on many Grafana Cloud stacks.
The misconfiguration was mitigated within 5 minutes, and we are w...
K6 test-runs cannot be started and the overall navigation experience is degraded
minorJan 2, 10:44 AM→Jan 2, 01:38 PMresolved
Jan 2, 01:38 PM
resolved — This incident has been resolved.
Jan 2, 11:53 AM
monitoring — We are continuing to monitor for any further issues.
Jan 2, 11:51 AM
monitoring — A fix has been implemented and we are monitoring the results.
+1 more updates
December 2025
PDC Queries in Prod-Us-East-3 Are Failing
minorDec 23, 08:31 PM→Dec 23, 09:59 PMresolved
Dec 23, 09:59 PM
resolved — This incident has been resolved.
Dec 23, 09:09 PM
monitoring — A fix has been implemented and we are monitoring the results.
Dec 23, 08:34 PM
identified — We have identified the issue, and are working on a fix now.
+1 more updates
New K6 Tests Intermittently Failing
majorDec 19, 04:10 PM→Dec 19, 04:50 PMresolved
Dec 19, 04:50 PM
resolved — This incident has been resolved.
Dec 19, 04:50 PM
investigating — We are continuing to investigate this issue.
Dec 19, 04:10 PM
investigating — We are currently investigating an issue that is causing intermittent failure when trying to start new k6 tests on the k6-app in the cloud. The tests are starting normally from CLI.
Logs write path degradation on GCP Belgium - prod-eu-west-0
noneDec 17, 01:02 PM→Dec 17, 01:02 PMresolved
Dec 17, 01:02 PM
resolved — Today 17th from 10:00 UTC to 10:38 UTC we experienced logs write path degradation. Customers may have experienced 5xx errors while ingestion on that cluster. Service is fully restored.
IRM access issues on instances in GCP Belgium (prod-eu-west-0)
noneDec 16, 12:51 PM→Dec 16, 12:51 PMresolved
Dec 16, 12:51 PM
resolved — Due to an incident today in IRM (OnCall) product, access to this application was degraded from 10:39 UTC until 11:16.
Customers may have experienced the app not operable or not accessible. The servic...
k6 Cloud service is down
noneDec 16, 10:37 AM→Dec 16, 10:42 AMresolved
Dec 16, 10:42 AM
resolved — This incident has been resolved.
Dec 16, 10:37 AM
investigating — The main k6 Cloud service is down due to database issues, and it is not possible to access test runs or start new ones.
Some Trace QL Queries Failing
majorDec 12, 02:17 PM→Dec 12, 02:38 PMresolved
Dec 12, 02:38 PM
resolved — This incident has been resolved.
Dec 12, 02:17 PM
identified — TraceQL queries with "= nil" in Explore Traces and part of Drilldown Traces are failing with 400 Bad Request errors. The issue has been identified, and a fix is currently being rolled out.
Mimir Partial Write Outage
majorDec 11, 09:43 PM→Dec 11, 11:17 PMresolved
Dec 11, 11:17 PM
resolved — We continue to observe a continued period of recovery. At this time, we are considering this issue resolved. No further updates.
Dec 11, 11:02 PM
monitoring — Synthetic Monitoring has now also recovered. Customers should no longer experience alert rules failing to evaluate.
We continue to monitor for recurrence and will provide updates accordingly.
Dec 11, 10:33 PM
monitoring — Engineering has released a fix and as of 22:25 UTC, customers should no longer experience ingestion issues. We will continue to monitor for recurrence and provide updates accordingly.
+2 more updates
Elevated Metric Push Failures and Latency
majorDec 10, 07:29 PM→Dec 10, 11:05 PMresolved
Dec 10, 11:05 PM
resolved — We continue to observe a continued period of recovery. At this time, we are considering this issue resolved. No further updates.
Dec 10, 10:22 PM
monitoring — We are observing a trend in improvement after implementing a fix. We will continue to monitor and update accordingly.
During our investigation, we also became aware that some alerts associated with ...
Dec 10, 09:38 PM
identified — Our engineering team has identified a potential root cause, and a fix is being implemented.
+4 more updates
Elevated Log Push Failures and Latency on prod-eu-west-0 cluster
majorDec 10, 07:30 PM→Dec 10, 07:30 PMresolved
Dec 11, 02:35 PM
resolved — Users experienced failed log pushes as well as increased latency when sending logs to Loki service hosted on prod-eu-west-0 cluster between 18:30 UTC to ~23:00 UTC.
Our engineering team has engaged o...
Metrics read issue affecting cortex-prod-13 on prod-us-east-0
criticalDec 10, 08:23 AM→Dec 10, 09:06 AMresolved
Dec 10, 09:06 AM
resolved — The incident has been resolved.
Dec 10, 08:30 AM
monitoring — Read path has been restored at 08:23 UTC and queries are fully functioning again. The read path outage lasted from 08:04 to 08:23 UTC.
Dec 10, 08:23 AM
investigating — At 08:04 UTC we detected read path outage (queries) on cortex-prod-13. We are currently investigating this issue.
The ingestion path (writes) is not affected.
Logs query degradation on AWS Germany (prod-eu-west-2)
majorDec 9, 12:59 PM→Dec 9, 01:10 PMresolved
Dec 9, 01:10 PM
resolved — The issue has been resolved.
Dec 9, 01:06 PM
monitoring — The query service is operational again, logs reads should be available on the cluster.
Our engineers are monitoring the health status of the service to ensure full recovery.
Dec 9, 12:59 PM
identified — Since today 9th, around 12:30 UTC time, we are experiencing problems on the Loki read path of cluster eu-west-2.
This translates in difficulties to query logs for customers on this cluster, and can a...
Hosted Grafana is currently being impacted as a result of the Cloudflare outage
noneDec 5, 09:25 AM→Dec 5, 09:44 AMresolved
Dec 5, 09:44 AM
resolved — This incident has been resolved.
Dec 5, 09:25 AM
investigating — We are currently experiencing disruptions to Hosted Grafana services due to a widespread Cloudflare outage impacting connectivity across multiple regions. Our team is actively monitoring the situation...
Loki prod-ap-northeast-0-loki-prod-030 writes degradation
minorDec 1, 08:00 AM→Dec 1, 08:00 AMresolved
Dec 1, 09:30 AM
resolved — Loki prod-ap-northeast-0-loki-prod-030 cell had writes degradation between 8:11 - 8:58 AM UTC. The engineering team mitigated the situation and the cell is stable now.
November 2025
Alerts failing with prometheus
minorNov 27, 04:49 PM→Nov 27, 06:27 PMresolved
Nov 27, 06:27 PM
resolved — This incident has been resolved.
Nov 27, 06:02 PM
monitoring — A fix has been implemented and we are monitoring the results.
Nov 27, 04:49 PM
investigating — We are currently investigating an issue around degraded services in prod-us-central-0. The expected behavioral impact is that the queries may take longer than usual to respond.
Synthetic Monitoring is down in prod-us-central-7
criticalNov 24, 11:19 AM→Nov 24, 11:49 AMresolved
Nov 24, 11:49 AM
resolved — This incident has been resolved.
Nov 24, 11:19 AM
investigating — Users can not interact with the SM API for any DB-related action, such as:
CRUD checks
CRUD probes
Longer Than Expected Load Times on Grafana Cloud
majorNov 21, 03:47 PM→Nov 22, 05:06 PMresolved
Nov 22, 05:06 PM
resolved — We continue to observe a continued period of recovery. At this time, we are considering this issue resolved. No further updates.
Nov 21, 05:04 PM
monitoring — A fix has been implemented, and we are seeing latency down across clusters. We are continuing to monitor progress.
Nov 21, 03:47 PM
investigating — We are currently investigating reports of long load times on Grafana Cloud. We will update as more information becomes available.
Some Loki Writes in Prod-Gb-South-0 Failed
majorNov 21, 02:32 PM→Nov 21, 03:38 PMresolved
Nov 21, 03:38 PM
resolved — This incident has been resolved.
Nov 21, 02:32 PM
monitoring — From approximately 14:10-14:25 UTC, writes to Loki failed for a subset of customers in the gb-south-0 region. Most of these errors have already recovered, and our team continues to monitor the recover...
Slow user queries exceeds threshold
minorNov 21, 12:09 PM→Nov 21, 02:52 PMresolved
Nov 21, 02:52 PM
resolved — This incident has been resolved.
Nov 21, 01:08 PM
monitoring — A fix has been implemented and we are monitoring the results.
Nov 21, 12:10 PM
investigating — We are having some intermittent query failures
+1 more updates
Elevated Read & Write Latency for Some Cells in Prod-Us-East-0
majorNov 20, 05:12 PM→Nov 20, 06:47 PMresolved
Nov 20, 06:47 PM
resolved — This incident has been resolved.
Nov 20, 05:47 PM
monitoring — Things have recovered, and we are monitoring to ensure stability.
Nov 20, 05:23 PM
identified — The previous post mentioned that this was occurring in some cells in prod-us-central-0. This is incorrect and is occurring in some cells in prod-us-east-0.
+1 more updates
Intermittent issues when starting k6 cloud test runs
noneNov 19, 09:00 PM→Nov 19, 09:00 PMresolved
Nov 20, 04:35 PM
resolved — We are experiencing issues starting cloud test runs. This is primarily affecting browser test runs and test runs using static IPs.
Missing Billing Metrics for Loki
noneNov 19, 06:20 AM→Nov 19, 06:20 AMresolved
Nov 19, 06:20 AM
resolved — An incident has impacted Loki billing metrics from 05:30 to 06:10 UTC across all clusters. This is currently resolved, however, users may notice some billing metrics missing from the billing dashboard...
Hosted grafana is currently being impacted as a result of the Cloudflare outage
noneNov 18, 11:59 AM→Nov 18, 05:34 PMresolved
Nov 18, 05:34 PM
resolved — This incident has been resolved.
Nov 18, 11:59 AM
investigating — We are currently experiencing disruption to Hosted Grafana services due to a widespread Cloudflare outage impacting connectivity across multiple regions. Our team is actively monitoring the situation ...
Cortex - read/write path disruption
majorNov 17, 02:06 PM→Nov 18, 07:26 AMresolved
Nov 18, 07:26 AM
resolved — This incident has been resolved.
Nov 17, 10:49 PM
identified — We’re continuing to work on this issue, and are actively investigating the remaining details. Our team is making progress, and we’ll share another update as soon as we have more information to provide...
Nov 17, 02:49 PM
identified — We are continuing to work on a fix for this issue.
+1 more updates
Elevated Mimir Read/Write Errors
majorNov 17, 10:22 PM→Nov 18, 12:10 AMresolved
Nov 18, 12:10 AM
resolved — We continue to observe a continued period of recovery. As of 00:08 UTC, we are considering this issue resolved. No further updates.
Nov 17, 11:27 PM
monitoring — A fix has been implemented and we are monitoring the results.
Nov 17, 10:55 PM
investigating — Our teams have been alerted that Synthetic Monitoring will also be affected by this outage.
Users may see gaps in their Synthetic Monitoring metrics as well as missed alerts as a result of this.
W...
+4 more updates
Metrics Write Outage in Multiple Cells
majorNov 17, 08:02 PM→Nov 17, 11:08 PMresolved
Nov 17, 11:08 PM
resolved — As of 23:07 UTC we are considering this incident as resolved.
Mitigation efforts have restored normal write performance, and error rates have returned to expected levels.
We have confirmed stabili...
Nov 17, 09:22 PM
monitoring — We are seeing improvement on the metrics side, with write performance recovering.
We continue to investigate the remaining impact to Synthetic Monitoring and are working to determine the underlying ...
Nov 17, 08:50 PM
identified — Our teams have been alerted that Synthetic Monitoring will also be affected by this outage.
Users may see gaps in their Synthetic Monitoring metrics as well as missed alerts as a result of this.
W...
+3 more updates
Hyderabad Probe Issues
minorNov 17, 07:01 PM→Nov 17, 07:01 PMresolved
Nov 17, 07:01 PM
resolved — We experienced degraded service with the Hyderabad probe today around 13:20 UTC which was resolved as of 17:30 UTC.
PDC-Prod-eu-west-2 cluster degraded performance
noneNov 17, 11:56 AM→Nov 17, 12:23 PMresolved
Nov 17, 12:23 PM
resolved — This incident has been resolved.
Nov 17, 12:22 PM
monitoring — Engineering has released a fix and as of 12:15 UTC, customers should no longer experience performance degradation on the PDC service. We will continue to monitor for recurrence and provide updates acc...
Nov 17, 11:56 AM
investigating — We are currently facing performance degradation on PDC service hosted on prod-eu-west-2 cluster. Our Engineering Team is currently working on fixing the issue, we do apologize for any inconvenience.
Loki Prod 012 read-path-unstable
minorNov 17, 04:12 AM→Nov 17, 04:55 AMresolved
Nov 17, 04:55 AM
resolved — resolved since 03:02UTC
Nov 17, 04:13 AM
monitoring — We started having instabilities for Alerting and Recording rules of this cell starting at 2:30 AM UTC. They started recovering at around 3 AM UTC, but we're still watching.
Nov 17, 04:12 AM
monitoring — A fix has been implemented and we are monitoring the results.
Degraded Brower Check Performance
minorNov 12, 04:50 PM→Nov 12, 04:50 PMresolved
Nov 12, 04:50 PM
resolved — Spanning from November 10th, 18:00 UTC to November 11th, 22:00 UTC, Synthetic Monitoring experienced degraded browser check performance due to a faulty release that has been rolled back.
This impacte...
Related Incident Histories
Get Grafana Cloud Outage Alerts
Be the first to know when Grafana Cloud go down.