Splunk Observability Cloud US2 Realm
All Systems Operational
Splunk Observability Cloud Web Interface Operational
90 days ago
99.95 % uptime
Today
Datapoint Ingest Operational
90 days ago
100.0 % uptime
Today
Splunk Observability Cloud API Operational
90 days ago
100.0 % uptime
Today
Alerting Operational
90 days ago
100.0 % uptime
Today
Splunk APM Operational
90 days ago
99.94 % uptime
Today
Splunk APM Ingest ? Operational
90 days ago
99.89 % uptime
Today
Splunk APM Interface ? Operational
90 days ago
100.0 % uptime
Today
Splunk APM Monitoring MetricSets ? Operational
90 days ago
99.95 % uptime
Today
Splunk APM Troubleshooting MetricSets ? Operational
90 days ago
99.92 % uptime
Today
Splunk APM Trace Data ? Operational
90 days ago
99.92 % uptime
Today
Splunk APM Tag Spotlight ? Operational
90 days ago
99.92 % uptime
Today
Splunk APM Business Workflows ? Operational
90 days ago
99.95 % uptime
Today
Splunk APM API ? Operational
90 days ago
100.0 % uptime
Today
3rd Party Services Operational
Google Cloud Platform Google Kubernetes Engine Operational
Google Cloud Platform Google Compute Engine Operational
Operational
Degraded Performance
Partial Outage
Major Outage
Maintenance
Major outage
Partial outage
No downtime recorded on this day.
No data exists for this day.
had a major outage.
had a partial outage.
API Health Check Response Time (Average) ?
Fetching
End-to-End Processing Time (Average) ?
Fetching
Splunk APM Interface Response Time (Average) ?
Fetching
Past Incidents
Jan 22, 2022

No incidents reported today.

Jan 21, 2022

No incidents reported.

Jan 20, 2022

No incidents reported.

Jan 19, 2022

No incidents reported.

Jan 18, 2022

No incidents reported.

Jan 17, 2022

No incidents reported.

Jan 16, 2022

No incidents reported.

Jan 15, 2022

No incidents reported.

Jan 14, 2022

No incidents reported.

Jan 13, 2022

No incidents reported.

Jan 12, 2022

No incidents reported.

Jan 11, 2022

No incidents reported.

Jan 10, 2022

No incidents reported.

Jan 9, 2022

No incidents reported.

Jan 8, 2022
Resolved - As of 7pm PST the last systems have been brought back to nominal operations and this incident has been resolved. Splunk APM data will not be available or appear incomplete in the time range between 4pm and 7pm PST.
Jan 8, 19:22 PST
Monitoring - Our cloud provider has confirmed that the recovery we've been seeing is real, and their networking issues have been resolved. We are continuing to work on bringing all systems back to a stable state. At this point, most of the Splunk Observability Cloud offerings are operating nominally.

Splunk APM's Monitoring MetricSets have recovered to real-time processing, and we're actively working on bringing the rest of the APM trace data processing pipelines back online for Troubleshooting MetricSets and raw trace data and trace search. As a part of this effort, and to accelerate the recovery of those pipelines and bring the most important real-time visibility back to our customers, we will be skipping over the data ingested during the incident; a precise time range of the lost data will be provided when the incident is resolved.
Jan 8, 19:03 PST
Update - We are still waiting on further updates from our cloud provider in this region, but we are starting to see early signs of stability and recovery. The Splunk Observability Cloud Web Interface is operational and responding normally, datapoint ingest is operational, and metric timeseries based charts, dashboards and detectors are functional.

Splunk APM Monitoring MetricSets have started their recovery and should be back to real-time within the next 30 minutes. We are working on bringing the Troubleshooting MetricSets and raw trace processing pipelines back online.
Jan 8, 18:45 PST
Identified - We identified the outage is related to network issues by the cloud provider (Status - https://status.cloud.google.com/incidents/NMcnk6aE8xMHHwRGmyry). We are working with the cloud provider to resolve it and ensure the availability of our services.

The current impact on Splunk Observabiity Cloud includes:
- The web interface and login had a degraded performance and may require multiple tries to login.
- Charts may not load or load slowly and intermittently.
- Detectors are not alerting in real time.
- Processing of Splunk APM trace spans is delayed, leading to Troubleshooting MetricSets, Monitoring MetricSets, and raw traces not being available or not representing the most current data.
- Small amounts of trace data were lost at the onset of the incident and we may drop a small amount of data to bring the system back to real time once the cloud provider network issue is resolved. Data ingest for both datapoints and traces is otherwise not affected at this time.
Jan 8, 17:00 PST
Update - We identified the outage is related to network issues by the cloud provider (Status - https://status.cloud.google.com/incidents/NMcnk6aE8xMHHwRGmyry). We are working with the cloud provider to resolve it and ensure the availability of our services.

The current impact on Splunk Observabiity Cloud includes:
- The web interface and login had a degraded performance and may require multiple tries to login.
- Charts may not load or load slowly and intermittently.
- Detectors are not alerting in real time.
- Processing of Splunk APM trace spans is delayed, leading to Troubleshooting MetricSets, Monitoring MetricSets, and raw traces not being available or not representing the most current data.
- Small amounts of trace data were lost at the onset of the incident and we may drop a small amount of data to bring the system back to real time once the cloud provider network issue is resolved. Data ingest for both datapoints and traces is otherwise not affected at this time.
Jan 8, 17:00 PST
Investigating - We are investigating an issue impacting several systems of the Splunk Observability Cloud. The web interface, including dashboards and charts, may be slow to load. Processing of Splunk APM trace spans is delayed, leading to Troubleshooting MetricSets, Monitoring MetricSets and raw traces not being available or not representing the most current data. Small amounts of trace data were lost at the onset of the incident, but data ingest for both datapoints and traces is otherwise not affected at this time.
Jan 8, 15:36 PST
Resolved - This incident has been resolved. At the beginning of the incident, 0.3% of Splunk APM trace spans were dropped between 7:26am and 7:28am PST.
Jan 8, 08:26 PST
Update - The processing of Monitoring MetricSets has recovered and is back to real-time processing. We are continuing to work on stabilizing the rest of the services.
Jan 8, 08:01 PST
Update - We are continuing to work on a fix for this issue.
Jan 8, 07:52 PST
Identified - A degradation in the performance of the Splunk APM trace processing pipeline is causing Troubleshooting MetricSets to be delayed by more than fifteen minutes. As a result, the APM Troubleshooting experience, service maps and Tag Spotlight do not have access to the most recent data from approximately 5% of the traffic.

The processing of metrics for Business Workflows, which also depends on this pipeline, are equally delayed. Trace data ingest is not impacted at this time; service-level and endpoint-level Monitoring MetricSets and the detectors built from them are also not impacted.
Jan 8, 07:51 PST