Microsoft Defender portal outage disrupts threat hunting alerts
2 mins read

Microsoft Defender portal outage disrupts threat hunting alerts

Microsoft is working to mitigate an ongoing incident that has been blocking access to some Defender XDR portal capabilities, including threat hunting alerts.

According to an admin center service alert (DZ1191468) seen by GeekFeed, this outage may affect customers attempting to access or use features in the Defender portal.

The issues are caused by what Microsoft describes as a “spike in traffic caused high Central Processing Unit (CPU) utilization on components that facilitate Microsoft Defender portal functionalities.”

When it acknowledged the outage this morning, Microsoft also tagged it as an incident, a designation commonly used for critical service issues that typically involve noticeable user impact.

Microsoft has since applied mitigation measures to address the impact and increased processing throughput, with telemetry showing that availability has recovered for some impacted customers, according to an 8 AM UTC update.

Defender XDR portal outage

Microsoft is now analyzing HTTP Archive (HAR) traces provided by impacted customers and said that, besides blocked access, the impacted portal functionality currently includes, but is not limited to, missing advanced threat-hunting alerts and devices not appearing.

“We’ve received confirmation from additional organizations that the issue is resolved for them, and monitoring telemetry continues to show CPU utilization remains within acceptable thresholds,” it added roughly two hours later.

“We’re working with a small number of organizations who reported that the issue still persists and coordinating with them to collect additional client-side diagnostics and HTTP Archive format (HAR) traces to assist our investigation.”

December 03, 04:04 EST: Microsoft says the incident has been mitigated for all affected customers.

“We’ve received confirmation from the additional organizations that the issue is resolved. Monitoring telemetry continued to show the service has remained stable for an extended period of time,” it said.

“We’ll provide a preliminary Post-Incident Report within two business days and a final Post-Incident Report within five business days.”

Leave a Reply

Your email address will not be published. Required fields are marked *