Latest Update (May 23, 2025)
The Azure Analysis Services (AAS) issue has been fully resolved. Microsoft’s rollback and patch have successfully restored stable performance across all affected environments.
Following testing and validation, EBM has scaled server capacity back to normal levels. All systems are running as expected.
We’ll continue working closely with Microsoft to help prevent similar issues in the future and ensure faster resolution should they arise again. Thanks for your patience throughout this incident.
Update (as of May 22, 2025)
Microsoft has confirmed that the Azure Analysis Services (AAS) issue has been fully patched and rolled back to a stable state across all environments. This resolves the memory-related job failures and performance degradation observed since May 20.
At EBM, we will begin scaling our AAS servers back to normal capacity following performance validation. The scaling back will take place tonight, May 22, around 10:00 PM CST.
We'll continue to monitor jobs closely, but based on Microsoft's resolution and current performance, everything appears stable.
This issue was global and impacted AAS infrastructure across all Microsoft clouds, not just EBM-hosted environments. We're staying in close contact with Microsoft to explore options for avoiding similar disruptions in the future.
Update (as of May 21, 2025)
Microsoft has informed us that their product group has implemented a fix for the Azure Analysis Services issue. The rollout is expected to begin this afternoon (May 21), pending final build readiness, and should be completed across all cloud environments by end of day May 22.
At EBM, we will keep our AAS servers scaled up during this rollout to prevent further memory-related job failures. Once the fix is confirmed in production, we’ll begin testing performance in a controlled way before gradually reducing memory thresholds across all environments.
Overview
Microsoft Azure Analysis Services (AAS) is currently experiencing a widespread issue affecting several EBM-hosted environments and Catalyst client instances. This is not isolated to EBM and is impacting AAS customers across multiple regions globally.
What’s Happening
Since the morning of May 20, we've observed excessive memory usage during standard Catalyst processing tasks, including:
-
Cube rebuilds
-
Nightly integration jobs
-
Automation workflows
These memory spikes are causing jobs to fail intermittently and require manual restart.
Root Cause (Confirmed by Microsoft)
Microsoft has identified a code bug introduced during a recent AAS service update as the root cause. This bug leads to out-of-memory exceptions during refresh and job execution processes.
Current Microsoft Status
Microsoft is actively rolling back to a previous stable build to resolve the issue. Preparation for the rollback is expected to take approximately 4 more hours, with the rollback beginning immediately after. The next update is expected from Microsoft within 7 hours, or sooner if needed.
Impact Statement from Microsoft:
"You have been identified as a customer using Azure Analysis Services and may experience refresh delays or failures due to out-of-memory exceptions."
What We’re Doing
To reduce the impact, EBM has:
-
Temporarily increased AAS memory thresholds
-
Actively monitored Catalyst job performance
-
Reprocessed failed jobs to maintain system reliability
What You Might Notice
-
Job failures during cube rebuilds or integrations
-
Delayed or inconsistent reporting data
-
Higher-than-normal memory consumption warnings
No Action Required
There is no action needed from your team. We're handling reprocessing and closely monitoring affected environments.
We’ll Keep You Updated
We’ll continue posting updates here as we receive new information from Microsoft or see changes in performance behavior. We appreciate your patience while this issue is resolved.
Comments
0 comments
Article is closed for comments.