Debugging Azure Worker Role Startup Issues
I’ve deployed a number of systems to Azure and one of the most painful experiences for me continues to be debugging role startup failures. Often, I’ve often made configuration mistakes (e.g., wrong Azure storage account credentials) which prevent my worker roles from successfully returning from their “OnStart()” method and this only surfaces when I deploy the system to Azure. Unfortunately, the current version of the Azure portal doesn’t offer a lot of direct help here beyond telling you that a role has failed to start. In the screenshot below, I’ve captured my worker role in the “Recycling” state which means it’s failed to start the first time and Azure is trying to restart it again.
So how do we debug debug this? Poking around the web a bit yielded a number of recommendations. I could:
- Enable Remote Desktop for the worker role and login to the box when it was starting up to see if I could debug the issue. This seems kind of tricky for my scenario as I’d have to catch it before it exited the “OnStart()” method.
- I could wrap my code in the “OnStart()” method with a try/catch and log the exception to SQL Azure, Azure Table storage or write it to queue. That wouldn’t be too bad, but it does require me to update my code.
- I could leverage the built-in Visual Studio IntelliTrace feature to record the execution history of my worker role and view this data all from the comfort of my local box.
I’m lazy so #3 sounds like a winner to me.
To configure this, you right click on your Cloud Project in Solution Explorer and choose “Publish”. Then, on the Advanced Settings tab, you can “Enable IntelliTrace” and further configure settings around how much data you want to collect. When you’re done, click “Publish” and away you go.
As Visual Studio goes through it’s publishing process an activity log is provided showing you the steps being executed. I’ve captured this below and though you can’t see the local time on my computer, I’ve been at that final step with the worker role being “busy” for a good while.
Time to look at that IntelliTrace data!
Using the Server Explorer window in Visual Studio, I can now view the IntelliTrace logs by expanding the Windows Azure Compute section and locating my worker role.
This triggers a download of the data to your local machine and once the download completes, Visual Studio pops open the IntelliTrace Summary in a new window. Now I can see the list of exceptions that were generated and where they occurred.
The screenshot below shows the dummy exception I intentionally added to cause my worker role to fail to start.
At this point, I can take things one step further by clicking the “Start Debugging” data to see the exact state of my worker role when the exception was triggered, fix up my bug and redeploy.
Hope this helps,