This post is a contribution from Paul Chan, an engineer with the SharePoint Developer Support team
This blog article will list the popular tools that can be used when troubleshooting performance issues. Note that the tools being mentioned here are listed in random order.
NOTE: If you feel needed or press on time, feel free to contact Microsoft and create a support case to work with us, instead of dragging the resolution of the issue for too long until the last minute.
Task Manager: This is the first tool can be used to identify high memory and high CPU scenario; e.g. to check if it is the w3wp.exe process taking a lot of memory or CPU cycles. This tool is always available from the OS. However, it only gives a high level point of view with no specific details.
Performance Monitor: This is another tool to identify which process is taking high memory or CPU. It also provides a logging option that you can track the usages in a period of time. However, it still won’t give you the clear target where the issue comes from.
Custom Logging: This method is being mentioned in both the high CPU and slow issues (in Part 2 and Part 3). To be honest, it is more useful in slow issues than high CPU issues. This method could be the one that give you the clearest target where the slowness come from.
Code Review: Although code review is the method being mentioned in all 3 types of performance issues, it is not really troubleshooting but trouble-scanning; i.e. it’s not shooting at any specific target but more like shooting in the dark; especially when it’s up to high memory usage. This is the main tool to use that fits all 3 types of performance issues though.
NP .NET Profiler: I mentioned it in Part 1 and 3 before. This is a nice tool but it does have its limitation. I think it can also be used in small scale code tests as well. One of the good things about this tool (if it does apply) is it doesn’t need to modify any code that it attaches itself to the process to do the profiling. But the main drawback is it needs to terminate the process when attaching and detaching itself.
Microsoft SharePoint Development Support: What do we do in our support team? Do we have any secret scrolls or magic spells? Maybe someone in the team does, but not me. Basically, we troubleshoot performance issues using the similar tools I mentioned as the condition fits. Of course, we might also use some tools that only available internally in Microsoft. Personally, I also rely on memory hang dumps to troubleshoot performance issues. However, this specific topic is too much to cover that it is not suitable to discuss in a small blog article like this. Memory dump analysis is not an internal thing to Microsoft only, you can look it up over the internet and you will find people discuss about it.
DebugDiag: This is a tool that associated to the memory hang dumps mentioned above. This tool provides an Analysis feature that it does the memory dump analysis via a script. So it kind of sits between a memory hang dump analysis done by a human being, and the tools that only give a high level overview of the issue. However, everyone have an uncomfortable feeling about analyzing data automatically done by a script. On the other hand, understanding its analysis results could be challenging as well. However, it is better than nothing, and don’t get me wrong that it is in fact a good tool. I personally use this tool as a reference in addition to my own memory hang dump analysis. Talking about how to use this tool will take a long time, not to mention how to read its results. Obviously, I will not include such information in this blog article. Lookup the term DebugDiag over the internet and you will find a lot of references about it.
ULS Log: This tool has been overlooked from time to time that people usually open the logs when there is an error pops up along with a correlation Id. Performance issues usually don’t apply to this because they are not really errors or exceptions from the code, but a symptom caused by the environment (exposed by the code). However, sometimes the ULS logs will provide useful information; such as query time taken, that could help in slow issues. If you are using the SPMonitoredScope to do the logging in the code, the log also provides the timestamp and what activities are happening around that time. Looking into ULS log without correlation Id is painful and need patience, but sometimes this is what needs to do.
NOTE: For any performance issues that you don’t feel comfortable to handle, please feel free to create a support case and work with Microsoft support. We do not want the resolution of our customer issues being delayed.