For many years, ELMAH was the go-to logging utility for ASP.NET. It caught exceptions that came up through the IIS response pipeline and logged them along with contextual information. It also put a subpage on your site that you could visit to view logged exceptions. It was a great tool for catching, logging, and viewing unhandled exceptions for monolithic ASP.NET applications. But now that we’ve moved to distributed application architectures, we need something more.
As we look for a logging tool that can serve our diverse needs as software developers tasked with maintaining systems, we need to look no further than Retrace. Retrace is more than just a table of errors. It provides metrics, monitoring, tracing, and log aggregation. You can feed it ELMAH logs, log4net, and NLog sources, to name a few.
While you don’t have to completely bury ELMAH if you don’t want to, for those looking to move on to something like ASP.NET Core, or even for you polyglots out there, Retrace gets you a whole lot further!
ELMAH gives you a list of errors that you can scan visually. You’ve got to take it all in before you can really see what you need. Retrace, on the other hand, scans your errors and gives you graphs and metrics. With those visuals, you can hone in on what’s important.
Since a picture is worth a thousand words, let’s look at what you get from Retrace:
This is the errors dashboard. It presents several breakdowns of the errors in the selected app. You can see the graph of errors for each interval in the last four hours. You can see how many errors were logged per minute and per hour. And you can see the count of each type of error where they’re listed at the bottom. They’re grouped together nicely so you aren’t scrolling through a long list.
Since you can send data to Retrace using so many logging clients, you have greater flexibility in the tools you use in your .NET apps. ELMAH isn’t ready for .NET Core, but other logging utilities are. For example, with log4net you can configure it as Matt Watson suggests, or you can use this NuGet package to handle the configuration for .NET Core. NLog is ready to go. You can follow the instructions on their GitHub wiki. And, of course, you can always send log entries to the REST endpoint.
Besides having the flexibility to use various logging utilities in .NET Standard and .NET Core, you can log to Retrace from various platforms. You can easily log from Node.js, Java, PHP, and Ruby. Theoretically, you can log from anything that has an HTTP client. ELMAH was only doing ASP.NET logging, but with Retrace, you get a whole lot more.
ELMAH gives you some information about the server along with the stack trace and request information. What you don’t get is the surrounding context or the details about what happened “inside” the request. In Retrace, you get much more contextual information.
From the errors dashboard, you can look across applications to see what else was happening in your system around a specific application exception. Perhaps there was more than one application logging an error at the same time. This could indicate a network issue, a shared database issue, or something happening on a common dependency such as a shared API. From the dashboard, you would clearly see evidence of this type of issue.
Besides seeing multiple applications in one screen to provide context, you can also get more context around the specific call that had an issue. One way Retrace gives you more context is by tracing automatically.
With Retrace, you can enable lightweight tracing to get much better detail. ELMAH doesn’t include an integrated APM solution along with the error logs, but Retrace does. Now when you have an error and you’ve got tracing you can see what else was happening when the error occurred.
The tracing is automatic for the most common resources used in applications today. Some examples of tracing are calls to SQL Server, Redis caches, and DynamoDB. You benefit more from error logs when you combine them with tracing.
It’s best to see the difference for yourself, so here’s an example. The following image was taken from an error document, which contains all the information sent from the logging client. Here’s the part about the web request itself:
We know from the web request that this error came from the index route. But we don’t get the full picture.
Here’s the same error in the trace log:
As you can see, the server responded to the user as if everything were OK. However, there was an error during the request. With this information, you might even find problems you didn’t even know you had! This is where performance monitoring really comes into play.
It’s one thing to see errors and stack traces, but those only give us one small piece of the picture that the users see. Errors only tell us when something is wrong and an exception was thrown. But we miss a lot when we’re just looking at those kinds of issues. There are other questions to consider, such as what happens when a page stalls. What about when a specific component takes forever to load?
From the user experience perspective, these are issues too. They may not result in a yellow screen of death or an error response, but they’re still important failures from the user’s point of view.
When you’re measuring application performance, you’re usually looking at things like HTTP response time. That’s the first line of defense. And yes, you can get this information directly from the server logs. But what happens when you do find a slow request? What steps do you take to dig further into understanding the underlying issues?
There are various questions you might ask at this point: Was it a specific SQL call? Was it page generation? Was it a concurrency issue? How about server resource utilization at the time? You could go out to various resources to answer some of these questions, but that would take a long time. You won’t answer all these questions unless you’re prepared. We’ve already seen that tracing gives us more information around the request. Now let’s see how it helps discover issues with resource timing.
Where response time breaks down
Here’s a sample of a web request that took more than 200 seconds:
You wouldn’t expect a user to wait around that long before they refresh the page. With the trace view, you can see exactly how much each operation contributed to the poor response time. In this case, you can see that the MySQL query was the main culprit. And from this point, you can probably attribute the response time to a deadlock.
The user’s view
One of the best features in Retrace, especially for product managers, is the comprehensive dashboard. Retrace isn’t just a tool for techs to see errors. Retrace is about getting the most out of your application. It gives you real insights about your application that you can use for planning.
The Retrace dashboard view displays all this information in one place. It’s not just a list of errors. It gives you instant visuals on server health, application performance, and, of course, errors. It even gives you a letter grade on your application performance along with tips to improve your score!
This is the dashboard view with a graph of user satisfaction based on performance:
In this view, you can see how your web request performance is doing from the user’s perspective. There’s a breakdown of performance for each URL at the bottom of the dashboard so you can hone in on specifics.
Here’s the same data viewed from the technical perspective:
By looking at the data in different ways, you can see if and when there are performance issues. In this case, there may have been something released on November 14 that caused some problems. But then on the 17th, the problems got much worse. The number of requests (the blue line) slowed down, probably due to the delayed responses.
A eulogy for ELMAH
I’m not the best person to write a eulogy for ELMAH. It’s not that I don’t want to honor all it’s done for us over the years. It’s saved my bacon a number of times! It was a simple, effective solution to logging errors in single ASP.NET web apps. But the time has come to either extend ELMAH by sending the logs to Retrace or lay it to rest.
If you’re still on the fence about Retrace, you can give it a try with zero commitment by signing up for a free 14-day trial. You can explore the sandboxed demo environment that you’ve seen in this post or start uploading your ELMAH logs and see what it can do for your own environment. I recommend planning ahead just a bit before starting your trial so you can set everything up and get the most out of that two-week period. Once you do, you’ll be ready to let ELMAH go for good.
- A Detailed Guide to PHP Debugging - January 31, 2019
- Node.js Logging Tutorial - January 18, 2019
- Comparison: Node.js vs. PHP - January 4, 2019
- ELMAH Is Dead. Get More Detailed Exceptions With Retrace - November 30, 2018
- MySQL Tutorial: How to Improve Performance - November 27, 2018