Debugging performance issues in production can be a pain and in some cases impossible without the right tools. Java profilers have been around forever, but the profilers most developers think about are only one type.
Let’s dive into the 3 different kinds of Java profilers:
- Standard JVM Profilers that track every detail of the JVM (CPU, thread, memory, garbage collection, etc).
- Lightweight profilers that highlight your application with a bit of abstraction.
- Application Performance Management (APM) tools used for monitoring applications live in production environments.
Standard JVM Profilers
A standard Java profiler certainly provides the most data, but not necessarily the most useful information. This depends on the type of debugging task. These profilers will track all method calls and memory usage. This allows a developer to dive into the call structure at whatever angle they choose.
- Great for tracking down memory leaks, standard profilers detail out all memory usage by the JVM and which classes/objects are responsible. The ability to manually run garbage collection and then review memory consumption can easily shine a spotlight on classes and processes that are holding on to memory in error.
- Good for tracking CPU usage, a Java profiler usually provides a CPU sampling feature to track and aggregate CPU time by class and method to help zero in on hot spots.
- Requires a direct connection to the monitored JVM; this ends up limiting usage to development environments in most cases. (Note: some profilers can work off thread and memory dumps in a limited fashion.)
- They slow down your application; a good deal of processing power is required for the high level of detail provided.
Lightweight Java Transaction Profilers
Products like XRebel and Stackify Prefix.
Lightweight profilers take a different approach at tracking your application by injecting themselves right into the code.
- Aspect Profilers use aspect-oriented programming (AOP) to inject code into the start and end of specified methods. The injected code can start a timer and then report the elapsed time when the method finishes. These profilers are simple to set up but you need to know what to profile. For an example, see Spring AOP Method Profiling.
- Java Agent profilers use the Java Instrumentation API to inject code into your application. This method has greater access to your application since the code is being rewritten at the bytecode level. This allows for any code running in your application to be instrumented – be it code you wrote or 3rd party libraries your application depends on. Check out Introduction to Java Agents to see how this all works.
Aspect profilers are pretty easy to setup but are limited in what they can monitor and are encumbered by detailing out everything you want to be tracked. Java Agents have a big advantage in their tracking depth but are much more complicated to write.
Stackify Prefix is a developer-oriented Java profiler using the Java Agent profiler method behind the scenes. The cool thing is that Prefix already knows the most desired classes and 3rd party libraries developers want to be instrumented – so you don’t have to detail them all out. Plus, it takes all the stats from the instrumentation and displays them in simple and understandable manner. As an example, when running an application using Hibernate, Prefix will not only detail out the elapsed time for queries but also displays parameter values for the generated SQL. When your app calls to a SOAP/REST API, Prefix provides the request and response content.
Low Overhead, Java JVM Profiling in Production (APM)
All the profilers so far have been great for development, but tracking how your system performs in production is critical. Production is always a different landscape – development and staging setups typically don’t have the same datasets and load.
Java APM tools typically use the Java Agent profiler method but with different instrumentation rules to allow them to run without affecting performance in productions. The trick with these profilers is to provide the right information in a smart way to not take up CPU cycles.
Stackify Retrace is an APM tool that uses the same tech as Stackify Prefix with a few adjustments to run smoothly in staging and production environments. This is done by aggregating timing statistics and sampling traces. This gives you method-level visibility to your application’s code that is running in production. So when you have a slow web request, that will translate into a trace showing up in Retrace. From there you can dive in and see what methods are the culprit.
Retrace Screenshot: Web Request Aggregation over 4 hours
Why Are Some Java Profilers So Expensive?
XRebel is a cool tool, but it costs $365 a year. Stackify Prefix is free and provides much of the same functionality.
The biggest problem with APM solutions is definitely their pricing. They have traditionally been so expensive that only the largest enterprises could afford them. It doesn’t make a lot of sense to spend $100 a month on a server at Azure or AWS and then spend another $200 a month for a product like New Relic.
Monitoring tools shouldn’t cost more than the servers!