Stackify is now BMC. Read theBlog

Tips for Application Troubleshooting

By: Iryen
  |  May 21, 2021
Tips for Application Troubleshooting

It is easier to perform application troubleshooting when you know that protocols are in place. For instance, knowing the core features of the application and how the application functions is already a standard. Also, you’ll need to expand the coverage like the requirements of Quality of Service (QoS).

Does the application need real-time performance or does it need to move a lot of data? Are there sub-applications running on the endpoints? All of these questions are valid, especially if you’re dealing with a slow application. Also, these questions will help you determine if there is a competition for bandwidth, CPU, memory, and disk I/O happening inside your application. 

Oftentimes, application troubleshooting practices cover two known areas: the client-side and the server-side. In this article, let’s go beyond these known areas and cover multi-function interactions and other application troubleshooting tips. Read on.

Client-Side Processing

A slow-performing application always results in endless complaints. Thus, it is best to perform assessment and application troubleshooting on how the client-side is working. First, determine what is happening on the client endpoints. Check the resources and timing of clients’ activities and how it impacts business application performance.

To further elaborate, let’s divide the client-side assessment into three categories:

  • Client-side Functionalities
  • Hardware-related issues
  • Resource Competition

Client-side Functionalities

Developers always look into the functionalities of the client-side to diagnose and understand the source of a problem. A web app with a longer loading time may be due to some complex programs running on the client. Consider the complexity level of these functionalities and how troubleshooting can affect the entire performance.

For example, some applications distribute their processing loads on the client-side. Thus, there are complex algorithms to deal with, or the size of code or data to be processed. To validate this case, use client-side diagnostic applications. This is to determine whether there are segments of the application that consume an excessive amount of memory and utilize a lot of  CPU. 

Hardware-related issues

No matter how robust your application development is, the problem still occurs when it has to deal with an underpowered client system. Hardware-related issues may result in an unresponsive application due to CPU scarcity, limited memory, or perhaps a slow local disk storage system. Furthermore, a nearly full disk system will function just like a very slow disk system as it searches for empty storage blocks. 

Resource Competition

The most common reason for a slow application is resource competition. For instance, a user is video streaming and another application starts to download a software patch/update. Then, another application is undergoing a system backup or running a virus scan at the same time. These processes consume client-side resources like network bandwidth, CPU, memory, and disk I/O bandwidth. Hence, part of the application troubleshooting tips is to consider the history of your application on how it reacts to these given situations.

Server-Side Architecture

“What is the best server-side architecture for your application?” The answer depends on the significant impact of the design on the performance. Plus, it should always consider the target market. For example, picture an application server that has to make queries against a repository on the other side of the world. This process may take milliseconds and has a huge impact on the application’s resilience and performance.

“Chatty” Applications

The stiff market competition requires robust applications’ design to address high interactivity between components. For instance, a web app that has millions of visitors will have a tremendous amount of data exchange between client and server, or between an app server and a database server. This affects the latency between the components. 

How much more will it be for an application that performs hundreds of data exchanges internally? There are so many things going on before sending a response to the client. This is a huge blow to the user experience but these are classified as chatty applications. 

Web application troubleshooting related to chatty applications can start with application assessment. It’s viewable from an application point of view or a network perspective. Sometimes, chatty applications are needed to avoid an exchange of huge amounts of data during the initial request. Thus, developers need to analyze the use case and figure out whether the client needs all the data in one setting or not. 

Database-related Problems

One of the known database-related problems was Canada’s immigration website crash during the US election in 2016. This type of problem always boils down to the basic problem of application scalability. The website received a tremendous amount of traffic than it can cater. Commonly, application architects implement database locking that is the primary reason why applications fail to scale up. Oftentimes, when a system crashes due to overload, vendors always improve the server architecture to solve the problem.

Another faulty system design is when an application returns a significant volume of data for the client to filter. This is often done with user interface JavaScript libraries for web-based applications. However, the best solution is re-architecting the application to return smaller amounts of pre-filtered data. In effect, it reduces the CPU required to process the data as well as the network bandwidth required to send it. 

Geolocation Issues 

Why do servers that far apart matter? In building an application, whether you’re deploying it on-prem or via the cloud, one thing is certain, geolocation matters. For example, for an on-prem deployment, there is always a delay between the main and backup data center. There are cases when part of an application is hosted in one data center while another part of the same application was hosted in another data center. Whether they’re both in the same location or another location, there is still latency. 

One of the best designs in dealing with geolocation issues is to design using the leaf-spine data center.  It works well by ensuring that the latency between servers in the location doesn’t fluctuate. However, for cloud deployments, the best approach is to monitor server performance based on the location to your target market. For instance, if your e-commerce application covers the Asia-Pacific region, then you might consider choosing server locations that provide lower latency.

Multi-Function Interactions

Apps are becoming increasingly more sophisticated. The latest apps’ designs require constant interaction among multiple functions. If not handled properly, this can cause poor application performance.

Domain Name System (DNS) problems

DNS problems are often misconfigured clients that send DNS requests to a former, now decommissioned, DNS server. When this occurs, the client will have a long DNS request timeout. Thus, making the application startup slower. However, when the client switches to another DNS server, it resolves the request issue quickly. So, users often experience this wherein the application is slow to start, but then runs properly after some time.

From a user point of view, DNS problems are often resolved by simply deactivating firewalls, restarting the router, or even changing the web browser. 

Incorrect codec selection

This error is often encountered by users when applications do not support video codec. However, it is not that there is something wrong with the app’s design or features. Oftentimes clients use a high-bandwidth voice codec over a low-bandwidth connection. For instance, a 1 Mbps congested Wi-Fi link will eventually result in this kind of problem as users suffer intermittent connections. With the demand versus the bandwidth capacity, periods of high packet loss and high jitter often occur.

For video streaming applications, it is best to configure the voice/video systems to report logs that have poor characteristics and the type of codec in use. 

Bandwidth Issues

The bandwidth issue is always difficult to solve. Oftentimes, users forget that applications are built with sufficient resources and bandwidth is finite. Most often, entertainment bandwidth competes heavily with business bandwidth.

The top application troubleshooting tip related to this problem is to apply QoS. The application will automatically adjust the buffering and prioritize critical features. Also, this will go back to the basics of coding. For instance, in HTML5, you can place script tags at the bottom. This will make the illusion that all the features are loaded while it is still fetching some JS script for example. 

APM-based Application Troubleshooting Tips

Troubleshooting web application performance issues is a tedious task. Instead of manually doing it, here are some specific application troubleshooting tips when using Stackify Retrace.

Error Logs Information

Errors and log information helps developers understand the root cause of a problem. The use of logs and log information is vital in every app performance monitoring. Developers can monitor the volume of logs at any particular time and check outages quickly. 

Here are some Stackify Retrace error logs information features:

  • The top five errors encountered by your application along with the error description and the frequency of its occurrence.
  • The recent errors that occur and their corresponding information. It helps developers the edge on where to start the troubleshooting process.
  • An error chart that provides insights on the volume of errors at a specified time and date.
  • The error rate provides a correlation between the volume of errors compared to an ideal state at a specified time. 

Troubleshoot Service Actions Using Logging Sources

One of the most common ways of troubleshooting service actions is real-time tailing. Monitoring and management performance solutions provide developers with dashboards that allow them to view logging statements as they occur. It provides easy monitoring across different applications across multiple servers simultaneously. Also, it helps organizations, especially those who have hybrid cloud set-up (on-prem and cloud).

Trace Executed Queries Back to Your Applications

Problems in the database often have a wide-reaching impact downstream. Thus, developers should learn how to effectively monitor query results.

One of the capabilities of Retrace is to prepare SQL monitors. Developers can set up the database information, the database credentials and configure a database connection, and create a reusable query. 

Now, to fast-track this process, Retrace will provide information on how long the SQL statement takes to execute. An alert can be set up to provide the notification. Also, the record count matters with an alert initiated every time a record count is one(1) or greater.  This means that alerts are sent every time there is a missing user/customer email address in the database. Finally, the result check ensures that an alert is initiated for unexpected query results.

Inspect the HTTP requests 

Retrace helps developers learn how to inspect the HTTP requests. It will retrieve all the details about a current HTTP request such as POST data, query strings, server variables, and others. From there, you can go into the details and show errors in context with the app’s other logging messages. Also, you can get additional details to see what happened before the error occurred. 

Solve Common App Development Issues

When developing applications, whether web or mobile, check the common issues that you may encounter and their possible solutions. The most common app issue is user experience. It is already a given that a poor user experience will let your users migrate away.

Poor user experience is a broad topic but the best solution is a proactive approach. Prior research always comes to the fore. User Experience (UX) design allows you to know what your users expect from the app before development begins. Look into the competitors, too, and narrow down your competition. 

However, if push comes to shove, and your application continues to exhibit poor performance, then consider a massive testing approach. After all, there might be feature glitches that the developers consider minor but are complicated by an untrained user.

Advanced Application Troubleshooting

Advanced app troubleshooting is sometimes beyond the capability of traditional processes. As organizations use the cloud to increase agility, flexibility, and efficiency, troubleshooting also becomes a complex task. Hence, the role of APMs plays a vital role in advancing troubleshooting techniques.

For example, in dealing with hybrid cloud deployment where your offline synchronization has errors. Meaning, not all the necessary data are being copied into your on-prem and cloud deployment. There could be a lot of reasons here such as filtering rules are not well defined. APMs can alleviate these issues and fast-track resolution time.

APMs provide possible solutions such as ensuring that the missing data is present in the intended destination. A missing data is catastrophic and APMs can help with real-time alerts and the capability to pinpoint wrong synchronization process implementation.

Perform Application Troubleshooting with Stackify Retrace

In dealing with application troubleshooting, maybe you need a multifunctional team to understand an application’s performance. Ideally, troubleshooting web application performance issues happen during the maintenance phase. Once the development is done and the warranty expires, an organization hires a full-stack application developer to do the maintenance. Hence, an APM tool is very helpful especially when dealing with a huge application.

Overall, web application troubleshooting is not easy to deal with. In the worst cases, an application may be subject to an overhaul. Re-architecting the application is very challenging. Hence, if you have a lot of sub-applications, especially for a wide-enterprise environment, choose the most important app that requires the biggest support effort. 

Application performance management (APM) tools, such as Stackify Retrace are helpful. APMs are effective tools to help developers identify, whether it is a client-side, network, or server-side problem. Retrace collects the right logs and helps identify back-end application database servers that are creating bottlenecks. 

Also, you can achieve the equivalent functionality with a free code profiler like Prefix. This tool can be installed while you work and can help you with languages such as Python, Java, Ruby, and others. 

Start your Stackify FREE 14-day trial today!

Improve Your Code with Retrace APM

Stackify's APM tools are used by thousands of .NET, Java, PHP, Node.js, Python, & Ruby developers all over the world.
Explore Retrace's product features to learn more.

Learn More

Want to contribute to the Stackify blog?

If you would like to be a guest contributor to the Stackify blog please reach out to [email protected]