Application Performance Management (APM) solutions are a must-have for Agile development teams, and when implemented correctly, they can save substantial amounts of time, create a better end user experience, and improve overall development operations. (Naturally, we’re big on APM – it’s what we do.)
The key to success, though, is implementing systems and solutions that are aligned with larger business goals and knowing how to leverage your tools to your advantage. So, we rounded up some advice from developers and IT leaders to offer some insight on this question:
“What’s the biggest mistake IT management teams make when it comes to implementing application performance monitoring processes (and how can they fix it)?”
Meet Our Panel of Developers and IT Leaders:
Read on to find out how you can better leverage APM to your team’s strategic advantage.
Mark supports, administers, and helps improve Morpheus IT infrastructure. Prior to Morpheus, Mark served as a Cloud Architect for a data consultancy, Support and Solutions Engineer at VMware, and a Procurement Automation Administrator at Lockheed Martin.
“After about 15 years in IT, here’s a couple of the most common mistakes that I see all the time…”
- They address the error, not the root cause. It’s the job of IT managers to know what’s going on and to try catching problems before they get out of control. As a result, we sometimes become too reliant on third-party technology. Tools like Datadog and New Relic are great, but they are not going to tell you exactly what the problem is. Yeah, they can help point out bottlenecks, but unless you resolve those bottlenecks by finding the root cause, you are going to continue having problems. Don’t expect a tool to solve your problems—find the root cause, create new checks and error messages, and and you’ll be able to deter potential problems in the future.
- They don’t consider the business impact. Sometimes we get a new tool because we think it will solve all our problems. While that may be the case sometimes, we usually have to do some groundwork before we can achieve true success. Here’s an example: We are tasked with monitoring the most important systems in our organization, so we put performance monitoring on everything. Good? No. I’ve seen so many teams use every single license they have, just for the sake of using them all. They think they need to put performance monitoring on everything so they hook up the performance monitor to all our servers. What this inevitably does it open more servers and systems to bugs, and suddenly, Dev is underperforming and it’s because of a “good thing.” Focus on what’s most important and make it solid. Then build out from there. (This ties in with this idea that sometimes performance management is catching what we need to care about-not just the problems at hand. The performance of high-use mission critical systems is vital for being monitored and an issue there should not be treated with the same level of concern as issues elsewhere. I’ve seen groups say, “Well, we only had 4 errors last month—everything is fine.” Well, those 4 errors were all in production, but they had 30 errors/alerts issues in staging and QA that they’re afraid to mention. All errors are not created equal. We need more errors in staging and QA, and we need to recognize those as good things because they are preventing errors in Prod.)
- Start earlier. Build performance expectations and testing into the earliest parts of new app development. This is hard to do on legacy applications but is a great practice for Agile shops. By setting desired expectations on some early metrics, as you build, those performance metrics are visible targets to the team. You can start dealing with the impact earlier and forecast performance for a production settings. Again, this can help cross-functional teams avoid problems down the road and point out issues and bugs sooner.
Tapas Banerjee is the CEO of Web Age Solutions.
“The single biggest mistake IT Management Teams make implementing Application Performance Monitoring is…”
In not having an enterprise monitoring strategy. This can stem from the mistaken notion that APM is server monitoring or as a result of bringing in solutions that someone on the team used before without considering the overall application set in play.
Fixing the gap means creation and implementation of a monitoring strategy. The monitoring strategy should cover the collection of monitoring data from all of the parts in the organization’s business solutions, so that you can proactively identify and resolve failures. Your monitoring plan should address, at a minimum, the following questions:
- What business goals does your monitoring support?
- What are the categories and specific resources you will monitor?
- How will you measure the success of your monitoring?
- How often will you review your monitoring plan?
- What is the periodicity of the monitoring of applications and resources?
- Who will perform the monitoring tasks?
- Who should be notified when something goes wrong?
- What monitoring tools do you use?
- What is your organizational DevOps monitoring maturity?
- What monitoring tools will you use?
Dan is the Development Team Lead for Objective in Salt Lake City, Utah. He holds a master’s degree in Linguistics from the University of Hawai’i and received bachelor’s degrees from Brigham Young University in Korean and Linguistics.
“The biggest mistake that IT management teams make when it comes to implementing application performance monitoring processes is that…”
They don’t implement them. This can be fixed by using services such as New Relic or Skylight for web application performance, and Crashlytics or similar for iPhone apps. There are similar services and/or libraries for pretty much every type of application.
David Lynch is a Marketing Specialist for ITXcorp.
“When IT management looks at application performance monitoring…”
They tend to think in terms of the systems that they manage and not the experience of the end user. This leads to a narrow focus on the underlying architecture (solving technical problems) first. Put another way, the focus of the performance monitoring is on the underlying systems, and this monitoring is used as a proxy for end user experience, such as “Is the disk performing well?” or, “Is the CPU under load?” Moving the target by asking, “Is the end user able to use the system to accomplish or advance their goals?” allows the system to be examined from a different context and helps to focus investment and iterative performance improvement tasks on those portions of the system which are most important.
Mihai Corbuleac is a Cloud Consultant at Unigma Monitoring Solution.
“The biggest mistake IT management teams make when it comes to implementing application performance monitoring processes is…”
The fact that people don’t usually give the performance monitoring app time to gather enough data to make forecasts and generate accurate suggestions for improving performance. Our tool monitors cloud-based apps, improves cloud costs, and also generates app performance suggestions, and I know that people are eager to optimize as soon as possible. We always share immediate suggestions, but the best optimization comes after a while.
Michael is the Founder & Chief Analytics Officer for Digital Acumen. He worked with Fortune 100 companies in these roles: media mix and statistical modeling, test & learn, web analytics, product, and program management.
“The biggest challenge to successful performance monitoring isn’t the vendor, the reports, or the actionable intelligence…”
It is executing on the intelligence.
We have worked with a major B2B service provider who uses a major performance monitoring tool but was not making substantial changes from the information collected. Their challenge was getting buy-in that the changes were critical to the user experience and thus had significant difficulty in getting the fixes prioritized.
The solution was to present evidence on the important of site speed and up-time, and its effect on bounce rates and thus conversion. The data shows that a one-second delay in site load can drop conversions by 7%. Their survey results also showed page load as a top-five complaint. With this data, the team was able to prioritize performance issues on the backlog.
Brady Keller is a Digital Marketing Strategist at Atlantic.Net, a trusted hosting solution for businesses seeking the best enterprise-class data centers.
“One of the biggest challenges to implementing application performance monitoring is…”
That with the rise of virtual and Cloud environments, it can be hard to monitor the performance of a process that may not always be running on the same server/node but could be hopping across a distributed cluster. Once IT teams are aware of the need for different metrics than what they had used prior, they can begin to split up those metrics into several categories, like virtual machine workload, per-server application performance, virtualization platform performance, and user-side experience and response time. It becomes less about one overall metric and more about segmenting and then weighing what metrics are the most important from a wide swatch of metrics.
Daniel Lakier is VP-ADC globally for Radware. Daniel has been in the greater technology industry for over 20 years. During that time he has worked in multiple verticals including the energy, manufacturing, and healthcare sectors. Prior to Radware, Daniel was president and CTO of a leading technology integrator.
“Two of the most common mistakes we see are intrinsically linked…”
First, the lack of monitoring and testing is often due to time pressure from production management and fast moving project timelines. However, the old adage, “more haste, less speed,” still rings true today. Do it right the first time and you save significant time on the backend because it’s much easier to optimize an application before it goes into production.
Moreover, too often we see people using two different systems, or tools for doing the performance monitoring/baselining – one in test and development phase and a completely different tool in production. Switching tools throughout the process makes using the performance metrics for baselining purposes less than ideal and can cause a host of unforeseen challenges when trying to compare actual application performance and stability to expected performance and stability.
To minimize these challenges and avoid these mistakes, be sure to use the right tool for the right job and function. In many cases, an SLA manager, similar to those found in some ADCs, can give you a quick guide to whether the problem is internal, external, network, or application-based. These SLA managers can also be set to provide alerts on performance deviations and are an effective first tool for any application performance strategy. Network monitoring solutions can also be used to help you get more granular on the connectivity layer by reviewing performance like network latency, packet error rates, and retransmits packet loss.
Lastly, a full APM can drill down into the application itself to do root-cause analysis for coding optimization to enhance code performance and stability.
If we have a clear strategy and build a good practice, then we can always stay ahead of the curve. By providing the application with the appropriate resources to handle the required task, we can provide a predictable and repeatable customer experience.
Swapnil Bhagwat is the Senior Manager – Design & Digital Media, implementing web, design and marketing strategies for the group companies. He is an MBA graduate with work experience in the US, UK and Europe. Swapnil
has worked for more than a decade across a range of businesses for the global markets.
“Some of the crucial mistakes the IT Management team makes while monitoring the application performance are…”
- Not identifying the exact scenario before hiring external contractors
- Going ahead with the non-essential investments related to the application
- Over-employment of staff
- Appointing inexperienced or incompetent leaders
- Having more than adequate numbers of managerial positions in a team
Eric Christopher is co-founder and CEO of Zylo, the leading SaaS optimization platform that transforms how companies manage and optimize the vast and accelerating number of cloud-based applications organizations rely
“With SaaS app purchases being made across the organization without IT’s involvement, CIOs have another issue on their hands…”
Lack of visibility. This “operating in the dark” can cause them to make decisions without having all of the important information.
Enter a cloud intelligence system of record. What could this type of platform actually do to help the CIO better manage SaaS and cloud applications? Let’s examine a few components:
Executive Dashboard: Having all provider-specific data in one single platform, versus many siloed platforms, CIOs can see cloud metrics alongside spend and application trending detail to make data-driven decisions.
Renewals: With proactive visibility and alerting, as well as application level data ownership, CIOs are in a much better position to negotiate and get the best contract terms available.
Supplier Relationships: Effectively managing supplier relationships is now possible when the contact information, quotes, contracts and notes about the relationship are stored with the current contract spend and application utilization information. The days of asking a provider to share utilization detail to negotiate a deal with that same provider are over.
Benny Friedman is the Director, Israeli Development Center at Ruckus Wireless.
NOTE: The following information is excerpted from 10 totally avoidable performance testing mistakes via TechBeacon.
“Some people schedule performance testing at the end of the life cycle, assuming they can’t test before the complete application is available…”
That is so wrong. If your continuous integration (CI) system already has some automated tests in it, you can start automated performance testing. Running performance tests should be part of your functional testing cycle.
When testing each build, you should also test for performance, and reuse the functional tests for performance testing if possible. You might want to have a few test configurations, just as with functional user interface (UI) tests. Have a short one to run for every build, a more thorough one to run nightly, and a longer, more comprehensive one to run at the end of each sprint. If your application relies on services that are expensive to use during testing, or if they’re not available at the time you need to test your application, use a service emulation tool to simulate them. This lets you test the performance of your core software as if it was consuming the actual services.
Beeye aligns people and projects through a collaborative planning and managing tool so you better reach your goals. With Beeye, managers know which projects are understaffed, which are running behind schedule and which are most profitable. Managers and employees can manage their time, workload, and analyse their performance. Beeye is a SaaS solution that gives organizations the capacity and profitability planning information they need, when they need it. All with a lightweight, low cost, easy to use tool.
NOTE: The following information is excerpted from 14 Mistakes That Ruin Performance Management Every Time via Beeye.
Probably the most common misunderstanding about performance management is that it is not the same thing as performance review. The performance appraisal is only a part of the whole process outlined above.
Worse, performance management is often confused with the mostly outdated annual performance review. If it is to be taken seriously, performance should be monitored on an ongoing basis so that problems are fixed when they arise, and opportunities exploited as soon as possible. It is a continuous process, not an event.
Thinking this way is one of the things that get organizations into trouble, because of the yearly appraisal process, no matter how well designed, is not enough to ensure that employees perform at their best.
It is also not a purely administrative burden: performance management is about making people and organizations more efficient in a measurable way, not about filling forms and having meetings to collect data that will never be used.
Basic misunderstandings about performance management explain both why it is reviled, and why it is inefficient when companies decide to go through with it even though they are missing critical pieces of the puzzle.
Charles Araujo is a Principal Analyst for Intellyx. Intellyx is the first and only industry analysis, advisory, and training firm focused on agile digital transformation. Intellyx works with enterprise digital professionals to cut through technology buzzwords and connect the dots between the customer and the technology – to provide the vision, the business case, and the architecture for agile digital transformation initiatives.
NOTE: The following information is excerpted from Slow is Smooth and Smooth is Fast: Application Performance Management and the New Development Mantra via Intellyx.
“There is a well-known axiom in the development world that is synonymous to my father’s SWAT team mantra…”
“The best time to find bugs is when you’re creating them.”
Of course, development teams know this — or at least they pay lip service to it. Quality Assurance (QA) teams and their embedded testing procedures are almost universally a part of the software development lifecycle. But there two corollary facts that are just as prevalent, if less discussed, in an ever-faster-moving development world: coders want to code (not test), and traditional testing won’t uncover the most common performance-related issues.
The problem is that traditional testing approaches primarily test code at a functional level — their aim: to identify code that just doesn’t work. But in today’s world, that’s a rarity. We’re long past the point in which consumers (internal or external) were tolerant of rampant and blatant bugs in the code. Today, the vast majority of issues that make the difference between perceived success or failure of a deployment come down to one thing: performance.
Unfortunately, developers discover most transactional performance issues only after they’re in production rather than at the point of development.
Stackify created Prefix to help close this gap. Prefix runs in your development environment and is a lightweight tool that shows real-time logs, errors, and queries, along with other real-time, performance-related information on the developers’ workstations. It helps them understand how long transactions take and can answer the key question, “what did my code just do,” while they can still do something about it.
Floyd Smith is the Director of Content Marketing at NGINX.
NOTE: The following information is excerpted from 10 Tips for 10x Application Performance via NGINX.
“If your web application runs on a single machine, the solution to performance problems might seem obvious: just get a faster machine, with more processor, more RAM, a fast disk array, and so on…”
Then the new machine can run your WordPress server, Node.js application, Java application, etc., faster than before. (If your application accesses a database server, the solution might still seem simple: get two faster machines, and a faster connection between them.)
Trouble is, machine speed might not be the problem. Web applications often run slowly because the computer is switching among different kinds of tasks: interacting with users on thousands of connections, accessing files from disk, and running application code, among others. The application server may be thrashing – running out of memory, swapping chunks of memory out to disk, and making many requests wait on a single task such as disk I/O.
Instead of upgrading your hardware, you can take an entirely different approach: adding a reverse proxy server to offload some of these tasks. A reverse proxy server sits in front of the machine running the application and handles Internet traffic. Only the reverse proxy server is connected directly to the Internet; communication with the application servers is over a fast internal network.
Using a reverse proxy server frees the application server from having to wait for users to interact with the web app and lets it concentrate on building pages for the reverse proxy server to send across the Internet. The application server, which no longer has to wait for client responses, can run at speeds close to those achieved in optimized benchmarks.
Adding a reverse proxy server also adds flexibility to your web server setup. For instance, if a server of a given type is overloaded, another server of the same type can easily be added; if a server is down, it can easily be replaced.
Because of the flexibility it provides, a reverse proxy server is also a prerequisite for many other performance-boosting capabilities, such as:
- Load balancing (see Tip 2) – A load balancer runs on a reverse proxy server to share traffic evenly across a number of application servers. With a load balancer in place, you can add application servers without changing your application at all.
- Caching static files (see Tip 3) – Files that are requested directly, such as image files or code files, can be stored on the reverse proxy server and sent directly to the client, which serves assets more quickly and offloads the application server, allowing the application to run faster.
- Securing your site – The reverse proxy server can be configured for high security and monitored for fast recognition and response to attacks, keeping the application servers protected.
NGINX software is specifically designed for use as a reverse proxy server, with the additional capabilities described above. NGINX uses an event-driven processing approach which is more efficient than traditional servers. NGINX Plus adds more advanced reverse proxy features, such as application health checks, specialized request routing, advanced caching, and support.
Frank J. Ohlhorst
Frank J. Ohlhorst is an award winning technology journalist, professional speaker and IT business consultant with over 25 years of experience in the technology arena. Frank has written for several leading technology publications, including ComputerWorld, TechTarget, PCWorld, ExtremeTech and Toms Hardware. Frank has also contributed to business publications, including Entrepreneur and BNET, and also has contributed to multiple technology books and has written several white papers, case studies, reviewers’ guides and channel guides for leading technology vendors.
NOTE: The following information is excerpted from Application Control: How to Detect Performance Bottlenecks via Tom’s IT Pro.
“Ultimately, the goal with Application Performance Management or Monitoring (APM ) is to leverage proactive management, succeeding in preventing problems and helping IT to plan for future needs…”
To accomplish that, an APM product should assist in managing certain elements – which can be broken down into:
- Fault Monitoring: Primarily used to detect major errors related to one or more components. Faults can consist of errors such as the loss of network connectivity, a database server going off line, or the application suffers an out-of-memory situation. Faults are important events to detect in the lifetime of an application because they negatively affect the user experience.
- Performance: Performance monitoring is specifically aimed at detecting less than desirable application performance, such as degraded servlet, database or other back-end resource response times. Generally, performance issues arise in an application as the user load increases. Performance problems are important events to detect in the lifetime of an application since they, like Fault events, negatively affect the user experience.
- Configuration: Configuration monitoring is a safeguard designed to ensure that configuration variables affecting the application and the back end resources remain at some predetermined configuration settings. Configurations that are incorrect, can negatively affect the application performance. Large environments with several machines, or environments where administration is manually performed, are candidates for mistakes and inconsistent configurations. Understanding the configuration of the applications and resources is critical for maintaining stability.
- Security: Security monitoring detects intrusion attempts by unauthorized system users.
- Accounting: In some cases, departments or users may be charged maintenance, usage and administration fees. Accounting monitoring measures usage so that, for example, organizations that have a centralized IT division with profit/loss responsibilities can appropriately bill its customers based on their usage.
Each of the above capabilities can be integrated into daily or weekly management reports for the application. If multiple application monitoring tools are used, the individual subsystems should be capable of either providing or exporting the collected data in different file formats that can then be fed into a reporting tool. Some of the more powerful application monitoring tools can not only monitor a variety of individual subsystems, but can also provide some reporting or graphing capabilities.
LadyCoders.com is run by a bunch of women that want to break down the stereotypes that women can’t code, aren’t good with computers, and somehow aren’t equal to men in this area. Nothing could be further from the truth! They offer a lot of tips that should really help both men and women who aspire to build a web application or site in one of the more popular coding languages.
NOTE: The following information is excerpted from Tips for Maximizing and Monitoring Application Performance via LadyCoders.com.
“When it comes to the performance of a web application, the amount of available bandwidth plays a direct role in how quickly the application operates…”
While you may have carefully planned for the amount of bandwidth an application will require, what happens when the application receives an unexpected amount of traffic? What if your site goes viral and thousands wish to access an application within a short amount of time? While there are many instances that are outside of your control, such as unexpected spikes in application traffic, you can help support continual high-performance by reducing the amount of unnecessary high resolution files, which include images and videos. When designing your web application, do so with the intention of having high bandwidth demands, which will automatically result in leaner, more functional applications.
Because of the dynamic environment most applications thrive in, it’s essential that you establish an application performance monitoring solution capable of continuously monitoring and addressing issues in all levels of an application. Remember, there’s no such thing as the perfect application code. Errors and issues will arise; however, it’s how you tend and repair these issues that truly determine its overall performance. Remain vigilant in regards to monitoring the performance of an application and establish set guidelines when it comes to addressing and correcting any and all errors.
George Lawton is a journalist based near San Francisco, Calif. Over the last 15 years, he has written over 2,000 stories for publications about computers, communications, knowledge management, business, health and other areas which interest him.
NOTE: The following information is excerpted from Web application performance tips from the wolves on Wall Street via TechTarget’s TheServerSide.com.
“Many web applications application performance issues come down to the I/O in, compute, archive, encode, and then I/O out cycle…”
Normally developers use coarse-grained parallelism where the application logic will go through some activity and then respond back. This has interesting implications. In many cases, the application model has to deal with concurrence. It’s easy to create applications where multiple threads hit the same backend.
“Another issue is that the application will often deal with contention on data stores,” said Martin Thompson, founder of Real Logic. A queue helps management. Modern CPUs have caches that drive performance. The level 1 cache has a latency of about 1 nanosecond (ns). But if the application misses this window it rises to about a 100 ns. If the data has to be pulled from another server, this can rise to over 1000 ns. One of the properties of caches is that they work on a principle of least recently used data. In many cases it’s possible to craft the application such that it doesn’t fundamentally leverage the best use of the hardware architecture.
Leveraging a pool of threads can make it easy to optimize cache performance, and subsequently improve web application performance as well. But a better approach is to assign a thread per stage where each stage is working with the part that keeps it simple. This is how manufacturing works. The model become single threaded and there is reduced concurrence. This approach is better at dealing with data stores in batches. Queues are used everywhere in modern applications. Thompson recommends making them explicit and then measuring cycle time and service time.
Kieran Taylor, previously the Director of Product Marketing of Compuware’s APM Business Unit, is currently the Senior Director of Product and Solutions Marketing for CA Technologies.
NOTE: The following information is excerpted from The New World of Application Performance Management via Data Center Journal.
“With so much going on beyond your own data center, the reality of modern web applications is that even if your tools inside the firewall indicate that everything is running okay, that’s no guarantee your end users are happy…”
You can no longer just manage the elements and application components inside your firewall because this gives you only partial coverage and leaves you with significant blind spots. Many aspects of the end-user experience will not be inferable from data collection points within the data center. The point at which the end user accesses a composite application is the only place where true application performance can be understood.
Today, it becomes more critical to assess the end-user experience as part of an overall performance management strategy canvassing all performance-impacting elements—from the end user’s browser all the way back to the multiple tiers of the data center and everything in between. This is the key to identifying and fixing any weak links. Some businesses believe they can’t manage the performance of the entire application delivery chain because many of these components are outside the firewall and beyond their direct control. But if you can manage application performance from the end user’s point of view and include the entire web application delivery chain, then you really are in a stronger position of control.
James Mancini is the Founder and Chief Technologist for Netreo.
NOTE: The following information is excerpted from Counting On The Cloud: 4 Tracking Tips For Cloud-Hosted Applications via Netreo.
“Knowing the real-time status of cloud-based systems may give you time to prepare for the effects of an impending outage…”
You may be able to take corrective action, or at least communicate to affected users so they’re aware of the problem and can act accordingly.The ability to see historical information at a glance, and produce reports to document it, is also important. With this data in hand, you can hold your service providers accountable. If they’re not delivering on the service level requirements they’ve committed to, you need to show them what’s happening.If you’ve done the hard work of migrating bare metal services to the cloud, you’ve probably seen an increase in uptime, and that’s great. But the cloud’s dramatically increasing role in IT system infrastructure will likely create more complexity and more service issues.Prepare yourself now to handle emerging cloud service issues by monitoring cloud-hosted applications thoroughly.
Boris Dzhingarov graduated at the University of National and World Economy with major marketing. He writes for several sites online such as Semrush, Tweakyourbiz, and Tech Surprise and MonetaryLibrary.. Boris is the founder of
NOTE: The following information is excerpted from 4 Tips to Improve Your .NET Application Performance via TG Daily.
“When optimizing your .NET application performance, Profilers are a critical component in your troubleshooting arsenal, especially when dealing with poor CPU performance and memory resource issues…”
Traditional profilers track things like memory usage, method call frequency, and time spent per line of code. Lightweight profilers provide you with a high-level understanding of how your code is performing. And APM tools monitor your production servers.
- Traditional .NET Profilers
While these profilers aren’t used very often, they come in handy when you’re dealing with problems stemming from poor CPU performance and memory resource issues.
Because traditional .NET Profilers consume a hefty amount of resources; you want to avoid running them on the same computer as the database you’re profiling.
- Lightweight .NET Profilers
These profilers are designed for frequent use and to track your application’s performance at a high level so you can see important data like page load times, successful database calls, and why pages are taking so long to load.
Since lightweight profilers don’t use much of your computer’s resources, you can let these run indefinitely.
- APM tools
APM tools that run on your server need to be lightweight, so they don’t slow down your applications. Thankfully, they can collect details quickly to help you diagnose the problem faster.