From the Blogosphere
APM Tools | @DevOpsSummit #Agile #APM #DevOps #ContinuousDelivery
Imagine you are an expert for a highly customizable platform that has been adapted to the customer’s needs
May. 31, 2016 09:30 PM
Just last week a senior Hybris consultant shared the story of a customer engagement on which he was working. This customer had problems, serious problems. We're talking about response times far beyond the most liberal acceptable standard. They were unable to solve the issue in their eCommerce platform - specifically Hybris. Although the eCommerce project was delivered by a system integrator / implementation partner, the vendor still gets involved when things go really wrong. After all, the vendor knows best, right?
So when he started working with this customer his first question was:
Do you have an APM tool in place?
Why? Imagine you are an expert for a highly customizable platform that has been adapted to the customer's needs. Within a very short time you are expected to get a complete overview of a mostly unknown environment in order to solve a pressing issue. So you need information, accurate information, the best information available. Just facts, no rumors or hearsay. It's like when your child gets hurt at the playground. You take them to the hospital and one of the first things a doctor does is perform an X-Ray to get a clear image of the injury - perhaps a broken bone.
"Yes we do have an APM solution." the customer replied. "Good" the expert consultant said. "Let's take a look at this problem in your staging environment." Customer: "It's only monitoring our production environment... and we already tried using it to solve the problem." Expert consultant (confidently):"Oh, okay, then let's work on production data to investigate the problem," and then asked for access.
Looking at production data provides the benefit of using "the real data and the real problem" for investigation, not the one replicated by a "close to real" test. Don't get me wrong, I'm not saying that performance analysis and diagnosis has to happen in production, but often it's the quicker way to resolve, well, production problems. Preventing these problems from ever hitting production by first using APM best practices is a whole other topic. More on that later.
Soon after he logged into his "dynamic" monitoring solution featuring nice dashboards and alerts, blinking on violated average response times. He saw an overview of the environment and even identified one specific business-relevant transaction that was extremely slow. What he saw also confirmed the issue about which the customer was complaining. The problem was obvious, but the solution wasn't. He needed details. He found database statements that were executing often, but all were functioning fast enough and seemed fine. So, what was making the transaction take so long? A deep investigation of the transaction executions would be needed.
Can you export this live data...?
"...so I can take it to our lab for deeper investigation?" the consultant asked? The day had been long and he wanted to analyze the data offline, while on the commute home. A 45-minute ride should be enough to find the root cause, and he would be in time for family dinner.
"Export production performance data for offline analysis... how would that be possible?" a young and genuinely surprised system administrator asked. "I don't think that's even possible" - and it wasn't possible with their monitoring tool. So the Hybris expert stayed a bit longer, missed his train home, but eventually gave up investigating the problem for the day. Fortunately he was home in time for dinner, and his wife wasn't angry. Peace at home, but none for the customer who had to live another day with the persisting problem.
The next day he went back to the customer, eager to solve the issue. It didn't go away overnight, and the analysis was still hindered by missing facts - facts that their APM solution couldn't identify and report.
Click here for the full article.