We are seeing a trend where Managed Service Providers are starting to monitor and troubleshoot Applications and Services for their end customers instead of just looking at the health of individual routers and servers. Since applications don't typically live on a single server anymore, troubleshooting and isolating the root cause of slow applications and IT services is much more complex than a flashing red icon for a failed router.
Netflow and packet data has always been useful for troubleshooting once an initial alarm has been detected by a fault or performance management product. Being able to do an analysis of the network data is the best way to determine the cause of problems between the different components of a distributed service. The use of Netflow and similar packet tools is very ubiquitous in the enterprise, where it is used mainly by the network department.
Using Netflow DataHowever, the expertise required to run the tool remains a key hurdle to the adoption of packet based solutions even in the enterprise. Switching from an Event Manager to a Netflow console in the NOC is not for the faint of heart. Having an integrated netflow engine with a uniform user interface allows overcoming this major hurdle to netflow adoption.
Using netflow data in MSP environments has other complexities because of the multi-tenant requirements. The netflow solution itself should permit distributed collectors behind firewalls and have a centralized configuration console for all these distributed collectors. The reporting and analysis engine should be capable of correlating the customer to the correct netflow collector. The Event Manager must be able to allow drill down for an end customer's event to the correct netflow data for that customer.
Presenting a unified dashboard of performance metrics and packet level metrics increases the usability of both data types. Contextual analytics and presentation of data from multiple sources is invaluable to IT Operations in troubleshooting poor application performance and user satisfaction. As a simple example, either of the two metrics above could indicate that a database is responding slowly to queries. The application performance metrics would show that the buffers are starved because the number of transactions is abnormally high. Integrating with the netflow or packet data would allow immediate drill down to isolate which client IP address is the source of the high number of queries.
New generation MSP monitoring solutions need to be scalable and capable of analyzing very large datasets in real-time to provide meaningful results to IT due to the complexities of new distributed, virtualized datacenters. There is a growing demand for such analytical products as more enterprises migrate to hybrid cloud environments and downtime or degraded performance is no longer an option.
Vikas Aggarwal is CEO of Zyrion Inc., a leading provider of Cloud and Network Monitoring software for large enterprises and Managed Service Providers. You can read more about Zyrion's MSP monitoring solution here.