The future of cloud application observability has arrived.

The Fabric company, OpsCruise emerges from stealth to provide a comprehensive Kubernetes-centric monitoring solution and a fresh approach to ensuring the performance of cloud applications.

Today, we are excited to announce that OpsCruise is emerging from stealth to make its public presence known.

For more than a year, we’ve been busy behind the scenes with our early customers and partners innovating a product to make the lives of DevOps and ITOps engineers easier.

We founded OpsCruise because we saw first-hand the new challenges in observability for ensuring the health and performance of modern, dynamic cloud applications being orchestrated by Kubernetes in production. The old method of deciding what telemetry to collect, continually setting and tuning thresholds, sifting through thousands of symptomatic alerts, viewing hundreds of metrics graphs and running cross-functional war rooms to isolate problems won’t scale in these dynamic environments. Legacy monitoring tools whose architectures and workflows pre-date this cloud-native era aren’t up to the task.

OpsCruise’s mission is to automate the performance assurance of cloud applications using a model-driven platform that leverages the CNCF open source monitoring stack. Five main principles guide our work:

  • Automate the Monitoring Operations Lifecycle. In legacy monitoring systems and processes, many steps (e.g., discovery, what metrics to monitor, where to set thresholds, kicking off fault isolation, escalating incidents, corrective actions, etc..) are manual. OpsCruise automates the majority of these steps in a comprehensive fashion for Kubernetes orchestrated applications.
  • Causation over Correlation. Sneezing may be correlated with your cold, but might be caused by your allergies. Troubleshooting real-time performance issues requires more than correlating metrics or logs in the same time and space. OpsCruise’s patent pending AI/ML engine, Cruise Control leverages the knowledge of the application structure through topology walks that can pinpoint problem sources far down the dependency chain.
  • Democratize Telemetry Data. In the cloud-native era, you shouldn’t have to pay a vendor to simply aggregate and visualize your applications’ instrumentation (i.e. logs, metrics, traces, etc.). Further, that same instrumentation can be used for many use cases beyond monitoring, including: security, chargeback and business operations. OpsCruise is architected atop the CNCF open source monitoring tools, including: K8s, Prometheus, FluentD and Jaeger as well as open standards such as eBPF. In their April, 2020 report:

Gartner estimates by 2025, 50% of new cloud-native application monitoring will use open-source instrumentation instead of vendor-specific agents for improved interoperability, up from 5% in 2019.

  • Don’t be intrusive. An observability solution should do just that – observe. Unfortunately, most legacy solutions are heavyweight – they require proprietary agents on every host that consume significant resources themselves, they necessitate development teams instrument their code or they require back-end infrastructure equal to the application they are actually monitoring. OpsCruise operates in the control plane with no impact on applications.
  • Get on the path to Autonomous Ops. At the end of the day, Ops teams want to avoid spending time in war rooms and being reactive. Only an observability system that can predict problems as they emerge, and provide insights in understanding, isolating, and correcting the problem in a closed-loop manner can enable that. OpsCruise’s model-based predictive approach is designed to get you on a path to autonomous operations.

And while it’s still early days, our approach has already allowed us to have a significant impact on the operations of our customers – allowing them to increase their application velocity without a commensurate increase in their SRE staff and enabling them allocate/size cloud resources more efficiently. What’s more – we eliminated the number of times that a war room is needed because we’ve predicted performance issues before customers are impacted and have been able to fault isolate. Comments from some of our early customers:

“OpsCruise’s unique ML-driven application visibility and automated root cause analysis is ideal for issue detection and resolution needed in any K8s environment and existing open monitoring framework.”
– Alak Deb, Chief Cloud Architect, A10

“At Bitovi, we work with organizations to reduce the burdens of developers and SREs through automation. OpsCruise’s real-time anomaly detection and alerting provides a foundation for auto remediation which will ultimately eliminate the war room.”
– Mick McGrath, DevOps Director, Bitovi

Further, since OpsCruise has designed around and contributed to the CNCF open source monitoring stack, we’ve presented at CNCF events, including a keynote at last November’s KubeCon in San Diego. We also received recognition from some of the projects’ creators.

“OpsCruise is a promising and novel approach for integrating information from both Prometheus and Kubernetes to help visualize the environment and automate production troubleshooting.”
– Julius Volz, Creator, prometheus.io

We would especially like to thank our early customers for taking a bet on us, working with us in the trenches and helping us build something that you’ll truly use and recommend to your friends.

And Finally, we’re grateful to our co-creation partner/seed investor, The Fabric, for their hands-on guidance, our advisory board for their industry connections and our employees who have worked tirelessly to get us to this point.

If you want to learn more about our story, read the Monitoring Manifesto or visit Opscruise.com.

Looking forward,

Aloke, Scott & Shridhar