What do recent data breaches tell us about the state and need of application security ?

Picture this. A hacker uses a VPN to breach into a cloud server (virtual machine hosted within a tier-1 public cloud provider) of a large financial enterprise, through a misconfigured firewall, then executes a small set of commands (injection attack), which gets her the credentials. She then uses the credentials to send authenticated requests to the cloud’s storage environment, and extracts hundreds of millions of credit card applications and account information (data exfiltration attack), carrying all kinds of sensitive data such as PII (personally identifiable information) and PI (personal information).

Sounds all too familiar and real ? That’s because it is a real breach. Interestingly this is a breach of an Enterprise that is well known to take data security very seriously.

What’s most alarming about the story, is that the hacker went undetected (despite all the monitoring and detection tools in place), and she had to brag about her adventures on GitHub/Slack for people to take notice, and only after a tip off did folks realize that the breach had occurred and took remediation actions.

Now, the purpose of highlighting this application breach is not to malign any particular Enterprise, but to learn from it. All too often we make the mistake of assuming that our environment is safe and not vulnerable to hacks and breaches, whereas it actually is quite exposed and hacker-friendly. In fact, over the past few years, we have seen numerous application breaches of credit reporting agencies, local search services, hotel chains, gaming companies, postal services, etc. and interestingly many of them were highly un-sophisticated data breaches. These obviously, negatively impact not only the businesses because of regulatory fines ($700 million in one case) and cost of damage control, but also the end customers, who have to deal with the aftermath of losing their sensitive data to the wrong hands.

Current State of Application (and API) Security

 So why has application security become of such paramount importance now more than ever before, and why are we seeing a spate of application data breaches in the recent past ?

When you look at Enterprise Data, the crown jewel that everyone is trying to protect, there are various touch points or ‘data access doors’, if you will, to it from various sources. Starting with employees who directly access data using their removable media, to end-point and IOT devices, to cloud and SaaS workloads to legacy web apps — and now with the increasing use of modern and distributed applications — everywhere data is being accessed, handled, and most importantly transferred across heterogenous and distributed environments. And while there are multiple security solutions to protect the other so-called ‘data access doors’, when it comes to modern applications and specifically when it comes to application data-in-motion (APIs being a small subset of that), there aren’t comprehensive security solutions that can monitor the environment and protect against bad actors (As a side note, of late, we are seeing the emergence of a few perimeter-based API security solutions). And with applications evolving rapidly from monolithic to distributed, the number of APIs or application data-in-motion interactions have just increased exponentially, thereby making the issue of security even worse and imminent.


Given all this, therefore, it should come as no surprise that security experts believe that “API is the next big cyber-attack vector” or if said more generically, ‘application data-in-motion is the next big cyber-attack vector’.

Learnings, Best Practices and Recommendations

Let’s face reality. No environment is completely safe and foolproof. So the least we can do is learn from other’s mistakes and better prepare ourselves to protect our environment against these kinds of breaches. Here are, in my opinion, the top 5 learnings from these recent data breaches, and best practices and recommendations to keep in mind.

1. Manual configuration is prone to human errors. Even the best of the best make errors, if they have to configure devices manually. Case in point is the above example data breach, where the hacker leveraged a misconfiguration to penetrate the environment.

Best Practice / Recommendations:

It’s always recommended to do away with manual configuration, wherever possible. Often times, there are solutions that require policy configurations to be manual, which creates a huge overhead and risk, especially when the number of components is unmanageably large — which is true for cloud-native workloads and environments. So it is advisable to look for solutions that do not require admins to configure policies manually, and instead, the system recommends what configurations and policies to put in place.

2. Perimeter breaches are inevitable. Whether it is through a misconfigured firewall, or an issue in the API server, or a vulnerability in the infrastructure (e.g. Kubernetes CVE-2018-1002105), perimeter (aka north-south) breaches are bound to happen, and when that does, how well you have protected your internal environment determines whether your data is breached or not.

Best Practice / Recommendation:

Investing in security solutions that focus on the east-west and insider attacks in addition to north-south is, therefore, a must. Ensure that your security solutions offer distributed security and policies, and that each workload granularly secures itself in a zero-trust manner. Often times though, ‘zero-trust’ is confused with just encryption (Mutual TLS). What we need to remember is that ‘Encrypted’ does not mean ‘Secured’. Although encryption raises the bar, hackers will use the encrypted path as the transport to breach applications and data. So, it is strongly recommended to invest in security solutions which are not only distributed but also deep within the data-layer (as opposed to just network- or URL-layer).

3. Modern distributed applications (and APIs) offer a path of least resistance for hackers. As a result, not only are applications and API breaches on the rise, but many of the attacks are also quite simple and unsophisticated.

Best Practice / Recommendation:

Whether it is your public APIs to partners, or your distributed east-west APIs or even your egress APIs to third party vendors, it is strongly recommended to have a comprehensive API security (or more generically data-in-motion security) strategy in place.

4. Post-authentication hacks using stolen credentials are quite common. “A scan of billions of files from 13% of all GitHub public repositories over a period of six months has revealed that over 100,000 repos have leaked API tokens and cryptographic keys, with thousands of new repositories leaking new secrets on a daily basis.” Clearly, focus on identity management alone is not enough in a world where such errors are made by novices and experts alike, thereby providing hackers a vehicle to piggyback on authorized sessions and perform data breaches. In the above example breach, the hacker used stolen credentials to get all the sensitive information from the cloud storage.

Best Practice / Recommendation:

While there is a lot of focus on identity and access management, several application attacks such as parameter tampering, etc. occur post-authentication using stolen credentials. It is, therefore, strongly recommended to invest in security solutions that address post-authentication and authorization breaches, especially those that can detect user account takeover using stolen credentials.

5. Real-time Visibility and Detection is key. According to IBM, on average, it takes about 197 days (i.e. 6+ months) to identify a breach. This will only get worse with modern applications which are distributed over clouds and environments. If you look at the above example breach, despite all the monitoring tools in place it’s possible that the hacker would have gone undetected while stealing hundreds of millions of sensitive data, had she not bragged about it on online/social media, four months after she actually perpetrated the breach.

Best Practice / Recommendation:

While there should be an emphasis on protecting, it is extremely important to first detect those potential threats and breaches in real-time. “You can’t protect what you can’t see”, goes the age-old saying in security. Therefore, it is strongly recommended to invest in visibility, discovery and detection tools that first discover your distributed environment (especially the interaction of assets that are in use) and then detects data leaks, attacks, and breaches on those asset interactions in real-time.

Cloud and application technologies have evolved from monolithic to multi-tiered to microservices to serverless functions. At the same time, workloads have gotten smaller in size and become ephemeral, while the number of workloads and the number of interactions between them have grown exponentially larger. As a result, data as we know it, has started residing increasingly in between workloads (i.e. in motion) rather than inside them (i.e. at rest or in use). At the same time, attacks have become deeper in the data layer. Distributed, deep-data-layer, data-in-motion security is, therefore, the need of the hour for these applications.