Why are Amazon S3 breaches on the rise… And what can we do about it?

Applications are increasingly being architected to run on public cloud environments, with multiple application components (and micro-components) interacting with each other using APIs. These applications also leverage an increasingly large number of public cloud services, that span a wide range including; storage, computing, networking, database, analytics, identity, deployment, management, mobile, developer tools, and more. In fact, Amazon alone offers over 200 such services for application developers.

One of the most common and prevalent services used by application developers today, is the object storage service, which is used to store and retrieve data, any time, from anywhere on the Internet, either using a web interface or more commonly programmatically using public-cloud-provided APIs. For Amazon Web Services’ (AWS) this is called Simple Storage Service or S3, Microsoft Azure has its own object storage service called Azure Blob Storage, whereas Google Cloud offers Cloud Storage service. For the rest of this article, although I talk primarily about AWS S3, it is equally applicable to other clouds and shared services.

As with any cloud service, and especially those that help store data, there is a huge amount of security and compliance risk associated because they contain sensitive data (such as PII – personally identifiable information or SPI – sensitive personal information) that can be programmatically accessed by other applications using APIs. Remember nothing in AWS S3 is a hidden resource and everything can be accessed using a URL. So S3 buckets must be continuously monitored and managed to avoid data breaches and ensure enterprise security, governance, and compliance standards are being met. However, as we have seen recently, there continue to be breaches and attacks on these data stores, and such breaches are clearly on the rise.

In 2019, there were 7,098 data breaches and 15.1B records exposed, making it the “worst year” in terms of the number of breaches and the amount of data breached. This trend is only going to get worse in 2020 with more applications and workloads getting distributed and data moving across clouds and environments.

So why are AWS S3 breaches on the rise?

Apart from the fact that AWS S3 buckets are not hidden, there are multiple reasons data gets exfiltrated and breached, including misconfigurations and human errors. For example, there are numerous scenarios where sensitive data buckets are unintentionally publicly exposed, either because of operator errors in adding sensitive data to incorrect or public buckets or incorrectly setting bucket and object permissions. In other cases, S3 buckets have misconfigurations that have become easy targets for hackers to conduct external attacks such as viewing, editing, and modifying JavaScript files to spread skimmer code across websites.

But beyond the usual human errors, there are numerous ways malicious and bad actors can breach into the buckets. These breaches are hard to detect and hence defend against, simply because the perpetrators almost always mimic real user behavior, when they present valid (albeit stolen or misused) credentials. For example, a malicious user with a stolen credential (or maybe a disgruntled employee misusing his/her credential) breaches into and compromises one of the workloads.

Once they breach into one workload, they then spread laterally east to west, continuing to compromise additional workloads, until they reach a workload that has access to sensitive data buckets. They can then easily exfiltrate data from the data buckets. Alternatively, they could laterally spread and compromise workloads until they get to a workload that has IAM (identity and access management) privileges and use the privileges to create public buckets, move data into these public buckets, and then access it from outside.

 

All these breaches are very hard to detect and hence protect, except at runtime. Most of the vendor solutions in the market today focus on either trying to identify misconfigurations or look at whether the access privileges and roles are left too broad. Some of the human errors can be corrected with such solutions, but it is impossible to ensure that all potential breaches can be prevented just by looking at the configurations. This is especially true given that it is often not clear as to the operator intent in creating a specific configuration or policy and how that configuration or policy could be misused. In other words, these preventive solutions focus on configuration, policy or management events and they do nothing to protect against (i.e. detect and remediate) security data events at runtime.

Cloud security professionals, therefore, lack visibility into malicious activities such as improper S3 bucket access, and other data breaches as mentioned earlier. In fact, a recent study by IBM shows that the average time it takes to detect a breach is 206 days, which means it takes on the average more than 6 months to actually detect that an environment has been breached, let alone defending against them. Currently, proper detection of sophisticated attacks is lacking, and Enterprises need to invest more in monitoring their environment against breaches.

The Concept of ‘Shared Responsibility’

While Cloud providers themselves offer a number of security controls and tools, they also have the notion of a ‘shared responsibility’ when it comes to security. This means that they are themselves responsible for the so-called ‘security of the cloud’ and the underlying infrastructure, which they do a tremendous job at, but leave it to the application owners to use the controls and tools and be responsible for securing their own environment, aka ‘security in the cloud’. More often than not, application owners leave gaps in the security policies and therefore leave it exposed to potential access from malicious users. In fact, 73 percent of IT security professionals report that, despite the number of products being used, they lack adequate controls to monitor, filter, and analyze “east-west” traffic across their workloads.

 

Data breaches are expensive. A single data breach can cost an organization millions of dollars and cause compliance issues. According to an IBM study, the average cost of a data breach is $3.9 million and includes notification costs, investigation expenses, damage control, and repairs, regulatory fines and lawsuits. Therefore, it is imperative for security professionals to understand why the solutions and tools available today to address these threats are not sufficient? Why are the security tools, recommendations and best practices offered by the public cloud service providers, also not sufficient to ward of these threats? What can we do to fill this big security gap? These are all valid and pertinent questions we aim to answer in this webinar.