Picture this. One day, you’re working at your desk in the office as usual (remember offices? :-O ). Your boss, Monica, walks up excitedly and breathlessly says, “Guess what? We’re gonna move our flagship app to… THE CLOUD!” (Cue ominous thunderclaps. Or celebratory violins. Or both.)
You quickly grab a cup of coffee (a big one), and join your boss and the rest of the team in the meeting room, where you are treated to a long litany of benefits supposedly achieved by moving directly to the cloud: Automatic scaling. Easier management. Continuous uptime. Lower costs. Someone else responsible for infrastructure and security. “And the best thing,” she wraps up, “it’s so easy to just deploy our code to cloud platforms, and the whole thing is transparent!”
Do you want to guess what happened next?
Well, I’m sure some of you don’t have to guess, you can remember. Vividly.
Anyway, since this blog is about cloud security, we will skip over the weeks of frustration, planning, retro-fitting, and finally arriving at a solution that is technically IN the cloud, but barely achieves the listed benefits… But let’s talk about The Incident. You know the one I mean, the one that’s only mentioned in hushed tones, often with nervous giggles. The one that is brought up every time anyone attempts a new migration plan to modern platforms, and why no one is allowed to mention the newspaper clippings stuck to Monica’s wall…
Many organizations have a similar story, and each one has different specifics for The Incident. While there are many possible flaws, with varying impacts, the common thread for all of them is not adapting the system architecture to the completely different environment. Cloud-based applications have a different threat model from “classic”, on-premises systems, with different trust boundaries, unexpected entry points, and implicit dependencies.
Your systems now share the same hardware as any number of nameless companies, and while there are no real stories of substantial breaches of tenant isolation, nor do you really know who is next door, and what they are doing. For that matter, you must have taken a real leap of faith to be able to trust every employee of the cloud provider! And even if we could rely on them, we’re still throwing all of our corporate jewels up on to the public internet – even if we tried to encrypt them, the keys are all right there for all to see, right?
While there definitely are plenty of benefits to be achieved by moving to a cloud platform, it is not at all, as your fictional boss Monica claimed, transparent.
There are several aspects you really should consider when designing a cloud-native application, to account for the different architecture. At the very least, the infrastructure stack and operational model is completely transformed. Hopefully, you’ll be taking advantage of the particular services available with your provider. But even if you don’t, your application’s risk profile is significantly altered – and it’s never going back.
So, what are the specific considerations for a starter level of security for a basic application heading to the cloud?
First thing worth discussing is the mutual responsibilities you now share with the cloud provider. Just because you no longer need to update your operating system and install patches, doesn’t mean it’s not being done. But this is now your platform provider’s concern, as well as with any other underlying software. Likewise, the provider is also responsible for the infrastructure security, such as networking and DNS. They will usually also be responsible for availability, backups, migrations, and physical security, to name just a few. However, you may still have some platform configuration in that area as well.
In particular, pay special attention to access controls on the cloud services themselves. I’m sure you’ve heard plenty of horror stories of organizations that were surprised by sensitive data being stolen from open S3 buckets, so I’d rather not add to your anxiety – but do make sure your buckets are locked down! And not just your buckets, you should ensure every cloud service and cloud resource in your account is similarly restricted, according to the “Least Privilege” principle. Typically, there are quite a few active services in use, even if your application isn’t using them directly.
This is especially relevant for management console access. Platforms such as AWS do give you very fine grained control over who has access to each part of the management interface, including network access control – for example, limiting the configuration interface to specific IP addresses only, and requiring two factor authentication (2FA). But you need to go through and configure what each user can do – and review that, periodically. You’ll want to leverage your platform’s IAM (Identity and Access Management) service as much as possible, there will be a lot of functionality and flexibility there, with hundreds (or thousands) of possible permissions, roles, and various policies for identity lifetimes and what each can do.
It is a bit trickier to granularly control which component can use which resources with what service, but nonetheless it is very important to do so as much as possible.
This will help mitigate most potential vulnerabilities that might be discovered in your code in the future, by constraining the “blast zone” of a vulnerability – once again, by following the principle of least privilege, not just which resources a component should access but also what they can do with it. You will also need to ensure your application and components are properly configured for strong authentication.
And this brings us to data security… Of course all the network traffic between your components, services, and users would be secured and encrypted, and there’s no reason not to use TLS all around. You’ll need to set up PKI certificates to be able to enforce HTTPS, though you might be able to get that done with just a flip of an administrative switch.
But you’ll probably also have a bunch of sensitive data that you’ll feel like keeping protected even in storage. Easy enough to use standard encryption, but how could you protect the encryption keys while running on shared infrastructure? Lucky enough, most platforms provide a dedicated service just for this – for example, Azure’s Key Vault, or KMS (Key Management Service) on AWS. These hosted services will provide, unsurprisingly, key management APIs, and provision access to them as needed. This will allow you to dump responsibility for protecting encryption keys and other secrets on to the cloud provider – and they do a really good job of it. You can even use this to protect configuration secrets, such as database credentials and API keys, using services such as AWS’ Secrets Manager.
Now, at this point you’re probably realizing how many different services you can leverage to support your application – and how many of those you really have no choice about. Not only do you need to manage the configuration for each of these, you’ll probably also want to track and monitor the actions of each one. Since your application should also be generating some form of user activity tracking, wouldn’t it be great if there was some kind of audit trail cloud service? Yes it would! And there is! Make sure to set up and configure audit logging, and protect these as well. Pay attention to the particulars, as some services will log the cloud service activity, and some are designed for user activity logging. Either way, you’ll want some kind of monitoring, and configure alerts on specific events – for example, if your services are running far above typical costs, you may be undergoing a FinDoS (financial denial of service) attack, which could wind up costing you much more than the whole system is even worth.
Of course, none of the security requirements from the “old” architecture went away either! You will still need secure coding, authentication and authorization controls, data validation, protection against injections, a process to secure and update all application dependencies, and everything else that we still haven’t even got a full handle on. Implementing a WAF (Web Application Firewall) over cloud services can be especially tricky… But one area in particular deserves special mention. If you are deploying code to FaaS (Function as a Service), such as AWS Lambda or Azure Functions (or even GCP Functions, we’re not judging), there is a tendency to treat it as back-end code. Call into it, it does its’ job, end of story. Except that in reality, it is as much an entry point as your user-facing APIs, and must be treated as such. That includes authentication and authorization of course, but also input validation inside the function.
Lastly, you’ll still want to include all of this in your regular SDLC (Software Development Life Cycle) process, including running scans on all your code and components. Just make sure you get official authorization from the service provider, before scanning their services or running penetration tests on live systems! You wouldn’t want to end up as a headline, just because you tried doing your job.
So, to sum up: moving your application to a cloud platform will likely provide substantial benefits, but you’ll be much happier if you spend some time designing your system to take advantage of the new architecture, and not just suffer through it. Almost all of the traditional risks still exist, and you’ll still need to build your application securely. However, there are also some new risks to consider, and plenty of increasingly powerful services are provided to you by the platform to empower greater control over your security controls. Automate everything, definitely.
Action items:
- Configure all open services, and shutdown unneeded services
- Restrict access to the management console, including IP restrictions
- Set up IAM policies, including account lifetime, roles, and 2FA requirements
- Set granular access for each component on each resource it needs
- Configure all services for HTTPS only
- Store encryption keys in the platform’s key store (e.g. KMS or Key Vault)
- Protect configuration secrets using the platform’s protected configuration service (such as Secrets Manager)
- Log service operation
- User activity, and set up monitoring alerts accordingly
- Enforce authentication and access control on FaaS (including AWS Lambda), and implement data validation in each function
- Review and scan all services, applications, and serverless functions – only with authorization
- Automate and monitor all the above