Have an automated, auditable secure CI/CD environment where security controls are transparent to developers and users.
To this end, I propose the following:
- Ensure development, QA, and production servers are configured identically but with different passwords.
- Ensure that secret information (API keys, passwords, AWS credentials, private data, PII, etc.) is adequately protected at-rest and in-motion.
- Run scans and log audits periodically to detect misconfigurations, security vulnerabilities, or missing patches.
- Define an escalation process and owners of this process if a potential security event is suspected.
- Develop and communicate processes that accomplish the above bullet points – For example, a requirement might include a document that defines how AWS VPC’s should be configured, and how access to each VPC should be controlled.
For example, AWS recommends (regarding security) that organizations:
- Restrict access to instances from limited IP ranges using Security Groups
- Disable root API access keys and secret keys
- Password protect the .pem file on user machines
- Delete keys from the authorized keys file on your instances when someone leaves the organization or no longer requires access
- Rotate credentials (DB, Access Keys)
- Regularly run least privilege checks using IAM user Access Advisor and IAM user Last Used Access Keys
- Use bastion hosts to enforce control and visibility
This is a subset of common-sense rules that developers and users on any cloud substrate should be required to follow. If these rules are not defined and communicated then developers and users will often – with no invidious motive – unwittingly create security vulnerabilities in the CI/CD environment.
One reasonable definition of DevSecOps security hardening is: “The process where we identify the default or existing configuration of a system, the desired secure system end state, and then leverage automation to apply changes that will change the configuration to the desired end state.”
To my mind, this can best be accomplished this with Ansible playbooks. An Ansible playbook is a codified security document, which enables one to describe the desired end state of a system, rather than the specific steps of how to get to that state. Systems change – it is better to have the end state defined rather than have to change commands as the system changes. Other advantages of playbooks include:
- Playbooks are written in YAML, which is text-based and relatively simple to understand.
- Playbooks are text files, so Git can be used for version control.
- Playbooks are security infrastructure as code.
- Playbooks are largely made up of roles, a key aspect of security.
- Numerous security playbooks are available as open source.
Instead of trying to guess what can go wrong at each step of the workflow (which seems to me an impossible task), we should define the desired end state & then use automation (Ansible, Chef, SNow, whatever) to: (1) Identify when a delta occurs between desired end state & final workflow product; (2) Identify point-in-time where error occurred & error type; & (3) either correct error or rollback.
A well-known working model for this follows: (1) Auto-scaling of EC2 instances can be enabled if the number of instances may not be able to handle the workload asked of those instances. This could be due to CPU, storage, RAM, I/O – many different issues; (2) Thresholds can be set in CloudWatch that automatically scale out the instances when, for example, CPU utilization >= 50% for 5 min; (3) CloudWatch log thresholds then kick off events via SQS, SNS or Lambda, which; (4) automatically scale out instances and notify the owner.
So, our desired end state in this case is < 50% CPU utilization. When the end state does not equal our desired end state, automation kicks in & fixes the problem. Other thresholds and outcomes can be similarly defined.
Below are some selected tools that will enable us to accomplish the goals set above. Most of these have both community and Enterprise editions. I would recommend that Hashicorp Vault and Chef InSpec tools be evaluated as Enterprise editions because of their increased functionality and the necessary support. Qualys & Tenable are obviously commercial products. The others are open-source.
More important than the tools: We *must* have management buy-in of the processes that are required to ensure identical security in Dev, Test, QA, & Prod environments. Without that buy-in & without codified processes (living as markdown in GitHub is fine), all of the tools in existence will not enable a secure environment.
- Hashicorp Vault: Protect secrets (see above)
- Git-crypt An alternative to Hashicorp Vault
- Git-secrets: Scan for hard-coded secrets on commit
- Cfn_nag: CloudFormation linter
- Chef InSpec Specifies and audits compliance, security and policy requirements
- nmap A general purpose port scanner
- Tenable/Qualys Vulnerability scanners
- Metasploit Exploitation framework
**AWS** These are native AWS services that can be used to enable secure and auditable environments
- CloudWatch logs & logging agent monitor, store, and access many types of logs
- CloudTrail logs Records all API calls
- IAM – Users, groups, roles, policies Manage “least-privilege” access
- VPC’s Network segregation
- Security Groups Inbound firewall
- Network ACL’s (NACL’s) Inbound/Outbound ACL’s
- VPN’s Encrypted access to environment
- Trusted Advisor Monitor service limits,
- IAM user Access Advisor Monitor user, role access
- Selective encryption of data-at-rest
- API Gateways Control access with IAM roles
- SNS Automated messaging to SMS