by

Hardening Kubernetes: Lessons Learnt from a customer Cloud Platform Security Assessment

Kubernetes Introduction

Kubernetes adoption has become widespread and is increasingly the preferred method for deploying containerised applications due to its flexibility, scalability, and efficiency, but there are still many security considerations. A recent cyber-security audit conducted on a customer’s Cloud Platform, utilising Kubernetes within an AWS ecosystem, raised critical insights into the platform's security posture. 

Kubernetes allows for deployment and management of applications within platform agnostic containers. A container is like a virtual machine that holds everything your solution needs to run smoothly. It includes the actual program or application, along with all the necessary files, libraries, and settings, and wraps up your software neatly and ensures it works the same way no matter where it's delivered. But even where the orchestration and application code optimised there are security considerations which can lead to Kubernetes environments being compromised.

There are several deployment models that organisations can choose, for example:

  • Single-node Cluster: Mainly in use for development and testing, this setup runs all Kubernetes components on a single machine.
  • Multi-node Cluster: In a production environment, Kubernetes is typically deployed across multiple nodes or machines, distributing workloads for better scalability and reliability.
  • On-Premises Deployment: Some organisations choose to deploy Kubernetes on their own hardware within their data centres for greater control and security.
  • Cloud-based Deployment: Due to the ease of adoption and initial deployment many will opt for cloud platforms like AWS, Azure, or Google Cloud, leveraging their managed Kubernetes services (EKS, AKS, GKE) to simplify cluster management.
  • Hybrid Deployment: This approach combines on-premises and cloud deployments.
  • Serverless Deployment: Kubernetes can be used for serverless computing, where the infrastructure is automatically managed, and users only focus on their application code.

ProCheckUp's Approach

ProCheckUp take a detailed and methodical approach aimed at assessing Kubernetes from multiple attack vectors, including:

•    Internal and external testing of endpoints/applications.
•    Cloud configuration review if the environment is hosted using cloud services. 
•    Review of container services and registry configuration controls if they are in use.
•    Evaluation of the Kubernetes server and cluster environment, along with the worker nodes hosting the cluster. 
•    Examination of any pods (a fundamental unit of deployment within any Kubernetes environment) running on the worker nodes and serving the containerised application(s) within specific namespace (a way to partition resources within a cluster) and review of any logging controls, and misconfigurations. 
•    Review and testing of access controls for the environment.
•    Testing of the networking setup to ensure effective traffic control both within and outside the namespaces and the cluster along with segmentation testing where applicable.
•    Checks for issues that package management such as Helm configurations may introduce (i.e., A Helm chart pulls together pre-configured settings for Kubernetes resources within multiple ‘yaml’ files).
•    Review of third-party software integrations. 
•    Assumed breach testing – here ProCheckUp are taking an attacker’s stance where a pod has been breached to see what is possible for an attacker to achieve. The most likely breach point is externally exposed pods, however well-placed attackers on adjacent or local networks could find ways to obtain access to other pods. 
•    Build review and breakout testing from within pods.
•    Reviewing Secrets management – ensuring these are encrypted at rest and in transit.

The ultimate objective for an attacker targeting a Kubernetes environment is to obtain control of the master node. Achieving this level of access allows them to access and modify data across the entire cluster. A significant concern in terms of security as it allows them to potentially expand their attack surface by targeting connected environments, deploy their own malicious workloads and cause service disruptions.

Lessons Learnt 

The Kubernetes configuration exhibited robust separation between customer environments, underscoring a strong fundamental design. However, identified issues such as outdated images and the absence of image scanning in AWS point towards substantial patch management and vulnerability detection gaps.

ProCheckUp’s security experts found that the cluster administrators' roles were overly expansive, increasing the attack surface from potential account compromises. An absence of cluster logging and the use of default namespaces for resources further complicated the security landscape, providing attackers with potential avenues for gaining leverage within the system.

The detailed findings emphasised the necessity for Kubernetes configurations to evolve beyond their current state, addressing key areas of hardening and optimisation. Kubernetes' inherent complexity demands meticulous attention to role-based access controls, service account token security, and the implementation of best practices in Helm chart configurations.

The assessment revealed that while Kubernetes clusters benefited from a well-configured environment with good customer isolation, several areas required immediate attention. The Kubernetes configurations were not strictly enforcing pod security policies, as evidenced by the use of the default namespace for certain resources and the lack of enforcement of 'restricted' access over 'privileged'. This lapse could potentially allow attackers or malicious users to exploit vulnerabilities within a pod, leading to the compromise of the entire cluster.

Furthermore, the absence of image scanning within the AWS environment indicated a glaring oversight. Without image scanning, outdated images with known vulnerabilities could be deployed, increasing the risk of security breaches. The lack of audit logging within clusters and a substantial number of cluster administrators with potentially expansive roles highlighted further risks, painting a picture of an environment ripe for exploitation through privilege escalation or unauthorised access.

The evaluation also pointed out the absence of network policies that restrict pod-to-pod communication, which could be detrimental in the event of a system component breach, making it easier for attackers to move laterally within the environment and exfiltrate data.

Some Finding Highlights

•    Insecure Role Based Access Controls 
A large number of cluster admins existed which widened the attack surface and fails to adhere to the separation of duties principal. 

It is also a little-known and poorly documented fact that AWS does not provide a direct and straightforward way to determine the individual AWS account that initially configured an Amazon EKS (Elastic Kubernetes Service) cluster. The information about the user or AWS account that created the cluster is not readily available through standard AWS Identity and Access Management (IAM) or EKS APIs. This account has full control over the cluster therefore should be identified and closely managed after cluster creation as if this account is breached the risk may not be fully apparent.

•    Patching and software utilities
The containerised operating system used for applications to run on was based on a Docker isDebian 10-slim image used on many of the pods. Scans of the images identified many vulnerabilities comprising critical, high, medium, and low risks attributed due to lack of patching of the software components. This shows that choice of images and vulnerability scanning is important to consider and review on an ongoing basis.

1197 vulnerabilities found

──────────────────────────

 

Image: <snip>.ecr.eu-west-1.amazonaws.com/<snip>

 

* 7 Critical

* 184 High

* 193 Medium

* 813 Other

 

Components with most vulnerabilities

────────────────────────────────────

 

* linux-libc-dev (4.19.304-1) - 27 High, 61 Medium, 1 Low, 67 Negligible, 5 Unknown

* binutils (2.31.1-16) - 89 Negligible

* binutils-common (2.31.1-16) - 89 Negligible

* binutils-x86-64-linux-gnu (2.31.1-16) - 89 Negligible

* libbinutils (2.31.1-16) - 89 Negligible

Common utilities left on pods that could enable attackers to conduct further malicious activities. When coupled with weak egress protection and network controls attackers have leverage for lateral movement, compilation of their own code, data exfiltration or obtaining remote shells.

•    Information disclosure 
Secrets handling needs to be carefully considered to ensure secrets are treated appropriately and should be encrypted at rest and in transit. Platform wide secrets were mapped to a common file accessible within multiple pods which disclosed credentials and other details rather than just the details needed by each pod. These need to be managed appropriately to prevent sensitive information being breached unnecessarily. 

Environment variables were also disclosing details such as Kubernetes IP addresses and database passwords. These are valuable to attackers looking to facilitate further malicious attacks. 
 
 

 

•    Use of IMDS
Use of the Instance metadata service (IMDS) on EC2 instances within the AWS environment allows access tokens to be obtained via the special interface . This is a commonly overlooked feature which effectively allows bypass of the normal namespace restrictions to allow communication with the worker nodes resulting in information disclosure and can potentially lead to further lateral movement into the AWS environment depending on the configuration.

Vulnerability Remediation Strategies:

The assessment highlighted several vulnerabilities within the Kubernetes setup that require immediate action. These include the use of outdated third-party software and the lack of image scanning within the AWS environment. Addressing these issues starts with instituting regular update schedules and utilising tools for scanning and patching vulnerabilities. Ensuring that software components are up to date is critical in protecting against known security threats.

Moreover, it is imperative to restrict the automounting of service account tokens within pods. By default, Kubernetes automounts these tokens, which could be leveraged by attackers to escalate privileges or move laterally within the cluster. The recommendation is to mount these tokens only when necessary and to explicitly disable automount for pods that do not require it.

Kubernetes Hardening Techniques:

Cluster administrators must implement robust role-based access controls (RBAC) to minimize permissions to the least required for operation, adhering to the principle of least privilege. This minimises the potential attack surface and reduces the risk of unauthorized access. 

Additionally, deploying resources in namespaces other than the default and applying network policies to restrict pod-to-pod communication are essential steps in hardening Kubernetes clusters against attacks.

As we further delved into the Customers Cloud Platform's security assessment, it became clear that continuous monitoring and automated scanning are not just optional, but essential components of a robust cybersecurity strategy, especially when it comes to managing Kubernetes clusters.

Continuous Monitoring and Automated Scanning:

The absence of auditing capabilities and the lack of a metrics server within the cluster, as noted in the assessment, presents a risk of unmonitored activities that could lead to security incidents going undetected. Implementing auditing according to CIS benchmarks and enabling a metrics server can provide an organization with the tools to actively monitor the health and security of their Kubernetes clusters.
Automated image scanning, which was notably absent within the AWS environment of the Customers Cloud Platform, is another critical control that should be put in place. Regular automated scanning, or at least semi-regular manual scans, are essential for identifying risks within containers and the underlying images. Without these scans, vulnerabilities could go unnoticed and unaddressed, leaving the system open to exploitation.

Remediation and Proactive Measures:

To mitigate vulnerabilities and ensure a proactive approach to cluster security, the following steps are recommended:
1.    Enable auditing for the cluster as per CIS benchmark guidelines.
2.    Implement a metrics server within the cluster to provide detailed analysis of pod statuses and resource utilization.
3.    Utilise AWS ECR for enhanced image scanning capabilities and enable scan on push configurations to ensure that images are scanned for vulnerabilities as they are uploaded.
4.    Consider open-source tools like Kubescape or Trivy for additional scanning and monitoring capabilities within the Kubernetes cluster.
5.    Ensure that all communications within the cluster use secure options such as HTTPS to prevent data interception and ensure data integrity and confidentiality during transmission. Avoid the use of weak TLS versions and ciphers and authentication mechanisms that transmit credentials in clear text.
6.    Deny access to AWS EC2 metadata from Pods: Implement network rules to restrict access to AWS EC2 metadata from the pods, preventing potential exploitation of the metadata service by malicious actors.
7.    Restrict network traffic within Customer Namespace: Implementing more stringent network policies within the customer namespace can prevent lateral movement and reduce the blast radius of any potential breach.
8.    Role-Based Access and Secret Management: Improving secret management and access controls will likely involve both technical changes and updates to operational processes to ensure that these controls are maintained over time.
9.    Remove Unnecessary CLI Tools from Pods: Limit the tools available within pods to only those that are necessary for the operation of the pod, reducing the potential for misuse.
10.    Disable service account token automount: Where possible to reduce the risk of token theft and subsequent misuse.
11.    Harden Helm chart configurations if they are in use to ensure security best practices, and reducing the risk of misconfigurations that could lead to security incidents.
12.    Limit user access to cluster secrets to minimise the risk of disclosure of sensitive information.
13.    Avoid using the default namespace for operations, as it is often more permissive and can be a security risk.

Summary:

The journey toward securing a Kubernetes environment within an AWS ecosystem, as highlighted by the security assessment of ProCheckUp's customer's Cloud Platform, is both challenging and ongoing. The remediation strategies and long-term cybersecurity approaches discussed are essential steps towards mitigating current vulnerabilities and preparing for future threats. By prioritizing these initiatives, organizations can significantly enhance their security posture, protect their assets, and maintain the trust of their customers.

In conclusion, securing a Kubernetes environment is a complex but achievable task that requires a comprehensive and adaptive approach. By implementing the recommended practices and maintaining a commitment to continuous improvement and collaboration, organizations can navigate the cybersecurity landscape with confidence.

Contact ProCheckUp today to assess and harden your Kubernetes environments.

Tools that can be used to improve your Kubernetes security:

•    Cloud CLI tools e.g. awscli, eksctl
•    Kubernetes API cli tool – kubectl
•    Trivy - https://github.com/aquasecurity/trivy
•    Kubescape - https://github.com/kubescape/kubescape
•    Kube-hunter - https://github.com/aquasecurity/kube-hunter
•    Kube-bench - https://github.com/aquasecurity/kube-bench
•    Network and vulnerability scanning tools e.g. nmap, Nessus
•    Linpeas/Winpeas - https://github.com/carlospolop/PEASS-ng