Security for Red Hat OpenShift on IBM Cloud

We can use built-in security features in Red Hat OpenShift on IBM Cloud for risk analysis and security protection. These features help you to protect the cluster infrastructure and network communication, isolate your compute resources, and ensure security compliance across your infrastructure components and container deployments.

Overview of security threats for the cluster

To protect the cluster from being compromised, we must understand potential security threats for the cluster and what we can do to reduce the exposure to vulnerabilities.

Cloud security and the protection of our systems, infrastructure, and data against attacks became very important over the last couple of years as companies continue to move their workloads into the public cloud. A cluster consists of several components that each can put the environment at risk for malicious attacks. To protect the cluster against these security threats, we must make sure that you apply the latest Red Hat OpenShift on IBM Cloud, OpenShift, and Kubernetes security features and updates in all cluster components.

These components include:

OpenShift API server and etcd

The OpenShift API server and etcd data store are the most sensitive components that run in the OpenShift master. If an unauthorized user or system gets access to the OpenShift API server, the user or system can change settings, manipulate, or take control of the cluster, which puts the cluster at risk for malicious attacks.

To protect the OpenShift API server and etcd data store, we must secure and limit the access to the OpenShift API server for both human users and Kubernetes service accounts.

How is access to my OpenShift API server granted?

By default, Kubernetes requires every request to go through several stages before access to the API server is granted:

Authentication: Validates the identity of a registered user or service account.
Authorization: Limits the permissions of authenticated users and service accounts to ensure that they can access and operate only the cluster components that we want them to.
Admission control: Validates or mutates requests before they are processed by the OpenShift API server. Many Kubernetes features require admission controllers in order to properly function.

What does Red Hat OpenShift on IBM Cloud do to secure my OpenShift API server and etcd data store?
The following image shows the default cluster security settings that address authentication, authorization, admission control, and secure connectivity between the Kubernetes master and worker nodes.

Security feature	Description
Fully managed and dedicated OpenShift master	Every cluster in Red Hat OpenShift on IBM Cloud is controlled by a dedicated OpenShift master that is managed by IBM in an IBM-owned IBM Cloud account. The OpenShift master is set up with the following dedicated components that are not shared with other IBM customers. etcd data store: Stores all Kubernetes resources of a cluster, such as Services, Deployments, and Pods. Kubernetes ConfigMaps and Secrets are app data that is stored as key value pairs so that they can be used by an app that runs in a pod. Data in etcd is stored on the local disk of the OpenShift master and is backed up to IBM Cloud Object Storage. Data is encrypted during transit to IBM Cloud Object Storage and at rest. We can choose to enable encryption for the etcd data on the local disk of our OpenShift master by enabling IBM Key Protect encryption for the cluster. When etcd data is sent to a pod, data is encrypted via TLS to ensure data protection and integrity. openshift-api: Serves as the main entry point for all cluster management requests from the worker node to the OpenShift master. The API server validates and processes requests that change the state of cluster resources, such as pods or services, and stores this state in the etcd data store. openshift-controller: Watches for newly created pods and decides where to deploy them based on capacity, performance needs, policy constraints, anti-affinity specifications, and workload requirements. If no worker node can be found that matches the requirements, the pod is not deployed in the cluster. The controller also watches the state of cluster resources, such as replica sets. When the state of a resource changes, for example if a pod in a replica set goes down, the controller manager initiates correcting actions to achieve the required state. cloud-controller-manager: The cloud controller manager manages cloud provider-specific components such as the IBM Cloud load balancer. OpenVPN: Red Hat OpenShift on IBM Cloud-specific component to provide secured network connectivity for all OpenShift master to worker node communication. The OpenVPN server works with the OpenVPN client to securely connect the master to the worker node. This connection supports apiserver proxy requests to the pods and services, and oc exec, attach, and logs requests to the kubelet. The connection from the worker nodes to the master is automatically secured with TLS certificates.
Continuous monitoring by IBM Site Reliability Engineers (SREs)	The OpenShift master, including all the master components, compute, networking, and storage resources are continuously monitored by IBM Site Reliability Engineers (SREs). The SREs apply the latest security standards, detect and remediate malicious activities, and work to ensure reliability and availability of Red Hat OpenShift on IBM Cloud.
CIS Kubernetes master benchmark	To configure Red Hat OpenShift on IBM Cloud, IBM engineers follow relevant cybersecurity practices from the Kubernetes master benchmark that is published by the Center of Internet Security (CIS) . The cluster master and all worker nodes are deployed with images that meet the benchmark.
Secure communication via TLS	To use Red Hat OpenShift on IBM Cloud, we must authenticate with the service by using your credentials. When we are authenticated, Red Hat OpenShift on IBM Cloud generates TLS certificates that encrypt the communication to and from the OpenShift API server and etcd data store to ensure a secure end-to-end communication between the worker nodes and the OpenShift master. These certificates are never shared across clusters or across OpenShift master components. Need to revoke existing certificates and create new certificates for the cluster? Check out Rotating CA certificates in the cluster.
OpenVPN connectivity to worker nodes	Although Kubernetes secures the communication between the master and worker nodes by using the https protocol, no authentication is provided on the worker node by default. To secure this communication, Red Hat OpenShift on IBM Cloud automatically sets up an OpenVPN connection between the OpenShift master and the worker node when the cluster is created.
Fine-grained access control	As the account administrator we can grant access to other users for Red Hat OpenShift on IBM Cloud by using IBM Cloud Identity and Access Management (IAM). IBM Cloud IAM provides secure authentication with the IBM Cloud platform, Red Hat OpenShift on IBM Cloud, and all the resources in your account. Setting up proper user roles and permissions is key to limit who can access your resources and to limit the damage that a user can do when legitimate permissions are misused. We can select from the following pre-defined user roles that determine the set of actions that the user can perform: Platform roles: Determine the cluster and worker node management-related actions that a user can perform in Red Hat OpenShift on IBM Cloud. Platform roles also assign users the basic-users and self-provisioners RBAC role. With these RBAC roles, we can create an OpenShift project in the cluster, in which we can deploy apps and other Kubernetes resources. As the creator of the project, we are automatically assigned the admin RBAC role for the project so that we can fully control what we want to deploy and run in the project. However, these RBAC roles do not grant access to other OpenShift projects. To view and access other OpenShift projects, you must be assigned the appropriate service access role in IAM. Service access roles: Determine the Kubernetes RBAC role that is assigned to the user and the actions that a user can run against the OpenShift API server. While the basic-users and self-provisioners RBAC role that is assigned with a platform role lets you create and manage your own OpenShift projects, we cannot view, access, or work with other OpenShift projects until we are assigned a service access role. For more information about the corresponding RBAC roles that are assigned to a user and associated permissions, see IBM Cloud IAM service roles. Classic infrastructure: Enables access to your classic IBM Cloud infrastructure resources. Example actions that are permitted by classic infrastructure roles are viewing the details of cluster worker node machines or editing networking and storage resources. VPC infrastructure: Enables access to VPC infrastructure resources. Example actions that are permitted by VPC infrastructure roles are creating a VPC, adding subnets, changing floating IP addresses, and creating VPC Block Storage instances. For more information about access control in a cluster, see Assigning cluster access.
Admission controllers	Admission controllers are implemented for specific features in Kubernetes and Red Hat OpenShift on IBM Cloud. With admission controllers, we can set up policies in the cluster that determine whether a particular action in the cluster is allowed or not. In the policy, we can specify conditions when a user cannot perform an action, even if this action is part of the general permissions that you assigned the user by using RBAC roles. Therefore, admission controllers can provide an extra layer of security for the cluster before an API request is processed by the OpenShift API server. When you create a cluster, Red Hat OpenShift on IBM Cloud automatically installs the following Kubernetes admission controllers in the given order in the OpenShift master, which cannot be changed by the user: NamespaceLifecycle LimitRanger ServiceAccount DefaultStorageClass ResourceQuota StorageObjectInUseProtection PersistentVolumeClaimResize Priority BuildByStrategy OriginPodNodeEnvironment PodNodeSelector ExternalIPRanger NodeRestriction SecurityContextConstraint SCCExecRestrictions PersistentVolumeLabel OwnerReferencesPermissionEnforcement PodTolerationRestriction openshift.io/JenkinsBootstrapper openshift.io/BuildConfigSecretInjector openshift.io/ImageLimitRange openshift.io/RestrictedEndpointsAdmission openshift.io/ImagePolicy openshift.io/IngressAdmission openshift.io/ClusterResourceQuota MutatingAdmissionWebhook ValidatingAdmissionWebhook We can install your own admission controllers in the cluster or choose from the optional admission controllers that Red Hat OpenShift on IBM Cloud provides: Container image security enforcer: Use this admission controller to enforce Vulnerability Advisor policies in the cluster to block deployments from vulnerable images. If you manually installed admission controllers and you do not want to use them anymore, make sure to remove them entirely. If admission controllers are not entirely removed, they might block all actions that we want to perform on the cluster.

What else can I do to secure my OpenShift API server?

We can decide how we want your master and worker nodes to communicate and how the cluster users can access the OpenShift API server by enabling the private service endpoint only, the public service endpoint only, or the public and private service endpoints. Note that your options for service endpoints vary based on the cluster's OpenShift version and infrastructure provider. For more information about service endpoints, see worker-to-master and user-to-master communication in classic clusters and VPC clusters.

Rotating CA certificates in the cluster

Revoke existing certificate authority (CA) certificates in the cluster and issue new CA certificates.

By default, certificate authority (CA) certificates are administered to secure access to various components of the cluster, such as the master API server. As you use the cluster, we might want to revoke the certificates issued by the existing CA. For example, the administrators of our team might use a certificate signing request (CSR) to manually generate certificates that are signed by the cluster's CA for worker nodes in the cluster. If these administrators leaves your organization, we can ensure that they no longer have admin access to the cluster by creating a new CA and certificates for the cluster, and removing the old CA and certificates.

To rotate the CA certificates for the cluster:

Create a CA for the cluster. Certificates that are signed by this new CA are issued for the cluster master components, and the API server is refreshed.
```
ibmcloud oc cluster ca create -c <cluster_name_or_ID>
```
Ensure that the cluster's master health is normal, the API server refresh is complete, and any master updates are complete. It might take several minutes for the master API server to refresh.
```
ibmcloud oc cluster get --cluster <cluster_name_or_ID>
```

Check the status of the CA creation. In the output, note the timestamp in the Action Completed field.

ibmcloud oc cluster ca status -c <cluster_name_or_ID>

Example output:

Status:             CA certificate creation complete. Ensure that your worker nodes are reloaded before you start a CA certificate rotation.
Action Started:     2020-08-30T16:17:56+0000
Action Completed:   2020-08-30T16:21:13+0000

Download the updated Kubernetes configuration data and certificates in the cluster's kubeconfig file.
```
ibmcloud oc cluster config -c <cluster_name_or_ID> --admin --network
```
Update any tooling that relies on the previous certificates. For example, if you use the certificate from the cluster's kubeconfig file in your own service such as Travis or Jenkins, or if you use calicoctl to manage Calico network policies, update your services and automation to use the new certificates.
Verify that the timestamps on your new certificates are later than the timestamp that you found in step 3. To check the date on your certificates, we can use a tool such as KeyCDN.
Reload your classic worker nodes or replace your VPC worker nodes to pick up the certificates that are signed by the new CA.
Rotate the old certificates with the new certificates. The old CA certificates in the cluster are removed.
```
ibmcloud oc cluster ca rotate -c <cluster_name_or_ID>
```

Check the status of the CA certificate rotation.

ibmcloud oc cluster ca status -c <cluster_name_or_ID>

Example output:

Status:             CA certificate rotation complete.
Action Started:     2020-08-30T16:37:56+0000
Action Completed:   2020-08-30T16:41:13+0000

Worker node

Worker nodes carry the deployments and services that make up the app. When you host workloads in the public cloud, we want to ensure that the app is protected from being accessed, changed, or monitored by an unauthorized user or software.

Who owns the worker node and am I responsible to secure it?
The ownership of a worker node depends on the type of cluster that you create and the infrastructure provider that you choose.

Standard classic clusters: Worker nodes are provisioned in to your IBM Cloud account. The worker nodes are dedicated to you and we are responsible to request timely updates to the worker nodes to ensure that the worker node OS and IBM Cloud Kubernetes Service components apply the latest security updates and patches.
Standard VPC clusters: Worker nodes are provisioned in to an IBM Cloud account that is owned by IBM to enable monitoring of malicious activities and apply security updates. We cannot access your worker nodes by using the VPC dashboard. However, we can manage your worker nodes by using the IBM Cloud Kubernetes Service console, CLI, or API. The virtual machines that make up your worker nodes are dedicated to you and we are responsible to request timely updates so that your worker node OS and IBM Cloud Kubernetes Service components apply the latest security updates and patches.

For more information, see Your responsibilities by using Red Hat OpenShift on IBM Cloud.

Use the ibmcloud oc worker update command regularly (such as monthly) to deploy updates and security patches to the operating system and to update the OpenShift version that your worker nodes run. When updates are available, we are notified when you view information about the master and worker nodes in the IBM Cloud console or CLI, such as with the ibmcloud oc clusters ls or ibmcloud oc workers ls --cluster <cluster_name> commands. Worker node updates are provided by IBM as a full worker node image that includes the latest security patches. To apply the updates, the worker node must be reimaged and reloaded with the new image. Keys for the root user are automatically rotated when the worker node is reloaded.

How does my worker node setup look like?
The following image shows the components that are set up for every worker node to protect your worker node from malicious attacks.

The image does not include components that ensure secure end-to-end communication to and from the worker node. For more information, see network security.

Security feature	Description
CIS-compliant RHEL image	Every worker node is set up with an Red Hat Enterprise Linux (RHEL) operating system that implements the benchmarks that are published by the Center of Internet Security (CIS). The operating system cannot be changed by the user or the owner of the machine. To review the current RHEL version, run oc get nodes -o wide. IBM works with internal and external security advisory teams to address potential security compliance vulnerabilities. Security updates and patches for the operating system are made available through Red Hat OpenShift on IBM Cloud and must be installed by the user to keep the worker node secure. Red Hat OpenShift on IBM Cloud uses a Red Hat Enterprise Linux kernel for worker nodes. We can run containers based on any Linux distribution in Red Hat OpenShift on IBM Cloud. Check with your container image vendor to verify that your container images can be run on an Red Hat Enterprise kernel.
Continuous monitoring by Site Reliability Engineers (SREs)	The image that is installed on the worker nodes is continuously monitored by IBM Site Reliability Engineers (SREs) to detect vulnerabilities and security compliance issues. To address vulnerabilities, SREs create security patches and fix packs for the worker nodes. Make sure to apply these patches when they are available to ensure a secure environment for the worker nodes and the apps that you run on top of them.
CIS Kubernetes worker node benchmark	To configure Red Hat OpenShift on IBM Cloud, IBM engineers follow relevant cybersecurity practices from the Kubernetes worker node benchmark that is published by the Center of Internet Security (CIS) .
Compute isolation	Worker nodes are dedicated to a cluster and do not host workloads of other clusters. When you create a classic cluster, we can choose to provision the worker nodes as physical machines (bare metal) or as virtual machines that run on shared or dedicated physical hardware. Worker nodes in in a standard VPC compute cluster can be provisioned as virtual machines on shared infrastructure only.
Option to deploy bare metal on classic	If you create a standard classic cluster, we can choose to provision the worker nodes on bare metal physical servers (instead of virtual server instances). With bare metal machines, we have additional control over the compute host, such as the memory or CPU. This setup eliminates the virtual machine hypervisor that allocates physical resources to virtual machines that run on the host. Instead, all of a bare metal machine's resources are dedicated exclusively to the worker, so you don't need to worry about "noisy neighbors" sharing resources or slowing down performance. Bare metal servers are dedicated to you, with all its resources available for cluster usage. Bare metal machines are not supported in VPC Gen 1 compute clusters.
Encrypted disks	By default, every worker node is provisioned with two local SSD, AES 256-bit encrypted data partitions. The first partition contains the kernel image that is used to boot the worker node and is not encrypted. The second partition holds the container file system and is unlocked by using LUKS encryption keys. Each worker node in a cluster has its own unique LUKS encryption key, managed by Red Hat OpenShift on IBM Cloud. When you create a cluster or add a worker node to an existing cluster, the keys are pulled securely and then discarded after the encrypted disk is unlocked. Encryption can impact disk I/O performance. For workloads that require high-performance disk I/O, test a cluster with encryption both enabled and disabled to help you decide whether to turn off encryption.
SELinux	Every worker node is set up with security and access policies that are enforced by Security-Enhanced Linux (SELinux){: external} profiles that are loaded into the worker node during bootstrapping. SELinux profiles cannot be changed by the user or owner of the machine.
SSH disabled	By default, SSH access is disabled on the worker node to protect the cluster from malicious attacks. When SSH access is disabled, access to the cluster is forced via the OpenShift API server. The OpenShift API server requires every request to be checked against the policies that are set in the authentication, authorization, and admission control module before the request is executed in the cluster. If we have a standard cluster and we want to install more features on the worker node, we can choose between the add-ons that are provided by Red Hat OpenShift on IBM Cloud or use Kubernetes daemon sets for everything that we want to run on every worker node. For any one-time action that we must execute, use Kubernetes jobs .

Network

The classic approach to protect a company's network is to set up a firewall and block any unwanted network traffic to our apps. While this is still true, research shows that many malicious attacks come from insiders or authorized users who misuse their assigned permissions.

Network segmentation and privacy

To protect your network and limit the range of damage that a user can do when access to a network is granted, we must make sure that the workloads are as isolated as possible and that you limit the number of apps and worker nodes that are publicly exposed.

What network traffic is allowed for my cluster by default?
All containers are protected by predefined Calico network policy settings that are configured on every worker node during cluster creation. By default, all outbound network traffic is allowed for all worker nodes. Inbound network traffic is blocked with the following exceptions:

NodePort: The Kubernetes NodePort range is opened by default so that we can expose apps with NodePort services. To block inbound network traffic on NodePorts in the cluster, see Controlling inbound traffic to NLB or NodePort services.
IBM monitoring ports: By default, IBM opens a few ports on the cluster so that network traffic can be monitored by IBM and for IBM to automatically install security updates for the OpenShift master. Access from the OpenShift master to the worker node's kubelet is secured by an OpenVPN tunnel. For more information, see the Red Hat OpenShift on IBM Cloud architecture.

What is network segmentation and how can I set it up for a cluster?
Network segmentation describes the approach to divide a network into multiple subnetworks. We can group apps and related data to be accessed by a specific group in your organization. Apps that run in one subnetwork cannot see or access apps in another subnetwork. Network segmentation also limits the access that is provided to an insider or third-party software and can limit the range of malicious activities.

Red Hat OpenShift on IBM Cloud provides IBM Cloud VLANs that ensure quality network performance and network isolation for worker nodes. A VLAN configures a group of worker nodes and pods as if they were attached to the same physical wire. VLANs are dedicated to your IBM Cloud account and not shared across IBM customers. In classic clusters, if we have multiple VLANs for the cluster, multiple subnets on the same VLAN, or a multizone classic cluster, we must enable a Virtual Router Function (VRF) for the IBM Cloud infrastructure account so your worker nodes can communicate with each other on the private network. To enable VRF, contact your IBM Cloud infrastructure account representative. To check whether a VRF is already enabled, use the ibmcloud account show command. If we cannot or do not want to enable VRF, enable VLAN spanning. To perform this action, we need the Network > Manage Network VLAN Spanning infrastructure permission, or we can request the account owner to enable it. To check whether VLAN spanning is already enabled, use the ibmcloud oc vlan spanning get --region <region> command.

When you enable VRF or VLAN spanning for the account, network segmentation is removed for the clusters.

Review the following table to see your options for how to achieve network segmentation when you enable VRF or VLAN spanning for the account.

Security feature Description

Set up custom network policies with Calico We can use the built-in Calico interface to set up custom Calico network policies for the worker nodes. For example, we can allow or block network traffic on specific network interfaces, for specific pods, or services. To set up custom network policies, we must install the calicoctl CLI.

Support for IBM Cloud network firewalls Red Hat OpenShift on IBM Cloud is compatible with all IBM Cloud firewall offerings. For example, we can set up a firewall with custom network policies to provide dedicated network security for the standard cluster and to detect and remediate network intrusion. For example, we might choose to set up a Virtual Router Appliance to act as the firewall and block unwanted traffic. When you set up a firewall, we must also open up the required ports and IP addresses for each region so that the master and the worker nodes can communicate.

Security feature	Description
Set up custom network policies with Calico	We can use the built-in Calico interface to set up custom Calico network policies for the worker nodes. For example, we can allow or block network traffic on specific network interfaces, for specific pods, or services. To set up custom network policies, we must install the calicoctl CLI.
Support for IBM Cloud network firewalls	Red Hat OpenShift on IBM Cloud is compatible with all IBM Cloud firewall offerings. For example, we can set up a firewall with custom network policies to provide dedicated network security for the standard cluster and to detect and remediate network intrusion. For example, we might choose to set up a Virtual Router Appliance to act as the firewall and block unwanted traffic. When you set up a firewall, we must also open up the required ports and IP addresses for each region so that the master and the worker nodes can communicate.

What else can I do to reduce the surface for external attacks?
The more apps or worker nodes that you expose publicly, the more steps we must take to prevent external malicious attacks. Review the following table to find options for how to keep apps and worker nodes private.

Security feature Description

Limit the number of exposed apps By default, our apps and services that run within the cluster are not reachable over the public internet. We can choose if we want to expose our apps to the public, or if we want our apps and services be reachable on the private network only. When you keep our apps and services private, we can leverage the built-in security features to assure secured communication between worker nodes and pods. To expose services and apps to the public internet, we can use OpenShift routes, or leverage the NLB and Ingress ALB support to securely make your services publicly available. Ensure that only necessary services are exposed, and revisit the list of exposed apps regularly to ensure that they are still valid.

Limit public internet connectivity with edge nodes Every worker node is configured to accept app pods and associated load balancer or ingress pods. We can label worker nodes as edge nodes to force load balancer pods to be deployed to these worker nodes only. In addition, we can taint your worker nodes so that app pods cannot schedule onto the edge nodes. With edge nodes, we can isolate the networking workload on fewer worker nodes in the cluster and keep other worker nodes in the cluster private.

Security feature	Description
Limit the number of exposed apps	By default, our apps and services that run within the cluster are not reachable over the public internet. We can choose if we want to expose our apps to the public, or if we want our apps and services be reachable on the private network only. When you keep our apps and services private, we can leverage the built-in security features to assure secured communication between worker nodes and pods. To expose services and apps to the public internet, we can use OpenShift routes, or leverage the NLB and Ingress ALB support to securely make your services publicly available. Ensure that only necessary services are exposed, and revisit the list of exposed apps regularly to ensure that they are still valid.
Limit public internet connectivity with edge nodes	Every worker node is configured to accept app pods and associated load balancer or ingress pods. We can label worker nodes as edge nodes to force load balancer pods to be deployed to these worker nodes only. In addition, we can taint your worker nodes so that app pods cannot schedule onto the edge nodes. With edge nodes, we can isolate the networking workload on fewer worker nodes in the cluster and keep other worker nodes in the cluster private.

What if I want to connect my cluster to an on-prem data center?
To connect your worker nodes and apps to an on-prem data center, we can configure a VPN IPSec endpoint with a strongSwan service, a Virtual Router Appliance, or with a Fortigate Security Appliance.

Network segmentation and privacy for VPC clusters

What network traffic is allowed for my cluster by default?
By default, worker nodes are connected to VPC subnets on the private network only and do not have a public network interface. All public ingress to and egress from your worker nodes is blocked.

To run default OpenShift components such as the web console or OperatorHub, we must attach a public gateway to the VPC subnets that the worker nodes are deployed to. All egress is permitted for worker nodes on a subnet with an attached public gateway, but all ingress is still blocked.

If we deploy apps in the cluster that must receive traffic requests from the internet, we can create a VPC load balancer to expose our apps. To allow ingress network traffic to your apps, we must configure your VPC load balancer for the ingress network traffic that we want to receive.

Red Hat OpenShift on IBM Cloud provides IBM Cloud VPC subnets that ensure quality network performance and network isolation for worker nodes. A VPC subnet consists of a specified private IP address range (CIDR block) and configures a group of worker nodes and pods as if they were attached to the same physical wire. VPC subnets are dedicated to your IBM Cloud account and not shared across IBM customers.

VPC subnets provide a channel for connectivity among the worker nodes within the cluster. Any system that is connected to any of the private subnets in the same VPC can communicate with workers. For example, all subnets in one VPC can communicate through private layer 3 routing with a built-in VPC router. If the clusters do not need to communicate, we can achieve the best network segmentation by creating the clusters in separate VPCs. If we have multiple clusters that must communicate with each other, we can create the clusters in the same VPC. Although subnets within one VPC can be shared by multiple clusters in that VPC, we can achieve better network segmentation by using different subnets for clusters within one VPC.

To achieve further private network segmentation between VPC subnets for the account, we can set up custom network policies with VPC access control lists (ACLs). When you create a VPC, a default ACL is created in the format allow-all-network-acl-<VPC_ID> for the VPC. Any subnet that you create in the VPC is attached to this ACL by default. The ACL includes an inbound rule and an outbound rule that allow all traffic between your worker nodes on a subnet and any system on the subnets in the same VPC. To specify which private network traffic is permitted to the worker nodes on your VPC subnets, we can create a custom ACL for each subnet in the VPC. For example, we can create a set of ACL rules to block most inbound and outbound private network traffic of a cluster, while allowing communication that is necessary for the cluster to function.

Security feature Description

Limit the number of exposed apps By default, our apps and services that run within the cluster are not reachable over the public internet. We can choose if we want to expose our apps to the public, or if we want our apps and services be reachable on the private network only. When you keep our apps and services private, we can leverage the built-in security features to assure secured communication between worker nodes and pods. To expose services and apps to the public internet, we can leverage the VPC load balancer and Ingress ALB support to securely make your services publicly available. Ensure that only necessary services are exposed, and revisit the list of exposed apps regularly to ensure that they are still valid.

Limit public network egress to one subnet with a public gateway If pods on the worker nodes need to connect to a public external endpoint, we can attach a public gateway to the subnet that those worker nodes are on.

Security feature	Description
Limit the number of exposed apps	By default, our apps and services that run within the cluster are not reachable over the public internet. We can choose if we want to expose our apps to the public, or if we want our apps and services be reachable on the private network only. When you keep our apps and services private, we can leverage the built-in security features to assure secured communication between worker nodes and pods. To expose services and apps to the public internet, we can leverage the VPC load balancer and Ingress ALB support to securely make your services publicly available. Ensure that only necessary services are exposed, and revisit the list of exposed apps regularly to ensure that they are still valid.
Limit public network egress to one subnet with a public gateway	If pods on the worker nodes need to connect to a public external endpoint, we can attach a public gateway to the subnet that those worker nodes are on.

What if I want to connect my cluster to other networks, like other VPCs, an on-prem data center, or IBM Cloud classic resources?
Depending on the network that we want to connect your worker nodes to, we can choose a VPN solution.

Expose apps with routes

To allow incoming network traffic from the internet, we can expose our apps by using routes.

Every OpenShift cluster is automatically set up with an OpenShift router that is assigned a unique domain name and secured with a TLS certificate. When you expose the app by using a route, the app is assigned a URL from the OpenShift router.

How can I create secured routes and control TLS termination?
When you create a route for the app, we can decide to create a secured (HTTPS) or unsecured (HTTP) route. For secured routes, we can decide where we want to implement the TLS termination, such as at the router or at the pod. For more information, see Exposing apps with routes.

Expose apps with LoadBalancer and Ingress services

We can use network load balancer (NLB) and Ingress application load balancer (ALB) networking services to connect our apps to the public internet or to external private networks. Review the following optional settings for NLBs and ALBs that we can use to meet back-end app security requirements or encrypt traffic as it moves through the cluster.

Can I use security groups to manage my cluster's network traffic?
Classic clusters: IBM Cloud security groups are applied to the network interface of a single virtual server to filter traffic at the hypervisor level. To manage traffic for each worker node, we can use security groups. When you create a security group, we must allow the VRRP protocol, which Red Hat OpenShift on IBM Cloud uses to manage NLB IP addresses. To uniformly manage traffic for the cluster across all of our worker nodes, use Calico and Kubernetes policies.

VPC clusters: VPC security groups are applied to the network interface of a single virtual server to filter traffic at the hypervisor level. We can add inbound and outbound rules to the default security group for the cluster to manage inbound and outbound traffic to a VPC cluster. The default rules of the security group for the cluster differs with your cluster's version.

VPC Gen 2 clusters that run OpenShift version 4.5 or later:
- The default security group for the VPC is applied to your worker nodes. This security group allows incoming ICMP packets (pings) and incoming traffic from other worker nodes in the cluster.
- Additionally, a unique security group that is named in the format kube-<cluster_ID> is automatically created and applied to the worker nodes for that cluster. This security group allows incoming traffic requests to the 30000 - 32767 port range on the worker nodes, and ensures that all inbound and outbound traffic to the pod subnet is permitted so that worker nodes can communicate with each other across subnets. Do not modify or delete this security group.
VPC Gen 2 clusters that run OpenShift version 4.4 or earlier: The default security group for the VPC is applied to your worker nodes. This security group denies all incoming traffic requests to your worker nodes.

Because the worker nodes of our VPC cluster exist in a service account and are not listed in the VPC infrastructure dashboard, we cannot create a security group and apply it to your worker node instances. We can only modify existing security groups that are created for you.

How can I do TLS termination with LoadBalancer and Ingress services?
The Ingress service offers TLS termination at two points in the traffic flow:

Decrypt package upon arrival: By default, the Ingress ALB load balances HTTP network traffic to the apps in the cluster. To also load balance incoming HTTPS connections, we can configure the ALB to decrypt the network traffic and forward the decrypted request to the apps that are exposed in the cluster. If you use the IBM-provided Ingress subdomain, we can use the IBM-provided TLS certificate. If you use a custom domain, we can use your own TLS certificate to manage TLS termination.
Re-encrypt package before you forward it to upstream apps: The ALB decrypts HTTPS requests before forwarding traffic to our apps. If we have apps that require HTTPS and need traffic to be encrypted before it is forwarded to those upstream apps, we can use the ssl-services annotation. If your upstream apps can handle TLS, we can optionally provide a certificate that is contained in a one-way or mutual-authentication TLS secret.

Persistent storage

Review supported options for encrypting and protecting your data on persistent storage in IBM Cloud.

By default, all IBM Cloud storage solutions automatically encrypt your data at rest with an IBM-managed encryption key at no additional cost. For more information, see the following links.

Depending on the type of storage that you choose, we can set up additional encryption with IBM Key Protect to protect your data in transit and at rest with your own encryption key.

We can also use an IBM Cloud database service, such as IBM Cloudant NoSQL DB, to persist data in a managed database outside the cluster. Data that is stored with a cloud database service can be accessed across clusters, zones, and regions. For security-related information, see the database service-specific IBM Cloud documentation.

Monitoring and logging

The key to detect malicious attacks in the cluster is the proper monitoring and logging of metrics and all the events that happen in the cluster. Monitoring and logging can also help you understand the cluster capacity and availability of resources for the app so that we can plan accordingly to protect our apps from a downtime.

Does IBM monitor my cluster?
Every cluster master is continuously monitored by IBM to control and remediate process level Denial-Of-Service (DOS) attacks. Red Hat OpenShift on IBM Cloud automatically scans every node where the master is deployed for vulnerabilities that are found in Kubernetes, Openshift, and OS-specific security fixes. If vulnerabilities are found, Red Hat OpenShift on IBM Cloud automatically applies fixes and resolves vulnerabilities on behalf of the user to ensure master node protection.

What information is logged?
By default, Red Hat OpenShift on IBM Cloud automatically collects logs for the following cluster components:

Containers: Logs that are written to STDOUT or STDERR.
Apps: Logs that are written to a specific path inside the app.
Workers: Logs from the Red Hat Enterprise Linux operating system that are sent to /var/log/syslog and /var/log/auth.log.
OpenShift API server: Every cluster-related action that is sent to the OpenShift API server is logged for auditing reasons, including the time, the user, and the affected resource. For more information, see Kubernetes audit logs. We can access these logs by using IBM Cloud Activity Tracker with LogDNA. For more information, see the getting started tutorial.
Routers: Logs inbound network traffic on routes.
Kubernetes system components: Logs from the kubelet, the kube-proxy, and other components that run in the kube-system namespace.

To access the logs of the cluster components, set up IBM Log Analysis with LogDNA. IBM Log Analysis with LogDNA provides access to all your logs and we can aggregate logs and build your own customized views across multiple clusters.

How can I monitor the health and performance of my cluster?
We can verify the health, capacity, and performance of our apps, services, and worker nodes by monitoring the cluster components and compute resources from the Red Hat OpenShift on IBM Cloud console or CLI, such as the CPU and memory usage. To view more in-depth metrics for the cluster, we can use the built-in monitoring capabilities that are based on open source technologies, such as Prometheus and Grafana. Prometheus is automatically installed when you create the cluster and we can use the tool to access real-time cluster and app metrics. Prometheus metrics are not stored persistently. To access historic metrics and to compare metrics across multiple clusters, use IBM Cloud Monitoring with Sysdig instead.

To set up a host-based intrusion detection system (HIDS) and security event log monitoring (SELM), install third-party tools that are designed to monitor the cluster and containerized apps to detect intrusion or misuse, such as Twistlock or the Sysdig Falco project. Sysdig Falco is a separate tool and is not included if you choose to install the IBM-provided Sysdig add-on in the cluster.

How can I audit events that happen in my cluster?
We can set up IBM Cloud Activity Tracker in your Red Hat OpenShift on IBM Cloud cluster. For more information, view the Activity Tracker documentation.

What are my options to enable trust in my cluster?
By default, Red Hat OpenShift on IBM Cloud provides many features for the cluster components so that we can deploy your containerized apps in a security-rich environment. Extend your level of trust in the cluster to better ensure that what happens within the cluster is what you intended to happen. We can implement trust in the cluster in various ways, as shown in the following diagram.

Content Trust for the images: Ensure the integrity of our images by enabling content trust in your IBM Cloud Container Registry. With trusted content, we can control who can sign images as trusted. After trusted signers push an image to your registry, users can pull the signed content so that they can verify the source of the image. For more information, see Signing images for trusted content.
Container Image Security Enforcement: Create an admission controller with custom policies so that we can verify container images before we deploy them. With Container Image Security Enforcement, you control where the images are deployed from and ensure that they meet Vulnerability Advisor policies or content trust requirements. If a deployment does not meet the policies that you set, security enforcement prevents modifications to the cluster. For more information, see Enforcing container image security.
Image Vulnerability Scanner: By default, Vulnerability Advisor scans images that are stored in IBM Cloud Container Registry to find potential security vulnerabilities. For more information, see Manage image security with Vulnerability Advisor.
Network insights with Security Advisor (beta): With IBM Cloud Security Advisor, we can centralize security insights from IBM Cloud services such as Vulnerability Advisor and Certificate Manager. When you enable Security Advisor in the cluster, we can view reports about suspicious incoming and outgoing network traffic. For more information, see Network Analytics. To install, see Set up monitoring of suspicious clients and server IP addresses for a Kubernetes cluster.
IBM Cloud Certificate Manager: To expose the app by using a custom domain with TLS, we can store your TLS certificate in Certificate Manager. Expired or about-to-expire certificates can also be reported in your Security Advisor dashboard. For more information, see Getting started with Certificate Manager.

Image and registry

Every deployment is based on an image that holds the instructions for how to spin up the container that runs the app. These instructions include the operating system inside the container and extra software that we want to install. To protect the app, we must protect the image and establish checks to ensure the image's integrity.

Should I use a public or a private registry to store my images?
Public registries, such as Docker Hub, can be used to get started with Docker images and Kubernetes to create your first containerized app in a cluster. But when it comes to enterprise applications, avoid registries that you don't know or don't trust to protect the cluster from malicious images. Keep the images in a private registry, like the one provided in IBM Cloud Container Registry or the internal registry that is automatically set up in the OpenShift cluster, and make sure to control access to the registry and the image content that can be pushed.

Why is it important to check images against vulnerabilities?
Research shows that most malicious attacks leverage known software vulnerabilities and weak system configurations. When we deploy a container from an image, the container spins up with the OS and extra binaries that you described in the image. Just like you protect your virtual or physical machine, we must eliminate known vulnerabilities in the OS and binaries that you use inside the container to protect the app from being accessed by unauthorized users.

How can I ensure secure container images?
To protect our apps, consider to address the following areas:

Automate the build process and limit permissions:
Automate the process to build your container image from your source code to eliminate source code variations and defects. By integrating the build process into your CI/CD pipeline, we can ensure that your image is scanned and built only if the image passes the security checks that you specified. To avoid that developers apply hot fixes to sensitive images, limit the number of people in your organization who have access to the build process.
Scan images before they deploy into production:
Make sure to scan every image before we deploy a container from it. For example, if you use IBM Cloud Container Registry, all images are automatically scanned for vulnerabilities when you push the image to your namespace. If vulnerabilities are found, consider eliminating the vulnerabilities or block deployment for those images. Find a person or team in your organization who is responsible for monitoring and removing vulnerabilities. Depending on your organizational structure, this person might be part of a security, operations, or deployment team. Use admission controllers, such as the Container Image Security Enforcement to block deployments from images that did not pass vulnerability checks and enable content trust so that images must be approved by a trusted signer before they can be pushed to the container registry.
Regularly scan running containers:
Even if we deployed a container from an image that passes the vulnerability check, the operating system or binaries that run in the container might get vulnerable over time. To protect the app, we must ensure that running containers are regularly scanned so that we can detect and remediate vulnerabilities. Depending on the app, to add extra security, we can establish a process that takes down vulnerable containers after they are detected.

Can I use the built-in container registry in my OpenShift cluster to address build automation, image and container scanning?
We can use the built-in container registry to automate the container image build process from your source code in an external source repository to your internal registry. However, images are not automatically scanned for vulnerabilities when they are pushed to the internal registry. To set up image scanning, set up a registry namespace and push your images to the managed IBM Cloud Container Registry instead.

How can IBM Cloud Container Registry help me to protect my images and deployment process?

Security for images and deployments Security feature Description Secured Docker private image repository in IBM Cloud Container Registry Set up your own Docker image repository in a multi-tenant, highly available, and scalable private image registry that is hosted and managed by IBM. By using the registry, we can build, securely store, and share Docker images across cluster users.

Learn more about securing your personal information when you work with container images. Push images with trusted content only Ensure the integrity of our images by enabling content trust in your image repository. With trusted content, we can control who can sign images as trusted and push images to a specific registry namespace. After trusted signers push an image to a registry namespace, users can pull the signed content so that they can verify the publisher and the integrity of the image. Automatic vulnerability scans When you use IBM Cloud Container Registry, we can leverage the built-in security scanning that is provided by Vulnerability Advisor. Every image that is pushed to your registry namespace is automatically scanned for vulnerabilities against a database of known CentOS, Debian, Red Hat, and Ubuntu issues. If vulnerabilities are found, Vulnerability Advisor provides instructions for how to resolve them to ensure image integrity and security. Block deployments from vulnerable images or untrusted users Create an admission controller with custom policies so that we can verify container images before we deploy them. With Container Image Security Enforcement, you control where the images are deployed from and ensure that they meet Vulnerability Advisor policies or content trust requirements. If a deployment does not meet the policies that you set, the admission controller blocks the deployment in the cluster.

What options do I have to scan running containers for vulnerabilities?
We can install third-party solutions in the cluster, such as Twistlock or StackRox to scan running containers and block malicious activities when they are detected.

Container isolation and security

When you run multiple apps in the cluster, we want to make sure that the workloads run isolated from each other and that you restrict the permissions of our pods within the cluster to avoid noisy neighbors or denial-of-service attacks.

What is an OpenShift project and why should I use it?
OpenShift projects are a way to virtually partition a cluster and provide isolation for the deployments and the groups of users that want to move their workload onto the cluster. With projects, we can organize resources across worker nodes and also across zones in multizone clusters.

Every cluster is set up with a set of default OpenShift projects that include the deployments and services that are required for Red Hat OpenShift on IBM Cloud to run properly and manage the cluster. For more information, see the service architecture. Cluster administrators automatically have access to these projects and can set up additional projects in the cluster. In addition, cluster users who are granted access to the cluster can create their own project and, as the creator of the project, can manage the project with administrator permissions. However, cluster users do not have access to other projects by default, unless they are granted access by a cluster administrator.

For every project that we have in the cluster, make sure to set up proper RBAC policies to limit access to this project, control what gets deployed, and to set proper resource quotas and limit ranges.

Should I set up a single-tenant or a multi-tenant cluster?
In a single-tenant cluster, you create one cluster for every group of people that must run workloads in a cluster. Usually, this team is responsible to manage the cluster and to properly configure and secure it. Multi-tenant clusters use multiple projects to isolate tenants and their workloads.

Deciding between single-tenant and multi-tenant clusters depends on the number of teams that must run workloads in a cluster, their service requirements, the size of the service, and the level of isolation that we want to achieve for the workloads.

A single-tenant cluster might be your option if we have many teams with complex services that each must have control over the lifecycle of the cluster. This includes having the freedom to decide when a cluster is updated or what resources can be deployed to the cluster. We can also configure a single-tenant cluster to allow privileged pods without putting other tenants at risk of being compromised. Keep in mind that managing a cluster requires in-depth Kubernetes, OpenShift , and infrastructure knowledge to ensure cluster capacity and security for the deployments.

Multi-tenant clusters use OpenShift projects to isolate tenants and are usually managed by a separate team that does not belong to one of the tenants. A multi-tenant cluster might be your option if we have multiple teams that must run small workloads in a cluster, and where creating a single-tenant cluster that is highly available across multiple zones does not bring the cost benefits that we want. While multi-tenant clusters usually require fewer people to manage and administer the cluster, they might not provide the level of isolation that we need and add more complexity in the following areas:

Access: When you set up multiple projects, we must configure proper RBAC policies for each project to ensure resource isolation. RBAC policies are complex and require in-depth Kubernetes knowledge.
Privileged pods: If one tenant in a multi-tenant cluster requires to run privileged pods, this pod can access other projects in the cluster or damage the shared compute host. Controlling privileged pods is a complex task that requires effort and deep technical expertise. Use security context constraints (SCCs) to control what resources your tenants can deploy in the cluster.
Network policies: Because your worker nodes are connected to the same private network, we must make sure that we have strict firewall policies in place to prevent pods from accessing pods in other namespaces.
Compute resource limitation: To ensure that every team has the necessary resources to deploy services and run apps in the cluster, we must set up resource quotas for every namespace. Resource quotas determine the deployment constraints for a project, such as the number of Kubernetes resources that we can deploy, and the amount of CPU and memory that can be consumed by those resources. After you set a quota, users must include resource requests and limits in their deployments.
Shared cluster resources: If you run multiple tenants in one cluster, some cluster resources, such as the OpenShift router, Ingress application load balancer (ALB) or available portable IP addresses are shared across tenants. Smaller services might have a hard time using shared resources if they must compete against large services in the cluster.
Updates: We can run one OpenShift API version at a time only. All apps that run in a cluster must comply with the current OpenShift API version independent of the team that owns the app. When we want to update a cluster, we must ensure that all teams are ready to switch to a new OpenShift API version and that apps are updated accordingly. This also means that individual teams have less control over the OpenShift API version they want to run.
Changes in cluster setup: To change the cluster setup or reschedule workloads onto new worker nodes, we must roll out this change across tenants. This roll out requires more reconciliation and testing than in a single-tenant cluster.
Communication process: When you manage multiple tenants, consider setting up a communication process so that tenants know where to go when an issue with the cluster exists, or when they need more resources for their services. This communication process also includes informing your tenants about all changes in the cluster setup or planned updates.

Although single-tenant and multi-tenant clusters come with roughly the same costs, single-tenant clusters provide a higher level of isolation than the projects in a multi-tenant cluster. For better workload isolation, use single-tenant clusters.

How can I control pod permissions?
To control pod permissions within or across projects, Red Hat OpenShift on IBM Cloud uses security context constraints (SCCs). By default, every cluster is set up with OpenShift SCCs and a set of IBM-provided SCCs that we can assign to service accounts, pods, deployments, or projects to limit the permissions within the cluster. If you do not explicitly assign an SCC, the pods use the restricted SCC. OpenShift SCCs are stricter than the default pod security policies in community Kubernetes clusters. We might need to modify an app that runs in a community Kubernetes cluster so that this app can run in OpenShift . For more information, see Configuring security context constraints.

What else can I do to protect my container?

Security feature Description

Limit the number of privileged containers Containers run as a separate Linux process on the compute host that is isolated from other processes. Although users have root access inside the container, the permissions of this user are limited outside the container to protect other Linux processes, the host file system, and host devices. Some apps require access to the host file system or advanced permissions to run properly. We can run containers in privileged mode to allow the container the same access as the processes running on the compute host.
Keep in mind that privileged containers can cause huge damage to the cluster and the underlying compute host if they become compromised. Try to limit the number of containers that run in privileged mode and consider changing the configuration for the app so that the app can run without advanced permissions. To block privileged containers from running in the cluster, consider setting up custom security context constraints.

Apply OS security settings to pods We can customize the default security context constraints to control the user ID and group ID that can run inside the container, or the user ID and group ID that owns the volume mount path. Setting a specific user ID helps facilitate a least privilege model. If the security context does not specify a user, Kubernetes automatically uses the user that is specified in the container image. For more information , see Configuring security context constraints.

Set CPU and memory limits for containers Every container requires a specific amount of CPU and memory to properly start and to continue to run. We can define Limit ranges for the containers or pods to limit the amount of CPU and memory that they can consume. If no limits for CPU and memory are set, and the container is busy, the container uses all the resources that are available. This high consumption of resources might affect other containers on the worker node that do not have enough resources to properly start or run, and puts your worker node at risk for denial-of-service attacks.

Security feature	Description
Limit the number of privileged containers	Containers run as a separate Linux process on the compute host that is isolated from other processes. Although users have root access inside the container, the permissions of this user are limited outside the container to protect other Linux processes, the host file system, and host devices. Some apps require access to the host file system or advanced permissions to run properly. We can run containers in privileged mode to allow the container the same access as the processes running on the compute host. Keep in mind that privileged containers can cause huge damage to the cluster and the underlying compute host if they become compromised. Try to limit the number of containers that run in privileged mode and consider changing the configuration for the app so that the app can run without advanced permissions. To block privileged containers from running in the cluster, consider setting up custom security context constraints.
Apply OS security settings to pods	We can customize the default security context constraints to control the user ID and group ID that can run inside the container, or the user ID and group ID that owns the volume mount path. Setting a specific user ID helps facilitate a least privilege model. If the security context does not specify a user, Kubernetes automatically uses the user that is specified in the container image. For more information , see Configuring security context constraints.
Set CPU and memory limits for containers	Every container requires a specific amount of CPU and memory to properly start and to continue to run. We can define Limit ranges for the containers or pods to limit the amount of CPU and memory that they can consume. If no limits for CPU and memory are set, and the container is busy, the container uses all the resources that are available. This high consumption of resources might affect other containers on the worker node that do not have enough resources to properly start or run, and puts your worker node at risk for denial-of-service attacks.

Storing personal information

You are responsible for ensuring the security of our personal information in Kubernetes resources and container images. Personal information includes your name, address, phone number, email address, or other information that might identify, contact, or locate you, your customers, or anyone else.

Use a Kubernetes secret to store personal information: Store personal information only in Kubernetes resources that are designed to hold personal information. For example, do not use your name in the name of an OpenShift project, deployment, service, or config map. For proper protection and encryption, store personal information in secrets{: external} instead.
Use a Kubernetes imagePullSecret to store image registry credentials: Do not store personal information in container images or registry namespaces. For proper protection and encryption, store registry credentials in Kubernetes imagePullSecrets and other personal information in secrets{: external} instead. Remember that if personal information is stored in a previous layer of an image, deleting an image might not be sufficient to delete this personal information.

On SGX-enabled bare metal worker nodes, we can encrypt your data in use by using IBM Cloud Data Shield. Similar to the way encryption works for data at rest and data in motion, Fortanix Runtime Encryption that is integrated with IBM Cloud Data Shield protects keys, data, and apps from external and internal threats. The threats might include malicious insiders, cloud providers, OS-level hacks, or network intruders.

Kubernetes security bulletins

If vulnerabilities are found in Kubernetes, Kubernetes releases CVEs in security bulletins to inform users and to describe the actions that users must take to remediate the vulnerability. Kubernetes security bulletins that affect Red Hat OpenShift on IBM Cloud users or the IBM Cloud platform are published in the IBM Cloud security bulletin.

Some CVEs require the latest patch update for a OpenShift version that we can install as part of the regular cluster update process in Red Hat OpenShift on IBM Cloud. Make sure to apply security patches in time to protect the cluster from malicious attacks. For more information about what is included in a security patch, refer to the version changelog.