Latest

Installing on Kubernetes / Openshift

Preparing the Kubernetes namespace

Prerequisite: If you are using OpenShift, our operator will leverage the OpenShift catalog to install Strimzi for you automatically. If not running on OpenShift, manual installation of Strimzi Kafka is required. Strimzi installation steps can be found here. Only the Strimzi cluster operator needs to be deployed, our operator will then make the necessary requests for the resources needed from Strimzi.

The operator directory referenced below can be found here

  1. Search for the Custom Resource Definition (CRD) and create it if it doesn’t already exist.

    $ oc get crds | grep manageiqs.manageiq.org
    $ oc create -f config/crd/bases/manageiq.org_manageiqs.yaml
    
  2. Create your namespace

    # Openshift:
    $ oc new-project your-project
    # --- OR ---
    # Kubernetes:
    $ kubectl create namespace your-project
    $ kubectl config set-context --current --namespace=<your project>
    
  3. Set up RBAC.

    $ oc create -f config/rbac/role.yaml
    $ oc create -f config/rbac/role_binding.yaml
    $ oc create -f config/rbac/service_account.yaml
    
  4. Deploy the operator in your namespace.

    $ oc create -f config/manager/manager.yaml
    

New Installation

Create the Custom Resource

Make any necessary changes (i.e applicationDomain) that apply to your environment then create the CR (Custom Resource). The operator will take action to deploy the application based on the information in the CR. It may take several minutes for the database to be initialized and the workers to enter a ready state.

$ oc create -f config/samples/_v1alpha1_manageiq.yaml

Logging In

If using internal authentication, you can now login to the MIQ web interface using the default username and password (admin/smartvm). If using external authentication, you can now login as a user that is a member of the initialAdminGroupName specified in the CR.

Migrating from Appliances

Notes

  • Multi-server / multi-zone: Current architecture in podified limits us to running a single server and zone in a Kubernetes namespace. Therefore, when migrating from a multi-appliance and/or multi-zone appliance architecture, you will need to choose a single server to assume the identity of. This server should have the UI and web service roles enabled before taking the database backup to ensure that those workers will start when deployed in the podified environment. All desired roles and settings will need to be configured on this server.

  • Multi-region: Multi-region deployments are slightly more complicated in a podified environment since postgres isn’t as easily exposed outside the project / cluster. If all of the region databases are running outside of the cluster or all of the remote region databases are running outside of the cluster and the global database is in the cluster, everything is configured in the same way as appliances. If the global region database is migrated from an appliance to a pod, the replication subscriptions will need to be recreated. If any of the remote region databases are running in the cluster, the postgresql service for those databases will need to be exposed using a node port. To publish the postgresql service on a node port, patch the service using $ kubectl patch service/postgresql --patch '{"spec": {"type": "NodePort"}}'. Now you will see the node port listed (31871 in this example) as well as the internal service port (5432). This node port can be used along with the IP address of any node in the cluster to access the postgresql service.

$ oc get service/postgresql
NAME         TYPE       CLUSTER-IP   EXTERNAL-IP   PORT(S)          AGE
postgresql   NodePort   192.0.2.1    <none>        5432:31871/TCP   2m

Collect data from the appliance

  1. Take a backup of the database

    $ pg_dump -Fc -d vmdb_production > /root/pg_dump
    
  2. Export the encryption key and Base64 encode it for the Kubernetes Secret.

    $ vmdb && rails r "puts Base64.encode64(ManageIQ::Password.key.to_s)"
    
  3. Get the region number

    $ vmdb && rails r "puts MiqRegion.my_region.region"
    
  4. Get the GUID of the server that you want to run as.

    $ vmdb && cat GUID
    

Restore the backup into the kubernetes environment

  1. Create a YAML file defining the Custom Resource (CR). Minimally you’ll need the following:

    apiVersion: manageiq.org/v1alpha1
    kind: ManageIQ
    metadata:
      name: <friendly name for you CR instance>
    spec:
      applicationDomain: <your application domain name>
      databaseRegion: <region number from the appliance above>
      serverGuid: <GUID value from appliance above>
    
  2. Create the CR in your namespace. Once created, the operator will create several additional resources and start deploying the app.

    $ oc create -f <file name from above>
    
  3. Edit the app secret inserting the encryption key from the appliance. Replace the “encryption-key” value with the value we exported from the appliance above.

    $ oc edit secret app-secrets
    
  4. Temporarily hijack the orchestrator by adding the following to the deployment:

    $ kubectl patch deployment orchestrator -p '{"spec":{"template":{"spec":{"containers":[{"name":"orchestrator","command":["sleep", "1d"]}]}}}}'
    
  5. Shell into the orchestrator container:

    $ oc rsh deploy/orchestrator
    
    $ cd /var/www/miq/vmdb
    $ source ./container_env
    $ DISABLE_DATABASE_ENVIRONMENT_CHECK=1 rake db:drop db:create
    
  6. oc rsh into the database pod and restore the database backup

    $ cd /var/lib/pgsql
    # --- download your backup here ---
    $ pg_restore -v -d vmdb_production <your_backup_file>
    $ rm -f <your_backup_file>
    $ exit
    
  7. Back in orchestrator pod from step 5:

    $ rake db:migrate
    $ exit
    
  8. Delete the orchestrator deployment to remove the command override that we added above:

    $ kubectl delete deployment orchestrator
    

Done! The orchestrator will start deploying the rest of the pods required to run the application.

Customizing the Installation

Configuring the application domain name

It is necessary to set the applicationDomain in the CR if you are running in a production cluster. For a Code Ready Containers cluster it should be something like miqproject.apps-crc.testing

External postgresql

Running with an external Postgres server is an option, if you want the default internal Postgres you can skip this step. Additionally, if you want to secure the connection, you need to include the optional parameters sslmode=verify-full and rootcertificate when you create the secret. To do this, manually create the secret and substitute your values before you create the CR.

$ oc create secret generic postgresql-secrets \
  --from-literal=dbname=vmdb_production \
  --from-literal=hostname=<your hostname> \
  --from-literal=port=5432 \
  --from-literal=password=<your password> \
  --from-literal=username=<your username> \
  --from-literal=sslmode=verify-full \ # optional
  --from-file=rootcertificate=path/to/optional/certificate.pem # optional

Note: If wanted, the secret name is also customizable by setting databaseSecret in the CR.

Using TLS to encrypt connections between pods inside the cluster:

Create a secret containing all of the certificates

The certificates should all be signed by a CA and that CA certificate should be in uploaded as root_crt so that it can be used to verify connection validity. If the secret is named internal-certificates-secret, no changes are needed in the CR, if you choose a different name, that should be set in the internalCertificatesSecret field of the CR.

    oc create secret generic internal-certificates-secret \
      --from-file=root_crt=./certs/root.crt \
      --from-file=root_key=./certs/root.key \
      --from-file=httpd_crt=./certs/httpd.crt \
      --from-file=httpd_key=./certs/httpd.key \
      --from-file=kafka_crt=./certs/kafka.crt \
      --from-file=kafka_key=./certs/kafka.key \
      --from-file=memcached_crt=./certs/memcached.crt \
      --from-file=memcached_key=./certs/memcached.key \
      --from-file=postgresql_crt=./certs/postgresql.crt \
      --from-file=postgresql_key=./certs/postgresql.key \
      --from-file=api_crt=./certs/api.crt \
      --from-file=api_key=./certs/api.key \
      --from-file=remote_console_crt=./certs/remote-console.crt \
      --from-file=remote_console_key=./certs/remote-console.key \
      --from-file=ui_crt=./certs/ui.crt \
      --from-file=ui_key=./certs/ui.key

Generating certificates:

We need a CA and certificates for each service. The certificates need to be valid for the internal kubernetes service name (i.e. httpd, postgres, etc.) and the services that are backing the route (ui & api) also need the certificate to include a SAN with the application domain name. For example, the certificate for the UI needs to be valid for the hostname ui and also your_application.apps.example.com. If you want a script that will generate these certificates for you, see: https://github.com/ManageIQ/manageiq-pods/blob/master/tools/cert_generator.rb

Configuring external messaging

It is possible to run with an external Kafka messaging server. To do this, create the required secret with the correct parameters using the folowing example and provide that secret name as kafkaSecret in the CR.

$ kubectl create secret generic kafka-secrets --from-literal=hostname=<your fqdn> --from-literal=username=<your username> --from-literal=password=<your password>

If you decide to use a secret name other than kafka-secrets, you need to specify that in the CR.

kafkaSecret: <name of your secret>

Creating a custom TLS secret

It is possible to run with a custom TLS certificate. To do this, create the required secret with the correct parameters using the folowing example and provide that secret name as tlsSecret in the CR.

$ oc create secret tls tls-secret --cert=tls.crt --key=tls.key`

If you decide to use a secret name other than tls-secret, you need to specify that in the CR.

tlsSecret: <name of your tls secret>

Configuring an image pull secret

If authentication is required in order to pull the images, a secret containing the credentials needs to be created. The following is an example of creating the secret.

$ kubectl create secret docker-registry image-pull-secret --docker-username=<your registry username> --docker-password=<your registry password> --docker-server=<your registry server>

If you decide to use a secret name other than image-pull-secret, you need to specify that in the CR.

imagePullSecret: <name of your pull secret>

Configuring OpenID-Connect Authentication

To run with OpenID-Connect Authentication, customize the following example as required to fit your environment. For this example we tested with Keycloak version 11.0

Create a secret containing the OpenID-Connect’s Client ID and Client Secret. The values for CLIENT_ID and CLIENT_SECRET come from your authentication provider’s client definition.

$ oc create secret generic <name of your kubernetes secret> --from-literal=CLIENT_ID=<your auth provider client ID> --from-literal=CLIENT_SECRET=<your auth provider client secret>

Modify the CR with the following values:

httpdAuthenticationType: openid-connect
oidcProviderURL: https://<your keycloak FQDN>/auth/realms/<your realm>/.well-known/openid-configuration
oidcClientSecret: <name of your kubernetes secret>

Configuring OpenID-Connect with a CA Certificate

To configure OpenID-Connect with a CA certificate follow these steps:

Acquire the CA certificate (precisely how to obtain the CA certificate is beyond the scope of this document) and store it in a secret using the following example.

$ oc create secret generic <name of your kubernetes OIDC CA cert> --from-file=<path to your OIDC CA cert file>

Modify the CR to identify the secret just created.

oidcCaCertSecret: <name of your kubernetes OIDC CA cert>

Using your own images

If you built your own custom application images and want to deploy those you can specify the image names in the CR using the following example.

applicationDomain: miqproject.apps-crc.testing
orchestratorImage: docker.io/<your_username_or_organization>/<your_prefix>-orchestrator:latest
baseWorkerImage: docker.io/<your_username_or_organization>/<your_prefix>-base-worker:latest
uiWorkerImage: docker.io/<your_username_or_organization>/<your_prefix>-ui-worker:latest
webserverWorkerImage: docker.io/<your_username_or_organization>/<your_prefix>-webserver-worker:latest

If you built your own operator image, you’ll need to edit the operator deployment to specify that.

Installing VMware VDDK on containers

Execution of SmartState Analysis on virtual machines within a VMware environment requires the Virtual Disk Development Kit (VDDK) to be installed.

To install the VDDK on containers we can take an existing container and create another layer on top of it. Once built, this image will need to be pushed to a registry and the CR updated to reflect that.

  1. Download the required VDDK version (VMware-vix-disklib-[version].x86_64.tar.gz) from the VMware website.

    Note:

    • If you do not already have a login ID to VMware, then you will need to create one. At the time of this writing, the file can be found by navigating to Downloads > vSphere. Select the version from the drop-down list, then click the Drivers & Tools tab. Expand Automation Tools and SDKs, and click Go to Downloads next to the VMware vSphere Virtual Disk Development Kit version. Alternatively, find the file by searching for it using the Search on the VMware site.

    • See VMware documentation for information about their policy concerning backward and forward compatibility for VDDK.

  2. Set up a temporary directory where we will build the image

    $ mkdir -p /tmp/vddk_container/container-assets
    
  3. Copy the tar file into the container-assets directory. For example,

    $ cp ~/Downloads/VMware-vix-disklib-*.tar.gz /tmp/vddk_container/container-assets
    
  4. Add the following Dockerfile as /tmp/vddk_container/Dockerfile and substitute the second FROM based on the image and version that you’re running.

    FROM registry.access.redhat.com/ubi8/ubi as vddk
    
    COPY container-assets/ /vddk/
    
    RUN cd /vddk && \
        filename=$(find . -name VMware-*) && \
        if [[ -f $filename ]] ; then tar -zxvf VMware-* ; fi && \
        mkdir -p vmware-vix-disklib-distrib && \
        touch /vddk/vmware-vix-disklib-distrib/.keep
    
    ################################################################################
    
    ### IMPORTANT: Modify the following image and tag as necessary to reflect the version that you're running
    FROM your_registry.example.com/namespace/manageiq-base-worker:latest
    
    COPY --from=vddk /vddk/vmware-vix-disklib-distrib/ /usr/lib/vmware-vix-disklib/
    
    RUN /tmp/install-vmware-vddk
    
  5. Build the image and tag it appropriately for your registry, then push it to the registry.

    $ cd /tmp/vddk_container/
    $ docker build . -t your_registry.example.com/namespace/manageiq-base-worker:latest_vddk
    $ docker push your_registry.example.com/namespace/manageiq-base-worker:latest_vddk
    
  6. Update the CR to use the base worker image you just pushed to the registry.

    baseWorkerImage: your_registry.example.com/namespace/manageiq-base-worker:latest_vddk
    

The operator will now update the orchestrator and worker deployments to reflect this change.

Uninstalling

  1. Remove the CR that was created in Step 6

    WARNING: This includes Persistent Volume Claims and secrets, so all data stored by the application will be removed. The following command removes everything except for the operator:

    $ oc delete -f config/samples/_v1alpha1_manageiq.yaml
    
  2. Remove the Operator deployment

    If you deployed the operator using Step 4 option 1 or 2, run the following to remove it:

    $ oc delete -f config/manager/manager.yaml
    
  3. Remove the RBAC that was added in Step 3

    $ oc delete -f config/rbac/role.yaml
    $ oc delete -f config/rbac/role_binding.yaml
    $ oc delete -f config/rbac/service_account.yaml
    
  4. Remove the Namespace created in Step 2 if it is no longer needed (optional)

    $ oc delete project your-project
    
  5. Remove the CRD (optional)

    WARNING: Ensure that there are no other application instances in any other namespace in the cluster since this is a cluster-wide change.

    $ oc delete -f config/crd/bases/manageiq.org_manageiqs.yaml
    

Troubleshooting

Check readiness of the pods

Note: Please allow a few minutes for the application to start. Initial install and upgrades take extra time due to database initialization. HTTPS traffic will not be served until pods are running and ready. The READY column denotes the number of replicas and their readiness state. The operator deploys the following pods:

$ oc get pods
NAME                                     READY     STATUS    RESTARTS   AGE
httpd-754985464b-4dzzx                   1/1       Running   0          37s
orchestrator-5997776478-vx4v9            1/1       Running   0          37s
memcached-696479b955-67fs6               1/1       Running   0          37s
postgresql-5f954fdbd5-tnlmf              1/1       Running   0          37s

The orchestrator waits for memcached and postgresql to be ready, migrates and seeds the database, then it will begin to start worker pods. After a few minutes you can see the initial set of worker pods has been deployed and the user interface should be accessible.

$ oc get pods
NAME                                     READY     STATUS    RESTARTS   AGE
event-handler-747574c54c-xpcvf           1/1       Running   0          32m
generic-55cc84f79d-gwf5v                 1/1       Running   0          32m
generic-55cc84f79d-w4vzs                 1/1       Running   0          32m
httpd-754985464b-4dzzx                   1/1       Running   0          37m
orchestrator-5997776478-vx4v9            1/1       Running   0          37m
memcached-696479b955-67fs6               1/1       Running   0          37m
postgresql-5f954fdbd5-tnlmf              1/1       Running   0          37m
priority-7b6666cdcd-5hkkm                1/1       Running   0          32m
priority-7b6666cdcd-rcf7l                1/1       Running   0          32m
remote-console-6958c4cc7b-5kmmj          1/1       Running   0          32m
reporting-85c8488848-p5fb6               1/1       Running   0          32m
reporting-85c8488848-z7kjp               1/1       Running   0          32m
schedule-6fd7bc5688-ptsxp                1/1       Running   0          32m
ui-5b8c86f6f9-jhd9w                      1/1       Running   0          32m
web-service-858f55f55d-5tmcr             1/1       Running   0          32m

Under normal circumstances the entire first time deployment process should take around ~10 minutes. Any issues can be seen by examining the deployment events and pod logs. The time can vary significantly based on performance of the environment, however, subsequent boots should be much faster. Version upgrades can also be time consuming depending on the size of the database and the number and difficulty of migrations that need to be applied. Progress can be monitored in the orchestrator logs.

Obtain host information from ingress/route

An ingress should have been deployed. If enabling internal SSL and running in OpenShift, a route is created.

$ oc get ingress
NAME      HOSTS                              ADDRESS   PORTS     AGE
httpd     my-app.apps.example.com                     80, 443   56s
$ oc get routes
NAME          HOST/PORT                          PATH      SERVICES   PORT      TERMINATION     WILDCARD
httpd-qlvmj   my-app.apps.example.com           /         httpd      80        edge/Redirect   None

Examine output and point your web browser to the reported URL/HOST.

Get a shell on the MIQ pod

On the application pods, source the container_env before attempting any rails or rake commands.

$ oc rsh orchestrator-78786d6b44-gcqdq
sh-4.4$ cd /var/www/miq/vmdb/
sh-4.4$ source ./container_env

Allow docker.io/manageiq images in kubernetes

Depending on your cluster’s configuration, kubernetes may not allow deployment of images from docker.io/manageiq. If so, deploying the operator may raise an error:

Error from server: error when creating "deploy/operator.yaml": admission webhook "trust.hooks.securityenforcement.admission.xxx" denied the request:
Deny "docker.io/manageiq/manageiq-operator:latest", no matching repositories in ClusterImagePolicy and no ImagePolicies in the "YYY" namespace

To allow images from docker.io/manageiq, edit the clusterimagepolicies and add docker.io/manageiq/* to the list of allowed repositories:

kubectl edit clusterimagepolicies $(kubectl get clusterimagepolicies --no-headers | awk{print $1})

For example:

...
spec:
  repositories:
  ...
  - name: docker.io/icpdashdb/*
  - name: docker.io/istio/proxyv2:*
  - name: docker.io/library/busybox:*
  - name: docker.io/manageiq/*
  ...

After saving this change, docker.io/manageiq image deployment should now be allowed.