Docker & Kubernetes - MigratoryData

Run a single MigratoryData instance

To start a single MigratoryData instance and allow clients to connect to that instance on the port 8800, install Docker and run the following command:

$ docker pull migratorydata/server:latest
$ docker run \
-d --name my_migratorydata -p 8800:8800 \
migratorydata/server:latest

If the hardware of your test/development environment is based on Apple M1/M2 CPU, use the following command instead of the previous one:

docker run --platform linux/amd64 \
-d --name my_migratorydata -p 8800:8800 \
migratorydata/server:latest

You should now be able to connect to http://yourhostname:8800 and run the demo app provided with the MigratoryData server (where yourhostname is the dns name or ip address of the machine where you installed this MigratoryData instance resolvable and reachable from your browsers).

You can see the logs of the container using the following command:

$ docker logs my_migratorydata

To stop and remove the container use:

$ docker stop my_migratorydata
$ docker rm my_migratorydata

Custom Configuration

It is possible to customize every aspect of the MigratoryData server running in a Docker container as described below.

MIGRATORYDATA_EXTRA_OPTS

Each parameter of the MigratoryData server can be defined or can override a parameter defined in the default configuration file by setting it as an extra option in the MIGRATORYDATA_EXTRA_OPTS environment variable, using the syntax:

-Dparameter=value

Note that no single or double quotes are used to define neither the parameter, nor its value. In addition, the space character is NOT allowed when defining the value of a parameter.

The complete list of parameters is available in the Configuration Guide.

Java Options

There are three environment variables MIGRATORYDATA_JAVA_GC_LOG_OPTS, MIGRATORYDATA_JAVA_GC_OPTS, and MIGRATORYDATA_JAVA_EXTRA_OPTS which can be used to customize the garbage collection logging options, the garbage collectors to be used, and respectively various other Java options.

All of these Java options come with default values which are sufficient for most use cases. Therefore, in most instances, it is not necessary to define these Java-related environment variables.

Adding a License Key

To use a license key with this image, override the parameter LicenseKeyof the default configuration file using an extra options as follows:

$ docker run -d -e MIGRATORYDATA_EXTRA_OPTS='-DLicenseKey=yourlicensekey' \
--name my_migratorydata -p 8800:8800 migratorydata/server:latest

where yourlicensekey is the license key obtained from MigratoryData for evaluation, test, or production usages.

Enabling JMX Monitoring

To enable the JMX monitoring for MigratoryData, you should define the JMX related parameters as usual and publish the JMX port to the host as follows:

$ docker run -d 
-e MIGRATORYDATA_EXTRA_OPTS='-DLicenseKey=yourlicensekey -DMonitor=JMX -DMonitorUsername=admin -DMonitorPassword=pass \
-DMonitorJMX.Listen=*:3000 -DMonitorJMX.Authentication=true -DMonitorJMX.Encryption=false' \
--name my_migratorydata -p 8800:8800 -p 3000:3000 migratorydata/server:latest

You should now be able to connect with any JMX client to yourhostname:3000 using the credentials defined here admin/pass. Please note that in order to access the JMX monitoring with Java’s jconsole JMX client, you will need to provide two Java extra options using MIGRATORYDATA_JAVA_EXTRA_OPTS as follows:

$ docker run -d 
-e MIGRATORYDATA_EXTRA_OPTS='-DLicenseKey=yourlicensekey -DMonitor=JMX -DMonitorUsername=admin -DMonitorPassword=pass \
-e MIGRATORYDATA_JAVA_EXTRA_OPTS='-Djava.net.preferIPv4Stack=true -Djava.rmi.server.hostname=yourhostname' \
-DMonitorJMX.Listen=*:3000 -DMonitorJMX.Authentication=true -DMonitorJMX.Encryption=false' \
--name my_migratorydata -p 8800:8800 -p 3000:3000 migratorydata/server:latest

Logging

Besides the logs provided to the standard output which are accessible with docker logs my_migratorydata, this image also writes the logs to a folder which defaults to /migratorydata/logs folder. You can change the log folder location, as follows:

$ docker run -d -e MIGRATORYDATA_EXTRA_OPTS='-DLicenseKey=yourlicensekey \
-DLogFolder=/myvolume/migratorydata/logs` -p 8800:8800 migratorydata/server:latest

You can use any parameter related to logging, including verbosity, rotation, compression, etc. For example, to record the access logs, use:

$ docker run -d -e MIGRATORYDATA_EXTRA_OPTS='-DLicenseKey=yourlicensekey \
-DLogFolder=/myvolume/migratorydata/logs \
-DAccessLog=true` -p 8800:8800 migratorydata/server:latest

Extensions

In order to deploy one or more extensions for the MigratoryData server you should mount a volume with the extensions into the MigratoryData standard extensions folder which is /migratorydata/extensions.

For example, supposing you developed an entitlement extension using MigratoryData Entitlement Extension API and deployed extension.jar to the (persistent) folder /myvolume/migratorydata/extensions, then, in order to load this entitlement extension, run:

$ docker run -d -e MIGRATORYDATA_EXTRA_OPTS='-DLicenseKey=yourlicensekey -DEntitlement=Custom' \
-v /myvolume/migratorydata/extensions:/migratorydata/extensions \
--name mymigratorydata -p 8800:8800 migratorydata/server:latest

Alternatively, you might load your entitlement extension by creating a new image derived from migratorydata as follows:

FROM migratorydata
COPY extension.jar /migratorydata/extensions/extension.jar

Then, build with docker build -t custom_migratorydata . and run:

$ docker run --name my_custom_migratorydata -d custom_migratorydata

MigratoryData clustering on Kubernetes

Here is an example configuration which can be used to deploy a cluster of three MigratoryData servers on Kubernetes.

#
# Service used by the MigratoryData cluster to communicate with the clients
#
apiVersion: v1
kind: Service
metadata:
  name: migratorydata-cs
  # uncomment the next two lines to deploy the cluster using Application Gateway
  #annotations:
  #  service.beta.kubernetes.io/azure-load-balancer-internal: "true"
  labels:
    app: migratorydata
spec:
  type: LoadBalancer
  ports:
    - name: client-port
      port: 80
      protocol: TCP
      targetPort: 8800
  selector:
    app: migratorydata
---
#
# Headless service used for inter-cluster communication
#
apiVersion: v1
kind: Service
metadata:
  name: migratorydata-hs
  labels:
    app: migratorydata
spec:
  clusterIP: None
  ports:
    - name: inter-cluster1
      port: 8801
      protocol: TCP
      targetPort: 8801
    - name: inter-cluster2
      port: 8802
      protocol: TCP
      targetPort: 8802
    - name: inter-cluster3
      port: 8803
      protocol: TCP
      targetPort: 8803
    - name: inter-cluster4
      port: 8804
      protocol: TCP
      targetPort: 8804
  selector:
    app: migratorydata
---
apiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
  name: migratorydata-pdb
spec:
  minAvailable: 3 # The value must be equal or higher than the number of seed members 🅐
  selector:
    matchLabels:
      app: migratorydata
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: migratorydata
spec:
  selector:
    matchLabels:
      app: migratorydata
  serviceName: migratorydata-hs
  replicas: 3 # The desired number of cluster members 🅑
  updateStrategy:
    type: RollingUpdate
  podManagementPolicy: OrderedReady
  template:
    metadata:
      labels:
        app: migratorydata
    spec:
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
            - weight: 1
              podAffinityTerm:
                labelSelector:
                  matchExpressions:
                    - key: "app"
                      operator: In
                      values:
                        - migratorydata
                topologyKey: "kubernetes.io/hostname"
      containers:
        - name: migratorydata-cluster
          imagePullPolicy: Always
          image: migratorydata/server:latest
          env:
            - name: MIGRATORYDATA_JAVA_EXTRA_OPTS
              value: "-XX:+UnlockExperimentalVMOptions -XX:+UseCGroupMemoryLimitForHeap"
            - name: MIGRATORYDATA_EXTRA_OPTS
              value: "-DMemory=128MB \
                -DClusterDeliveryMode=Guaranteed \
                -DLogLevel=INFO \
                -DX.ConnectionOffload=true \
                -DClusterSeedMemberCount=3" # Define the number of seed members 🅒
          command:
            - bash
            - "-c"
            - |
              set -x

              HOST=`hostname -s`
              DOMAIN=`hostname -d`

              CLUSTER_PORT=8801
              MAX_REPLICAS=5 # Define the maximum number of cluster members 🅓

              if [[ $HOST =~ (.*)-([0-9]+)$ ]]; then
                  NAME=${BASH_REMATCH[1]}
              fi

              CLUSTER_MEMBER_LISTEN=$HOST.$DOMAIN:$CLUSTER_PORT
              echo $CLUSTER_MEMBER_LISTEN
              MIGRATORYDATA_EXTRA_OPTS="$MIGRATORYDATA_EXTRA_OPTS -DClusterMemberListen=$CLUSTER_MEMBER_LISTEN"

              CLUSTER_MEMBERS=""
              for (( i=1; i < $MAX_REPLICAS; i++ ))
              do
                  CLUSTER_MEMBERS="$CLUSTER_MEMBERS$NAME-$((i-1)).$DOMAIN:$CLUSTER_PORT,"
              done
              CLUSTER_MEMBERS="$CLUSTER_MEMBERS$NAME-$((MAX_REPLICAS-1)).$DOMAIN:$CLUSTER_PORT"
              echo $CLUSTER_MEMBERS
              MIGRATORYDATA_EXTRA_OPTS="$MIGRATORYDATA_EXTRA_OPTS -DClusterMembers=$CLUSTER_MEMBERS"

              echo $MIGRATORYDATA_EXTRA_OPTS
              export MIGRATORYDATA_EXTRA_OPTS

              ./start-migratorydata.sh              
          resources:
            requests:
              memory: "256Mi"
              cpu: "0.5"
          ports:
            - name: client-port
              containerPort: 8800
            - name: inter-cluster1
              containerPort: 8801
            - name: inter-cluster2
              containerPort: 8802
            - name: inter-cluster3
              containerPort: 8803
            - name: inter-cluster4
              containerPort: 8804
          readinessProbe:
            tcpSocket:
              port: 8800
            initialDelaySeconds: 10
            periodSeconds: 5
          livenessProbe:
            tcpSocket:
              port: 8800
            initialDelaySeconds: 10
            periodSeconds: 5

The manifest above contains a Headless Service, a Service, a PodDisruptionBudget, and a StatefulSet. The service is used to handle the clients of the MigratoryData cluster. The headless service is used for inter-cluster communication which provides the DNS records corresponding to the instances of the MigratoryData cluster.

In order to deploy the MigratoryData cluster on Kubernetes, copy the example configuration above into a file named, say, migratorydata-cluster.yaml and run the following command:

$ kubectl apply -f migratorydata-cluster.yaml

By running the following command, you are checking that the three pods of the example configuration are up and running:

$ kubectl get pods
NAME              READY   STATUS    RESTARTS   AGE
migratorydata-0   1/1     Running   0          2m52s
migratorydata-1   1/1     Running   0          2m40s
migratorydata-2   1/1     Running   0          2m25s

and you can check the logs of each cluster member by running a command as follows:

$ kubectl logs migratorydata-0

Also, by running the following command, you can check that the two services of the example configuration are up and running:

$ kubectl get svc
NAME               TYPE           CLUSTER-IP    EXTERNAL-IP    PORT(S)                               AGE
kubernetes         ClusterIP      10.0.0.1      <none>         443/TCP                               37m
migratorydata-cs   LoadBalancer   10.0.58.189   YourExternalIP   80:31596/TCP                          4m8s
migratorydata-hs   ClusterIP      None          <none>         8801/TCP,8802/TCP,8803/TCP,8804/TCP   4m7s

You should now be able to connect to http://YourExternalIP and run the demo app provided with each MigratoryData server of the cluster (where YourExternalIP is the external IP address assigned by Kubernetes to your client service, which can be obtained as shown above, using the command kubectl get svc).

Finally, let’s make a few remarks about the configuration above. In the configuration above, we used the default port 8800 for listening for clients. This port is mapped into the port 80 by the client service. The port 8801 is used by each cluster member for its parameter ClusterMember.Listen, and therefore, like in traditional deployments, this port 8801 together with the next three consecutive ports, i.e. 8802, 8803, and 8804, are reserved for inter-cluster communication.

Also, note that we’ve used the MIGRATORYDATA_EXTRA_OPTS environment variable to customize each cluster member as explained more in detail in the section “Custom Configuration” above.

The MIGRATORYDATA_JAVA_EXTRA_OPTS environment variable is also used to provide Java options which will cause the JVM to take into account the hardware resources assigned by each pod.

MigratoryData Scaling on Kubernetes

Before reading this section, it is recommended to read about the Elasticity feature of MigratoryData.

Note that in the example YAML above, we defined with the shell variable MAX_REPLICAS 🅓 a cluster with a maximum number of members of 5. However, only the first 3 members of the cluster are created as specified by the replicas field 🅑, which is sufficient to satisfy the minimum criteria specified with the minAvailable field 🅐. Also note that the number of seed members has been configured to 3 using the ClusterSeedMemberCount parameter 🅒, and therefore the requirement that the value of the minAvailable field 🅐 to be equal or higher than the number of seed members is satisfied.

In this example, two additional cluster members could be added, then removed either manually or using autoscaling according to the load of the system.

Manual Scaling

In the example above, you can scale out the cluster of the three members up to five members by modifying the value of the replicas field 🅑. For example, if the load of your system increases substantially, and supposing your nodes have enough resources available, you can add two new members to the cluster by modifying the replicas field as follows:

$ kubectl scale statefulsets migratorydata --replicas=5

Then, if the load of your system decreases, then you might remove one member from the cluster by modifying the replicas field as follows:

$ kubectl scale statefulsets migratorydata --replicas=4

Note that you cannot assign to the replicas field a value which is neither higher than the maximum number of members defined by the shell variable MAX_REPLICAS 🅓, nor smaller than the minimum number of cluster members defined by the minAvailable field 🅐.

Autoscaling

Manual scaling is practical if the load of your system changes gradually. Otherwise, you can use the autoscaling feature of Kubernetes.

Kubernetes can monitor the load of your system, typically expressed in CPU usage, and scale your MigratoryData cluster up and down by automatically modifying the replicas field.

In the example above, to add one or more new members up to a maximum of 5 cluster members if the CPU usage of the existing members becomes higher than 50%, or remove one or more of the existing members if the CPU usage of the existing members becomes lower than 50%, use the following command:

$ kubectl autoscale statefulset migratorydata --cpu-percent=50 --min=3 --max=5

Alternatively, you can use a YAML manifest as follows:

apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
  name: migratorydata-autoscale # you can use any name here
spec:
  maxReplicas: 5
  minReplicas: 3
  scaleTargetRef:
    apiVersion: apps/v1
    kind: StatefulSet
    name: migratorydata 
  targetCPUUtilizationPercentage: 50

Save it as a file named for example migratorydata_autoscale.yaml, then execute it as follows:

kubectl apply -f migratorydata_autoscale.yaml

Now, you can display information about the autoscaler object above using the following command:

kubectl get hpa

and display CPU usage of cluster members with:

kubectl top pods

While testing cluster autoscaling, it is important to be aware that the Kubenetes autoscaler gets the CPU usage periodically from the cluster members, so autoscaling might appear to be not immediate. However, this is normal Kubernetes behavior.

Node Autoscaling

Cloud infrastructure like Azure Kubernetes Service (AKS) can be configured to automatically spin up additional nodes if the resources of the existing nodes are insufficient to create new cluster members.

For example, to create an ASK cluster of minimum 3 nodes and maximum 5 nodes and enable node autoscaling, use something as follows:

# install AKS client
$ az aks install-cli

# login to ASK
az login

# create a resource group
$ az group create --name myMigratoryDataGroup --location eastus

# create the cluster and enable the cluster autoscaling
$ az aks create \
  --resource-group myMigratoryDataGroup \
  --name myMigratoryDataCluster \
  --node-count 3 \
  --vm-set-type VirtualMachineScaleSets \
  --enable-integrations monitoring \
  --generate-ssh-keys \
  --load-balancer-sku standard \
  --enable-cluster-autoscaler \
  --min-count 3 \
  --max-count 5

# connect to the ASK cluster
$ az aks get-credentials --resource-group myMigratoryDataGroup --name myMigratoryDataCluster

# check the AKS cluster nodes
$ kubectl get nodes

Finally, when you don’t need the AKS cluster of nodes, delete it:

$ az group delete --name myMigratoryDataGroup --yes --no-wait

For more details, please refer to the ASK documentation at:

https://docs.microsoft.com/en-us/azure/aks/cluster-autoscaler

While testing node autoscaling, it is important to be aware that adding or disposing nodes from AKS might take some time. For example, to evict an unused node back to AKS might take up to several minutes. Here are more details about time granularity on AKS cluster node autoscaling:

https://docs.microsoft.com/en-us/azure/aks/cluster-autoscaler#using-the-autoscaler-profile

Node Failure Testing

MigratoryData clustering tolerates a number of cluster member to be down or to fail as detailed in the Elasticity section of the Architecture Guide.

In order to test an AKS node failure, use:

kubectl drain <node-name> --force --delete-local-data --ignore-daemonsets

Then, to start an AKS node, use:

kubectl uncordon <node-name>

License

View the license information for the MigratoryData software contained in this Docker image.