This tutorial shows how to deploy a MigratoryData cluster — with Kafka support, in conjunction with Azure Event Hubs, using Azure Kubernetes Service (AKS).

Prerequisites

Before deploying MigratoryData on AKS, ensure that you have a Microsoft Azure account and have installed the following tools:

Shell variables

Let’s use the following shell variables for this tutorial:

export RESOURCE_GROUP=rg-migratorydata
export AKS_CLUSTER=aks-migratorydata
export EVENTHUBS_NAMESPACE=evhns-migratorydata
export EVENTHUBS_TOPIC=server

Create an AKS cluster

Login to AKS:

az login

Create a new resource group:

az group create --name $RESOURCE_GROUP --location eastus

Create an AKS cluster with at least three and at most five nodes, while also activating cluster autoscaling:

az aks create \
  --resource-group $RESOURCE_GROUP \
  --name $AKS_CLUSTER \
  --node-count 3 \
  --vm-set-type VirtualMachineScaleSets \
  --generate-ssh-keys \
  --load-balancer-sku standard \
  --enable-cluster-autoscaler \
  --min-count 3 \
  --max-count 5

Connect to the AKS cluster:

az aks get-credentials \
--resource-group $RESOURCE_GROUP \
--name $AKS_CLUSTER

Check if the nodes of the AKS cluster are up:

kubectl get nodes

Create namespace

Create a namespace migratory for all the resources created for this environment by copying the following to a file migratory-namespace.yaml:

apiVersion: v1
kind: Namespace
metadata:
  name: migratory

Then, execute the command:

kubectl apply -f migratory-namespace.yaml

Azure Event Hubs

Create an Azure Event Hubs topic

First, create a namespace into the Event Hubs:

az eventhubs namespace create --name $EVENTHUBS_NAMESPACE \
--resource-group $RESOURCE_GROUP -l eastus

Create a Kafka topic on Azure Event Hubs as follows:

az eventhubs eventhub create --name $EVENTHUBS_TOPIC \
--resource-group $RESOURCE_GROUP \
--namespace-name $EVENTHUBS_NAMESPACE

Authenticate to Azure Event Hubs with SASL using JAAS

Fetch the Event Hubs rule/policy and get the value of the name attribute from the JSON response of the following command:

az eventhubs namespace authorization-rule list \
--resource-group $RESOURCE_GROUP \
--namespace-name $EVENTHUBS_NAMESPACE

Suppose the policy got above is RootManageSharedAccessKey, then get the value of the attribute primaryConnectionString from the JSON response of the following command:

az eventhubs namespace authorization-rule keys list \
--resource-group $RESOURCE_GROUP \
--namespace-name $EVENTHUBS_NAMESPACE \
--name RootManageSharedAccessKey

The value of the attribute primaryConnectionString from the response of the last command should look as follows:

Endpoint=sb://evhns-migratorydata.servicebus.windows.net/;SharedAccessKeyName=RootManageSharedAccessKey;SharedAccessKey=xxxxxxxxxxxxxxxxx

Therefore, the JAAS config to authenticate to Azure Event Hubs with SASL should look as follows:

KafkaClient {
        org.apache.kafka.common.security.plain.PlainLoginModule required
        username="$ConnectionString"
        password="Endpoint=sb://evhns-migratorydata.servicebus.windows.net/;SharedAccessKeyName=RootManageSharedAccessKey;SharedAccessKey=xxxxxxxxxxxxxxxxx";
};

Copy the JAAS config to a file jaas.config. We will need this configuration later to connect a Kafka consumer and producer to the Azure Event Hubs with SASL.

Create a secret from the JAAS config file

Because the JAAS config file obtained in the previous step must be included in the pod configuration, we should create a secret from jaas.config which will be mounted as a volume in Kubernetes:

kubectl create secret generic migratory-secret \
--from-file=jaas.config -n migratory

Deploy

We will use the following Kubernetes manifest to build a cluster of three MigratoryData servers. The following command will update the variables $EVENTHUBS_NAMESPACE, $EVENTHUBS_TOPIC and it will create a file named migratorydata-cluster.yaml:

cat > migratorydata-cluster.yaml <<EOL
apiVersion: v1
kind: Service
metadata:
  namespace: migratory
  name: migratorydata-cs
  labels:
    app: migratorydata
spec:
  type: LoadBalancer
  ports:
    - name: client-port
      port: 80
      protocol: TCP
      targetPort: 8800
  selector:
    app: migratorydata
---
apiVersion: apps/v1
kind: Deployment
metadata:
  namespace: migratory
  name: migratorydata
spec:
  selector:
    matchLabels:
      app: migratorydata
  replicas: 3
  template:
    metadata:
      labels:
        app: migratorydata
    spec:
      containers:
      - name: migratorydata-cluster
        imagePullPolicy: Always
        image: migratorydata/server:latest
        volumeMounts:
        - name: migratory-secret
          mountPath: "/migratorydata/secrets/jaas.config"
          subPath: jaas.config
          readOnly: true
        env:
          - name: MIGRATORYDATA_EXTRA_OPTS
            value: "-DMemory=128MB \
              -DLogLevel=INFO \
              -DX.ConnectionOffload=true \
              -DClusterEngine=kafka"
          - name: MIGRATORYDATA_KAFKA_EXTRA_OPTS
            value: "-Dbootstrap.servers=$EVENTHUBS_NAMESPACE.servicebus.windows.net:9093 \
              -Dtopics=$EVENTHUBS_TOPIC \
              -Dsecurity.protocol=SASL_SSL \
              -Dsasl.mechanism=PLAIN
              -Djava.security.auth.login.config=/migratorydata/secrets/jaas.config"
          - name: MIGRATORYDATA_JAVA_EXTRA_OPTS
            value: "-XX:+UnlockExperimentalVMOptions -XX:+UseCGroupMemoryLimitForHeap"
        resources:
          requests:
            memory: "256Mi"
            cpu: "0.5"
        ports:
          - name: client-port
            containerPort: 8800
        readinessProbe:
          tcpSocket:
            port: 8800
          initialDelaySeconds: 10
          periodSeconds: 5
        livenessProbe:
          tcpSocket:
            port: 8800
          initialDelaySeconds: 10
          periodSeconds: 5
      volumes:
      - name: migratory-secret
        secret:
          secretName: migratory-secret
EOL

This manifest contains a Service and a Deployment. The Service is used to handle the clients of the MigratoryData cluster over the port 80.

In this manifest, we’ve used the MIGRATORYDATA_EXTRA_OPTS environment variable which can be used to define specific parameters or adjust the default value of any parameter listed in the Configuration Guide. In this manifest, we’ve used this environment variable to modify the default values of the parameters such as Memory. Additionally, we’ve employed it to modify the default value of the parameter ClusterEngine, to enable the Kafka native add-on.

To customize the MigratoryData’s native add-on for Kafka, the environment variable MIGRATORYDATA_KAFKA_EXTRA_OPTS offers the flexibility to define specific parameters or adjust the default value of any parameter of the Kafka native add-on. In the manifest above, we’ve used this environment variable to modify the default values of the parameters bootstrap.servers and topics among others to connect to Azure Event Hubs.

To deploy the MigratoryData cluster, copy this manifest to a file migratorydata-cluster.yaml, update the variables $EVENTHUBS_NAMESPACE and $EVENTHUBS_TOPIC in the file and run the command:

kubectl apply -f migratorydata-cluster.yaml

Namespace switch

Because the deployment concerns the namespace migratory, switch to this namespace as follows:

kubectl config set-context --current --namespace=migratory

To return to the default namespace, run:

kubectl config set-context --current --namespace=default

Verify the deployment

Check the running pods to ensure the migratorydata pods are running:

kubectl get pods 

The output of this command should include something similar to the following:

NAME                                READY   STATUS    RESTARTS   AGE
migratorydata-57848575bd-4tnbz   1/1     Running   0          4m32s
migratorydata-57848575bd-gjmld   1/1     Running   0          4m32s
migratorydata-57848575bd-tcbtf   1/1     Running   0          4m32s

You can check the logs of each cluster member by running a command as follows:

kubectl logs migratorydata-57848575bd-4tnbz

Test installation

Now, you can check that the service of the manifest above is up and running:

kubectl get svc

You should see an output similar to the following:

NAME                  TYPE           CLUSTER-IP    EXTERNAL-IP   PORT(S)        AGE
migratorydata-cs   LoadBalancer   10.0.39.44   YourExternalIP   80:32210/TCP   17s

You should now be able to connect to the address assigned by AKS to the load balancer service under the column EXTERNAL-IP. In this case the external IP address is YourExternalIP and the port is 80. Open in your browser the corresponding URL http://YourExternalIP. You should see a welcome page that features a demo application under the Debug Console menu for publishing to and consuming real-time messages from the MigratoryData cluster.

Scaling

The stateless nature of the MigratoryData cluster when deployed in conjunction with Azure Event Hubs, where each cluster member is independent from the others, highly simplifies the horizontal scaling on AKS.

Manual scaling up

For example, if the load of your system increases substantially, and supposing your nodes have enough resources available, you can add two new members to the cluster by modifying the replicas field as follows:

kubectl scale deployment migratorydata --replicas=5 

Manual scaling down

If the load of your system decreases significantly, then you might remove three members from the cluster by modifying the replicas field as follows:

kubectl scale deployment migratorydata --replicas=2

Autoscaling

Manual scaling is practical if the load of your system changes gradually. Otherwise, you can use the autoscaling feature of Kubernetes.

Kubernetes can monitor the load of your system, typically expressed in CPU usage, and scale your MigratoryData cluster up and down by automatically modifying the replicas field.

In the example above, to add one or more new members up to a maximum of 5 cluster members if the CPU usage of the existing members becomes higher than 50%, or remove one or more of the existing members if the CPU usage of the existing members becomes lower than 50%, use the following command:

kubectl autoscale deployment migratorydata \
--cpu-percent=50 --min=3 --max=5

Alternatively, you can use a YAML manifest as follows:

apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
  namespace: migratory
  name: migratorydata-autoscale # you can use any name here
spec:
  maxReplicas: 5
  minReplicas: 3
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: migratorydata 
  targetCPUUtilizationPercentage: 50

Save it as a file named for example migratorydata-autoscale.yaml, then execute it as follows:

kubectl apply -f migratorydata-autoscale.yaml

Now, you can display information about the autoscaler object above using the following command:

kubectl get hpa

and display CPU usage of cluster members with:

kubectl top pods

While testing cluster autoscaling, it is important to understand that the Kubernetes autoscaler periodically retrieves CPU usage information from the cluster members. As a result, the autoscaling process may not appear instantaneous, but this delay aligns with the normal behavior of Kubernetes.

Node Failure Testing

MigratoryData clustering tolerates a number of cluster member to be down or to fail as detailed in the Clustering section.

In order to test an AKS node failure, use:

kubectl drain <node-name> --force --delete-local-data \
--ignore-daemonsets

Then, to start an AKS node, use:

kubectl uncordon <node-name>

Uninstall

Delete the Kubernetes resources created for this deployment with:

kubectl delete -f migratory-namespace.yaml

Go back to default namespace:

kubectl config set-context --current --namespace=default

Finally, when you don’t need anymore the AKS cluster of nodes, delete it:

az group delete --name $RESOURCE_GROUP --yes --no-wait

Build realtime apps

First, please read the documentation of the Kafka native add-on to understand the automatic mapping between MigratoryData subjects and Kafka topics.

Utilize MigratoryData’s client APIs to create real-time applications that communicate with your MigratoryData cluster via Azure Event Hubs.

Also, employ the APIs or tools of Azure Event Hubs to generate real-time messages, which are subsequently delivered to MigratoryData’s clients. Similarly, consume real-time messages from Azure Event Hubs that originate from MigratoryData’s clients.