This tutorial shows how to deploy a MigratoryData cluster — with Kafka support, in conjunction with Azure Event Hubs, using Azure Kubernetes Service (AKS).
Prerequisites
Before deploying MigratoryData on AKS, ensure that you have a Microsoft Azure account and have installed the following tools:
Shell variables
Let’s use the following shell variables for this tutorial:
export RESOURCE_GROUP=rg-migratorydata
export AKS_CLUSTER=aks-migratorydata
export EVENTHUBS_NAMESPACE=evhns-migratorydata
export EVENTHUBS_TOPIC=server
Create an AKS cluster
Login to AKS:
az login
Create a new resource group:
az group create --name $RESOURCE_GROUP --location eastus
Create an AKS cluster with at least three and at most five nodes, while also activating cluster autoscaling:
az aks create \
--resource-group $RESOURCE_GROUP \
--name $AKS_CLUSTER \
--node-count 3 \
--vm-set-type VirtualMachineScaleSets \
--generate-ssh-keys \
--load-balancer-sku standard \
--enable-cluster-autoscaler \
--min-count 3 \
--max-count 5
Connect to the AKS cluster:
az aks get-credentials \
--resource-group $RESOURCE_GROUP \
--name $AKS_CLUSTER
Check if the nodes of the AKS cluster are up:
kubectl get nodes
Create namespace
Create a namespace migratory
for all the resources created for this environment by copying the following to a file migratory-namespace.yaml
:
apiVersion: v1
kind: Namespace
metadata:
name: migratory
Then, execute the command:
kubectl apply -f migratory-namespace.yaml
Azure Event Hubs
Create an Azure Event Hubs topic
First, create a namespace into the Event Hubs:
az eventhubs namespace create --name $EVENTHUBS_NAMESPACE \
--resource-group $RESOURCE_GROUP -l eastus
Create a Kafka topic on Azure Event Hubs as follows:
az eventhubs eventhub create --name $EVENTHUBS_TOPIC \
--resource-group $RESOURCE_GROUP \
--namespace-name $EVENTHUBS_NAMESPACE
Authenticate to Azure Event Hubs with SASL using JAAS
Fetch the Event Hubs rule/policy and get the value of the name
attribute from the JSON response of the following command:
az eventhubs namespace authorization-rule list \
--resource-group $RESOURCE_GROUP \
--namespace-name $EVENTHUBS_NAMESPACE
Suppose the policy got above is RootManageSharedAccessKey
, then get the value of the attribute primaryConnectionString
from the JSON response of the following command:
az eventhubs namespace authorization-rule keys list \
--resource-group $RESOURCE_GROUP \
--namespace-name $EVENTHUBS_NAMESPACE \
--name RootManageSharedAccessKey
The value of the attribute primaryConnectionString
from the response of the last command should look as follows:
Endpoint=sb://evhns-migratorydata.servicebus.windows.net/;SharedAccessKeyName=RootManageSharedAccessKey;SharedAccessKey=xxxxxxxxxxxxxxxxx
Therefore, the JAAS config to authenticate to Azure Event Hubs with SASL should look as follows:
KafkaClient {
org.apache.kafka.common.security.plain.PlainLoginModule required
username="$ConnectionString"
password="Endpoint=sb://evhns-migratorydata.servicebus.windows.net/;SharedAccessKeyName=RootManageSharedAccessKey;SharedAccessKey=xxxxxxxxxxxxxxxxx";
};
Copy the JAAS config to a file jaas.config
. We will need this configuration later to connect a Kafka consumer and producer to the Azure Event Hubs with SASL.
Create a secret from the JAAS config file
Because the JAAS config file obtained in the previous step must be included in the pod configuration, we should create a secret from jaas.config
which will be mounted as a volume in Kubernetes:
kubectl create secret generic migratory-secret \
--from-file=jaas.config -n migratory
Deploy
We will use the following Kubernetes manifest to build a cluster of three MigratoryData servers. The following command
will update the variables $EVENTHUBS_NAMESPACE
, $EVENTHUBS_TOPIC
and it will create a file named migratorydata-cluster.yaml
:
cat > migratorydata-cluster.yaml <<EOL
apiVersion: v1
kind: Service
metadata:
namespace: migratory
name: migratorydata-cs
labels:
app: migratorydata
spec:
type: LoadBalancer
ports:
- name: client-port
port: 80
protocol: TCP
targetPort: 8800
selector:
app: migratorydata
---
apiVersion: apps/v1
kind: Deployment
metadata:
namespace: migratory
name: migratorydata
spec:
selector:
matchLabels:
app: migratorydata
replicas: 3
template:
metadata:
labels:
app: migratorydata
spec:
containers:
- name: migratorydata-cluster
imagePullPolicy: Always
image: migratorydata/server:latest
volumeMounts:
- name: migratory-secret
mountPath: "/migratorydata/secrets/jaas.config"
subPath: jaas.config
readOnly: true
env:
- name: MIGRATORYDATA_EXTRA_OPTS
value: "-DMemory=128MB \
-DLogLevel=INFO \
-DX.ConnectionOffload=true \
-DClusterEngine=kafka"
- name: MIGRATORYDATA_KAFKA_EXTRA_OPTS
value: "-Dbootstrap.servers=$EVENTHUBS_NAMESPACE.servicebus.windows.net:9093 \
-Dtopics=$EVENTHUBS_TOPIC \
-Dsecurity.protocol=SASL_SSL \
-Dsasl.mechanism=PLAIN
-Djava.security.auth.login.config=/migratorydata/secrets/jaas.config"
- name: MIGRATORYDATA_JAVA_GC_LOG_OPTS
value: "-XX:+PrintCommandLineFlags -XX:+PrintGC -XX:+PrintGCDetails -XX:+DisableExplicitGC -Dsun.rmi.dgc.client.gcInterval=0x7ffffffffffffff0 -Dsun.rmi.dgc.server.gcInterval=0x7ffffffffffffff0 -verbose:gc"
resources:
requests:
memory: "256Mi"
cpu: "0.5"
ports:
- name: client-port
containerPort: 8800
readinessProbe:
tcpSocket:
port: 8800
initialDelaySeconds: 20
failureThreshold: 5
periodSeconds: 5
livenessProbe:
tcpSocket:
port: 8800
initialDelaySeconds: 10
failureThreshold: 5
periodSeconds: 5
volumes:
- name: migratory-secret
secret:
secretName: migratory-secret
EOL
This manifest contains a Service and a Deployment. The Service is used to
handle the clients of the MigratoryData cluster over the port 80
.
In this manifest, we’ve used the MIGRATORYDATA_EXTRA_OPTS
environment variable which can be used to define specific parameters or adjust the default value of any parameter listed in
the Configuration Guide. In this manifest, we’ve used this environment variable to modify
the default values of the parameters such as Memory. Additionally,
we’ve employed it to modify the default value of the parameter ClusterEngine
, to enable the
Kafka native add-on.
To customize the MigratoryData’s native add-on for Kafka, the environment variable
MIGRATORYDATA_KAFKA_EXTRA_OPTS offers the flexibility to
define specific parameters or adjust the default value of any parameter of the Kafka native add-on. In the manifest above, we’ve used
this environment variable to modify the default values of the parameters bootstrap.servers
and topics
among others
to connect to Azure Event Hubs.
To deploy the MigratoryData cluster, copy this manifest to a file migratorydata-cluster.yaml
, update the variables
$EVENTHUBS_NAMESPACE
and $EVENTHUBS_TOPIC
in the file and run the command:
kubectl apply -f migratorydata-cluster.yaml
Namespace switch
Because the deployment concerns the namespace migratory
, switch to this namespace as follows:
kubectl config set-context --current --namespace=migratory
To return to the default namespace, run:
kubectl config set-context --current --namespace=default
Verify the deployment
Check the running pods to ensure the migratorydata
pods are running:
kubectl get pods
The output of this command should include something similar to the following:
NAME READY STATUS RESTARTS AGE
migratorydata-57848575bd-4tnbz 1/1 Running 0 4m32s
migratorydata-57848575bd-gjmld 1/1 Running 0 4m32s
migratorydata-57848575bd-tcbtf 1/1 Running 0 4m32s
You can check the logs of each cluster member by running a command as follows:
kubectl logs migratorydata-57848575bd-4tnbz
Test installation
Now, you can check that the service of the manifest above is up and running:
kubectl get svc
You should see an output similar to the following:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
migratorydata-cs LoadBalancer 10.0.39.44 YourExternalIP 80:32210/TCP 17s
You should now be able to connect to the address assigned by AKS to the load balancer service under the column
EXTERNAL-IP
. In this case the external IP address is YourExternalIP
and the port is 80
. Open in your browser the
corresponding URL http://YourExternalIP
. You should see a welcome page that features a demo application under the
Debug Console menu for publishing to and consuming real-time messages from the MigratoryData cluster.
Scaling
The stateless nature of the MigratoryData cluster when deployed in conjunction with Azure Event Hubs, where each cluster member is independent from the others, highly simplifies the horizontal scaling on AKS.
Manual scaling up
For example, if the load of your system increases substantially, and supposing your nodes have enough resources
available, you can add two new members to the cluster by modifying the replicas
field as follows:
kubectl scale deployment migratorydata --replicas=5
Manual scaling down
If the load of your system decreases significantly, then you might remove three members from the cluster by modifying
the replicas
field as follows:
kubectl scale deployment migratorydata --replicas=2
Autoscaling
Manual scaling is practical if the load of your system changes gradually. Otherwise, you can use the autoscaling feature of Kubernetes.
Kubernetes can monitor the load of your system, typically expressed in CPU usage, and scale your MigratoryData cluster
up and down by automatically modifying the replicas
field.
In the example above, to add one or more new members up to a maximum of 5
cluster members if the CPU usage of the
existing members becomes higher than 50%, or remove one or more of the existing members if the CPU usage of the existing
members becomes lower than 50%, use the following command:
kubectl autoscale deployment migratorydata \
--cpu-percent=50 --min=3 --max=5
Alternatively, you can use a YAML manifest as follows:
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
namespace: migratory
name: migratorydata-autoscale # you can use any name here
spec:
maxReplicas: 5
minReplicas: 3
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: migratorydata
targetCPUUtilizationPercentage: 50
Save it as a file named for example migratorydata-autoscale.yaml
, then execute it as follows:
kubectl apply -f migratorydata-autoscale.yaml
Now, you can display information about the autoscaler object above using the following command:
kubectl get hpa
and display CPU usage of cluster members with:
kubectl top pods
While testing cluster autoscaling, it is important to understand that the Kubernetes autoscaler periodically retrieves CPU usage information from the cluster members. As a result, the autoscaling process may not appear instantaneous, but this delay aligns with the normal behavior of Kubernetes.
Node Failure Testing
MigratoryData clustering tolerates a number of cluster member to be down or to fail as detailed in the Clustering section.
In order to test an AKS node failure, use:
kubectl drain <node-name> --force --delete-local-data \
--ignore-daemonsets
Then, to start an AKS node, use:
kubectl uncordon <node-name>
Uninstall
Delete the Kubernetes resources created for this deployment with:
kubectl delete -f migratory-namespace.yaml
Go back to default namespace:
kubectl config set-context --current --namespace=default
Finally, when you don’t need anymore the AKS cluster of nodes, delete it:
az group delete --name $RESOURCE_GROUP --yes --no-wait
Build realtime apps
First, please read the documentation of the Kafka native add-on to understand the automatic mapping between MigratoryData subjects and Kafka topics.
Utilize MigratoryData’s client APIs to create real-time applications that communicate with your MigratoryData cluster via Azure Event Hubs.
Also, employ the APIs or tools of Azure Event Hubs to generate real-time messages, which are subsequently delivered to MigratoryData’s clients. Similarly, consume real-time messages from Azure Event Hubs that originate from MigratoryData’s clients.