This guide describes the installation of MigratoryData KE using Docker and Kubernetes.

Supported tags and respective Dockerfile links

Tag Meaning
latest Image for the latest version of MigratoryData KE
<version> Image for the specified version of MigratoryData KE (e.g. 6.0.1)

Quick reference

What is MigratoryData KE?

MigratoryData - Kafka Edition (MigratoryData KE) is an ultra-scalable Websocket publish/subscribe broker which integrates natively with Apache Kafka. Thanks to its huge vertical scalability (1000x more scalable than the C10K problem) and linear horizontal scalability both for publishers and subscribers, MigratoryData KE can extend Kafka messaging to any number of Web and Mobile users or IoT devices - and do so cost-effectively.

Please visit the product page and the Kafka solution page for more information.

How to use this image

MigratoryData KE with Docker

To start a single instance of the latest version of MigratoryData KE and allow clients to connect to that instance on the port 8800, install Docker and run the following command:

$ docker run -d --name my_migratorydata_ke \
-p 8800:8800 migratorydata/migratorydata-ke:latest

You should now be able to connect to http://[my_host]:8800 and run the demo app bundled with MigratoryData KE, where [my_host] is the address of the machine where you installed MigratoryData KE.

You can see the logs of the container using the following command:

$ docker logs my_migratorydata_ke

To stop and remove the container use:

$ docker stop my_migratorydata_ke
$ docker rm my_migratorydata_ke

Customization

It is possible to customize every aspect of MigratoryData KE running in a Docker container using certain environment variables such as:

  • MIGRATORYDATA_EXTRA_OPTS
  • MIGRATORYDATA_KAFKA_EXTRA_OPTS
  • MIGRATORYDATA_JAVA_EXTRA_OPTS

which are explained in the Configuration Guide.

Below you can find some examples about how to customize MigratoryData KE using these environment variables.

Adding a License Key

To use a license key with this image, override the parameter LicenseKey of the default configuration file with an extra options using the MIGRATORYDATA_EXTRA_OPTS environment variable as follows:

$ docker run -d --name my_migratorydata_ke \
-e MIGRATORYDATA_EXTRA_OPTS='-DLicenseKey=demolicensekey' \
-p 8800:8800 migratorydata/migratorydata-ke:latest

where yourlicensekey is the license key obtained from MigratoryData for evaluation, test, or production usage.

Integrate with Apache Kafka

In order to integrate with Apache Kafka deployed on the same Docker network, say my_migratorydata_network, override the default parameters with the extra options bootstrap.servers and topics using the MIGRATORYDATA_KAFKA_EXTRA_OPTS environment as follows:

# create a docker network
$ docker network create --driver bridge my_migratorydata_network

# install Zookeeper required by Apache Kafka 
# using the digitalwonderland/zookeeper Docker image
$ docker run -d --name my_zk \
--network my_migratorydata_network \
-p 2181:2181 -p 2888:2888 -p 3888:3888 digitalwonderland/zookeeper

# install Apache Kafka 
# using the wurstmeister/kafka Docker image
docker run -d --name my_kafka \
--network my_migratorydata_network \
-e KAFKA_ZOOKEEPER_CONNECT=my_zk:2181 \
-e KAFKA_ADVERTISED_HOST_NAME=my_kafka \
-e KAFKA_ADVERTISED_PORT=9092 \
-p 9092:9092 wurstmeister/kafka

# install MigratoryData KE 
docker run -d --name my_migratorydata_ke \
--network my_migratorydata_network \
-e MIGRATORYDATA_EXTRA_OPTS='-DLicenseKey=demolicensekey' \
-e MIGRATORYDATA_KAFKA_EXTRA_OPTS='-Dbootstrap.servers=my_kafka:9092 -Dtopics=vehicles' \
-p 8800:8800 migratorydata/migratorydata-ke:latest

You can check the status of the Zookeeper, Kafka, and MigratoryData KE services deployed above, stop them, and remove them, as well as the Docker network created above, using:

# check the status of the services
docker ps

# stop the services
docker stop my_zk my_kafka my_migratorydata_ke

# remove the services
docker rm my_zk my_kafka my_migratorydata_ke

# remove the network
docker network rm my_migratorydata_network

Enabling JMX Monitoring

To enable the JMX monitoring for MigratoryData, you should define the JMX related parameters as usual and publish the JMX port 3000 to the host as follows:

$ docker run -d --name my_migratorydata_ke \
-e MIGRATORYDATA_EXTRA_OPTS='-DLicenseKey=demolicensekey \
-DMonitor=JMX -DMonitorJMX.Listen=*:3000 \
-DMonitorUsername=admin -DMonitorPassword=pass \
-DMonitorJMX.Authentication=true -DMonitorJMX.Encryption=false' \
-p 3000:3000 -p 8800:8800 migratorydata/migratorydata-ke:latest

You should now be able to connect with any JMX client to [my_host]:3000 using the credentials defined here admin/pass.

Note that in order to access the JMX monitoring with Java’s jconsole JMX client, you will need to provide two Java extra options using the MIGRATORYDATA_JAVA_EXTRA_OPTS environment variable as follows:

$ docker run -d --name my_migratorydata_ke \
-e MIGRATORYDATA_EXTRA_OPTS='-DLicenseKey=demolicensekey \
-DMonitor=JMX -DMonitorJMX.Listen=*:3000 \
-DMonitorUsername=admin -DMonitorPassword=pass \
-DMonitorJMX.Authentication=true -DMonitorJMX.Encryption=false' \
-e MIGRATORYDATA_JAVA_EXTRA_OPTS='-Djava.net.preferIPv4Stack=true \
-Djava.rmi.server.hostname=[my_host]' \
-p 3000:3000 -p 8800:8800 migratorydata-ke:latest

Extensions

To deploy one or more extensions of MigratoryData KE, you should mount a volume with the extensions into the standard folder for extensions which is /migratorydata/extensions.

For example, supposing that you developed an entitlement extension authorization.jar using the Plugin SDK for Authorization and deployed to the folder of the host machine /myvolume/migratorydata/extensions, then to load it into the docker container, run:

$ docker run -d --name my_migratorydata_ke \
-e MIGRATORYDATA_EXTRA_OPTS='-DEntitlement=Custom' \
-v /myvolume/migratorydata/extensions:/migratorydata/extensions \
-p 8800:8800 migratorydata/migratorydata-ke:latest

MigratoryData KE with Kubernetes

In this section we explain how to deploy a cluster of MigratoryData KE using Azure Kubernetes Service (AKS) which connects to a Kafka service provided by Azure Event Hubs. Of course, you can adapt the instructions below to use a Kubernetes service and a Kafka service from other provider, or even deployed locally.

Variables

Let’s use the following variables for this tutorial:

export RESOURCE_GROUP=rg-migratorydata-ke
export AKS_CLUSTER=aks-migratorydata-ke
export EVENTHUBS_NAMESPACE=evhns-migratorydata-ke
export EVENTHUBS_TOPIC=vehicles

Create an AKS cluster

Let’s create a Kubernetes cluster of at least 3 nodes and at most 5 nodes.

# login to AKS
az login

# create the resource group
az group create --name $RESOURCE_GROUP --location eastus

# create the cluster and enable the cluster autoscaling
az aks create \
  --resource-group $RESOURCE_GROUP \
  --name $AKS_CLUSTER \
  --node-count 3 \
  --vm-set-type VirtualMachineScaleSets \
  --enable-addons monitoring \
  --generate-ssh-keys \
  --load-balancer-sku standard \
  --enable-cluster-autoscaler \
  --min-count 3 \
  --max-count 5

Connect to the AKS cluster and check if the AKS nodes are ready

# connect to the AKS cluster
az aks get-credentials \
--resource-group $RESOURCE_GROUP \
--name $AKS_CLUSTER

# check if the nodes of the cluster are up
kubectl get nodes

Create an Event Hubs topic

To create a Kafka topic, run the following commands:

# create a namespace into the Event Hubs
az eventhubs namespace create --name $EVENTHUBS_NAMESPACE \
--resource-group $RESOURCE_GROUP -l eastus

# create the Kafka topic
az eventhubs eventhub create --name $EVENTHUBS_TOPIC \
--resource-group $RESOURCE_GROUP \
--namespace-name $EVENTHUBS_NAMESPACE

Authenticate to Event Hubs with SASL using JAAS

# fetch the Event Hubs rule/policy and pick the value of the "name" 
# attribute from the JSON response of the following command
az eventhubs namespace authorization-rule list \
--resource-group $RESOURCE_GROUP \
--namespace-name $EVENTHUBS_NAMESPACE

# suppose the policy picked above is RootManageSharedAccessKey, 
# then pick the value of the attribute "primaryConnectionString" 
# from the JSON response of the following command
az eventhubs namespace authorization-rule keys list \
--resource-group $RESOURCE_GROUP \
--namespace-name $EVENTHUBS_NAMESPACE \
--name RootManageSharedAccessKey

The value of the attribute primaryConnectionString picked from the response of the last command should look as follows:

'Endpoint=sb://evhns-migratorydata-ke.servicebus.windows.net/;SharedAccessKeyName=RootManageSharedAccessKey;SharedAccessKey=xxxxxxxxxxxxxx/yyyyyyyyyyyyyyyyyyyyyyyyyyyy=","primaryKey": "xxxxxxxxxxxxxx/yyyyyyyyyyyyyyyyyyyyyyyyyyyy='

Therefore, the JAAS config to authenticate Event Hubs with SASL should look as follows:

KafkaClient {
        org.apache.kafka.common.security.plain.PlainLoginModule required \
        username="$ConnectionString" \
        password='Endpoint=sb://evhns-migratorydata-ke.servicebus.windows.net/;SharedAccessKeyName=RootManageSharedAccessKey;SharedAccessKey=xxxxxxxxxxxxxx/yyyyyyyyyyyyyyyyyyyyyyyyyyyy=","primaryKey": "xxxxxxxxxxxxxx/yyyyyyyyyyyyyyyyyyyyyyyyyyyy='
};

Copy the JAAS config to a file, say jaas.config. We will need this configuration later to connect a Kafka consumer and producer to to Event Hubs with SASL.

Encode the JAAS config as base64

Because the JAAS config obtained in the previous step include spaces but the [Configuration Guide](extra options) used to customize MigratoryData KE cannot contain spaces, we should convert the JAAS config to base64 as follows (note that the semicolon after password is mandatory):

export BASE64_JAAS_CONFIG=`echo -n 'org.apache.kafka.common.security.plain.PlainLoginModule required username="$ConnectionString" password="Endpoint=sb://evhns-migratorydata-ke.servicebus.windows.net/;SharedAccessKeyName=RootManageSharedAccessKey;SharedAccessKey=xxxxxxxxxxxxxx/yyyyyyyyyyyyyyyyyyyyyyyyyyyy=";' | base64`

Deploy MigratoryData KE cluster

Copy the following Kubernetes configuration a file, say cluster.yaml and edit it by replacing the variables EVENTHUBS_NAMESPACE, EVENTHUBS_TOPIC, BASE64_JAAS_CONFIG with the values defined or obtained above:

apiVersion: v1
kind: Service
metadata:
  name: migratorydata-ke-cs
  labels:
    app: migratorydata-ke
spec:
  type: LoadBalancer
  ports:
    - name: client-port
      port: 80
      protocol: TCP
      targetPort: 8800
  selector:
    app: migratorydata-ke
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: migratorydata-ke
spec:
  selector:
    matchLabels:
      app: migratorydata-ke
  replicas: 3
  template:
    metadata:
      labels:
        app: migratorydata-ke
    spec:
      containers:
        - name: migratorydata-ke-cluster
          imagePullPolicy: Always
          image: migratorydata/migratorydata-ke:latest
          env:
            - name: MIGRATORYDATA_JAVA_EXTRA_OPTS
              value: "-XX:+UnlockExperimentalVMOptions -XX:+UseCGroupMemoryLimitForHeap"
            - name: MIGRATORYDATA_EXTRA_OPTS
              value: "-DMemory=128MB \
                -DLogLevel=INFO \
                -DX.ConnectionOffload=true"
            - name: MIGRATORYDATA_KAFKA_EXTRA_OPTS
              value: "-Dbootstrap.servers=$EVENTHUBS_NAMESPACE.servicebus.windows.net:9093 \
                -Dtopics=$EVENTHUBS_TOPIC \
                -Dsecurity.protocol=SASL_SSL \
                -Dsasl.mechanism=PLAIN \
                -Dsasl.jaas.config=$BASE64_JAAS_CONFIG"
          resources:
            requests:
              memory: "256Mi"
              cpu: "0.5"
          ports:
            - name: client-port
              containerPort: 8800
          readinessProbe:
            tcpSocket:
              port: 8800
            initialDelaySeconds: 10
            periodSeconds: 5
          livenessProbe:
            tcpSocket:
              port: 8800
            initialDelaySeconds: 10
            periodSeconds: 5

The manifest above contains a Service and a Deployment. The Service is used to handle the clients of the cluster over the port 80.

In order to deploy the cluster of MigratoryData KE on the AKS cluster, run the following command:

$ kubectl apply -f cluster.yaml

Check if the pods are up and running

By running the following command, you check that the three pods of the example configuration are up and running:

kubectl get pods
NAME                                READY   STATUS    RESTARTS   AGE
migratorydata-ke-57848575bd-4tnbz   1/1     Running   0          4m32s
migratorydata-ke-57848575bd-gjmld   1/1     Running   0          4m32s
migratorydata-ke-57848575bd-tcbtf   1/1     Running   0          4m32s

and you can check the logs of each cluster member by running a command as follows:

kubectl logs migratorydata-ke-57848575bd-4tnbz

Also, by running the following command, you can check that the service is up and running:

kubectl get svc
NAME                  TYPE           CLUSTER-IP    EXTERNAL-IP   PORT(S)        AGE
kubernetes            ClusterIP      10.0.0.1      <none>        443/TCP        37m
migratorydata-ke-cs   LoadBalancer   10.0.90.187   YourExternalIP    80:30546/TCP   5m27s

You should now be able to connect to http://YourExternalIP and run the demo app bundled with MigratoryData KE.

Scaling MigratoryData KE on AKS

The stateless nature of the MigratoryData KE clustering where all cluster members are independent highly simplifies the horizontal scaling on AKS.

Manual Scaling

For example, if the load of your system increases substantially, and supposing your nodes have enough resources available, you can add two new members to the cluster by modifying the replicas field as follows:

kubectl scale deployment migratorydata-ke --replicas=5 

Then, if the load of your system decreases significantly, then you might remove three members from the cluster by modifying the replicas field as follows:

kubectl scale deployment migratorydata-ke --replicas=2

Autoscaling

Manual scaling is practical if the load of your system changes gradually. Otherwise, you can use the autoscaling feature of Kubernetes.

Kubernetes can monitor the load of your system, typically expressed in CPU usage, and scale up and down your MigratoryData cluster by automatically modifying the replicas field.

In the example above, to add one or more new members up to a maximum of 5 cluster members if the CPU usage of the existing members becomes higher than 50%, or remove one or more of the existing members if the CPU usage of the existing members becomes lower than 50%, use the following command:

$ kubectl autoscale deployment migratorydata-ke --cpu-percent=50 --min=2 --max=5

Now, you can display information about the autoscaler object above using the following command:

kubectl get hpa

and display CPU usage of cluster members with:

kubectl top pods

While testing cluster autoscaling, it is important to be aware that the Kubenetes’s autoscaler gets the CPU usage periodically from the cluster members, so autoscaling might appear to be not immediate. However, this is a normal Kubernetes behavior.

Node Autoscaling

Cloud infrastructure like Azure Kubernetes Service (AKS) can be configured to automatically spin up additional nodes if the resources of the existing nodes are insufficient to create new cluster members. In the example above, we have created the AKS cluster with a minimum of 3 nodes and a maximum of 5 nodes and enabled node autoscaling with the command that we recall here:

# create the cluster and enable the cluster autoscaling
az aks create \
  --resource-group $RESOURCE_GROUP \
  --name $AKS_CLUSTER \
  --node-count 3 \
  --vm-set-type VirtualMachineScaleSets \
  --enable-addons monitoring \
  --generate-ssh-keys \
  --load-balancer-sku standard \
  --enable-cluster-autoscaler \
  --min-count 3 \
  --max-count 5

While testing node autoscaling, it is important to be aware that adding or disposing nodes from AKS might take some time. For example, to evict an unused node back to AKS might take up to several minutes. Here are more details about time granularity on AKS cluster node autoscaling:

https://docs.microsoft.com/en-us/azure/aks/cluster-autoscaler#using-the-autoscaler-profile

Node Failure Testing

MigratoryData KE clustering tolerates node failures provided that at least one node remain up and running. In order to test an AKS node failure, use:

kubectl drain <node-name> --force --delete-local-data --ignore-daemonsets

Then, to start an AKS node, use:

kubectl uncordon <node-name>

Maintenance

If something gets wrong with the commands above, you can remove the MigratoryData KE cluster with the command below and try again:

kubectl delete -f cluster.yaml

Finally when you don’t need anymore the AKS cluster and the Event Hubs topic for testing, remove the resources allocated with:

az group delete --name $RESOURCE_GROUP --yes --no-wait

License

View the license information for the MigratoryData KE software contained in this Docker image.