This tutorial shows how to deploy a MigratoryData cluster — with Kafka support, in conjunction with Apache Kafka, using Kubernetes.
Prerequisites
Before deploying MigratoryData, ensure that you have installed Minikube, a tool for quickly setting up local Kubernetes clusters.
Start Minikube as follows:
minikube start
Check the Kubernetes dashboard as follows:
minikube dashboard
Create namespace
Create a namespace migratory
for all the resources created for this environment by copying the following to a file migratory-namespace.yaml
:
apiVersion: v1
kind: Namespace
metadata:
name: migratory
Then, execute the command:
kubectl apply -f migratory-namespace.yaml
Deploy
Kafka
We will use the following Kubernetes manifest to build a cluster of one Kafka server:
apiVersion: v1
kind: Service
metadata:
name: kafka-service
namespace: migratory
spec:
ports:
- port: 9092
name: kafka-port
selector:
app: kafka
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: kafka
namespace: migratory
spec:
replicas: 1
selector:
matchLabels:
app: kafka
template:
metadata:
labels:
app: kafka
spec:
containers:
- name: kafka
image: bitnami/kafka:latest
ports:
- containerPort: 9092
env:
- name: KAFKA_ENABLE_KRAFT
value: "yes"
- name: KAFKA_CFG_NODE_ID
value: "0"
- name: KAFKA_CFG_PROCESS_ROLES
value: "broker,controller"
- name: KAFKA_CFG_CONTROLLER_LISTENER_NAMES
value: "CONTROLLER"
- name: KAFKA_CFG_LISTENERS
value: "PLAINTEXT://:9092,CONTROLLER://:9093"
- name: KAFKA_CFG_LISTENER_SECURITY_PROTOCOL_MAP
value: "CONTROLLER:PLAINTEXT,PLAINTEXT:PLAINTEXT,EXTERNAL:PLAINTEXT"
- name: KAFKA_CFG_ADVERTISED_LISTENERS
value: "PLAINTEXT://kafka-service.migratory.svc.cluster.local:9092"
- name: KAFKA_CFG_CONTROLLER_QUORUM_VOTERS
value: "0@127.0.0.1:9093"
- name: ALLOW_PLAINTEXT_LISTENER
value: "yes"
volumeMounts:
- name: data
mountPath: /bitnami/kafka
volumes:
- name: data
emptyDir: {}
To deploy the Kafka cluster, copy this manifest to a file kafka.yaml
, and run:
kubectl apply -f kafka.yaml
Several environment variables, such as KAFKA_KFG_*
, are used to customize Kafka for the purposes of this
tutorial. For further details on these variables, you can consult the
Bitnami package documentation.
Deploy a MigratoryData cluster
We will use the following Kubernetes manifest to build a cluster of one MigratoryData server. In the following sections, we will explore how to scale the cluster up and down, and how to enable the autoscaling feature of Kubernetes.
---
#
# Service used by the MigratoryData cluster to communicate with the clients
#
apiVersion: v1
kind: Service
metadata:
namespace: migratory
name: migratorydata-cs
labels:
app: migratorydata
spec:
type: LoadBalancer
ports:
- name: client-port
port: 8888
protocol: TCP
targetPort: 8800
selector:
app: migratorydata
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: migratorydata
namespace: migratory
spec:
selector:
matchLabels:
app: migratorydata
replicas: 1 # The desired number of cluster members 🅑
template:
metadata:
labels:
app: migratorydata
spec:
containers:
- name: migratorydata-cluster
imagePullPolicy: Always
image: migratorydata/server:latest
env:
- name: MIGRATORYDATA_EXTRA_OPTS
value: "-DMemory=128MB \
-DClusterEngine=kafka \
-DLogLevel=INFO \
-DX.ConnectionOffload=true"
- name: MIGRATORYDATA_KAFKA_EXTRA_OPTS
value: "-Dbootstrap.servers=kafka-service.migratory.svc.cluster.local:9092 -Dtopics=server"
- name: MIGRATORYDATA_JAVA_GC_LOG_OPTS
value: "-XX:+PrintCommandLineFlags -XX:+PrintGC -XX:+PrintGCDetails -XX:+DisableExplicitGC -Dsun.rmi.dgc.client.gcInterval=0x7ffffffffffffff0 -Dsun.rmi.dgc.server.gcInterval=0x7ffffffffffffff0 -verbose:gc"
resources:
requests:
memory: "256Mi"
cpu: "0.5"
ports:
- name: client-port
containerPort: 8800
- name: prometheus-port
containerPort: 9988
readinessProbe:
tcpSocket:
port: 8800
initialDelaySeconds: 20
failureThreshold: 5
periodSeconds: 5
livenessProbe:
tcpSocket:
port: 8800
initialDelaySeconds: 10
failureThreshold: 5
periodSeconds: 5
This manifest contains a Service, and a Deployment. The Service is used to handle the clients of the MigratoryData cluster.
In this manifest, we’ve used the MIGRATORYDATA_EXTRA_OPTS
environment variable which can be used to define specific parameters or adjust the default value of any parameter listed in
the Configuration Guide. In this manifest, we’ve used this environment variable to modify
the default values of certain parameters such as Memory. Additionally,
we’ve employed it to modify the default value of the parameter ClusterEngine
, to enable the
Kafka native add-on.
To customize the MigratoryData’s native add-on for Kafka, the environment variable
MIGRATORYDATA_KAFKA_EXTRA_OPTS offers the flexibility to
define specific parameters or adjust the default value of any parameter of the
Kafka native add-on. In the manifest above, we’ve used
this environment variable to modify the default values of the parameters bootstrap.servers
and topics
to connect
to the Kafka cluster deployed earlier to listen on the port 9092
, and consume the Kafka topic server
.
To deploy the MigratoryData cluster, copy this manifest to a file migratorydata-cluster.yaml
, and run the command:
kubectl apply -f migratorydata-cluster.yaml
Namespace switch
Because the deployment concerns the namespace migratory
, switch to this namespace as follows:
kubectl config set-context --current --namespace=migratory
To return to the default namespace, run:
kubectl config set-context --current --namespace=default
Verify installation
Check the running pods to ensure the migratorydata
and kafka
pods are running:
kubectl get pods
The output of this command should include something similar to the following:
NAME READY STATUS RESTARTS AGE
kafka-0 1/1 Running 0 89s
migratorydata-6447f9c7cb-c9s8g 1/1 Running 0 66s
You can check the logs of the pods running a command as follows:
kubectl logs migratorydata-6447f9c7cb-c9s8g
Test installation
In order to expose the load balancer Service of the manifest above, we can use the minikube tunnel command as follows:
minikube tunnel
Now, you can check that the Service of Docker manifest above is up and running:
kubectl get svc
You should see an output similar to the following:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kafka-service ClusterIP 10.43.244.197 <none> 9092/TCP 2m8s
migratorydata-cs LoadBalancer 10.43.237.196 127.0.0.1 8888:31735/TCP 105s
You should now be able to connect to the address assigned by Kubernetes to the load balancer service under the column
EXTERNAL-IP
. In this case the external IP address is 127.0.0.1
and the port is 8888
. Open in your browser the
corresponding URL http://127.0.0.1:8888
. You should see a welcome page that features a demo application under the
Debug Console menu for publishing to and consuming real-time messages from the MigratoryData cluster.
Scaling
Manual scaling up
In the example above, we deployed a cluster with a single MigratoryData server. You can deploy more MigratoryData
servers in the cluster by modifying the value of the replicas
field 🅑. For example to scale up the cluster to
three members, run:
kubectl scale deployment migratorydata --replicas=3
Manual scaling down
If the load of your system decreases, then you might remove one member from the cluster by modifying the replicas
field as follows:
kubectl scale deployment migratorydata --replicas=2
Autoscaling
Manual scaling is practical if the load of your system changes gradually. Otherwise, you can use the autoscaling feature of Kubernetes.
Kubernetes can monitor the load of your system, typically expressed in CPU usage, and scale your MigratoryData cluster
up and down by automatically modifying the replicas
field.
In the example above, to add one or more new members up to a maximum of 5
cluster members if the CPU usage of the
existing members becomes higher than 50%, or remove one or more of the existing members provided that at least three
members remain active if the CPU usage of the existing members becomes lower than 50%, use the following command:
kubectl autoscale deployment migratorydata \
--cpu-percent=50 --min=3 --max=5
Alternatively, you can use a YAML manifest as follows:
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
namespace: migratory
name: migratorydata-autoscale # you can use any name here
spec:
maxReplicas: 5
minReplicas: 3
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: migratorydata
targetCPUUtilizationPercentage: 50
Save it to a file migratorydata-autoscale.yaml
, and run:
kubectl apply -f migratorydata-autoscale.yaml
Now, you can display information about the autoscaler object above using the following command:
kubectl get hpa
While testing cluster autoscaling, it is important to understand that the Kubernetes autoscaler periodically retrieves CPU usage information from the cluster members. As a result, the autoscaling process may not appear instantaneous, but this delay aligns with the normal behavior of Kubernetes.
Uninstall
Delete the Kubernetes resources created for this deployment with:
kubectl delete -f migratory-namespace.yaml
Go back to default namespace:
kubectl config set-context --current --namespace=default
Build realtime apps
First, please read the documentation of the Kafka native add-on to understand the automatic mapping between MigratoryData subjects and Kafka topics.
Utilize MigratoryData’s client APIs to create real-time applications that communicate with your MigratoryData cluster via your Kafka cluster.
Also, employ Kafka’s APIs or tools to generate real-time messages destined for Kafka, which are subsequently delivered to MigratoryData’s clients. Similarly, consume real-time messages from Kafka that originate from MigratoryData’s clients.