This tutorial shows how to deploy a MigratoryData cluster using Elastic Kubernetes Service (EKS).

Prerequisites

Before deploying MigratoryData on EKS, ensure that you have an AWS account and have installed the following tools:

Shell variables

To avoid a hardcoded name of the EKS cluster, let’s define an environment variables as follows:

export EKS_CLUSTER=eks-migratorydata

Create an EKS cluster

Login to AWS with the following command and follow the instructions on the screen to configure your AWS credentials:

aws configure

Create an EKS cluster with at least three and at most five nodes:

  • Create cluster configuration file. For NLB load balancer to work we need to change the parameter awsLoadBalancerController from false to true:
eksctl create cluster --name=$EKS_CLUSTER \
--nodes-min=3 \
--nodes-max=5 \
--region=us-east-1 \
--zones=us-east-1a,us-east-1b \
--ssh-access=true \
--dry-run | sed 's/awsLoadBalancerController: false/awsLoadBalancerController: true/g' > cluster-config.yaml
  • Create the cluster:
eksctl create cluster -f cluster-config.yaml

Check if the EKS nodes are ready with the following command:

kubectl get nodes

Install a load balancer

Install an AWS load balancer controller as follows:

helm repo add "eks" "https://aws.github.io/eks-charts"
helm repo update
helm upgrade -i aws-load-balancer-controller \
eks/aws-load-balancer-controller \
--namespace kube-system \
--set clusterName=$EKS_CLUSTER

Create namespace

Create a namespace migratory for all the resources created for this environment by copying the following to a file migratory-namespace.yaml:

apiVersion: v1
kind: Namespace
metadata:
  name: migratory

Then, execute the command:

kubectl apply -f migratory-namespace.yaml

Create NLB service

Create a NLB service to balance the traffic from clients across the MigratoryData’s cluster members using the following YAML:

#
# Service used by the MigratoryData cluster to communicate with the clients
#
apiVersion: v1
kind: Service
metadata:
  namespace: migratory
  name: migratorydata-cs
  labels:
    app: migratorydata
  annotations:
    service.beta.kubernetes.io/aws-load-balancer-type: external
    service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: ip
    service.beta.kubernetes.io/aws-load-balancer-scheme: internet-facing
spec:
  type: LoadBalancer
  ports:
    - name: client-port
      port: 80
      protocol: TCP
      targetPort: 8800
  selector:
    app: migratorydata

Copy this YAML to a file, say nlb-service.yaml, and run:

kubectl apply -f nlb-service.yaml

Deploy MigratoryData

We will use the following Kubernetes manifest to build a cluster of three MigratoryData servers:

#
# Headless service used for inter-cluster communication
#
apiVersion: v1
kind: Service
metadata:
  name: migratorydata-hs
  namespace: migratory
  labels:
    app: migratorydata
spec:
  clusterIP: None
  ports:
    - name: inter-cluster1
      port: 8801
      protocol: TCP
      targetPort: 8801
    - name: inter-cluster2
      port: 8802
      protocol: TCP
      targetPort: 8802
    - name: inter-cluster3
      port: 8803
      protocol: TCP
      targetPort: 8803
    - name: inter-cluster4
      port: 8804
      protocol: TCP
      targetPort: 8804
  publishNotReadyAddresses: true
  selector:
    app: migratorydata
---
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  namespace: migratory
  name: migratorydata-pdb
spec:
  minAvailable: 3 # The value must be equal or higher than the number of seed members 🅐
  selector:
    matchLabels:
      app: migratorydata
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: migratorydata
  namespace: migratory
  labels:
    app: migratorydata
spec:
  selector:
    matchLabels:
      app: migratorydata
  serviceName: migratorydata-hs
  replicas: 3 # The desired number of cluster members 🅑
  updateStrategy:
    type: RollingUpdate
  podManagementPolicy: OrderedReady
  template:
    metadata:
      labels:
        app: migratorydata
    spec:
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
            - weight: 1
              podAffinityTerm:
                labelSelector:
                  matchExpressions:
                    - key: "app"
                      operator: In
                      values:
                        - migratorydata
                topologyKey: "kubernetes.io/hostname"
      containers:
        - name: migratorydata
          imagePullPolicy: Always
          image: migratorydata/server:latest
          env:
            - name: MIGRATORYDATA_EXTRA_OPTS
              value: "-DMemory=128MB \
                -DClusterDeliveryMode=Guaranteed \
                -DLogLevel=INFO \
                -DX.ConnectionOffload=true \
                -DClusterSeedMemberCount=3" # Define the number of s 🅒
            - name: MIGRATORYDATA_JAVA_GC_LOG_OPTS
              value: "-XX:+PrintCommandLineFlags -XX:+PrintGC -XX:+PrintGCDetails -XX:+DisableExplicitGC -Dsun.rmi.dgc.client.gcInterval=0x7ffffffffffffff0 -Dsun.rmi.dgc.server.gcInterval=0x7ffffffffffffff0 -verbose:gc"
          command:
            - bash
            - "-c"
            - |
              set -x

              HOST=`hostname -s`
              DOMAIN=`hostname -d`

              CLUSTER_PORT=8801
              MAX_REPLICAS=5 # Define the maximum number of cluster members 🅓

              if [[ $HOST =~ (.*)-([0-9]+)$ ]]; then
                  NAME=${BASH_REMATCH[1]}
              fi

              CLUSTER_MEMBER_LISTEN=$HOST.$DOMAIN:$CLUSTER_PORT
              echo $CLUSTER_MEMBER_LISTEN
              MIGRATORYDATA_EXTRA_OPTS="$MIGRATORYDATA_EXTRA_OPTS -DClusterMemberListen=$CLUSTER_MEMBER_LISTEN"

              CLUSTER_MEMBERS=""
              for (( i=1; i < $MAX_REPLICAS; i++ ))
              do
                  CLUSTER_MEMBERS="$CLUSTER_MEMBERS$NAME-$((i-1)).$DOMAIN:$CLUSTER_PORT,"
              done
              CLUSTER_MEMBERS="$CLUSTER_MEMBERS$NAME-$((MAX_REPLICAS-1)).$DOMAIN:$CLUSTER_PORT"
              echo $CLUSTER_MEMBERS
              MIGRATORYDATA_EXTRA_OPTS="$MIGRATORYDATA_EXTRA_OPTS -DClusterMembers=$CLUSTER_MEMBERS"

              echo $MIGRATORYDATA_EXTRA_OPTS
              export MIGRATORYDATA_EXTRA_OPTS

              ./start-migratorydata.sh              
          resources:
            requests:
              memory: "256Mi"
              cpu: "0.5"
          ports:
            - name: client-port
              containerPort: 8800
            - name: inter-cluster1
              containerPort: 8801
            - name: inter-cluster2
              containerPort: 8802
            - name: inter-cluster3
              containerPort: 8803
            - name: inter-cluster4
              containerPort: 8804
          readinessProbe:
            tcpSocket:
              port: 8800
            initialDelaySeconds: 60
            failureThreshold: 5
            periodSeconds: 5
          livenessProbe:
            tcpSocket:
              port: 8800
            initialDelaySeconds: 10
            failureThreshold: 5
            periodSeconds: 5

This manifest contains a Headless Service, a PodDisruptionBudget, and a StatefulSet. The Headless Service is used for inter-cluster communication, providing the DNS records corresponding to the instances of the MigratoryData cluster

In this manifest, we’ve used the MIGRATORYDATA_EXTRA_OPTS environment variable which can be used to define specific parameters or adjust the default value of any parameter listed in the Configuration Guide. In this manifest, we’ve used this environment variable to modify the default values of the parameters such as Memory, ClusterDeliveryMode, etc. Additionally, we’ve employed it to specify the ClusterMemberListen parameter, setting the port to 8801 for inter-cluster communication, and defined the ClusterMembers parameter to establish an ordered list of cluster members.

For client connections, we’ve maintained the default value of the Listen parameter, which is 8800. Furthermore, the NLB service create above maps this port to port number 80. Consequently, clients will establish connections with the MigratoryData cluster on port 80.

To deploy the MigratoryData cluster, copy this manifest to a file migratorydata-cluster.yaml, and run the command:

kubectl apply -f migratorydata-cluster.yaml

Namespace switch

Because the deployment concerns the namespace migratory, switch to this namespace as follows:

kubectl config set-context --current --namespace=migratory

To return to the default namespace, run:

kubectl config set-context --current --namespace=default

Verify installation

Check the running pods to ensure the migratorydata pods are running:

kubectl get pods

The output of this command should include something similar to the following:

NAME              READY   STATUS    RESTARTS   AGE
migratorydata-0   1/1     Running   0          2m52s
migratorydata-1   1/1     Running   0          2m40s
migratorydata-2   1/1     Running   0          2m25s

You can check the logs of each cluster member by running a command as follows:

kubectl logs migratorydata-0

Test installation

Now, you can check that the service of the manifest above is up and running:

kubectl get svc

You should see an output similar to the following:

NAME               TYPE           CLUSTER-IP     EXTERNAL-IP   PORT(S)                               AGE
migratorydata-cs   LoadBalancer   10.100.81.52   NLB-DNS       80:30922/TCP                        95s
migratorydata-hs   ClusterIP      None           <none>        8801/TCP,8802/TCP,8803/TCP,8804/TCP   57s

You should now be able to connect to http://NLB-DNS:80 and run the demo app provided with each MigratoryData server of the cluster (where NLB-DNS is the external address assigned by NLB service to your client service.

Scaling

It’s recommended to read the Clustering section before moving forward here.

In the manifest above, it’s worth noting that we’ve set the maximum number of cluster members to 5 using MAX_REPLICAS 🅓. However, only the initial 3 members are created, as indicated by the replicas field 🅑, meeting the minimum criteria set by the minAvailable field 🅐. Additionally, the ClusterSeedMemberCount parameter 🅒 has been configured to 3, ensuring that the requirement for the minAvailable field 🅐 to be equal or higher than the number of seed members is fulfilled.

Therefore, according to this Kubernetes manifest, two additional cluster members could be added either manually or using autoscaling according to the load of the system.

Manual scaling up

In the example above, you can scale up the cluster of the three members up to five members by modifying the value of the replicas field 🅑. For example, if the load of your system increases substantially, and supposing your nodes have enough resources available, you can add two new members to the cluster by modifying the replicas field as follows:

kubectl scale statefulsets migratorydata --replicas=5

Note that you cannot assign to the replicas field a value which is neither higher than the maximum number of members defined by the shell variable MAX_REPLICAS 🅓, nor smaller than the minimum number of cluster members defined by the minAvailable field 🅐.

Manual scaling down

If the load of your system decreases, then you might remove one member from the cluster by modifying the replicas field as follows:

kubectl scale statefulsets migratorydata --replicas=4

Note that you cannot assign to the replicas field a value which is neither higher than the maximum number of members defined by the shell variable MAX_REPLICAS 🅓, nor smaller than the minimum number of cluster members defined by the minAvailable field 🅐.

Autoscaling

Manual scaling is practical if the load of your system changes gradually. Otherwise, you can use the autoscaling feature of Kubernetes.

Kubernetes can monitor the load of your system, typically expressed in CPU usage, and scale your MigratoryData cluster up and down by automatically modifying the replicas field.

In the example above, to add one or more new members up to a maximum of 5 cluster members if the CPU usage of the existing members becomes higher than 50%, or remove one or more of the existing members if the CPU usage of the existing members becomes lower than 50%, use the following command:

kubectl autoscale statefulset migratorydata \
--cpu-percent=50 --min=3 --max=5

Alternatively, you can use a YAML manifest as follows:

apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
  namespace: migratory
  name: migratorydata-autoscale # you can use any name here
spec:
  maxReplicas: 5
  minReplicas: 3
  scaleTargetRef:
    apiVersion: apps/v1
    kind: StatefulSet
    name: migratorydata 
  targetCPUUtilizationPercentage: 50

Save it to a file migratorydata-autoscale.yaml, and run:

kubectl apply -f migratorydata-autoscale.yaml

Now, you can display information about the autoscaler object above using the following command:

kubectl get hpa

and display CPU usage of cluster members with:

kubectl top pods

While testing cluster autoscaling, it is important to understand that the Kubernetes autoscaler periodically retrieves CPU usage information from the cluster members. As a result, the autoscaling process may not appear instantaneous, but this delay aligns with the normal behavior of Kubernetes.

Node Scaling with eksctl

Get node group name with the following command:

eksctl get nodegroup --cluster=$EKS_CLUSTER --region=us-east-1

You should see an output similar to the following:

CLUSTER                 NODEGROUP       STATUS  CREATED                 MIN SIZE        MAX SIZE        DESIRED CAPACITY        INSTANCE TYPE   IMAGE ID        
eks-migratorydata       ng-78d1f82e     ACTIVE  2024-01-25T07:55:46Z    3               5               5                       m5.large        AL2_x86_64      

To scale the number of nodes in the EKS cluster, use the following command, update <NODE_GROUP> with the value from above:

eksctl scale nodegroup --cluster=$EKS_CLUSTER --nodes=5 --name=<NODE_GROUP> --region=us-east-1

See the number of nodes increased with the following command:

kubectl get nodes

You should see an output similar to the following:

NAME                             STATUS   ROLES    AGE     VERSION
ip-192-168-0-196.ec2.internal    Ready    <none>   2m26s   v1.27.9-eks-5e0fdde
ip-192-168-20-197.ec2.internal   Ready    <none>   2m26s   v1.27.9-eks-5e0fdde
ip-192-168-46-194.ec2.internal   Ready    <none>   54m     v1.27.9-eks-5e0fdde
ip-192-168-49-230.ec2.internal   Ready    <none>   54m     v1.27.9-eks-5e0fdde
ip-192-168-8-103.ec2.internal    Ready    <none>   54m     v1.27.9-eks-5e0fdde

Node Failure Testing

MigratoryData clustering tolerates a number of cluster member to be down or to fail as detailed in the Clustering section.

In order to test an EKS node failure, use:

kubectl drain <node-name> --force --delete-local-data --ignore-daemonsets

Then, to start an EKS node, use:

kubectl uncordon <node-name>

Uninstall

To uninstall or if something got wrong with the commands above, you can remove the allocated resources (and try again) as detailed below.

Delete workspace

Delete the Kubernetes resources created for this deployment with:

kubectl delete -f migratory-namespace.yaml

Go back to default namespace:

kubectl config set-context --current --namespace=default

Delete EKS cluster

eksctl delete cluster --name=$EKS_CLUSTER --region=us-east-1

Build realtime apps

Use any of the MigratoryData’s client APIs to develop real-time applications for communication with this MigratoryData cluster.