Run a single MigratoryData instance
To start a single MigratoryData instance and allow clients to connect to that instance on the port 8800, install Docker and run the following command:
$ docker pull migratorydata/server:latest
$ docker run \
-d --name my_migratorydata -p 8800:8800 \
migratorydata/server:latest
docker run --platform linux/amd64 \
-d --name my_migratorydata -p 8800:8800 \
migratorydata/server:latest
You should now be able to connect to http://yourhostname:8800
and run the demo app provided with the MigratoryData server (where yourhostname
is the dns name or ip address of the machine where you installed this MigratoryData instance resolvable and reachable from your browsers).
You can see the logs of the container using the following command:
$ docker logs my_migratorydata
To stop and remove the container use:
$ docker stop my_migratorydata
$ docker rm my_migratorydata
Custom Configuration
It is possible to customize every aspect of the MigratoryData server running in a Docker container as described below.
MIGRATORYDATA_EXTRA_OPTS
Each parameter of the MigratoryData server can be defined or can override a parameter defined in the default configuration file by setting it as an extra option in the MIGRATORYDATA_EXTRA_OPTS
environment variable, using the syntax:
-Dparameter=value
Note that no single or double quotes are used to define neither the parameter, nor its value. In addition, the space character is NOT allowed when defining the value of a parameter.
The complete list of parameters is available in the Configuration Guide.
Java Options
There are three environment variables MIGRATORYDATA_JAVA_GC_LOG_OPTS
, MIGRATORYDATA_JAVA_GC_OPTS
, and MIGRATORYDATA_JAVA_EXTRA_OPTS
which can be used to customize the garbage collection logging options, the garbage collectors to be used, and respectively various other Java options.
All of these Java options come with default values which are sufficient for most use cases. Therefore, in most instances, it is not necessary to define these Java-related environment variables.
Adding a License Key
To use a license key with this image, override the parameter LicenseKey
of the default configuration file using an extra options as follows:
$ docker run -d -e MIGRATORYDATA_EXTRA_OPTS='-DLicenseKey=yourlicensekey' \
--name my_migratorydata -p 8800:8800 migratorydata/server:latest
where yourlicensekey
is the license key obtained from MigratoryData for evaluation, test, or production usages.
Enabling JMX Monitoring
To enable the JMX monitoring for MigratoryData, you should define the JMX related parameters as usual and publish the JMX port to the host as follows:
$ docker run -d
-e MIGRATORYDATA_EXTRA_OPTS='-DLicenseKey=yourlicensekey -DMonitor=JMX -DMonitorUsername=admin -DMonitorPassword=pass \
-DMonitorJMX.Listen=*:3000 -DMonitorJMX.Authentication=true -DMonitorJMX.Encryption=false' \
--name my_migratorydata -p 8800:8800 -p 3000:3000 migratorydata/server:latest
You should now be able to connect with any JMX client to yourhostname:3000
using the credentials defined here admin/pass
. Please note that in order to access the JMX monitoring with Java’s jconsole
JMX client, you will need to provide two Java extra options using MIGRATORYDATA_JAVA_EXTRA_OPTS
as follows:
$ docker run -d
-e MIGRATORYDATA_EXTRA_OPTS='-DLicenseKey=yourlicensekey -DMonitor=JMX -DMonitorUsername=admin -DMonitorPassword=pass \
-e MIGRATORYDATA_JAVA_EXTRA_OPTS='-Djava.net.preferIPv4Stack=true -Djava.rmi.server.hostname=yourhostname' \
-DMonitorJMX.Listen=*:3000 -DMonitorJMX.Authentication=true -DMonitorJMX.Encryption=false' \
--name my_migratorydata -p 8800:8800 -p 3000:3000 migratorydata/server:latest
Logging
Besides the logs provided to the standard output which are accessible with docker logs my_migratorydata
, this image also writes the logs to a folder which defaults to /migratorydata/logs
folder. You can change the log folder location, as follows:
$ docker run -d -e MIGRATORYDATA_EXTRA_OPTS='-DLicenseKey=yourlicensekey \
-DLogFolder=/myvolume/migratorydata/logs` -p 8800:8800 migratorydata/server:latest
You can use any parameter related to logging, including verbosity, rotation, compression, etc. For example, to record the access logs, use:
$ docker run -d -e MIGRATORYDATA_EXTRA_OPTS='-DLicenseKey=yourlicensekey \
-DLogFolder=/myvolume/migratorydata/logs \
-DAccessLog=true` -p 8800:8800 migratorydata/server:latest
Extensions
In order to deploy one or more extensions for the MigratoryData server you should mount a volume with the extensions into the MigratoryData standard extensions folder which is /migratorydata/extensions
.
For example, supposing you developed an entitlement extension using MigratoryData Entitlement Extension API and deployed extension.jar
to the (persistent) folder /myvolume/migratorydata/extensions
, then, in order to load this entitlement extension, run:
$ docker run -d -e MIGRATORYDATA_EXTRA_OPTS='-DLicenseKey=yourlicensekey -DEntitlement=Custom' \
-v /myvolume/migratorydata/extensions:/migratorydata/extensions \
--name mymigratorydata -p 8800:8800 migratorydata/server:latest
Alternatively, you might load your entitlement extension by creating a new image derived from migratorydata
as follows:
FROM migratorydata
COPY extension.jar /migratorydata/extensions/extension.jar
Then, build with docker build -t custom_migratorydata .
and run:
$ docker run --name my_custom_migratorydata -d custom_migratorydata
MigratoryData clustering on Kubernetes
Here is an example configuration which can be used to deploy a cluster of three MigratoryData servers on Kubernetes.
#
# Service used by the MigratoryData cluster to communicate with the clients
#
apiVersion: v1
kind: Service
metadata:
name: migratorydata-cs
# uncomment the next two lines to deploy the cluster using Application Gateway
#annotations:
# service.beta.kubernetes.io/azure-load-balancer-internal: "true"
labels:
app: migratorydata
spec:
type: LoadBalancer
ports:
- name: client-port
port: 80
protocol: TCP
targetPort: 8800
selector:
app: migratorydata
---
#
# Headless service used for inter-cluster communication
#
apiVersion: v1
kind: Service
metadata:
name: migratorydata-hs
labels:
app: migratorydata
spec:
clusterIP: None
ports:
- name: inter-cluster1
port: 8801
protocol: TCP
targetPort: 8801
- name: inter-cluster2
port: 8802
protocol: TCP
targetPort: 8802
- name: inter-cluster3
port: 8803
protocol: TCP
targetPort: 8803
- name: inter-cluster4
port: 8804
protocol: TCP
targetPort: 8804
selector:
app: migratorydata
---
apiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
name: migratorydata-pdb
spec:
minAvailable: 3 # The value must be equal or higher than the number of seed members 🅐
selector:
matchLabels:
app: migratorydata
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: migratorydata
spec:
selector:
matchLabels:
app: migratorydata
serviceName: migratorydata-hs
replicas: 3 # The desired number of cluster members 🅑
updateStrategy:
type: RollingUpdate
podManagementPolicy: OrderedReady
template:
metadata:
labels:
app: migratorydata
spec:
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
podAffinityTerm:
labelSelector:
matchExpressions:
- key: "app"
operator: In
values:
- migratorydata
topologyKey: "kubernetes.io/hostname"
containers:
- name: migratorydata-cluster
imagePullPolicy: Always
image: migratorydata/server:latest
env:
- name: MIGRATORYDATA_JAVA_EXTRA_OPTS
value: "-XX:+UnlockExperimentalVMOptions -XX:+UseCGroupMemoryLimitForHeap"
- name: MIGRATORYDATA_EXTRA_OPTS
value: "-DMemory=128MB \
-DClusterDeliveryMode=Guaranteed \
-DLogLevel=INFO \
-DX.ConnectionOffload=true \
-DClusterSeedMemberCount=3" # Define the number of seed members 🅒
command:
- bash
- "-c"
- |
set -x
HOST=`hostname -s`
DOMAIN=`hostname -d`
CLUSTER_PORT=8801
MAX_REPLICAS=5 # Define the maximum number of cluster members 🅓
if [[ $HOST =~ (.*)-([0-9]+)$ ]]; then
NAME=${BASH_REMATCH[1]}
fi
CLUSTER_MEMBER_LISTEN=$HOST.$DOMAIN:$CLUSTER_PORT
echo $CLUSTER_MEMBER_LISTEN
MIGRATORYDATA_EXTRA_OPTS="$MIGRATORYDATA_EXTRA_OPTS -DClusterMemberListen=$CLUSTER_MEMBER_LISTEN"
CLUSTER_MEMBERS=""
for (( i=1; i < $MAX_REPLICAS; i++ ))
do
CLUSTER_MEMBERS="$CLUSTER_MEMBERS$NAME-$((i-1)).$DOMAIN:$CLUSTER_PORT,"
done
CLUSTER_MEMBERS="$CLUSTER_MEMBERS$NAME-$((MAX_REPLICAS-1)).$DOMAIN:$CLUSTER_PORT"
echo $CLUSTER_MEMBERS
MIGRATORYDATA_EXTRA_OPTS="$MIGRATORYDATA_EXTRA_OPTS -DClusterMembers=$CLUSTER_MEMBERS"
echo $MIGRATORYDATA_EXTRA_OPTS
export MIGRATORYDATA_EXTRA_OPTS
./start-migratorydata.sh
resources:
requests:
memory: "256Mi"
cpu: "0.5"
ports:
- name: client-port
containerPort: 8800
- name: inter-cluster1
containerPort: 8801
- name: inter-cluster2
containerPort: 8802
- name: inter-cluster3
containerPort: 8803
- name: inter-cluster4
containerPort: 8804
readinessProbe:
tcpSocket:
port: 8800
initialDelaySeconds: 10
periodSeconds: 5
livenessProbe:
tcpSocket:
port: 8800
initialDelaySeconds: 10
periodSeconds: 5
The manifest above contains a Headless Service, a Service, a PodDisruptionBudget, and a StatefulSet. The service is used to handle the clients of the MigratoryData cluster. The headless service is used for inter-cluster communication which provides the DNS records corresponding to the instances of the MigratoryData cluster.
In order to deploy the MigratoryData cluster on Kubernetes, copy the example configuration above into a file named, say, migratorydata-cluster.yaml
and run the following command:
$ kubectl apply -f migratorydata-cluster.yaml
By running the following command, you are checking that the three pods of the example configuration are up and running:
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
migratorydata-0 1/1 Running 0 2m52s
migratorydata-1 1/1 Running 0 2m40s
migratorydata-2 1/1 Running 0 2m25s
and you can check the logs of each cluster member by running a command as follows:
$ kubectl logs migratorydata-0
Also, by running the following command, you can check that the two services of the example configuration are up and running:
$ kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.0.0.1 <none> 443/TCP 37m
migratorydata-cs LoadBalancer 10.0.58.189 YourExternalIP 80:31596/TCP 4m8s
migratorydata-hs ClusterIP None <none> 8801/TCP,8802/TCP,8803/TCP,8804/TCP 4m7s
You should now be able to connect to http://YourExternalIP
and run the demo app provided with each MigratoryData server of the cluster (where YourExternalIP is the external IP address assigned by Kubernetes to your client
service, which can be obtained as shown above, using the command kubectl get svc
).
Finally, let’s make a few remarks about the configuration above. In the configuration above, we used the default port 8800
for listening for clients. This port is mapped into the port 80
by the client
service. The port 8801
is used by each cluster member for its parameter ClusterMember.Listen
, and therefore, like in traditional deployments, this port 8801
together with the next three consecutive ports, i.e. 8802
, 8803
, and 8804
, are reserved for inter-cluster communication.
Also, note that we’ve used the MIGRATORYDATA_EXTRA_OPTS
environment variable to customize each cluster member as explained more in detail in the section “Custom Configuration” above.
The MIGRATORYDATA_JAVA_EXTRA_OPTS
environment variable is also used to provide Java options which will cause the JVM to take into account the hardware resources assigned by each pod.
MigratoryData Scaling on Kubernetes
Before reading this section, it is recommended to read about the Elasticity feature of MigratoryData.
Note that in the example YAML above, we defined with the shell variable MAX_REPLICAS
🅓 a cluster with a maximum number of members of 5
. However, only the first 3
members of the cluster are created as specified by the replicas
field 🅑, which is sufficient to satisfy the minimum criteria specified with the minAvailable
field 🅐. Also note that the number of seed members has been configured to 3
using the ClusterSeedMemberCount parameter 🅒, and therefore the requirement that the value of the minAvailable
field 🅐 to be equal or higher than the number of seed members is satisfied.
In this example, two additional cluster members could be added, then removed either manually or using autoscaling according to the load of the system.
Manual Scaling
In the example above, you can scale out the cluster of the three members up to five members by modifying the value of the replicas
field 🅑. For example, if the load of your system increases substantially, and supposing your nodes have enough resources available, you can add two new members to the cluster by modifying the replicas
field as follows:
$ kubectl scale statefulsets migratorydata --replicas=5
Then, if the load of your system decreases, then you might remove one member from the cluster by modifying the replicas
field as follows:
$ kubectl scale statefulsets migratorydata --replicas=4
Note that you cannot assign to the replicas
field a value which is neither higher than the maximum number of members defined by the shell variable MAX_REPLICAS
🅓, nor smaller than the minimum number of cluster members defined by the minAvailable
field 🅐.
Autoscaling
Manual scaling is practical if the load of your system changes gradually. Otherwise, you can use the autoscaling feature of Kubernetes.
Kubernetes can monitor the load of your system, typically expressed in CPU usage, and scale your MigratoryData cluster up and down by automatically modifying the replicas
field.
In the example above, to add one or more new members up to a maximum of 5
cluster members if the CPU usage of the existing members becomes higher than 50%, or remove one or more of the existing members if the CPU usage of the existing members becomes lower than 50%, use the following command:
$ kubectl autoscale statefulset migratorydata --cpu-percent=50 --min=3 --max=5
Alternatively, you can use a YAML manifest as follows:
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
name: migratorydata-autoscale # you can use any name here
spec:
maxReplicas: 5
minReplicas: 3
scaleTargetRef:
apiVersion: apps/v1
kind: StatefulSet
name: migratorydata
targetCPUUtilizationPercentage: 50
Save it as a file named for example migratorydata_autoscale.yaml
, then execute it as follows:
kubectl apply -f migratorydata_autoscale.yaml
Now, you can display information about the autoscaler object above using the following command:
kubectl get hpa
and display CPU usage of cluster members with:
kubectl top pods
While testing cluster autoscaling, it is important to be aware that the Kubenetes autoscaler gets the CPU usage periodically from the cluster members, so autoscaling might appear to be not immediate. However, this is normal Kubernetes behavior.
Node Autoscaling
Cloud infrastructure like Azure Kubernetes Service (AKS) can be configured to automatically spin up additional nodes if the resources of the existing nodes are insufficient to create new cluster members.
For example, to create an ASK cluster of minimum 3
nodes and maximum 5
nodes and enable node autoscaling, use something as follows:
# install AKS client
$ az aks install-cli
# login to ASK
az login
# create a resource group
$ az group create --name myMigratoryDataGroup --location eastus
# create the cluster and enable the cluster autoscaling
$ az aks create \
--resource-group myMigratoryDataGroup \
--name myMigratoryDataCluster \
--node-count 3 \
--vm-set-type VirtualMachineScaleSets \
--enable-integrations monitoring \
--generate-ssh-keys \
--load-balancer-sku standard \
--enable-cluster-autoscaler \
--min-count 3 \
--max-count 5
# connect to the ASK cluster
$ az aks get-credentials --resource-group myMigratoryDataGroup --name myMigratoryDataCluster
# check the AKS cluster nodes
$ kubectl get nodes
Finally, when you don’t need the AKS cluster of nodes, delete it:
$ az group delete --name myMigratoryDataGroup --yes --no-wait
For more details, please refer to the ASK documentation at:
https://docs.microsoft.com/en-us/azure/aks/cluster-autoscaler
While testing node autoscaling, it is important to be aware that adding or disposing nodes from AKS might take some time. For example, to evict an unused node back to AKS might take up to several minutes. Here are more details about time granularity on AKS cluster node autoscaling:
https://docs.microsoft.com/en-us/azure/aks/cluster-autoscaler#using-the-autoscaler-profile
Node Failure Testing
MigratoryData clustering tolerates a number of cluster member to be down or to fail as detailed in the Elasticity section of the Architecture Guide.
In order to test an AKS node failure, use:
kubectl drain <node-name> --force --delete-local-data --ignore-daemonsets
Then, to start an AKS node, use:
kubectl uncordon <node-name>
License
View the license information for the MigratoryData software contained in this Docker image.