High Availability and Scaling

Our services don’t keep any local state on the machine it runs on. Services that don’t have any database configured work completely statelessly (EAP, FES, or WKD if using Pull-Sync), and the remaining services keep state in the PostgreSQL database (EKM, WKD if using Push-Sync).

The high availability/scaling approach is the same as scaling any stateless internal REST app running on the JVM. Ensure these steps to scale:

Make sure the JVM is fully utilizing the hardware you give it, and/or increase the machine size.
Add more machines with one instance per machine. We recommend three machines for redundancy.
Add an HTTP load balancer, reverse proxies such as HAProxy or NGINX, or a web app firewall in front of the machines. Use any standard approach such as round-robin, random, hash of client IP, etc. Node stickiness isn’t required on the LB/RP. Depending on the load balancer or health checks being used, you may need to set two hosts in api.accept.hosts configuration property. One host is for client-initiated requests and the other is for health-check initiated requests, which are often IP-based. Please refer to the configuration guide for General properties to learn more about how to set these properties.
Configure LB/RP to test node healthiness using the /health endpoint and only send requests if healthy.
All instances are equal, there’re no primary or secondary instances.
If the service uses PostgreSQL, see the Scaling PostgreSQL guide.