Follow the steps below to deploy Chef Automate High Availability (HA) on AWS (Amazon Web Services) cloud.
Install Chef Automate HA on AWS
Virtual Private Cloud (VPC) should be created in AWS before starting. Reference for VPC and CIDR creation
If you want to use Default VPC we have to create public and private subnet, If subnet are not available. Please refer this
We need 3 private and 3 public subnet in a vpc (1 subnet for each AZ). As of now we support dedicate subnet for each AZ.
We recommend to create a new vpc. And Bastion should be in the same VPC.
Get AWS credentials (
aws_secret_access_key) which have privileges like:
Set these in
~/.aws/credentialsin Bastion Host:
sudo su -
mkdir -p ~/.aws echo "[default]" >> ~/.aws/credentials echo "aws_access_key_id=<ACCESS_KEY_ID>" >> ~/.aws/credentials echo "aws_secret_access_key=<SECRET_KEY>" >> ~/.aws/credentials echo "region=<AWS-REGION>" >> ~/.aws/credentials
Have DNS certificate ready in ACM for 2 DNS entries: Example:
chefinfraserver.example.comReference for Creating new DNS Certificate in ACM
Have SSH Key Pair ready in AWS, so new VM’s are created using that pair. Reference for AWS SSH Key Pair creation
We do not support passphrase for Private Key authentication.
Preferred key type will be ed25519
Make sure your linux has
sysctlutility available in all nodes.
- PLEASE DONOT MODIFY THE WORKSPACE PATH it should always be “/hab/a2_deploy_workspace”
- We currently don’t support AD managed users in nodes. We only support local linux users.
Run the following steps on Bastion Host Machine:
- Make sure that bastion machine is in the same vpc, as mention in
config.toml, otherwise we need to do vpc peering.
- Use subnet-id instead of CIDR block in
config.toml, to avoid the subnet conflict. If we use CIDR block, will fail if an consecutive cidr block are not available.
- If you choose
s3then provide the bucket name to field
s3_bucketNameexist it is directly use for backup configuration and if it doesn’t exist then deployment process will create
- If you choose
efsthen we will create the EFS and mount on all frontend and backend node.
- If you choose
" "(empty), then you have to manually to do the backup configuration, after the deployment complete. But we recommended that to use
backup_configto be set to
efsat the time of deployment.
Run below commands to download latest Automate CLI and Airgapped Bundle:
#Run commands as sudo. sudo -- sh -c " #Download Chef Automate CLI. curl https://packages.chef.io/files/current/latest/chef-automate-cli/chef-automate_linux_amd64.zip \ | gunzip - > chef-automate && chmod +x chef-automate \ | cp -f chef-automate /usr/bin/chef-automate #Download latest Airgapped Bundle. #To download specific version bundle, example version: 4.2.59 then replace latest.aib with 4.2.59.aib curl https://packages.chef.io/airgap_bundle/current/automate/latest.aib -o automate.aib #Generate init config and then generate init config for existing infra structure chef-automate init-config-ha aws "
NoteChef Automate bundles are available for 365 days from the release of a version. However, the milestone release bundles are available for download forever.
Update Config with relevant data. Click here for sample config
ssh_userwhich has access to all the machines. Example:
ssh_portin case your AMI is running on custom ssh port, default will be 22.
ssh_key_filepath, this should have been download from AWS SSH Key Pair which we want to use to create all the VM’s. Thus, we will be able to access all VM’s using this.
sudo_passwordis only meant to switch to sudo user. If you have configured password for sudo user, please provide it here.
- We support only private key authentication.
s3_bucketNameto a Unique Value.
admin_passwordwhich you can use to access Chef Automate UI for user
- Don’t set
fqdnfor this AWS deployment.
instance_countfor Chef Automate, Chef Infra Server, Postgresql, OpenSearch.
- Set AWS Config Details:
profile, by default
region, by default
aws_vpc_id, which you had created as Prerequisite step. Example:
public_custom_subnets: example :
ssh_key_pair_name, this is the SSH Key Pair we created as Prerequisite. This value should be just name of the AWS SSH Key Pair, not having
.pemextention. The ssh key content should be same as content of
false, As these deployment steps are for Non-Managed Services AWS Deployment. Default value is
ami_id, this value depends on your AWS Region and the Operating System Image you want to use.
- Please use the Hardware Requirement Calculator sheet to get information for which instance type you will need for your load.
- Set Instance Type for Chef Automate in
- Set Instance Type for Chef Infra Server in
- Set Instance Type for OpenSearch in
- Set Instance Type for Postgresql in
automate_lb_certificate_arnwith the arn value of the Certificate created in AWS ACM for DNS entry of
chef_server_lb_certificate_arnwith the arn value of the Certificate created in AWS ACM for DNS entry of
automate_ebs_volume_sizebased on your load needs.
chef_ebs_volume_sizebased on your load needs.
opensearch_ebs_volume_sizebased on your load needs.
postgresql_ebs_volume_sizebased on your load needs.
postgresql_ebs_volume_type. Default value is
"gp3". Change this based on your needs.
NoteClick here to know more on adding certificates for services during deployment.
Continue with the provisioning the infra after updating config:
#Run commands as sudo. sudo -- sh -c " #Print data in the config cat config.toml #Run provision command to deploy `automate.aib` with set `config.toml` chef-automate provision-infra config.toml --airgap-bundle automate.aib "
Once the provisioning is successful, if you have added custom DNS to your configuration file (
fqdn), make sure to map the load-balancer FQDN from the output of previous command to your DNS from DNS Provider. After that continue with the deployment process with following.
sudo -- sh -c " #Run deploy command to deploy `automate.aib` with set `config.toml` chef-automate deploy config.toml --airgap-bundle automate.aib #After Deployment is done successfully. Check status of Chef Automate HA services chef-automate status #Check Chef Automate HA deployment information, using the following command chef-automate info "
After the deployment successfully completed. To view the automate UI, run the command
chef-automate info, you will get the
automate_url. If we want to change the FQDN URL from the loadbalancer URL to some other FQDN URL, then use below template
create a file
[global] [global.v1] fqdn = "AUTOMATE-DNS-URL-WITHOUT-HTTP"
Run the command to apply the config from bastion
chef-automate config patch a2.fqdn.toml --automate
create a file
[global] [global.v1] fqdn = "AUTOMATE-DNS-URL-WITHOUT-HTTPS" [global.v1.external.automate] node = "https://AUTOMATE-DNS-URL"
Run the command to apply the config from the bastion
chef-automate config patch cs.fqdn.toml --chef_server
chefinfraserver.example.compointing to respective Load Balancers as shown in
Check if Chef Automate UI is accessible by going to (Domain used for Chef Automate) https://chefautomate.example.com.
- Assuming 8+1 nodes (1 bastion, 1 for automate UI, 1 for Chef-server, 3 for Postgresql, 3 for Opensearch)
User only needs to create/setup the bastion node with IAM role of Admin access, and s3 bucket access attached to it.
Following config will create s3 bucket for backup.
[architecture.aws] ssh_user = "ec2-user" ssh_port = "22" ssh_key_file = "~/.ssh/my-key.pem" # sudo_password = "" backup_config = "s3" s3_bucketName = "My-Bucket-Name" secrets_key_file = "/hab/a2_deploy_workspace/secrets.key" secrets_store_file = "/hab/a2_deploy_workspace/secrets.json" architecture = "aws" workspace_path = "/hab/a2_deploy_workspace" backup_mount = "/mnt/automate_backups" [automate.config] admin_password = "MY-AUTOMATE-UI-PASSWORD" fqdn = "" instance_count = "1" config_file = "configs/automate.toml" enable_custom_certs = false # root_ca = "" # private_key = "" # public_key = "" [chef_server.config] instance_count = "1" enable_custom_certs = false # Add Chef Server load balancer root-ca and keys # private_key = "" # public_key = "" [opensearch.config] instance_count = "3" enable_custom_certs = false # root_ca = "" # admin_key = "" # admin_cert = "" # private_key = "" # public_key = "" [postgresql.config] instance_count = "3" enable_custom_certs = false # Add Postgresql load balancer root-ca and keys # root_ca = "" # private_key = "" # public_key = "" [aws.config] profile = "default" region = "ap-southeast-2" aws_vpc_id = "vpc12318h" aws_cidr_block_addr = "" private_custom_subnets = ["subnet-e556d512", "subnet-e556d513", "subnet-e556d514"] public_custom_subnets = ["subnet-p556d512", "subnet-p556d513", "subnet-p556d514"] ssh_key_pair_name = "my-key" setup_managed_services = false managed_opensearch_domain_name = "" managed_opensearch_domain_url = "" managed_opensearch_username = "" managed_opensearch_user_password = "" managed_opensearch_certificate = "" aws_os_snapshot_role_arn = "" os_snapshot_user_access_key_id = "" os_snapshot_user_access_key_secret = "" managed_rds_instance_url = "" managed_rds_superuser_username = "" managed_rds_superuser_password = "" managed_rds_dbuser_username = "" managed_rds_dbuser_password = "" managed_rds_certificate = "" ami_id = "ami-08d4ac5b634553e16" delete_on_termination = true automate_server_instance_type = "t3.medium" chef_server_instance_type = "t3.medium" opensearch_server_instance_type = "m5.large" postgresql_server_instance_type = "m5.large" automate_lb_certificate_arn = "arn:aws:acm:ap-southeast-2:112758395563:certificate/9b04-6513-4ac5-9332-2ce4e" chef_server_lb_certificate_arn = "arn:aws:acm:ap-southeast-2:112758395563:certificate/9b04-6513-4ac5-9332-2ce4e" chef_ebs_volume_iops = "100" chef_ebs_volume_size = "50" chef_ebs_volume_type = "gp3" opensearch_ebs_volume_iops = "100" opensearch_ebs_volume_size = "50" opensearch_ebs_volume_type = "gp3" postgresql_ebs_volume_iops = "100" postgresql_ebs_volume_size = "50" postgresql_ebs_volume_type = "gp3" automate_ebs_volume_iops = "100" automate_ebs_volume_size = "50" automate_ebs_volume_type = "gp3" lb_access_logs = "false" X-Contact = "" X-Dept = "" X-Project = ""
Minimum Changes required in sample config
ssh_userwhich has access to all the machines. Eg:
ssh_key_filepath, this key should have access to all the Machines or VM’s. Eg:
ami_idfor the respective region where the infra is been created. Eg:
certificate ARNfor both automate and Chef server in
Add more nodes In AWS Deployment post deployment
The commands require some arguments so that it can determine which types of nodes you want to add to your HA setup from your bastion host. It needs the count of the nodes you want to add as as argument when you run the command. For example,
if you want to add 2 nodes to automate, you have to run the:
chef-automate node add --automate-count 2
If you want to add 3 nodes to chef-server, you have to run the:
chef-automate node add --chef-server-count 3
If you want to add 1 node to OpenSearch, you have to run the:
chef-automate node add --opensearch-count 1
If you want to add 2 nodes to PostgreSQL you have to run:
chef-automate node add --postgresql-count 2
You can mix and match different services if you want to add nodes across various services.
If you want to add 1 node to automate and 2 nodes to PostgreSQL, you have to run:
chef-automate node add --automate-count 1 --postgresql-count 2
If you want to add 1 node to automate, 2 nodes to chef-server, and 2 nodes to PostgreSQL you have to run:
chef-automate node add --automate-count 1 --chef-server-count 2 --postgresql-count 2
Once the command executes, it will add the supplied number of nodes to your automate setup. The changes might take a while.
- If you have patched some external config to any of the existing services then make sure you apply the same on the new nodes as well. For example, if you have patched any external configurations like SAML or LDAP, or any other done manually post-deployment in automate nodes, make sure to patch those configurations on the new automate nodes. The same must be followed for services like Chef-Server, Postgresql, and OpenSearch.
- The new node will be configured with the certificates which were already configured in your HA setup.
Delete single node In AWS Deployment post deployment
We do not recommend the removal of any node from the backend cluster, but replacing the node is recommended. For the replacement of a node, click here for the reference.
Removal of nodes for Postgresql or OpenSearch is at your own risk and may result to data loss. Consult your database administrator before trying to delete Postgresql or OpenSearch nodes.
Below process can be done for
The commands require some arguments so that it can determine which types of nodes you want to remove to your HA setup from your bastion host. It needs the ip address of the node you want to remove as as argument when you run the command. For example,
if you want to remove node of automate, you have to run the:
chef-automate node remove --automate-ip "<automate-ip-address>"
If you want to remove node of chef-server, you have to run the:
chef-automate node remove --chef-server-ip "<chef-server-ip-address>"
If you want to remove node of OpenSearch, you have to run the:
chef-automate node remove --opensearch-ip "<opensearch-ip-address>"
If you want to remove node of PostgreSQL you have to run:
chef-automate node remove --postgresql-ip "<postgresql-ip-address>"
Once the command executes, it will remove nodes to your HA setup
Uninstall chef automate HA
- Running clean up command will remove all AWS resources created by
--forceflag will remove storage (Object Storage/ NFS) if it is created by
To uninstall chef automate HA instances after unsuccessfull deployment, run below command in your bastion host.
chef-automate cleanup --aws-deployment --force
chef-automate cleanup --aws-deployment