AWS Deployment using EFS

Note

Chef Automate 4.10.1 released on 6th September 2023 includes improvements to the deployment and installation experience of Automate HA. Please read the blog to learn more about key improvements. Refer to the pre-requisites page (On-Premises, AWS) and plan your usage with your customer success manager or account manager.

Note

If the user chooses backup_config as efs in config.toml backup is already configured during deployment, the below steps are not required and can be skipped. i.e., backup_config = "efs" . If we have kept the backup_config blank, then the configuration needs to be configured manually.

Overview

A shared file system is always required to create OpenSearch snapshots. To register the snapshot repository using OpenSearch, it is necessary to mount the same shared filesystem to the exact location on all master and data nodes. Register the location in the path.repo setting on all master and data nodes.

Setting up the backup configuration

To create an EFS file system, please refer to the sample steps on the Create your Amazon EFS file system page.
Let’s create a folder structure /mnt/automate_backups/ on all the Frontend and backend nodes, then we have to mount EFS to all the vm’s manually. To do that please refer this

Configuration in OpenSearch Node

Mount the EFS on all OpenSearch Node. For example you mount the EFS to folder structure /mnt/automate_backups/

Create an opensearch sub-directory and set permissions as mention below (all the OpenSearch nodes).

sudo mkdir -p /mnt/automate_backups/opensearch
sudo chown hab:hab /mnt/automate_backups/opensearch/

Configuration for OpenSearch Node from Provision host

Configure the OpenSearch path.repo attribute.

Create a toml file (os_config.toml) and add below template
```
[path]
repo = "/mnt/automate_backups/opensearch"
```
Patch the config os_config.toml from bastion to the OpenSearch cluster.
```
chef-automate config patch --opensearch os_config.toml
```
Above command will restart the OpenSearch cluster.

Healthcheck commands

Get the OpenSearch Cluster status
```
chef-automate status --os
```

Following command can be run in the OpenSearch node

hab svc status (check whether OpenSearch service is up or not)

curl -k -X GET "<https://localhost:9200/_cat/indices/*?v=true&s=index&pretty>" -u admin:admin (Another way to check is to check whether all the indices are green or not)

# Watch for a message about OpenSearch going from RED to GREEN
`journalctl -u hab-sup -f | grep 'automate-ha-opensearch'

Configuration for Automate node from Bastion host

Mount the EFS to all the Frontend node manually. For example you mount the EFS to folder structure /mnt/automate_backups
Create an automate.toml file on the bastion host using the following command:
```
touch automate.toml
```

Add the following configuration to automate.toml on the bastion host:

[global.v1.external.opensearch.backup]
enable = true
location = "fs"

[global.v1.external.opensearch.backup.fs]
# The `path.repo` setting you've configured on your OpenSearch nodes must be a parent directory of the setting you configure here:
path = "/mnt/automate_backups/opensearch"

[global.v1.backups.filesystem]
path = "/mnt/automate_backups/backups"

Patch the config using the below command.

./chef-automate config patch --frontend automate.toml

Backup and Restore commands

Backup

Run the backup command from Bastion as shown below to create a backup:
```
chef-automate backup create
```

Restoring the EFS Backed-up Data

To restore backed-up data of the Chef Automate High Availability (HA) using External File System (EFS), follow the steps given below:

Check the status of all Chef Automate and Chef Infra Server front-end nodes by executing the chef-automate status command.
Execute the restore command from bastionchef-automate backup restore <BACKUP-ID> -b /mnt/automate_backups/backups --airgap-bundle </path/to/bundle>.

Note

If you are restoring the backup from an older version, then you need to provide the --airgap-bundle </path/to/current/bundle>.
Large Compliance Report is not supported in Automate HA

Troubleshooting

Try these steps if Chef Automate returns an error while restoring data.

Check the Chef Automate status.
```
chef-automate status
```
Check the status of your Habitat service on the Automate node.
```
hab svc status
```
If the deployment services are not healthy, reload them.
```
hab svc load chef/deployment-service
```

Now check the status of the Automate node and then try running the restore command from the bastion host.

How to change the base_path or path. The steps for the File System backup are as shown below:

While at the time of deployment backup_mount default value will be /mnt/automate_backups
In case, if you modify the backup_mount in config.toml before deployment, then the deployment process will do the configuration with the updated value
In case, you changed the backup_mount value post-deployment, then we need to patch the configuration manually to all the frontend and backend nodes, for example, if you change the backup_mount to /bkp/backps

Update the FE nodes with the below template, use the command chef-automate config patch fe.toml --fe

   [global.v1.backups]
      [global.v1.backups.filesystem]
         path = "/bkp/backps"
   [global.v1.external.opensearch.backup]
      [global.v1.external.opensearch.backup.fs]
         path = "/bkp/backps"

Update the OpenSearch node with the below template, use the command chef-automate config patch os.toml --os

[path]
   repo = "/bkp/backps"

Run the curl request to one of the automate frontend node

curl localhost:10144/_snapshot?pretty

If the response is empty {}, then we are good
If the response has json output, then it should have correct value for the backup_mount, refer the location value in the response. It should start with the /bkp/backps

{
 "chef-automate-es6-event-feed-service" : {
 "type" : "fs",
 "settings" : {
 "location" : "/mnt/automate_backups/opensearch/automate-elasticsearch-data/chef-automate-es6-event-feed-service"
       }
    },
 "chef-automate-es6-compliance-service" : {
 "type" : "fs",
 "settings" : {
 "location" : "/mnt/automate_backups/opensearch/automate-elasticsearch-data/chef-automate-es6-compliance-service"
       }
    },
 "chef-automate-es6-ingest-service" : {
     "type" : "fs",
 "settings" : {
 "location" : "/mnt/automate_backups/opensearch/automate-elasticsearch-data/chef-automate-es6-ingest-service"
       }
    },
 "chef-automate-es6-automate-cs-oc-erchef" : {
 "type" : "fs",
 "settings" : {
 "location" : "/mnt/automate_backups/opensearch/automate-elasticsearch-data/chef-automate-es6-automate-cs-oc-erchef"
       }
    }
 }

If the pre string in the location is not match with backup_mount, then we need to to delete the existing snapshots. use below script to delete the snapshot from the one of the automate frontend node.

   snapshot=$(curl -XGET http://localhost:10144/_snapshot?pretty | jq 'keys[]')
   for name in $snapshot;do
       key=$(echo $name | tr -d '"')
      curl -XDELETE localhost:10144/_snapshot/$key?pretty
   done

The above scritp requires the jq needs to be installed, You can install from the airgap bundle, please use command on the one of the automate frontend node to locate the jq package.

ls -ltrh /hab/cache/artifacts/ | grep jq

-rw-r--r--. 1 ec2-user ec2-user  730K Dec  8 08:53 core-jq-static-1.6-20220312062012-x86_64-linux.hart
-rw-r--r--. 1 ec2-user ec2-user  730K Dec  8 08:55 core-jq-static-1.6-20190703002933-x86_64-linux.hart

In case of multiple jq version, then install the latest one. use the below command to install the jq package to the automate frontend node

hab pkg install /hab/cache/artifacts/core-jq-static-1.6-20190703002933-x86_64-linux.hart -bf

Below steps for object storage as a backup option

While at the time of deployment backup_config will be object_storage
To use the object_storage, we are using below template at the time of deployment

   [object_storage.config]
    google_service_account_file = ""
    location = ""
    bucket_name = ""
    access_key = ""
    secret_key = ""
    endpoint = ""
    region = ""

If you configured pre deployment, then we are good
If you want to change the bucket or base_path, then use the below template for Frontend nodes

[global.v1]
  [global.v1.external.opensearch.backup.s3]
      bucket = "<BUCKET_NAME>"
      base_path = "opensearch"
   [global.v1.backups.s3.bucket]
      name = "<BUCKET_NAME>"
      base_path = "automate"

You can choose any value for the variable base_path. base_path patch is only required for the frontend node.
Use the command to apply the above template chef-automate config patch frontend.toml --fe
Post the configuration patch, and use the curl request to validate
```
curl localhost:10144/_snapshot?pretty
```
If the response is empty {}, then we are good

If the response has JSON output, then it should have the correct value for the base_path

{
    "chef-automate-es6-event-feed-service" : {
      "type" : "s3",
      "settings" : {
        "bucket" : "MY-BUCKET",
        "base_path" : "opensearch/automate-elasticsearch-data/chef-automate-es6-event-feed-service",
        "readonly" : "false",
        "compress" : "false"
      }
    },
    "chef-automate-es6-compliance-service" : {
      "type" : "s3",
      "settings" : {
        "bucket" : "MY-BUCKET",
        "base_path" : "opensearch/automate-elasticsearch-data/chef-automate-es6-compliance-service",
        "readonly" : "false",
        "compress" : "false"
      }
    },
    "chef-automate-es6-ingest-service" : {
      "type" : "s3",
      "settings" : {
        "bucket" : "MY-BUCKET",
        "base_path" : "opensearch/automate-elasticsearch-data/chef-automate-es6-ingest-service",
        "readonly" : "false",
        "compress" : "false"
      }
    },
    "chef-automate-es6-automate-cs-oc-erchef" : {
      "type" : "s3",
      "settings" : {
        "bucket" : "MY-BUCKET",
        "base_path" : "opensearch/automate-elasticsearch-data/chef-automate-es6-automate-cs-oc-erchef",
        "readonly" : "false",
        "compress" : "false"
      }
    }
}

In case of base_path value is not matching, then we have to delete the existing snapshot. please refer to the steps from the file system.