Chef

Table Of Contents

An Overview of Chef

Welcome to Chef!

Chef is a powerful automation platform that transforms complex infrastructure into code, bringing your servers and services to life. Whether you’re operating in the cloud, on-premises, or a hybrid, Chef automates how applications are configured, deployed, and managed across your network, no matter its size.

Chef is built around simple concepts: achieving desired state, centralized modeling of IT infrastructure, and resource primitives that serve as building blocks. These concepts enable you to quickly manage any infrastructure with Chef. These very same concepts allow Chef to handle the most difficult infrastructure challenges on the planet. Anything that can run the chef-client can be managed by Chef.

Chef Components

The following diagram shows the relationships between the various elements of Chef, including the nodes, the server, premium features of the server, and the workstation. These elements work together to provide the chef-client the information and instruction that it needs so that it can do its job. As you are reviewing the rest of this topic, use the icons in the tables to refer back to this image.

_images/chef_overview.png

Chef has the following major components:

Feature Description
_images/icon_big_chef_client.png A chef-client is installed on every node that is under management by Chef. The chef-client performs all of the configuration tasks that are specified by the run-list and will pull down any required configuration data from the Chef server as it is needed during the chef-client run.
_images/icon_big_workstation.png

One (or more) workstations are configured to allow users to author, test, and maintain cookbooks. Cookbooks are uploaded to the Chef server from the workstation. Some cookbooks are custom to the organization and others are based on community cookbooks available from the Chef Supermarket.

Often, a workstation is configured to use the Chef development kit as the development toolkit. The Chef development kit is a package from Chef that provides an optional (but recommended) set of tooling, including Chef itself, the chef command line tool, Kitchen, ChefSpec, Berkshelf, and more.

_images/icon_big_chef_server.png

The Chef server acts as a hub of information. Cookbooks and policy settings are uploaded to the Chef server by users from workstations. (Policy settings may also be maintained from the Chef server itself, via the Chef management console web user interface.)

The chef-client accesses the Chef server from the node on which it’s installed to get configuration data, perform searches of historical chef-client run data, and then pull down the necessary configuration data. After the chef-client run is finished, the chef-client uploads updated run data to the Chef server (as the updated node object), uploads data generated by audit-mode (for additional rules processing by Chef analytics), and generates reporting data.

Chef management console is the user interface for the Chef server. It is used to manage data bags, attributes, run-lists, roles, environments, and cookbooks, and also to configure role-based access for users and groups.

_images/icon_big_chef_analytics.png Chef analytics provides real-time visibility into what is happening on the Chef server, including what’s changing, who made those changes, and when they occurred. Details are tracked by the chef-client during the chef-client run. These details are uploaded to the Chef server at the end of the chef-client run. This data is used to build reports, run rules against the output of audit-mode, generate notifications based on the results of auditing, and visibility into messages that were generated during the chef-client run.
_images/icon_big_chef_supermarket.png Chef Supermarket is the location in which community cookbooks are authored and maintained. Cookbooks that are part of the Chef Supermarket may be used by any Chef user. How community cookbooks are used varies from organization to organization.

The premium features of the Chef server—Chef management console, Chef analytics, chef-client run reporting, high availability configurations, and Chef server replication—may all be installed and configured for use with the Chef server. Each of these premium features are easily enabled and can be run as part of any Chef server deployment!

The following sections discuss these elements (and their various components) in more detail.

Nodes

A node is any physical, virtual, or cloud machine that is configured to be maintained by a chef-client.

The following types of nodes can be managed:

Feature Description
_images/icon_node_physical.png A physical node is typically a server or a virtual machine, but it can be any active device attached to a network that is capable of sending, receiving, and forwarding information over a communications channel. In other words, a physical node is any active device attached to a network that can run a chef-client and also allow that chef-client to communicate with a Chef server.
_images/icon_node_cloud.png A cloud-based node is hosted in an external cloud-based service, such as Amazon Virtual Private Cloud, OpenStack, Rackspace, Google Compute Engine, Linode, or Microsoft Azure. Plugins are available for knife that provide support for external cloud-based services. knife can use these plugins to create instances on cloud-based services. Once created, the chef-client can be used to deploy, configure, and maintain those instances.
_images/icon_node_virtual.png A virtual node is a machine that runs only as a software implementation, but otherwise behaves much like a physical machine.
_images/icon_node_networking.png A network node is any networking device—a switch, a router—that is being managed by a chef-client, such as networking devices by Juniper Networks, Arista, Cisco, and F5. Use Chef to automate common network configurations, such physical and logical Ethernet link properties and VLANs, on these devices.

The key components Chef that are installed on nodes are:

Feature Description
_images/icon_chef_client.png

A chef-client is an agent that runs locally on every node that is under management by Chef. When a chef-client is run, it will perform all of the steps that are required to bring the node into the expected state, including:

  • Registering and authenticating the node with the Chef server
  • Building the node object
  • Synchronizing cookbooks
  • Compiling the resource collection by loading each of the required cookbooks, including recipes, attributes, and all other dependencies
  • Taking the appropriate and required actions to configure the node
  • Looking for exceptions and notifications, handling each as required

RSA public key-pairs are used to authenticate the chef-client with the Chef server every time a chef-client needs access to data that is stored on the Chef server. This prevents any node from accessing data that it shouldn’t and it ensures that only nodes that are properly registered with the Chef server can be managed.

_images/icon_ohai.png

Ohai is a tool that is used to detect attributes on a node, and then provide these attributes to the chef-client at the start of every chef-client run. Ohai is required by the chef-client and must be present on a node. (Ohai is installed on a node as part of the chef-client install process.)

The types of attributes Ohai collects include (but are not limited to):

  • Platform details
  • Network usage
  • Memory usage
  • CPU data
  • Kernel data
  • Host names
  • Fully qualified domain names
  • Other configuration details

Attributes that are collected by Ohai are automatic attributes, in that these attributes are used by the chef-client to ensure that these attributes remain unchanged after the chef-client is done configuring the node.

Workstations

A workstation is a computer that is configured to run knife, to synchronize with the chef-repo, and interact with a single Chef server. The workstation is the location from which most users will do most of their work, including:

  • Developing cookbooks and recipes (and authoring them using Ruby)
  • Keeping the chef-repo synchronized with version source control
  • Using knife to upload items from the chef-repo to the Chef server
  • Configuring organizational policy, including defining roles and environments and ensuring that critical data is stored in data bags
  • Interacting with nodes, as (or when) required, such as performing a bootstrap operation

Some important components of workstations include:

Feature Description
_images/icon_commands.png Chef incudes two important command-line tools. Use the knife command-line tool when interacting with nodes or when working with objects on the Chef server. Use the chef command line tool when working with the chef-repo, which is the repository structure in which cookbooks are authored, tested, and maintained.
_images/icon_chef_dk.png The Chef development kit is a set of tooling that is packaged by Chef. It includes the chef command line tool, Kitchen, ChefSpec, Berkshelf, and more.
_images/icon_kitchen.png

Use Kitchen to automatically test cookbook data across any combination of platforms and test suites:

  • Defined in a .kitchen.yml file
  • Uses a driver plugin architecture
  • Supports cookbook testing across many cloud providers and virtualization technologies
  • Supports all common testing frameworks that are used by the Ruby community
_images/icon_chefspec.png

Use ChefSpec to simulate the convergence of resources on a node:

  • Runs the chef-client on a local machine
  • Uses chef-zero or chef-solo
  • Is an extension of RSpec, a behavior-driven development (BDD) framework for Ruby
  • Is the fastest way to test resources and recipes
_images/icon_policy.png Policy defines how business and operational requirements, processes, and production workflows map to objects that are stored on the Chef server. Policy objects on the Chef server include roles, environments, and cookbook versions.
_images/icon_repository.png

The chef-repo is the repository structure in which cookbooks are authored, tested, and maintained.

  • Cookbooks contain recipes, attributes, resources, providers, libraries, files, templates, and so on.
  • The chef-repo should be synchronized with a version control system, such as git and then managed as if it were source code.

The directory structure within the chef-repo varies. Some organizations prefer to keep all of their cookbooks in a single chef-repo, while other organizations prefer to use a chef-repo for every cookbook.

While Chef includes tooling like the Chef development kit, encourages integration and unit testing, and defines workflow around cookbook authoring and policy, it’s important to note that you know best about how your infrastructure should be put together. Therefore, Chef makes as few decisions on its own as possible. When a decision must be made, the chef-client uses a reasonable default setting that can be easily changed. While Chef encourages the use of the tooling packaged in the Chef development kit, none of these tools should be seen as a requirement or pre-requisite for being successful using Chef.

The Chef Server

The Chef server acts as a hub for configuration data. The Chef server stores cookbooks, the policies that are applied to nodes, and metadata that describes each registered node that is being managed by the chef-client. Nodes use the chef-client to ask the Chef server for configuration details, such as recipes, templates, and file distributions. The chef-client then does as much of the configuration work as possible on the nodes themselves (and not on the Chef server). This scalable approach distributes the configuration effort throughout the organization.

In addition to node objects, policy, and cookbooks, a Chef server includes:

Feature Description
_images/icon_search.png Search indexes allow queries to be made for any type of data that is indexed by the Chef server, including data bags (and data bag items), environments, nodes, and roles. A defined query syntax is used to support search patterns like exact, wildcard, range, and fuzzy. A search is a full-text query that can be done from several locations, including from within a recipe, by using the search subcommand in knife, the search method in the Recipe DSL, the search box in the Chef management console, and by using the /search or /search/INDEX endpoints in the Chef server API. The search engine is based on Apache Solr and is run from the Chef server.
_images/icon_manager.png

Chef management console is a web-based interface for the Chef server that provides users a way to manage the following objects:

  • Nodes
  • Cookbooks and recipes
  • Roles
  • Stores of JSON data (data bags), including encrypted data
  • Environments
  • Searching of indexed data
  • User accounts and user data for the individuals who have permission to log on to and access the Chef server
_images/icon_data_bags.png A data bag is a global variable that is stored as JSON data and is accessible from a Chef server. A data bag is indexed for searching and can be loaded by a recipe or accessed during a search.

Node Objects

For the chef-client, two important aspects of nodes are groups of attributes and run-lists. An attribute is a specific piece of data about the node, such as a network interface, a file system, the number of clients a service running on a node is capable of accepting, and so on. A run-list is an ordered list of recipes and/or roles that are run in an exact order. The node object consists of the run-list and node attributes, which is a JSON file that is stored on the Chef server. The chef-client gets a copy of the node object from the Chef server during each chef-client run and places an updated copy on the Chef server at the end of each chef-client run.

Some important node objects include:

Feature Description
_images/icon_node_attribute.png

An attribute is a specific detail about a node. Attributes are used by the chef-client to understand:

  • The current state of the node
  • What the state of the node was at the end of the previous chef-client run
  • What the state of the node should be at the end of the current chef-client run

Attributes are defined by:

  • The state of the node itself
  • Cookbooks (in attribute files and/or recipes)
  • Roles
  • Environments

During every chef-client run, the chef-client builds the attribute list using:

  • Data about the node collected by Ohai
  • The node object that was saved to the Chef server at the end of the previous chef-client run
  • The rebuilt node object from the current chef-client run, after it is updated for changes to cookbooks (attribute files and/or recipes), roles, and/or environments, and updated for any changes to the state of the node itself

After the node object is rebuilt, all of attributes are compared, and then the node is updated based on attribute precedence. At the end of every chef-client run, the node object that defines the current state of the node is uploaded to the Chef server so that it can be indexed for search.

_images/icon_run_lists.png

A run-list defines all of the information necessary for Chef to configure a node into the desired state. A run-list is:

  • An ordered list of roles and/or recipes that are run in the exact order defined in the run-list; if a recipe appears more than once in the run-list, the chef-client will not run it twice
  • Always specific to the node on which it runs; nodes may have a run-list that is identical to the run-list used by other nodes
  • Stored as part of the node object on the Chef server
  • Maintained using knife, and then uploaded from the workstation to the Chef server, or is maintained using the Chef management console

Policy

Policy settings can be used to map business and operational requirements, such as process and workflow, to settings and objects stored on the Chef server:

  • Roles define server types, such as “web server” or “database server”
  • Environments define process, such as “dev”, “staging”, or “production”
  • Certain types of data—passwords, user account data, and other sensitive items—can be placed in data bags, which are located in a secure sub-area on the Chef server that can only be accessed by nodes that authenticate to the Chef server with the correct SSL certificates
  • The cookbooks (and cookbook versions) in which organization-specific configuration policies are maintained

Some important aspects of policy include:

Feature Description
_images/icon_roles.png A role is a way to define certain patterns and processes that exist across nodes in an organization as belonging to a single job function. Each role consists of zero (or more) attributes and a run-list. Each node can have zero (or more) roles assigned to it. When a role is run against a node, the configuration details of that node are compared against the attributes of the role, and then the contents of that role’s run-list are applied to the node’s configuration details. When a chef-client runs, it merges its own attributes and run-lists with those contained within each assigned role.
_images/icon_environments.png An environment is a way to map an organization’s real-life workflow to what can be configured and managed when using Chef server. Every organization begins with a single environment called the _default environment, which cannot be modified (or deleted). Additional environments can be created to reflect each organization’s patterns and workflow. For example, creating production, staging, testing, and development environments. Generally, an environment is also associated with one (or more) cookbook versions.
_images/icon_cookbook_versions.png

A cookbook version represents a set of functionality that is different from the cookbook on which it is based. A version may exist for many reasons, such as ensuring the correct use of a third-party component, updating a bug fix, or adding an improvement. A cookbook version is defined using syntax and operators, may be associated with environments, cookbook metadata, and/or run-lists, and may be frozen (to prevent unwanted updates from being made).

A cookbook version is maintained just like a cookbook, with regard to source control, uploading it to the Chef server, and how the chef-client applies that cookbook when configuring nodes.

Analytics

The Chef analytics platform is a feature of Chef that provides real-time visibility into what is happening on the Chef server, including what’s changing, who made those changes, and when they occurred. Individuals may be notified of these changes in real-time. Use this visibility to verify compliance against internal controls.

Chef analytics includes:

Feature Description
_images/icon_actions.png Actions are policy and administrative changes made to the Chef server. The Chef server gathers a lot of data—–each node object, the node run history for all nodes, the history of every cookbook and cookbook version, how policy settings, such as roles, environments, and data bags, are applied and to what they are applied, individual user data, and so on.
_images/icon_reports.png Reporting is used to keep track of what happened during the execution of chef-client runs across all of the infrastructure that is being managed by Chef. Reports can be generated for the entire organization and they can be generated for specific nodes.

Cookbooks

A cookbook is the fundamental unit of configuration and policy distribution. A cookbook defines a scenario and contains everything that is required to support that scenario:

  • Recipes that specify the resources to use and the order in which they are to be applied
  • Attribute values
  • File distributions
  • Templates
  • Extensions to Chef, such as libraries, definitions, and custom resources

The chef-client uses Ruby as its reference language for creating cookbooks and defining recipes, with an extended DSL for specific resources. A reasonable set of resources are available to the chef-client, enough to support many of the most common infrastructure automation scenarios; however, this DSL can also be extended when additional resources and capabilities are required.

Some important components of cookbooks include:

Feature Description
_images/icon_node_attribute.png An attribute can be defined in a cookbook (or a recipe) and then used to override the default settings on a node. When a cookbook is loaded during a chef-client run, these attributes are compared to the attributes that are already present on the node. Attributes that are defined in attribute files are first loaded according to cookbook order. For each cookbook, attributes in the default.rb file are loaded first, and then additional attribute files (if present) are loaded in lexical sort order. When the cookbook attributes take precedence over the default attributes, the chef-client will apply those new settings and values during the chef-client run on the node.
_images/icon_cookbook_recipes.png

A recipe is the most fundamental configuration element within the organization. A recipe:

  • Is authored using Ruby, which is a programming language designed to read and behave in a predictable manner
  • Is mostly a collection of resources, defined using patterns (resource names, attribute-value pairs, and actions); helper code is added around this using Ruby, when needed
  • Must define everything that is required to configure part of a system
  • Must be stored in a cookbook
  • May be included in a recipe
  • May use the results of a search query and read the contents of a data bag (including an encrypted data bag)
  • May have a dependency on one (or more) recipes
  • May tag a node to facilitate the creation of arbitrary groupings
  • Must be added to a run-list before it can be used by the chef-client
  • Is always executed in the same order as listed in a run-list

The chef-client will run a recipe only when asked. When the chef-client runs the same recipe more than once, the results will be the same system state each time. When a recipe is run against a system, but nothing has changed on either the system or in the recipe, the chef-client won’t change anything.

In addition to attributes, recipes, and versions, the following items are also part of cookbooks:

  • Files and templates. A template is a file written in markup language that uses Ruby statements to solve complex configuration scenarios. Files can be transferred from cookbooks to nodes.
  • Resources and providers. A resource is a package, a service, a group of users, and so on. A resource tells the chef-client which provider to use during a chef-client run for various tasks, such as installing packages, running Ruby code, or accessing directories and file systems. A resource is generic: “install program A” while a provider knows what to do with that process on Debian and Ubuntu and Microsoft Windows. A provider defines the steps that are required to bring that piece of the system into the desired state. Default providers exist that cover the most common scenarios.
  • Libraries. A library allows the use of arbitrary Ruby code in a cookbook, either as a way to extend the chef-client language or to implement a new class.
  • Definitions. A definition is used to create new resources by stringing together one (or more) existing resources.

Conclusion

Chef is a thin DSL (domain-specific language) built on top of Ruby. This approach allows Chef to provide just enough abstraction to make reasoning about your infrastructure easy. Chef includes a built-in taxonomy of all the basic resources one might configure on a system, plus a defined mechanism to extend that taxonomy using the full power of the Ruby language. Ruby was chosen because it provides the flexibility to use both the simple built-in taxonomy, as well being able to handle any customization path your organization requires.