Terraform Cloud Agents
Hands-on: Try the Manage Private Environments with Terraform Cloud Agents tutorial on HashiCorp Learn.
Note: Terraform Cloud Agents are a paid feature, available as part of the Terraform Cloud for Business upgrade package. Learn more about Terraform Cloud pricing here. The number of agents you are eligible to deploy is determined by the number of concurrent runs your organization is entitled to.
Terraform Cloud Agents allow Terraform Cloud to communicate with isolated, private, or on-premises infrastructure. By deploying lightweight agents within a specific network segment, you can establish a simple connection between your environment and Terraform Cloud which allows for provisioning operations and management. This is useful for on-premises infrastructure types such as vSphere, Nutanix, OpenStack, enterprise networking providers, and anything you might have in a protected enclave.
The agent architecture is pull-based, so no inbound connectivity is required. Any agent you provision will poll Terraform Cloud for work and carry out execution of that work locally.
Before Install
Supported Operating Systems
Agents currently only support x86_64 bit Linux operating systems. You can also run the agent within Docker using our official Terraform Agent Docker container.
Supported Terraform Versions
Agents support Terraform versions 0.12 and above. Workspaces configured to use Terraform versions below 0.12 will not be able to select the agent-based execution mode.
Hardware Requirements
The host running the agent will have varying resource requirements depending on the workspace. A host can be a dedicated or shared cloud instance, virtual machine, bare metal server, or a container. You should monitor and adjust memory, CPU, and disk space based on each workspace's usage and performance. The name of your instance type may vary depending on your deployment environment.
You can use the specs below as a reference:
- At least 4GB of free disk space
- Each run requires the agent to temporarily store local copies of the tarred repository, extracted repository, state file, any providers or modules, and the Terraform binary itself.
- At least 2GB of system memory
Networking Requirements
In order for an agent to function properly, it must be able to make outbound requests over HTTPS (TCP port 443) to the Terraform Cloud application APIs. This may require perimeter networking as well as container host networking changes, depending on your environment. The IP ranges are documented in the Terraform Cloud IP Ranges documentation.
Additionally, the agent must also be able to communicate with any services required by the Terraform code it is executing. This includes the Terraform releases distribution service, releases.hashicorp.com (supported by Fastly), as well as any provider APIs. The services which run on these IP ranges are described in the table below.
Hostname | Port/Protocol | Directionality | Purpose |
---|---|---|---|
app.terraform.io | tcp/443, HTTPS | Outbound | Polling for new workloads, providing status updates, and downloading private modules from Terraform Cloud's Private Module Registry |
registry.terraform.io | tcp/443, HTTPS | Outbound | Downloading public modules from the Terraform Registry |
releases.hashicorp.com | tcp/443, HTTPS | Outbound | Updating agent components and downloading Terraform binaries |
archivist.terraform.io | tcp/443, HTTPS | Outbound | Blob Storage |
Operational Considerations
The agent is distributed as a standalone binary which can be run on any supported system. By default, the agent will run in the foreground as a long-running process that will continuously poll for workloads from Terraform Cloud. We strongly recommend pairing the agent with a process supervisor to ensure that it is automatically restarted in case of an error.
Agents do not guarantee a clean working environment per Terraform execution. Each execution is performed in its own temporary directory with a clean environment, but references to absolute file paths or other machine state may cause interference between Terraform executions. We strongly recommend that you write your Terraform code to be stateless and idempotent. You may also want to consider using single-execution mode to ensure your agent only runs a single workload.
Updating
By default, the agent will automatically update itself to the latest minor version. Administrators are required to update the host operating system and all other installed software.
To customize this update behavior, pass the flag -auto-update
or set the environment variable TFC_AGENT_AUTO_UPDATE
. The valid options are presented in the table below.
Update Setting | Behavior |
---|---|
minor | Matches the default behavior, automatically update the agent to the latest minor version. |
patch | The agent will only be updated to the newest patch version, new minor versions will require a manual update. |
disabled | Disables automatic updates, all updates will be manual. |
Security Considerations
Agents should be considered a global resource within an organization. Once configured, any workspace owner may configure their workspace to target the organization's agents. This may allow a malicious workspace to access state files, variables, or code from other workspaces targeting the same agent, or access sensitive data on the host running the agent. For this reason, we recommend carefully considering the implications of enabling agents within an organization, and restricting access to your organization to only trusted parties.
Limitations
Agents allow you to run Terraform operations from a Terraform Cloud workspace on your private infrastructure. Agents do not support:
- Connecting to private infrastructure from Sentinel policies using the http import.
- Connecting Terraform Cloud workspaces to VCS instances that do not allow access from the public internet. For example, you cannot use agents to connect to a GitHub Enterprise Server instance that requires access to your VPN.
For these use cases, we recommend you leverage the information provided by the IP Ranges documentation to permit direct communication from the appropriate Terraform Cloud service to your internal infrastructure.
Terraform Enterprise
Terraform Enterprise supports Terraform Cloud Agents; see Terraform Cloud Agents on TFE for TFE-specific documentation and requirements.
Managing Agent Pools
Agents are organized into pools, which can be managed in Terraform Cloud's UI. Each workspace can specify which agent pool should run its workloads.
Note: You must be a member of the “Owners” team within your organization in order to manage an organization's agents in Terraform Cloud. (More about permissions.)
Create a new Agent Pool
Navigate to Organization Settings > Agents and click "New agent pool".
Give your pool a name, then click "Continue". This name will be used to distinguish your pools when changing the settings of a workspace.
Give your token a description and click "Create Token".
Note: Your token information will not be displayed again. Make sure to save it appropriately before moving to the final step.
Click "Finish".
Delete an Agent Pool
Navigate to Organization Settings > Agents and click on the name of the pool you would like to delete.
Click "Delete agent pool".
Confirm the deletion of the pool by clicking "Yes, delete agent pool".
Important: Agent pools which are still associated with a workspace are unable to be deleted. To delete these pools, first ensure the related workspace has completed all in progress runs, and remove the association to the agent pool in Workspace Settings > General Settings.
Scope an Agent Pool to Specific Workspaces
Scoping an agent pool lets you specify which workspaces are allowed to use it. The agent pool will then only be available from within the chosen workspaces; it will not appear other workspace's lists of available agent pools. Scoping an agent pool does not remove it from workspaces that are actively using it. To remove access, you must first remove the agent pool from within the workspace's settings.
To scope an agent pool to specific workspaces within the organization:
Click Settings and then Agents. A list of agent pools for the organization appears.
Click the name of the pool you would like to update.
Click Grant access to specific workspaces. A menu opens where you can search for workspaces within this organization.
Begin typing in the search field to find available workspaces and select the workspaces that should have access to the agent pool.
Click Save.
Now only the specified workspaces will be able to use the agent pool.
Revoke an Agent Token
You may revoke an issued token from your agents at any time.
Revoking a token will cause the agents using it to exit. For agents to continue servicing workspace jobs, they must be reinitialized with a new token. Under normal circumstances, it may be desirable to generate a new token first, initialize the agents using it, then revoke the old token once no agents are using it. Agent tokens display information about the last time they were used to help you decide whether they are safe to revoke.
Navigate to Organization Settings > Agents, then click on the agent pool you would like to manage.
Click "Revoke Token" for the token you would like to revoke.
Confirm the deletion of the token by clicking "Yes, delete token".
Managing Agents
The agent software runs on your own infrastructure. Agent pool membership is determined by which token you provide when starting the agent.
Download and Install the Agent
- Download the latest agent release, the associated checksum file (.SHA256sums), and the checksum signature (.sig).
- Verify the integrity of the downloaded archive, as well as the signature of the
SHA256SUMS
file using the instructions available on HashiCorp's security page. - Extract the release archive. The
unzip
utility is available on most Linux distributions and may be invoked asunzip <archive file>
. Two individual binaries will be extracted (tfc-agent
andtfc-agent-core
). These binaries must reside in the same directory for the agent to function properly.
Start the Agent
Using the Agent token you copied earlier, set the TFC_AGENT_TOKEN
and TFC_AGENT_NAME
environment variables.
export TFC_AGENT_TOKEN=your-token
export TFC_AGENT_NAME=your-agent-name
./tfc-agent
Note: The TFC_AGENT_NAME
variable is optional. If you do not specify a name here, one will not be displayed. These names are for your reference only, and the agent ID is what will appear in logs and API requests.
Once complete, your agent will appear on the Agents page and display its current status.
Optional Configuration: Running an Agent using Docker
Alternatively, you can use our official agent Docker container to run the Agent.
docker pull hashicorp/tfc-agent:latest
docker run -e TFC_AGENT_TOKEN=your-token -e
TFC_AGENT_NAME=your-agent-name hashicorp/tfc-agent
This Docker image executes the tfc-agent process as the non-root tfc-agent user. For some workflows, such as those that require the ability to install software via apt-get during local-exec scripts, you may need to build a customized version of the agent Docker image for your internal use.
FROM hashicorp/tfc-agent:latest
USER root
# Install sudo. The container runs as a non-root user, but people may rely on
# the ability to apt-get install things.
RUN apt-get -y install sudo
# Permit tfc-agent to use sudo apt-get commands.
RUN echo 'tfc-agent ALL=NOPASSWD: /usr/bin/apt-get , /usr/bin/apt' >> /etc/sudoers.d/50-tfc-agent
USER tfc-agent
An image customized in this way will permit installation of additional software via sudo apt-get.
Stopping the Agent
The agent maintains a registration and a liveness indicator within Terraform Cloud during the entire course of its runtime. When an agent is to be retired, it must deregister itself from Terraform Cloud. The agent performs deregistration automatically as part of its shutdown procedure in the following scenarios:
- If using an interactive terminal, Ctrl-C is pressed.
- One of
SIGINT
,SIGTERM
, orSIGQUIT
is sent to the agent process ID. It is important to send only one signal. If a second signal is received by the agent, it is interpreted as a forceful termination signal and will cause the agent to exit immediately.
In both cases, after initiating a graceful shutdown, the terminal user or parent program should wait for the agent to exit. The amount of time this takes depends on the agent's current workload. The agent will wait for any current operations to complete before deregistering and exiting.
It is highly recommended that the agent is only terminated using one of the above methods. Abruptly terminating an agent by forcefully killing the process, power cycling the host, etc., will not provide the agent the opportunity to deregister, and will result in an Unknown agent status. This may cause further capacity issues, as outlined below in Agent Capacity Usage.
Optional Configuration: Single-execution mode
The Agent can also be configured to run in single-execution mode, which will ensure that the Agent only runs a single workload, then terminates. This can be used in combination with Docker and a process supervisor to ensure a clean working environment for every Terraform run.
To use single-execution mode, start the agent with the -single
command line argument.
Configuring Workspaces to use the Agent
Note: You must have “Admin” access to the workspace you are configuring to change the execution mode. (More about permissions.)
Important: Changing your workspace's execution mode after a run has already been planned will cause the run to error when it is applied. To minimize the number runs that error, you should disable auto-apply, complete any runs that are no longer in the pending stage, and lock your workspace before changing the execution mode.
To configure a workspace to execute runs using an agent:
- Open the workspace from the main "Workspaces" view, then navigate to "Settings > General" from the dropdown menu.
- Select Agent as the execution mode, as well as the agent pool this workspace should use.
- Click "Save Settings" at the bottom of the page.
Run Details
Runs which are processed by an agent will have additional information about that agent in the details section of the run:
Note: Different agents may be used for the plan and apply operations, depending on agent availability within the pool.
Running Multiple Agents
You may choose to run multiple agents within your network, up to the organization's purchased agent limit. If there are multiple agents available within an organization, Terraform Cloud will select the first available agent within the target pool.
Each agent process will run a single Terraform run at a time. Multiple agent processes can be concurrently run on a single instance, license limit permitting.
Resilience
It is possible that an agent process could be terminated unexpectedly (due to killing the process forcefully, power cycling the host machine, etc.). We strongly recommend pairing the agent with a process supervisor to ensure that it is automatically restarted in case of an error.
(See Agent Capacity Usage below).
Troubleshooting
Viewing Agent Statuses
Agent status appear on the Organization Settings > Agents page and will contain one of these values:
- Idle: The agent is running normally and waiting for jobs to be available.
- Busy: The agent is running normally and currently executing a job.
- Unknown: The agent has not reported any status for an unexpected period of time. The agent may yet recover if the agent's situation is temporary, such as a short-lived network partition.
- Errored: The agent encountered an unrecoverable error or has been in an Unknown state for long enough that Terraform Cloud considers it errored. This may indicate that the agent process was interrupted, has crashed, a permanent network partition exists, etc. If the agent was in the process of running an operation (such as a plan or apply), that operation has been marked as errored. If the current agent process does manage to recover, it will be instructed to exit immediately.
- Exited: The agent exited normally, and has successfully informed Terraform of it doing so.
Agent Capacity Usage
Agents are considered active and count towards the organization's purchased agent capacity if they are in the Idle, Busy, or Unknown state. Agents which are in the Errored or Exited state do not count towards the organization's total agent capacity.
The number of active agents you are eligible to deploy is determined by the number of Concurrent Runs your organization is entitled to. Agents are available as part of the Terraform Cloud Business tier.
Agents in the Unknown state continue to be counted against the organization's total agent allowance, as this status is typically an indicator of a temporary communication issue between the agent and Terraform Cloud. Unknown agents which do not respond after a period of 2 hours will automatically transition to an Errored state, at which point they will not count against the agent allowance.
Agents may have an Unknown status if they are terminated without gracefully exiting. Agents should always be shut down according to the Stopping the Agent section to allow them to deregister from Terraform Cloud. We strongly recommend ensuring that any process supervisor, application scheduler, or other runtime manager is configured to follow this procedure to minimize Unknown agent statuses.
You can deregister agents that are Unknown, Errored, or Exited through either the Organization Settings > Agents page or through the Agent API. Deregistered agents will no longer appear in settings page or count against the organization's agent allowance.
Viewing Agent Logs
Output from the Terraform execution will be visible on the run’s page within Terraform Cloud. For more in-depth debugging, you may wish to view the agent’s logs, which are sent to stdout
and configurable via the -log-level
command line argument. By default, these logs are not persisted in any way. It is the responsibility of the operator to collect and store these logs if they are needed.