Troubleshooting Terraform Executions Using Cloudify

In the previous articles Running Terraform with Cloudify Part 1, and Running Terraform with Cloudify Part 2 you saw how to run Terraform with Cloudify and deploy an existing Terraform module using Cloudify

What You’ll Learn

This article will go through several techniques for troubleshooting Terraform executions using Cloudify. It can be done in different ways, depending on the specific user role and access level: 

  1. Using Cloudify Console 
  2. Following through Terraform execution events using Cloudify CLI
  3. Inspecting Cloudify manager logs and troubleshooting on the manager server locally

Prerequisites

You will need the following

  1. Working Cloudify Manager. 

Please see our documentation for information about installing your Manager.

  1. Cloudify CLI is installed and configured to work with your Cloudify Manager.

Please see our documentation for details on how to install and configure it.

What Is Cloudify?

Cloudify is an open-source, multi-cloud orchestration platform featuring unique technology that packages infrastructure, networking, and existing automation tools into certified blueprints. So you can automate, orchestrate and manage your workloads and environments while using all your DevOps tools in one place. 

Let’s Start

At this point, you have already packaged the Terraform template as part of the Blueprint and uploaded it to the Cloudify manager, or you created your blueprint from the Terraform module and are ready to create and instantiate your deployments. We will see how to follow and troubleshoot Terraform execution in Cloudify.

1. Follow Terraform Execution in the Cloudify Console

Once execution starts you can follow and monitor it in several areas in the Cloudify console. In this article, we will focus on the Deployments, Executions, and Logs pages in the Cloudify console.

Let’s take a look at an execution that was completed successfully. Navigate to the Deployments page Services widget from the left sidebar

The services widget gives a centralized view of the list of the deployments, where you can select a particular deployment and drill down into Executions/Execution Task Graph and Deployment Events/Logs to follow the execution and/or troubleshoot. We will talk more about it shortly. 

Let’s take a look at the first example of an execution that was not completed successfully and how to find the root cause. 

You can see the Execution Task Graph, which displays the exact series and order of steps Cloudify takes to convert the declarative blueprint into an actual deployment.

If anything goes wrong, you can take a look at the Deployment Events/Logs and zoom into a specific error to troubleshoot and resolve the issues.

Alternatively, you can navigate to the Executions from the left sidebar and see where the error occurred and the error message itself by clicking on the Show Error Details as shown below.

Finally, you can check the logs and events in the System Set-up, System Logs widget. Filter by blueprint to see all the logs and events. You can also apply additional filters like log levels, node instances, etc. 

Please refer to the documentation for more information.

Thanks to the detailed output of the Terraform execution in Cloudify logs we can see the exact Terraform error in the error details and conclude based on the error message that the issue is related to security group rules defined in Terraform template, main.tf. See below.

The next step is to review the main.tf, fix the issue, and try again. It is easy to achieve with Cloudify terraform plugin workflows. Check out Cloudify Terraform plugin documentation for more details.

Once your terraform module is fixed, just execute Cloudify terraform workflow Reload terraform template. Open up the page for your deployment, and select Execute Workflow > Tf > Reload terraform template, populate all the required values, and hit Execute. 

Note: populate “variables”:{} parameter value in case there were new input variables added in terraform template files as part of your fix. See below.

Here is another example of an error during Terraform execution. 

You can see the exact command executed by Cloudify and the detailed Terraform error from resource creation aws_instance.example_vm

/opt/manager/resources/deployments/default_tenant/tf-01-from-a-file/terraform_o10t7u/terraform apply -auto-approve -no-color -input=false -var-file /opt/manager/resources/deployments/default_tenant/tf-01-from-a-file/cloud_resources_hnqkws/tmpg42ac2tj.json, exit_code: 1, stdout: aws_instance.example_vm: Creating..., stderr: Error: Error launching source instance: VPCResourceNotSpecified: The specified instance type can only be used in a VPC. A subnet ID or network interface ID is required to carry out the request.	status code: 400, request id: ffe2407d-b93e-4a06-a5fc-dc5e4d53404e  on main.tf line 8, in resource "aws_instance" "example_vm":   8: resource "aws_instance" "example_vm"

As the error suggests, it doesn’t find the default VPC in the us-east-1 region. 

Next, is to review the main.tf, fix the issue, and try again.

2. Follow the Terraform Execution Using Cloudify CLI

First, let’s retrieve the deployment id

cfy deployments list

Then, select the executions list for the specific deployment id

cfy executions list -d 57b35cea-26bc-445b-86f4-fc21c1fa20a6

<53f38d62-4107-45ed-848f-77c9dd3caa64> is the execution id

Now you can list the events for the specific execution id in sequential order 

cfy events list  53f38d62-4107-45ed-848f-77c9dd3caa64

<53f38d62-4107-45ed-848f-77c9dd3caa64> is the execution id

It is also possible to run it during execution with option –tail , which will show the events of the specified execution in the real-time until it ends

cfy events list 53f38d62-4107-45ed-848f-77c9dd3caa64 --tail

The next step is to fix the issue and try again.

3. Inspect Cloudify Manager Logs and Troubleshoot on the Manager Server Locally

Note: this option is available for those who have ssh access to the Cloudify manager server. Once the ssh connection with the Cloudify Manager server is established,  you can find the logs in the /var/log/cloudify/mgmtworker/logs directory

There is a dedicated log file created for each deployment where you can follow through all the events, and see errors if occurred.

/var/log/cloudify/mgmtworker/logs/<deployment_id>.log

Please refer to Cloudify’s  documentation for more information about Cloudify logging

Review the specific deployment log file. You can observe the exact command that was executed when the error occurred. See below.

script_runner.tasks.ProcessException: command: /opt/manager/resources/deployments/default_tenant/57b35cea-26bc-445b-86f4-fc21c1fa20a6/terraform_9tzism/terraform apply -auto-approve -no-color -input=false -var-file /opt/manager/resources/deployments/default_tenant/57b35cea-26bc-445b-86f4-fc21c1fa20a6/cloud_resources_mar1r5/tmptlqxsyhd.json, exit_code: 1, stdout: random_id.suffix: Creating...random_id.suffix: Creation complete after 0s [id=N1rjrA]aws_security_group.allow_ssh_and_http: Creating..., stderr: Error: error updating Security Group (sg-0b5969c5a0a83c09e): error authorizing Security Group (egress) rules: InvalidGroup.NotFound: The security group 'sg-0b5969c5a0a83c09e' does not exist      status code: 400, request id: 5a704e6c-29dc-4115-b020-92ec8cf00911  on main.tf line 6, in resource "aws_security_group" "allow_ssh_and_http":   6: resource "aws_security_group" "allow_ssh_and_http" {

The error message suggests there is an error in the Terraform template file. For more convenience and to shorten debug time you can fine-tune your Terraform template locally on the Cloudify manager.

Just switch your current working directory to the Terraform template location and you can execute the same command you’ve seen in the log above.

Fix the main.tf file and run the command again.

During the debugging process, you can see what resources were created already using familiar Terraform commands.

In case there are more debug printouts required to find the issue, reset the Terraform debug level and retry again.

export TF_LOG=DEBUG

You can also add it to the blueprint in terraform module environment_variables section. Please refer to Cloudify documentation for more details.

comments

    Leave a Reply

    Your email address will not be published. Required fields are marked *

    Back to top