[Terraform](https://www.terraform.io/intro): "HashiCorp Terraform is an infrastructure as code tool that lets you define both cloud and on-prem resources in human-readable configuration files that you can version, reuse, and share. You can then use a consistent workflow to provision and manage all of your infrastructure throughout its lifecycle. Terraform can manage low-level components like compute, storage, and networking resources, as well as high-level components like DNS entries and SaaS features."
- Full automation: In the past, resource creation, modification and removal were handled manually or by using a set of tooling. With Terraform or other IaC technologies, you manage the full lifecycle in an automated fashion.<br>
- Modular and Reusable: Code that you write for certain purposes can be used and assembled in different ways. You can write code to create resources on a public cloud and it can be shared with other teams who can also use it in their account on the same (or different) cloud><br>
- Improved testing: Concepts like CI can be easily applied on IaC based projects and code snippets. This allow you to test and verify operations beforehand
- Declarative: Terraform uses the declarative approach (rather than the procedural one) in order to define end-status of the resources
- No agents: as opposed to other technologies (e.g. Puppet) where you use a model of agent and server, with Terraform you use the different APIs (of clouds, services, etc.) to perform the operations
- Community: Terraform has strong community who constantly publishes modules and fixes when needed. This ensures there is good modules maintenance and users can get support quite quickly at any point
1. Write Terraform definitions: `.tf` files written in HCL that described the desired infrastructure state (and run `terraform init` at the very beginning)
This is a manual process. Most of the time this is automated so user submits a PR/MR to propose terraform changes, there is a process to test these changes and once merged they are applied (`terraform apply`).
- Infra provisioning and management: You need to automated or code your infra so you are able to test it easily, apply it and make any changes necessary.
- Consistent environments: You manage environments such as test, production, staging, ... and looking for a way to have them consistent so any modification in one of them, applies to other environments as well
<summary>What's the difference between Terraform and technologies such as Ansible, Puppet, Chef, etc.</summary><br><b>
Terraform is considered to be an IaC technology. It's used for provisioning resources, for managing infrastructure on different platforms.
Ansible, Puppet and Chef are Configuration Management technologies. They are used once there is an instance running and you would like to apply some configuration on it like installing an application, applying security policy, etc.
To be clear, CM tools can be used to provision resources so in the end goal of having infrastructure, both Terraform and something like Ansible, can achieve the same result. The difference is in the how. Ansible doesn't saves the state of resources, it doesn't know how many instances there are in your environment as opposed to Terraform. At the same time while Terraform can perform configuration management tasks, it has less modules support for that specific goal and it doesn't track the task execution state as Ansible. The differences are there and it's most of the time recommended to mix the technologies, so Terraform used for managing infrastructure and CM technologies used for configuration on top of that infrastructure.
Run `terraform init`. This will scan the code in the directory to figure out which providers are used (in this case AWS provider) and will download them.
<summary>You've executed <code>terraform init</code> and now you would like to move forward to creating the resources but you have concerns and would like to make be 100% sure on what you are going to execute. What should you be doing?</summary><br><b>
Execute `terraform plan`. That will provide a detailed information on what Terraform will do once you apply the changes.
</b></details>
<details>
<summary>You've downloaded the providers, seen the what Terraform will do (with terraform plan) and you are ready to actually apply the changes. What should you do next?</summary><br><b>
Run `terraform apply`. That will apply the changes described in your .tf files.
</b></details>
<details>
<summary>Explain the meaning of the following strings that seen at the beginning of each line When you run <code>terraform apply</code>
* '+'
* '-'
* '-/+'
</summary><br><b>
* '+' - The resource or attribute is going to be added
* '-' - the resource or attribute is going to be removed
* '-/+' - the resource or attribute is going to be replaced
</b></details>
### Dependencies
<details>
<summary>Sometimes you need to reference some resources in the same or separate .tf file. Why and how it's done?</summary><br><b>
Why: because resources are sometimes connected or need to be connected. For example, you create an AWS instance with "aws_instance" resource but, at the same time you would like also to allow some traffic to it (because by default traffic is not allowed). For that you'll create a "aws_security_group" resource and then, in your aws_instance resource, you'll reference it.
<summary>Does it matter in which order Terraform creates resources?</summary><br><b>
Yes, when there is a dependency between different Terraform resources, you want the resources to be created in the right order and this is exactly what Terraform does.
To make it ever more clear, if you have a resource X that references the ID of resource Y, it doesn't makes sense to create first resource X because it won't have any ID to get from a resource that wasn't created yet.
</b></details>
<details>
<summary>Is there a way to print/see the dependencies between the different resources?</summary><br><b>
Yes, with `terraform graph`
The output is in DOT - A graph description language.
<summary>Explain what is a "provider"</summary><br><b>
[terraform.io](https://www.terraform.io/docs/language/providers/index.html): "Terraform relies on plugins called "providers" to interact with cloud providers, SaaS providers, and other APIs...Each provider adds a set of resource types and/or data sources that Terraform can manage. Every resource type is implemented by a provider; without providers, Terraform can't manage any kind of infrastructure."
<summary>Write a configuration of a Terraform provider (any type you would like)</summary><br><b>
AWS is one of the most popular providers in Terraform. Here is an example of how to configure it to use one specific region and specifying a specific version of the provider
<summary>How to cleanup Terraform resourcse? Why the user shold be careful doing so?</summary><br><b>
`terraform destroy` will cleanup all the resources tracked by Terraform.
A user should be careful with this command because there is no way to revert it. Sure, you can always run again "apply" but that can take time, generates completely new resources, etc.
Variables allow you define piece of data in one location instead of repeating the hardcoded value of it in multiple different locations. Then when you need to modify the variable's value, you do it in one location instead of changing each one of the hardcoded values.
You can use something like the `-var` option to provide the value and avoid being prompted to insert a value. Another option is to run `export TF_VAR_<VAR_NAME>=<VALUE>`.
<summary>What are output variables? Why do we need them?</summary><br><b>
Output variable allow you to display/print certain piece of data as part of Terraform execution.
The most common use case for it is probably to print the IP address of an instance. Imagine you provision an instance and you would like to know what the IP address to connect to it. Instead of looking for it for the console/OS, you can use the output variable and print that piece of information to the screen
</b></details>
<details>
<summary>Explain the "sensitive" parameter of output variable</summary><br><b>
When set to "true", Terraform will avoid logging output variable's data. The use case for it is sensitive data such as password or private keys.
</b></details>
<details>
<summary>Explain the "depends" parameter of output variable</summary><br><b>
It is used to set explicitly dependency between the output variable and any other resource. Use case: some piece of information is available only once another resource is ready.
<summary>Is it possible to modify the default lifecycle? How? Why?</summary><br><b>
Yes, it's possible. There are different lifecycles one can choose from. For example "create_before_destroy" which inverts the order and first creates the new resource, updates all the references from old resource to the new resource and then removes the old resource.
How to use it:
```
lifecycle {
create_before_destroy = true
}
```
Why to use it in the first place: you might have resources that have dependency where they dependency itself is immutable (= you can't modify it hence you have to create a new one), in such case the default lifecycle won't work because you won't be able to remove the resource that has the dependency as it still references an old resource. AWS ASG + launch configurations is a good example of such use case.
<summary>You've deployed a virtual machine with Terraform and you would like to pass data to it (or execute some commands). Which concept of Terraform would you use?</summary><br><b>
<summary>Why is it often recommended to use provisioners as last resort?</summary><br><b>
Since a provisioner can run a variety of actions, it's not always feasible to plan and understand what will happen when running a certain provisioner. For this reason, it's usually recommended to use Terraform built-in option, whenever's possible.
<code>terraform taint resource.id</code> manually marks the resource as tainted in the state file. So when you run <code>terraform apply</code> the next time, the resource will be destroyed and recreated.
<summary>What are output variables and what <code>terraform output</code> does?</summary><br><b>
Output variables are named values that are sourced from the attributes of a module. They are stored in terraform state, and can be used by other modules through <code>remote_state</code>
</b></details>
<details>
<summary>Explain <code>remote-exec</code> and <code>local-exec</code></summary><br><b>
</b></details>
<details>
<summary>Explain "Remote State". When would you use it and how?</summary><br><b>
Terraform generates a `terraform.tfstate` json file that describes components/service provisioned on the specified provider. Remote
State stores this file in a remote storage media to enable collaboration amongst team.
State locking is a mechanism that blocks an operations against a specific state file from multiple callers so as to avoid conflicting operations from different team members. Once the first caller's operation's lock is released the other team member may go ahead to
carryout his own operation. Nevertheless Terraform will first check the state file to see if the desired resource already exist and
[terraform.io](https://www.terraform.io/language/state): "Terraform must store state about your managed infrastructure and configuration. This state is used by Terraform to map real world resources to your configuration, keep track of metadata, and to improve performance for large infrastructures."
In other words, it's a mechanism in Terraform to track resources you've created or cleaned up. This is how terraform knows what to update/create/delete when you run `terraform apply` and also other commands like `terraform destroy`.
<summary>Where Terraform state is stored?</summary><br><b>
There is more than one answer to this question. It's very much depends on whether you share it with others or it's only local in your Terraform directory, but taking a beginner's case, when you run terraform in a directory, the state will be stored in that directory in `terraform.tfstate` file.
* The representation of resources - JSON format of the resources, their attributes, IDs, ... everything that required to identify the resource and also anything that was included in the .tf files on these resources
<summary>Why does it matter where you store the tfstate file? In your answer make sure to address the following:
* Public vs. Private
* Git repository vs. Other locations
</summary><br><b>
- tfstate contains credentials in plain text. You don't want to put it in publicly shared location
- tfstate shouldn't be modified concurrently so putting it in a shared location available for everyone with "write" permissions might lead to issues. (Terraform remote state doesn't has this problem).
- tfstate is an important file. As such, it might be better to put it in a location that has regular backups and good security.
As such, tfstate shouldn't be stored in git repositories. secured storage such as secured buckets, is a better option.
In general, storing state file on your computer isn't a problem. It starts to be a problem when you are part of a team that uses Terraform and then you would like to make sure it's shared. In addition to being shared, you want to be able to handle the fact that different teams members can edit the file and can do it at the same time, so locking is quite an important aspect as well.
</b></details>
<details>
<summary>Mention some best practices related to tfstate</summary><br><b>
- Don't edit it manually. tfstate was designed to be manipulated by terraform and not by users directly.
- Store it in secured location (since it can include credentials and sensitive data in general)
- Backup it regularly so you can roll-back easily when needed
- Store it in remote shared storage. This is especially needed when working in a team and the state can be updated by any of the team members
- Enabled versioning if the storage where you store the state file, supports it. Versioning is great for backups and roll-backs in case of an issue.
If there are two users or processes concurrently editing the state file it can result in invalid state file that doesn't actually represents the state of resources.<br>
To avoid that, Terraform can apply state locking if the backend supports that. For example, AWS s3 supports state locking and consistency via DynamoDB. Often, if the backend supports it, Terraform will make use of state locking automatically so nothing is required from the user to activate it.
<summary>Describe how you manage state file(s) when you have multiple environments (e.g. development, staging and production)</summary><br><b>
Probably no right or wrong answer here, but it seems, based on different source, that the overall preferred way is to have a dedicated state file per environment.
* Sensitive data: some resources may specify sensitive data (like passwords and tokens) and everything in a state file is stored in plain text
* Prone to errors: when working with Git repos, you mayo often find yourself switch branches, checkout specific commits, perform rebases, ... all these operations may end up in you eventually performing `terraform apply` on non-latest version of your Terraform code
Terraform backend determines how the Terraform state is stored and loaded. By default the state is local, but it's possible to set a remote backend
</b></details>
<details>
<summary>Describe how to set a remote backend of any type you choose</summary><br><b>
Let's say we chose use Amazon s3 as a remote Terraform backend where we can store Terraform's state.
1. Write Terraform code for creating an s3 bucket
1. It would be a good idea to add lifecycle of "prevent_destroy" to it so it's not accidentally deleted
2. Enable versioning (add a resource of "aws_s3_bucket_versioning")
3. Encrypt the bucket ("aws_s3_bucket_server_side_encryption_configuration")
4. Block public access
5. Handle locking. One way is to add DB for it
6. Add the point you'll want to run init and apply commands to avoid an issue where you at the same time create the resources for remote backend and also switch to a remote backend
7. Once resources were created, add Terraform backend code
```
terraform {
backend "s3" {
bucket ...
}
}
```
7. Run `teraform init` as it will configure the backend
That's true and quite a limitation as it means you'll have to go to the resources of the remote backend and copy some values to the backend configuration.
One way to deal with it is using partial configurations in a completel separate file from the backend itself and then load them with `terraform init -backend-config=some_backend_partial_conf.hcl`
[developer.hashicorp.com](https://developer.hashicorp.com/terraform/language/state/workspaces): "The persistent data stored in the backend belongs to a workspace. The backend initially has only one workspace containing one Terraform state associated with that configuration. Some backends support multiple named workspaces, allowing multiple states to be associated with a single configuration."
[Terraform.io](https://www.terraform.io/language/modules/develop): "A module is a container for multiple resources that are used together. Modules can be used to create lightweight abstractions, so that you can describe your infrastructure in terms of its architecture, rather than directly in terms of physical objects."
</b></details>
<details>
<summary>How do you test a terraform module?</summary><br><b>
There are multiple answers, but the most common answer would likely to be using the tool <code>terratest</code>, and to test that a module can be initialized, can create resources, and can destroy those resources cleanly.
</b></details>
<details>
<summary>Where can you obtain Terraform modules?</summary><br><b>
Terraform modules can be found at the [Terrafrom registry](https://registry.terraform.io/browse/modules)
</b></details>
<details>
<summary>There's a discussion in your team whether to store modules in one centralized location/repository or have them in each of the projects/repositories where they are used. What's your take on that?</summary><br><b>
You might have a different opinion but my personal take on that, is to keep modules in one centralized repository as any maintenance or updates to the module you need to perform, are done in one place instead of multiple times in different repositories.
<summary>You have a Git repository with Terraform files but no .gitignore. What would you add to a .gitignore file in Terraform repository?</summary><br><b>
```
.terraform
*.tfstate
*.tfstate.backup
```
You don't want to store state file nor any downloaded providers in .terraform directory. It also doesn't makes sense to share/store the state backup files.
<summary>You manage ASG with Terraform which means you also have "aws_launch_configuration" resources. The problem is that launch configurations are immutable and sometimes you need to change them. This creates a problem where Terraform isn't able to delete ASG because they reference old launch configuration. How to do deal with it?</summary><br><b>
Add the following to "aws_launch_configuration" resource
```
lifecycle {
create_before_destroy = true
}
```
This will change the order of how Terraform works. First it will create the new resource (launch configuration). then it will update other resources to reference the new launch configuration and finally, it will remove old resources
<summary>How would you enforce users that use your variables to provide values with certain constraints? For example, a number greater than 1</summary><br><b>
Using `validation` block
```
variable "some_var" {
type = number
validation {
condition = var.some_var > 1
error_message = "you have to specify a number greater than 1"