From 1eda57d212d3bd66950362d87ab1aa18dc219c3b Mon Sep 17 00:00:00 2001 From: abregman Date: Wed, 23 Oct 2019 19:54:45 +0300 Subject: [PATCH] Add SRE questions --- README.md | 33 +++++++++++++++++++++++++++------ 1 file changed, 27 insertions(+), 6 deletions(-) diff --git a/README.md b/README.md index 7c7dad8..5c1e0d4 100644 --- a/README.md +++ b/README.md @@ -3,14 +3,15 @@ :information_source:  This repository contains interview questions on various DevOps related topics -:bar_chart:  There are currently **413** questions - +:bar_chart:  There are currently **416** questions :warning:  You don't need to know how to answer all the questions in this repo. DevOps is not about knowing all :) -:page_facing_up:  Different interviewers focus on different things. Some will focus on your resume while others might focus on scenario questions or specific technical questions. In this repository I tried to cover different types of questions for you to practice and test your knowledge +:thought_balloon:  Different interviewers focus on different things. Some will focus on your resume while others might focus on scenario questions or specific technical questions. I tried to cover different types of questions for you to practice and test your skills -:pencil:  You can add more questions & answers by submitting pull requests :) +:page_facing_up:  Some questions are also relevant to similar roles like SRE and Production Engineer + +:pencil:  You can add more questions & answers by submitting pull requests :) You can read more about it [here](CONTRIBUTING.md) **** @@ -156,11 +157,31 @@ which follows the immutable infrastructure paradigm.
-Explain monitoring. What is it? Why it's important?
+Explain monitoring. What is it? What its goal?
-What monitoring methods are you familiar with?
+What types of monitoring outputs are you familiar with and/or used in the past?
+ +Alerts +Tickets +Logging +
+ +##### SRE + +
+What SRE team is responsible for?
+ +One can argue whether it's per company definition or a global one but at least according to a large companies, like Google for example, the SRE team is responsible for availability, latency, performance, efficiency, change management, monitoring, emergency response, and capacity planning of their services +
+ +
+What is an error budget?
+
+ +
+What are MTTF (mean time to failure) and MTTR (mean time to repair)? What these metrics help us to evaluate?