Add Git exercises

This commit is contained in:
abregman 2022-06-25 17:48:13 +03:00
parent 05f055f0c0
commit 738e582468
4 changed files with 111 additions and 1 deletions

View File

@ -802,7 +802,7 @@ yippiekaiyay 2> ls_output.txt
<details>
<summary>Demonstrate Linux stderr to stdout redirection</summary><br><b>
yippiekaiyay 1>&2
yippiekaiyay &> file
</b></details>
<details>
@ -6813,6 +6813,10 @@ https://idiallo.com/blog/c10k-2016
<summary>Explain Dark Data</summary><br><b>
</b></details>
<details>
<summary>Explain MBR</summary><br><b>
</b></details>
## Questions you CAN ask
<a name="questions-you-ask"></a>

View File

@ -2059,6 +2059,10 @@ No. Since AWS reserves 5 IP addresses for every subnet, Kratos will have 32-5=27
It's better if Kratos uses a subnet of size /26 but good luck telling him that.
</b></details>
<details>
<summary>In order for AWS Lambda to have internet access</summary><br><b>
</b></details>
##### AWS EC2 - ENI
<details>

View File

@ -10,6 +10,8 @@
## Questions
### Git Basics
<details>
<summary>How do you know if a certain directory is a git repository?</summary><br><b>
@ -50,6 +52,10 @@ There are different ways to check whether a file is tracked or not:
...
</b></details>
<details>
<summary>Explain what the file <code>gitignore</code> is used for</summary><br><b>
</b></details>
<details>
<summary>How can you see which changes have done before committing them?</summary><br><b>
@ -58,6 +64,48 @@ There are different ways to check whether a file is tracked or not:
<details>
<summary>What <code>git status</code> does?</summary><br><b>
`git status` helps you to understand the tracking status of files in your repository. Focusing on working directory and staging area - you can learn which changes were made in the working directory, which changes are in the staging area and in general, whether files are being tracked or not.
</b></details>
<details>
<summary>You've created new files in your repository. How to make sure Git tracks them?</summary><br><b>
`git add FILES`
</b></details>
### Scenarios
<details>
<summary>You have files in your repository you don't want Git to ever track them. What should you be doing to avoid ever tracking them?</summary><br><b>
Add them to the file `.gitignore`. This will make sure these files are never added to staging area.
</b></details>
<details>
<summary>A development team in your organization is using a monorepo and it's became quite big, including hundred thousands of files. They say running many git operations is taking a lot of time to run (like git status for example). Why does that happen and what can you do in order to help them?</summary><br><b>
Many Git operations are related to filesystem state. `git status` for example will run diffs to compare HEAD commit to index and another diff to compare index to working directory. As part of these diffs, it would need to run quite a lot of `lstat()` system calls. When running on hundred thousands of files, it can take seconds if not minutes.
One thing to do about it, would be to use the built-in `fsmonitor` (filesystem monitor) of Git. With fsmonitor (which integrated with Watchman), Git spawn a daemon that will watch for any changes continuously in the working directory of your repository and will cache them . This way, when you run `git status` instead of scanning the working directory, you are using a cached state of your index.
<p align="center">
<img src="images/design/development/git_fsmonitor.png"/>
</p>
Next, you can try to enable `feature.manyFile` with `git config feature.manyFiles true`. This does two things:
1. Sets `index.version = 4` which enables path-prefix compression in the index
2. Sets `core.untrackedCache=true` which by default is set to `keep`. The untracked cache is quite important concept. What it does is to record the mtime of all the files and directories in the working directory. This way, when time comes to iterate over all the files and directories, it can skip those whom mtime wasn't updated.
Before enabling it, you might want to run `git update-index --test-untracked-cache` to test it out and make sure mtime operational on your system.
Git also has the built-in `git-maintainence` command which optimizes Git repository so it's faster to run commands like `git add` or `git fatch` and also, the git repository takes less disk space. It's recommended to run this command periodically (e.g. each day).
In addition, track only what is used/modified by developers - some repositories may include generated files that are required for the project to run properly (or support certain accessibility options), but not actually being modified by any way by the developers. In that case, tracking them is futile.
In order to avoid populating those file in the working directory, one can use the `sparse checkout` feature of Git.
Finally, with certain build systems, you can know which files are being used/relevant exactly based on the component of the project that the developer is focusing on. This, together with the `sparse checkout` can lead to a situation where only a small subset of the files are being populated in the working directory. Making commands like `git add`, `git status`, etc. really quick
</b></details>
### Branches
@ -103,6 +151,12 @@ Git runs update-ref to add the SHA-1 of the last commit of the branch you're on
Using the HEAD file: `.git/HEAD`
</b></details>
<details>
<summary>What <code>unstaged</code> means in regards to Git?</summary><br><b>
A file the is in the working directory but is not in the HEAD nor in the staging area, referred to as "unstaged".
</b></details>
<details>
<summary>True or False? when you <code>git checkout some_branch</code>, Git updates .git/HEAD to <code>/refs/heads/some_branch</code></summary><br><b>
@ -249,3 +303,51 @@ False. If you would like to keep a file on your filesystem, use `git reset <file
`find .git/refs/`
</b></details>
## Git Diff
<details>
<summary>What git diff does?</summary><br><b>
git diff can compare between two commits, two files, a tree and the staging area, etc.
</b></details>
<details>
<summary>Which one is faster? <code>git diff-index HEAD</code> or <code>git diff HEAD</code> </summary><br><b>
`git diff-index` is faster but to be fair, it's because it does less. `git diff index` won't look at the content,
only metadata like timestamps.
</b></details>
<details>
<summary>By which other Git commands does git diff used?</summary><br><b>
The diff mechanism used by `git status` to perform a comparison and let the user know which files are being tracked
</b></details>
## Git Internal
<details>
<summary>Describe how `git status` works</summary><br><b>
Shortly, it runs `git diff` twice:
1. Compare between HEAD to staging area
2. Compare staging area to working directory
</b></details>
<details>
<summary>If <code>git status</code> has to run diff on all the files in the HEAD commit to those in staging area/index and another one on staging area/index and working directory, how is it fairly fast? </summary><br><b>
One reason is about the structure of the index, commits, etc.
* Every file in a commit is stored in tree object
* The index is then a flattened structure of tree objects
* All files in the index have pre-computed hashes
* The diff operation then, is comparing the hashes
Another reason is caching
* Index caches information on working directory
* When Git has the information for certain file cached, there is no need to look at the working directory file
</b></details>

Binary file not shown.

After

Width:  |  Height:  |  Size: 39 KiB