System Resources
1. Overview
???note How do we define system resources?
- How fine-grained should it be?
- One cluster of computers
- One computer
- One socket
- One core
- Portion of times on one core
- ...
- In a nutshell:
- What resources do you see:
namespace
- What resources can you use:
cgroup
???note Namespace
- Partitions of kernel resources such that one partition is `viewable` by one
set of processes but not by another set.
- Namespace kinds:
- Mount (mnt): Powerful and flexible tool for creating per-user and per-container filesystem trees
- Process ID (pid): [Processes in different pid namespaces can have the same pid](https://manpages.ubuntu.com/manpages/xenial/man7/pid_namespaces.7.html)
- Network (net): [Provide isolation of system resources associated with networking](https://man7.org/linux/man-pages/man7/network_namespaces.7.html)
- IPC (ipc): [Isolate certain IPC resources and POSIX message queues](https://man7.org/linux/man-pages/man7/ipc_namespaces.7.html)
- UTS (uts): UTS namespaces allow a single system to appear to have different host and domain names to different processes.
- User ID (user): [Isolate security-related identifiers and attributes, in particular, user IDs and group IDs,
the root directory, keys, and capabilities](https://man7.org/linux/man-pages/man7/user_namespaces.7.html)
- Control group (cgroup): Will talk about in cgroup.
- Time: Allow processes to see different system times.
???note cgroup (the initial c is never capitalized) - Control groups, usually referred to as cgroups, are a Linux kernel feature which allow processes to be organized into hierarchical groups whose usage of various types of resources can then be limited and monitored. - Man page - Authoritative document on the design, interface, and convention of cgroupv2 - Features: - Resource limiting: memory limit, I/O bandwidth limit, CPU quota limit, or CPU set limit. - Prioritization: some groups may get a larger share of CPU utilization or disk I/O throughput. - Accounting: measures a group's resource usage. - Control: freezing groups of processes, their checkpointing and restarting
## 2. Hands-on
???note Hands-on: cgroup CPU
- Instantiate the Docker branch
- Run the following commands
~~~bash
$ sudo apt install -y cgroup-tools
$ sudo cgcreate -g cpu:/cpulimit
~~~
- `cpu.cfs_period_us` and `cpu.cfs_quota.us`: how tasks in a cgroup should
be able to access a single CPU for `quota` out of `period` microseconds (`us`).
~~~bash
$ sudo cgset -r cpu.cfs_period_us=1000000 cpulimit
$ sudo cgset -r cpu.cfs_quota_us=10000 cpulimit
$ sudo cgget -g cpu:cpulimit
~~~
- In the above example, tasks in the `cpulimit` group can access 10000 microseconds
out of 1000000 microseconds of a single CPU time.
- Create a two-horizontal panes tmux session. Keep the bottom pane running the `top`
command.
- For the top pane, run the following commands one by one and observe the CPU usage
from the `top` pane.
~~~bash
$ dd if=/dev/zero of=out bs=1M
~~~
and
~~~bash
$ sudo cgexec -g cpu:cpulimit dd if=/dev/zero of=out bs=1M
~~~
:::{image} ../fig/csc603/06-system-resources/cgroup-cpu-1.png
:alt: Running dd without cgroup
:class: bg-primary mb-1
:height: 300px
:align: center
:::
:::{image} ../fig/csc603/06-system-resources/cgroup-cpu-2.png
:alt: Running dd with cgroup
:class: bg-primary mb-1
:height: 300px
:align: center
:::
???note Hands-on: cgroup memory - Check the content of the initial memory cgroup
- Create a new memory cgroup called
blue
- Check the initial memory limit
- Set the amount of memory for tasks in the
blue
group
$ echo 104857600 | sudo tee /sys/fs/cgroup/memory/blue/memory.limit_in_bytes
$ cat /sys/fs/cgroup/memory/blue/memory.limit_in_bytes
- Check on
OOM killer
(out-of-memory)
- Create a memory hog file:
with the following contents:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#define KB (1024)
#define MB (1024 * KB)
#define GB (1024 * MB)
int main(int argc, char *argv[]) {
char *p;
again:
while ((p = (char *)malloc(GB)))
memset(p, 0, GB);
while ((p = (char *)malloc(MB)))
memset(p, 0, MB);
while ((p = (char *)malloc(KB)))
memset(p, 0, KB);
sleep(1);
goto again;
return 0;
}
- Move the shell into the tasks group of cgroup
blue
:
- Compile and run
memhog
and observe how it is killed:
- Turn off the OOM and see how
memhog
hang.
- Open a new windows,
SSH into the CloudLab node, and try to turn the shell into
blue
cgroup. You will see that the shell hanged (out of memory)
- Open yet another shell and change the
OOM
flag to enable OOM killed. You will see thatmemhog
is immediately killed once the flag is turned back on.
```