OSTEP: Ch5: Interlude: Process API: Library — Zhiheng Lin's Second Brain

OSTEP: Ch5: Interlude: Process API

20th August 2020 at 2:19pm

核心问题	操作系统应该提供怎样的 API 来创建和控制进程？
解法	Linux `fork()`, `exec()`, `wait()` 等

应如何设计操作系统的 API，使用户可以操作进程？核心的 API 应该有：

创建进程
销毁进程
等待进程退出
控制能力：比如暂停进程、继续进程
查询进程状态

对于 Linux 系统来说，创建一个新进程使用的是 fork()；如果想在新创建的进程中运行完全不一样的 binary，使用 exec()。

分离 fork() exec() 的好处是，可以在 fork 后 exec 前做一些操作，比如给新进程设置环境变量、重定向文件等。比如：

wc p3.c > newfile.txt

对于这样的 shell 命令，shell 可以在 fork 后、exec wc 命令之前，在 fork 出来的子进程中打开 newfile.txt，并将 stdout 定位到文件中。

fork 使用须知

对于带缓冲区的输出流（buffered stream），比如 stdout，在 fork 之前应该调用 fflush(stdout)。比如下面这段代码，如果不调用 fflush，同时 stdout 是 buffered stream（参考 OS: I/O: Stream）时，会输出 两行 Hi（unexpected）：

int main() {
    printf("Hi\n");
    fflush(stdout);
    fork();
    return 0;
}

Orphan Processes and Zombie Processes

Linux 采用的是一个进程树结构，除了 init 进程（pid 1）之外，其他的进程都会有一个父进程，最上层的父进程则是 init。

假如一个进程（称之为父进程）创建了另一个进程（称为子进程），在父进程中调用 wait() 会等待子进程完成并获得子进程的返回码。如果父进程不调用 wait() 而且自身仍在运行，则子进程结束后会变成 僵尸进程（zombie）；大量僵尸进程会导致系统资源，特别是进程表耗尽，使得无法创建新的进程。假如父进程在子进程结束前就退出了，那子进程会变成 孤儿进程（orphan），被 init 进程接管（即它的父进程变成 init 进程），init 进程会自动调用 wait() 来回收。假如子进程处于僵尸状态，但父进程还没调用 wait() 就退出了，子进程一样会被 init 进程接管。

假如一个进程有爷爷进程、也有父进程，如果父进程提前退出，爷爷进程还在，此时子进程仍然会被 init 接管。

ASIDE: KEY PROCESS TERMS

Each process has a name; in most systems, that name is a number known as a process ID (PID).
The fork() system call is used in U NIX systems to create a new process. The creator is called the parent; the newly created process is called the child. As sometimes occurs in real life [J16], the child process is a nearly identical copy of the parent.
The wait() system call allows a parent to wait for its child to complete execution.
The exec() family of system calls allows a child to break free from its similarity to its parent and execute an entirely new program.
A UNIX shell commonly uses fork(), wait(), and exec() to launch user commands; the separation of fork and exec enables features like input/output redirection, pipes, and other cool features, all without changing anything about the programs being run.
Process control is available in the form of signals, which can cause jobs to stop, continue, or even terminate.
Which processes can be controlled by a particular person is encapsulated in the notion of a user; the operating system allows multiple users onto the system, and ensures users can only control their own processes.
A superuser can control all processes (and indeed do many other things); this role should be assumed infrequently and with caution for security reasons.