OSTEP: Ch5: Interlude: Process API

 20th August 2020 at 2:19pm
核心问题操作系统应该提供怎样的 API 来创建和控制进程?
解法Linux fork(), exec(), wait()

应如何设计操作系统的 API,使用户可以操作进程?核心的 API 应该有:

  • 创建进程
  • 销毁进程
  • 等待进程退出
  • 控制能力:比如暂停进程、继续进程
  • 查询进程状态

对于 Linux 系统来说,创建一个新进程使用的是 fork();如果想在新创建的进程中运行完全不一样的 binary,使用 exec()

分离 fork() exec() 的好处是,可以在 fork 后 exec 前做一些操作,比如给新进程设置环境变量、重定向文件等。比如:

wc p3.c > newfile.txt

对于这样的 shell 命令,shell 可以在 fork 后、exec wc 命令之前,在 fork 出来的子进程中打开 newfile.txt,并将 stdout 定位到文件中。

fork 使用须知

对于带缓冲区的输出流(buffered stream),比如 stdout,在 fork 之前应该调用 fflush(stdout)。比如下面这段代码,如果不调用 fflush,同时 stdout 是 buffered stream(参考 OS: I/O: Stream)时,会输出 两行 Hi(unexpected):

int main() {
    printf("Hi\n");
    fflush(stdout);
    fork();
    return 0;
}

Orphan Processes and Zombie Processes

Linux 采用的是一个进程树结构,除了 init 进程(pid 1)之外,其他的进程都会有一个父进程,最上层的父进程则是 init。

假如一个进程(称之为父进程)创建了另一个进程(称为子进程),在父进程中调用 wait() 会等待子进程完成并获得子进程的返回码。如果父进程不调用 wait() 而且自身仍在运行,则子进程结束后会变成 僵尸进程(zombie);大量僵尸进程会导致系统资源,特别是进程表耗尽,使得无法创建新的进程。假如父进程在子进程结束前就退出了,那子进程会变成 孤儿进程(orphan),被 init 进程接管(即它的父进程变成 init 进程),init 进程会自动调用 wait() 来回收。假如子进程处于僵尸状态,但父进程还没调用 wait() 就退出了,子进程一样会被 init 进程接管。

假如一个进程有爷爷进程、也有父进程,如果父进程提前退出,爷爷进程还在,此时子进程仍然会被 init 接管。

ASIDE: KEY PROCESS TERMS

  • Each process has a name; in most systems, that name is a number known as a process ID (PID).
  • The fork() system call is used in U NIX systems to create a new process. The creator is called the parent; the newly created process is called the child. As sometimes occurs in real life [J16], the child process is a nearly identical copy of the parent.
  • The wait() system call allows a parent to wait for its child to complete execution.
  • The exec() family of system calls allows a child to break free from its similarity to its parent and execute an entirely new program.
  • A UNIX shell commonly uses fork(), wait(), and exec() to launch user commands; the separation of fork and exec enables features like input/output redirection, pipes, and other cool features, all without changing anything about the programs being run.
  • Process control is available in the form of signals, which can cause jobs to stop, continue, or even terminate.
  • Which processes can be controlled by a particular person is encapsulated in the notion of a user; the operating system allows multiple users onto the system, and ensures users can only control their own processes.
  • A superuser can control all processes (and indeed do many other things); this role should be assumed infrequently and with caution for security reasons.