核心问题 | 操作系统应该提供怎样的 API 来创建和控制进程? |
---|---|
解法 | Linux fork() , exec() , wait() 等 |
应如何设计操作系统的 API,使用户可以操作进程?核心的 API 应该有:
- 创建进程
- 销毁进程
- 等待进程退出
- 控制能力:比如暂停进程、继续进程
- 查询进程状态
对于 Linux 系统来说,创建一个新进程使用的是 fork()
;如果想在新创建的进程中运行完全不一样的 binary,使用 exec()
。
分离 fork()
exec()
的好处是,可以在 fork 后 exec 前做一些操作,比如给新进程设置环境变量、重定向文件等。比如:
wc p3.c > newfile.txt
对于这样的 shell 命令,shell 可以在 fork 后、exec wc
命令之前,在 fork 出来的子进程中打开 newfile.txt
,并将 stdout 定位到文件中。
fork 使用须知
对于带缓冲区的输出流(buffered stream),比如 stdout,在 fork 之前应该调用 fflush(stdout)
。比如下面这段代码,如果不调用 fflush
,同时 stdout 是 buffered stream(参考 OS: I/O: Stream)时,会输出 两行 Hi(unexpected):
int main() {
printf("Hi\n");
fflush(stdout);
fork();
return 0;
}
Orphan Processes and Zombie Processes
Linux 采用的是一个进程树结构,除了 init 进程(pid 1)之外,其他的进程都会有一个父进程,最上层的父进程则是 init。
假如一个进程(称之为父进程)创建了另一个进程(称为子进程),在父进程中调用 wait()
会等待子进程完成并获得子进程的返回码。如果父进程不调用 wait()
而且自身仍在运行,则子进程结束后会变成 僵尸进程(zombie);大量僵尸进程会导致系统资源,特别是进程表耗尽,使得无法创建新的进程。假如父进程在子进程结束前就退出了,那子进程会变成 孤儿进程(orphan),被 init 进程接管(即它的父进程变成 init 进程),init 进程会自动调用 wait()
来回收。假如子进程处于僵尸状态,但父进程还没调用 wait()
就退出了,子进程一样会被 init 进程接管。
假如一个进程有爷爷进程、也有父进程,如果父进程提前退出,爷爷进程还在,此时子进程仍然会被 init 接管。
ASIDE: KEY PROCESS TERMS
- Each process has a name; in most systems, that name is a number known as a process ID (PID).
- The fork() system call is used in U NIX systems to create a new process. The creator is called the parent; the newly created process is called the child. As sometimes occurs in real life [J16], the child process is a nearly identical copy of the parent.
- The wait() system call allows a parent to wait for its child to complete execution.
- The exec() family of system calls allows a child to break free from its similarity to its parent and execute an entirely new program.
- A UNIX shell commonly uses fork(), wait(), and exec() to launch user commands; the separation of fork and exec enables features like input/output redirection, pipes, and other cool features, all without changing anything about the programs being run.
- Process control is available in the form of signals, which can cause jobs to stop, continue, or even terminate.
- Which processes can be controlled by a particular person is encapsulated in the notion of a user; the operating system allows multiple users onto the system, and ensures users can only control their own processes.
- A superuser can control all processes (and indeed do many other things); this role should be assumed infrequently and with caution for security reasons.