本文共 9566 字,大约阅读时间需要 31 分钟。
在我之前文章提到Linux 4.1内核支持单用户模式(传送门:),此模式下用户UID和GID均为0同时不再区分用户权限(类root权限),应用于在某些小系统(例如嵌入式系统)。
接下来我们看下这个patch是如何实现内核单用户的。patch查看地址:
1.commit说明
kernel: conditionally support non-root users, groups and capabilitiesThere are a lot of embedded systems that run most or all of theirfunctionality in init, running as root:root. For these systems,supporting multiple users is not necessary.在很多嵌入式系统中,他们始终使用root:root用户进行操作。这些系统中,多用户功能显得不是很必需(鸡肋了~)。This patch adds a new symbol, CONFIG_MULTIUSER, that makes support fornon-root users, non-root groups, and capabilities optional. It is enabledunder CONFIG_EXPERT menu.这个patch添加了新的CONFIG_MULTIUSER内核开关,支持non-root users,, non-root groups, and capabilities。When this symbol is not defined, UID and GID are zero in any possible caseand processes always have all capabilities.当CONFIG_MULTIUSER关闭(关闭多用户模式),UID和GID均是0,进程拥有所有capabilities拥有的功能。The following syscalls are compiled out: setuid, setregid, setgid,setreuid, setresuid, getresuid, setresgid, getresgid, setgroups,getgroups, setfsuid, setfsgid, capget, capset.同时系统调用setuid, setregid, setgid,setreuid, setresuid, getresuid, setresgid, getresgid, setgroups,getgroups, setfsuid, setfsgid, capget, capset将不再编译(和支持)。Also, groups.c is compiled out completely.同时group.c文件不再编译。In kernel/capability.c, capable function was moved in order to avoidadding two ifdef blocks.kernel/capability.c中的capable相关函数也将移除(其实是采用#ifdef来判断进入正常处理还是直接返回)。This change saves about 25 KB on a defconfig build. The most minimalkernels have total text sizes in the high hundreds of kB rather thanlow MB. (The 25k goes down a bit with allnoconfig, but not that much.这项修改在使用defconfig(内核的默认config)可以节省25KB的内核二进制大小。在小内核的config场景可以节省数百KB空间(小于1MB)。在allnoconfig下节省稍微小于25KB的空间。The kernel was booted in Qemu. All the common functionalities work.Adding users/groups is not possible, failing with -ENOSYS.在虚拟机启动的系统(验证),基本系统调用都可以正常运行,所有设计添加users/groups的操作都无效,返回-ENOSYS。Bloat-o-meter output:add/remove: 7/87 grow/shrink: 19/397 up/down: 1675/-26325 (-24650)[akpm@linux-foundation.org: coding-style fixes]Signed-off-by: Iulia MandaReviewed-by: Josh Triplett Acked-by: Geert Uytterhoeven Tested-by: Paul E. McKenney Reviewed-by: Paul E. McKenney Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds
2.patch修改内容解析
因为patch涉及修改行多,并且很多目的相同,所以挑重点介绍。a.某些功能和架构中添加对MULTIUSER config的支持:
diff --git a/arch/s390/Kconfig b/arch/s390/Kconfigindex a5ced5c..de2726a 100644--- a/arch/s390/Kconfig+++ b/arch/s390/Kconfig@@ -328,6 +328,7 @@ config COMPAT select COMPAT_BINFMT_ELF if BINFMT_ELF select ARCH_WANT_OLD_COMPAT_IPC select COMPAT_OLD_SIGACTION+ depends on MULTIUSERdiff --git a/drivers/staging/lustre/lustre/Kconfig b/drivers/staging/lustre/lustre/Kconfigindex 6725467..62c7bba 100644--- a/drivers/staging/lustre/lustre/Kconfig+++ b/drivers/staging/lustre/lustre/Kconfig@@ -10,6 +10,7 @@ config LUSTRE_FS select CRYPTO_SHA1 select CRYPTO_SHA256 select CRYPTO_SHA512+ depends on MULTIUSER…
b.通过#ifdef CONFIG_MULTIUSER设置函数分支
diff --git a/include/linux/capability.h b/include/linux/capability.hindex aa93e5e..af9f0b9 100644--- a/include/linux/capability.h+++ b/include/linux/capability.h@@ -205,6 +205,7 @@ static inline kernel_cap_t cap_raise_nfsd_set(const kernel_cap_t a, cap_intersect(permitted, __cap_nfsd_set)); }+#ifdef CONFIG_MULTIUSER //如果定义多用户,则执行正常功能函数 extern bool has_capability(struct task_struct *t, int cap); extern bool has_ns_capability(struct task_struct *t, struct user_namespace *ns, int cap);@@ -213,6 +214,34 @@ extern bool has_ns_capability_noaudit(struct task_struct *t, struct user_namespace *ns, int cap); extern bool capable(int cap); extern bool ns_capable(struct user_namespace *ns, int cap);+#else // 如果non-root模式,则capability等操作不支持+static inline bool has_capability(struct task_struct *t, int cap)+{+ return true;+}…+static inline bool ns_capable(struct user_namespace *ns, int cap)+{+ return true;+}+#endif /* CONFIG_MULTIUSER *diff --git a/include/linux/cred.h b/include/linux/cred.hindex 2fb2ca2..8b6c083 100644--- a/include/linux/cred.h+++ b/include/linux/cred.h@@ -62,9 +62,27 @@ do { \ groups_free(group_info); \ } while (0)-extern struct group_info *groups_alloc(int); extern struct group_info init_groups;+#ifdef CONFIG_MULTIUSER //non-root模式屏蔽in_group_p和in_egroup_p等函数+extern struct group_info *groups_alloc(int); extern void groups_free(struct group_info *);++extern int in_group_p(kgid_t);+extern int in_egroup_p(kgid_t);+#else+static inline void groups_free(struct group_info *group_info)+{+}++static inline int in_group_p(kgid_t grp)+{+ return 1;+}+static inline int in_egroup_p(kgid_t grp)+{+ return 1;+}+#endifdiff --git a/include/linux/uidgid.h b/include/linux/uidgid.hindex 2d1f9b6..0ee05da 100644--- a/include/linux/uidgid.h+++ b/include/linux/uidgid.h@@ -29,6 +29,7 @@ typedef struct { #define KUIDT_INIT(value) (kuid_t){ value } #define KGIDT_INIT(value) (kgid_t){ value }+#ifdef CONFIG_MULTIUSER //屏蔽__kuid_val和__kuid_val static inline uid_t __kuid_val(kuid_t uid) { return uid.val;@@ -38,6 +39,17 @@ static inline gid_t __kgid_val(kgid_t gid) { return gid.val; }+#else+static inline uid_t __kuid_val(kuid_t uid)+{+ return 0;+}++static inline gid_t __kgid_val(kgid_t gid)+{+ return 0;+}+#endif
c. init/Kconfig添加MULTIUSER支持,这样内核make menuconfig可以看到MULTIUSER
…+config MULTIUSER+ bool "Multiple users, groups and capabilities support" if EXPERT+ default y+ help+ This option enables support for non-root users, groups and+ capabilities.++ If you say N here, all processes will run with UID 0, GID 0, and all+ possible capabilities. Saying N here also compiles out support for+ system calls related to UIDs, GIDs, and capabilities, such as setuid,+ setgid, and capset.++ If unsure, say Y here.+
d. kernel/Makefile添加MULTIUSER支持
diff --git a/kernel/Makefile b/kernel/Makefileindex 1408b33..0f8f8b0 100644--- a/kernel/Makefile+++ b/kernel/Makefile@@ -9,7 +9,9 @@ obj-y = fork.o exec_domain.o panic.o \ extable.o params.o \ kthread.o sys_ni.o nsproxy.o \ notifier.o ksysfs.o cred.o reboot.o \- async.o range.o groups.o smpboot.o+ async.o range.o smpboot.o++obj-$(CONFIG_MULTIUSER) += groups.o //这里,选择CONFIG_MULTIUSER后才会编译group.c
e.这里在capability.c中,第35行添加ifdef CONFIG_MULTIUSER,第386行添加+#endif /* CONFIG_MULTIUSER */,说明只有选择CONFIG_MULTIUSER,文件第35行——386行中包括的函数,才可以生效(定义,实现)。
diff --git a/kernel/capability.c b/kernel/capability.cindex 989f5bf..45432b5 100644--- a/kernel/capability.c+++ b/kernel/capability.c@@ -35,6 +35,7 @@ static int __init file_caps_disable(char *str) } __setup("no_file_caps", file_caps_disable);+#ifdef CONFIG_MULTIUSER /* * More recent versions of libcap are available from: *@@ -386,6 +387,24 @@ bool ns_capable(struct user_namespace *ns, int cap) } EXPORT_SYMBOL(ns_capable);++/**+ * capable - Determine if the current task has a superior capability in effect+ * @cap: The capability to be tested for+ *+ * Return true if the current task has the given superior capability currently+ * available for use, false if not.+ *+ * This sets PF_SUPERPRIV on the task if the capability is available on the+ * assumption that it's about to be used.+ */+bool capable(int cap)+{+ return ns_capable(&init_user_ns, cap);+}+EXPORT_SYMBOL(capable);+#endif /* CONFIG_MULTIUSER */
f.sys_ni.c中添加以上处理函数。
这里提一下sys_ni.c作用,如果一个系统调用被淘汰,它所对应的服务例程就要被指定为sys_ni_syscall。sys_ni_syscall中的”ni”即表示”not implemented(没有实现)”。diff --git a/kernel/sys_ni.c b/kernel/sys_ni.cindex 5adcb0a..7995ef5 100644--- a/kernel/sys_ni.c+++ b/kernel/sys_ni.c@@ -159,6 +159,20 @@ cond_syscall(sys_uselib); cond_syscall(sys_fadvise64); cond_syscall(sys_fadvise64_64); cond_syscall(sys_madvise);+cond_syscall(sys_setuid);+cond_syscall(sys_setregid);+cond_syscall(sys_setgid);+cond_syscall(sys_setreuid);+cond_syscall(sys_setresuid);+cond_syscall(sys_getresuid);+cond_syscall(sys_setresgid);+cond_syscall(sys_getresgid);+cond_syscall(sys_setgroups);+cond_syscall(sys_getgroups);+cond_syscall(sys_setfsuid);+cond_syscall(sys_setfsgid);+cond_syscall(sys_capget);+cond_syscall(sys_capset);
以上,patch简单来说,就是实现了:
The following syscalls are compiled out: setuid, setregid, setgid,
setreuid, setresuid, getresuid, setresgid, getresgid, setgroups, getgroups, setfsuid, setfsgid, capget, capset.Also, groups.c is compiled out completely.
In kernel/capability.c, capable function was moved in order to avoid
adding two ifdef blocks.
1.使用v4.18内核编译bzImage
#git branch* (HEAD detached at v4.18)#cp arch/x86/configs/x86_64_defconfig ./.config#make menuconfig (关闭MULTIUSER)#make bzImage -j8
编译好后,内核在目录arch/x86/boot/bzImage
2.使用qemu启动
/ # adduser cuibixuan (这里为什么还能添加用户?)adduser: /home/cuibixuan: No such file or directorypasswd: unknown uid 0/ # su cuibixuansu: can't set groups: Function not implemented
可以看到,groups相关操作,已经” Function not implemented”。说明添加到kernel/sys_ni.c的函数sys_setgroups已经生效(+cond_syscall(sys_setgroups);)。
Linux对single-user system的支持,个人认为仅仅不支持uid/gid、group和等capability等相关函数是不够的。比如,启动前fs已经配置多个用户(/etc/passwd和/etc/group)怎么处理;以及某些(安全相关)系统调用建议运行在个人用户权限下怎么办?以及讨论提到:
multiple processes, scheduling等问题:Come to think of it, I look forward to the next tinification patch
that removes support for multiple processes, scheduling, and makes the only running process always have pid 1.
或者针对threads讨论:
The problem is then that the single userspace task can prevent
necessary kernel threads from running.