🤵🏾 😑 🍁 How to protect processes and kernel extensions on macOS 👩🏿‍🚀 🌋 🔼

Hello, Habr! Today I would like to talk about how you can protect processes from intruders in macOS. For example, it is useful for an antivirus or backup system, especially in light of the fact that under macOS there are several ways to “kill” the process. Read about it and about protection methods under a cat.

The classic way to kill a process

The well-known way to “kill” a process is to send a signal about a SIGKILL process. Through bash, you can call the standard “kill -SIGKILL PID” or “pkill -9 NAME” to kill. The kill command has been known since UNIX and is available not only on macOS, but also on other UNIX-like systems.

As in UNIX-like systems, macOS allows you to intercept any signals to the process except two - SIGKILL and SIGSTOP. In this article, the SIGKILL signal will be primarily considered as a signal giving rise to the killing of a process.

MacOS specifics

On macOS, the kill system call in the XNU kernel calls the psignal function (SIGKILL, ...). Let's try to see what other user actions in userspace can call the psignal function. We eliminate calls to the psignal function in the internal mechanisms of the kernel (although they may be non-trivial, we will leave them for another article :) - signature verification, memory errors, exit / terminate processing, violation of file protection, etc.

We start the overview with the function and the corresponding system call terminate_with_payload. It can be seen that in addition to the classic kill call, there is an alternative approach that is specific to the macOS operating system and is not found in BSD. The operating principles of both system calls are also close. They are direct calls to the psignal kernel function. Also note that before killing a process, a “cansignal” check is performed - whether the process can send a signal to another process, the system does not allow any application to kill system processes for example.

static int
terminate_with_payload_internal(struct proc *cur_proc, int target_pid, uint32_t reason_namespace,
				uint64_t reason_code, user_addr_t payload, uint32_t payload_size,
				user_addr_t reason_string, uint64_t reason_flags)
{
...
	target_proc = proc_find(target_pid);
...
	if (!cansignal(cur_proc, cur_cred, target_proc, SIGKILL)) {
		proc_rele(target_proc);
		return EPERM;
	}
...
	if (target_pid == cur_proc->p_pid) {
		/*
		 * psignal_thread_with_reason() will pend a SIGKILL on the specified thread or
		 * return if the thread and/or task are already terminating. Either way, the
		 * current thread won't return to userspace.
		 */
		psignal_thread_with_reason(target_proc, current_thread(), SIGKILL, signal_reason);
	} else {
		psignal_with_reason(target_proc, SIGKILL, signal_reason);
	}
...
}

launchd

The standard way to create daemons at system startup and control their lifetime is launchd. I’ll draw attention to the fact that the source code is for the old version of launchctl before macOS 10.10, code examples are given as an illustration. The modern launchctl sends launchd signals through XPC, the logic of launchctl is transferred to it.

Let's consider how the application is stopped. Before sending a SIGTERM signal, they try to stop the application using the proc_terminate system call.

<launchctl src/core.c>
...
	error = proc_terminate(j->p, &sig);
	if (error) {
		job_log(j, LOG_ERR | LOG_CONSOLE, "Could not terminate job: %d: %s", error, strerror(error));
		job_log(j, LOG_NOTICE | LOG_CONSOLE, "Using fallback option to terminate job...");
		error = kill2(j->p, SIGTERM);
		if (error) {
			job_log(j, LOG_ERR, "Could not signal job: %d: %s", error, strerror(error));
		} 
...
<>

Under the hood, proc_terminate, in spite of its name, can send not only psignal with SIGTERM, but also SIGKILL.

Indirect kill - resource limit

A more interesting case can be seen in another process_policy system call . The standard use of this system call is application resource limits, for example, for the indexer, there is a limit on the processor time and memory quota so that the system does not significantly slow down from file caching actions. If the application has reached the resource limit, as can be seen from the proc_apply_resource_actions function, the SIGKILL signal is sent to the process.

Although this system call could potentially kill a process, the system did not adequately check the rights of the process that caused the system call. In fact, a check existed , but it is enough to use the alternative flag PROC_POLICY_ACTION_SET to bypass this condition.

Hence, if you “limit” the CPU usage quota by the application (for example, allow only 1 ns to be executed), then you can kill any process in the system. So, the malware can kill any process on the system, including the antivirus process. Also interesting is the effect that occurs when a process is killed with pid 1 (launchctl) - kernel panic when trying to process a SIGKILL signal :)

How to solve the problem?

The most straightforward way to prevent a process from being killed is to replace the function pointer in the system call table. Unfortunately, this method is non-trivial for many reasons

. Firstly, the symbol that is responsible for the position of sysent in memory is not only a private symbol of the XNU kernel, but also cannot be found in kernel symbols. You will have to use heuristic search methods, for example, dynamic disassembly of a function and search for a pointer in it.

Secondly, the structure of the entries in the table depends on the flags with which the kernel was built. If the flag CONFIG_REQUIRES_U32_MUNGING is declared, then the size of the structure will be changed - an additional field sy_arg_munge32 is added. It is necessary to carry out an additional check on the flag with which the kernel was compiled, as an option, compare pointers to functions with known ones.

struct sysent {         /* system call table */
        sy_call_t       *sy_call;       /* implementing function */
#if CONFIG_REQUIRES_U32_MUNGING || (__arm__ && (__BIGGEST_ALIGNMENT__ > 4))
        sy_munge_t      *sy_arg_munge32; /* system call arguments munger for 32-bit process */
#endif
        int32_t         sy_return_type; /* system call return types */
        int16_t         sy_narg;        /* number of args */
        uint16_t        sy_arg_bytes;   /* Total size of arguments in bytes for
                                         * 32-bit system calls
                                         */
};

Fortunately, in modern versions of macOS, Apple provides a new API for working with processes. The Endpoint Security API allows clients to authorize many requests to other processes. So, you can block any signals to processes, including the SIGKILL signal using the aforementioned API.

#include <bsm/libbsm.h>
#include <EndpointSecurity/EndpointSecurity.h>
#include <unistd.h>

int main(int argc, const char * argv[]) {
    es_client_t* cli = nullptr;
    {
        auto res = es_new_client(&cli, ^(es_client_t * client, const es_message_t * message) {
            switch (message->event_type) {
                case ES_EVENT_TYPE_AUTH_SIGNAL:
                {
                    auto& msg = message->event.signal;
                    auto target = msg.target;
                    auto& token = target->audit_token;
                    auto pid = audit_token_to_pid(token);
                    printf("signal '%d' sent to pid '%d'\n", msg.sig, pid);
                    es_respond_auth_result(client, message, pid == getpid() ? ES_AUTH_RESULT_DENY : ES_AUTH_RESULT_ALLOW, false);
                }
                    break;
                default:
                    break;
            }
        });
    }

    {
        es_event_type_t evs[] = { ES_EVENT_TYPE_AUTH_SIGNAL };
        es_subscribe(cli, evs, sizeof(evs) / sizeof(*evs));
    }

    printf("%d\n", getpid());
    sleep(60); // could be replaced with other waiting primitive

    es_unsubscribe_all(cli);
    es_delete_client(cli);

    return 0;
}

Similarly, you can register MAC Policy in the kernel, which provides a signal protection method (policy proc_check_signal), but the API is not officially supported.

Kernel Extension Protection

In addition to protecting processes in the system, protection of the kernel extension itself (kext) is also necessary. macOS provides a framework for developers to conveniently develop IOKit device drivers. In addition to providing tools for working with devices, IOKit provides driver stacking methods using instances of C ++ classes. An application in userspace will be able to “find” a registered instance of the class to establish a kernel-userspace connection.

To detect the number of class instances in the system, the ioclasscount utility exists.

my_kext_ioservice = 1
my_kext_iouserclient = 1

Any kernel extension that wishes to register on the driver stack must declare a class inherited from IOService, for example, my_kext_ioservice in this case. Connecting user applications will create a new instance of the class that inherits from IOUserClient, in the example my_kext_iouserclient.

When trying to unload the driver from the system (kextunload command), the virtual function “bool terminate (IOOptionBits options)” is called. It is enough to return false on the call to the terminate function when trying to unload to disable kextunload.

bool Kext::terminate(IOOptionBits options)
{

  if (!IsUnloadAllowed)
  {
    // Unload is not allowed, returning false
    return false;
  }

  return super::terminate(options);
}

The IsUnloadAllowed flag can be set by IOUserClient at boot. When loading is limited, the kextunload command will return the following output:

admin@admins-Mac drivermanager % sudo kextunload ./test.kext
Password:
(kernel) Can't remove kext my.kext.test; services failed to terminate - 0xe00002c7.
Failed to unload my.kext.test - (iokit/common) unsupported function.

Similar protection must be done for IOUserClient. Class instances can be unloaded using the userspace function IOKitLib “IOCatalogueTerminate (mach_port_t, uint32_t flag, io_name_t description);”. You can return false on a call to the “terminate” command until userspace the application dies, that is, there is no call to the clientDied function.

File protection

To protect files, it is enough to use the Kauth API, which allows you to restrict access to files. Apple provides developers with notifications about various events in the scope, the operations KAUTH_VNODE_DELETE, KAUTH_VNODE_WRITE_DATA and KAUTH_VNODE_DELETE_CHILD are important for us. Restricting access to files is easiest along the way - we use the API “vn_getpath” to get the path to the file and compare the path prefix. Note that to optimize the renaming of the paths of folders with files, the system does not authorize access to each file, but only to the folder itself, which was renamed. It is necessary to compare the parent path and restrict KAUTH_VNODE_DELETE for it.

The disadvantage of this approach may be low performance with increasing number of prefixes. In order for the comparison not to be equal to O (prefix * length), where prefix is the number of prefixes, length is the length of the string, you can use a deterministic finite state machine (DFA) constructed by prefixes.

Consider a way to build a DFA for a given set of prefixes. We initialize the cursors at the beginning of each prefix. If all cursors point to the same character, then we increase each cursor by one character and remember that the length of the same line is more by one. If there are two cursors with different symbols under them, we divide the cursors into groups by the symbol to which they point and repeat the algorithm for each group.

In the first case (all characters under the cursors are the same), we get the DFA state, which has only one transition on the same line. In the second case, we get a transition table of size 256 (number of characters and the maximum number of groups) in the subsequent states obtained by recursively calling the function.

Consider an example. For a set of prefixes (“/ foo / bar / tmp /”, “/ var / db / foo /”, “/ foo / bar / aba /”, “foo / bar / aac /”) you can get the following DFA. The figure shows only transitions leading to other states, other transitions will not be final.

When passing through the DKA states, there may be 3 cases.

The final state was reached - the path is protected, we restrict the operations KAUTH_VNODE_DELETE, KAUTH_VNODE_WRITE_DATA and KAUTH_VNODE_DELETE_CHILD
, “” ( -) — , KAUTH_VNODE_DELETE. , vnode , ‘/’, “/foor/bar/t”, .
, . , .

The aim of the developed security solutions is to increase the level of security of the user and his data. On the one hand, this goal is ensured by the development of Acronis software product that covers vulnerabilities where the operating system itself is “weak”. On the other hand, we should not neglect the strengthening of those security aspects that can be improved on the OS side, especially since the closure of such vulnerabilities increases our own stability as a product. The vulnerability was reported by the Apple Product Security Team and was fixed in macOS 10.14.5 (https://support.apple.com/en-gb/HT210119).

All this can be done only if your utility has been officially installed in the kernel. That is, there are no such loopholes for external and unwanted software. However, as you can see, even to protect legitimate programs such as antivirus and backup systems, you have to work hard. But now, new Acronis products for macOS will have additional protection against unloading from the system.

How to protect processes and kernel extensions on macOS