Linux Malware: Defense Evasion Techniques ·

Table of Contents

Introduction #

This post will cover some of the different techniques that can be employed to avoid malware detection in a Linux host.

On a successful compromise of a host in a network, an implant|agent which is obviously a malicious software is installed to provided backdoor access or for data exfil. A malware author needs to make sure that their malware remains stealthy leaving little to no IOCs that could cause detection. This blog will therefore look at some of the different techniques that can be employed for this purpose.

The following source code snippet, is the skeleton of the example C code that will be used to showcase an implementation of the techniques.

#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <utime.h> 
#include <signal.h>
#include <dirent.h>
#include <fcntl.h> 
#include <time.h>


#define SIZE 1048

extern char __executable_start; //returns the address of the image base

//This is called when the debugger is detected.
void exit_func(void){
	puts("Debugger Detected");
	_exit(-1);
}

int main(int argc, char **argv){
	time_check();
	time_stomp(); //spoof the creation date of the malware on disk
	masquerade(argv[0]); //Spoof a running process
	check_for_debugger(); //check for debugging
	detect_breakpoint(); //Look for a software breakpoint (mainly used during debugging)
	vm_check(); //check if we are running in a sand-box (VM)
	return 0;
}

Time Stomping #

This is a technique that can be used to avoid detection by modifying the creation date and time of the malware. This is useful when bypassing certain filters that are used to find recently modified or added files. The following code snippet will be used to modify the creation date to Thursday, October 20, 2008 7:17:16 PM (GMT) which is a very special date (my bd anniversary)

This is done using the utime system call whose signature is as follows:

int utimes(const char *filename, const struct timeval times[2])


void time_stomp(char *filename){
	struct timeval tv[2];
	
	tv[0].tv_sec = 1225394236;
	tv[0].tv_usec = 0;
	
	tv[1].tv_sec = 1225394236;
	tv[1].tv_usec = 0;
	
	if(!utimes(filename, tv))
		fprintf(stdout, "[+] TimeStomped!");
}

The utimes function will take two time_val structures where one will be the actime used to change the action time and the modtime used to change the Modification time.

Process Name Masquerading (PID-Spoofing) #

To avoid detection when an analyst/user lists running processes i.e ps aux, we can masquerade and pretend to be a legitimate Linux Daemon Process. This is possible because when a process is run in Linux, the Process Name that will be displayed when we list processes is the string at argv[0] which is passed as an argument to the entry function main(). Therefore, modifing this value on execution will do the job perfectly.

void masquerade(char *args){
        const char *name = "ffs-it-worked"; //?? Use a real linux demon process name e.g [kthreadd]
        strcpy(args, name);
        fprintf(stdout, "[+] %d is now spoofed!\n", getpid());

        getchar(); //Pause execution; check proesses 'ps -aux'
}

The above code snippet copies the new process name to argv[0] and this masquerades as the new spoofed process name. A list of legit Linux Processes can also be declared and chosen at random at the beginning of each execution.

A screenshot to showcase a spoofed process

The code snippet provided in my GitHub here is an example of a C code that will enumerate all the running processes and after each cycle of about an hour, it will spoof another process.

Analysis Evasion #

When a malware is detected in a host or network, it will tend to be dissected and Reverse Engineere’d with the aim of understanding how the malware works and the impact it has or may have caused to an organization. Malware Authors will therefore implement techniques to prevent static and dynamic analysis of the malware incase the malware is caught (a fail-safe).

Static Analysis #

There are many ways that can be used to prevent static analysis, the following are some of the techniques that can be implemented.

When a binary is stripped via the -strip command option during compilation, this is used to remove the symbol table therefore preventing the determination of naming mechanism(s) used for functions.
Obfuscation of strings i.e keys, passwords, shellcode etc.. is necesarry in preventing static analysis. There are different obfuscation techniques that can be used:
- IpFuscation: This is where data in the malware will be obfuscated to look like IP address.
- MaCFuscation: This is where data in the malware can be obfuscated to mimic MAC address strings.
When strings are also declared as local variables or global variables, they should not be stacked together. The following code snippet explains this is detail.

const char xor_key[] = {'m', 'a', 'l', 'w', 'a', 'r', 'e'}; //not stacked
const char xor_key[] = "malware"; //stacked

Encryption i.e RC4, AES, XOR can also be used to encrypt strings used in the malware.
The last option is to completely avoid the use of strings where necessary, for example suppose an implant connects to a specific IP, this can be represented as follows:

unsigned char *ip_addr = "127.0.0.1"; //This is easily detectable via static analysis
unsigned long ip_addr = 0x100007f; // IP converted to a LONG not easily detectable via static analysis

You can read more on the technique(s) from here referenced below.

Dynamic Analysis #

For a Reverse Engineer to understand how a malware works, the usual is to run the malware in an isolated environment e.g inside a VM and further use a debugger for analysis. Malware authors will therefore employ various anti-debugging techniques to try and make dynamic analysis a pain in the ass.

Ptrace: Self-Debugging #

From its man-page, man ptrace is described as a system call by which one process the tracer may observe and control the execution of another process the tracee used to implement breakpoint debugging.

This is the system call that is used by debuggers and other tracers like ltrace and strace. This syscall has one major limitation that can be of an advantage to a malware author.

Only one tracer can control a tracee at a time.

This therefore means, when a tracer is tracing a process, no other tracer can be attached; which can be used to evade the debugger. A malware can therefore spawn a child process or a thread on start-up for self-debugging. When a debugger is attached self-debugging will fail because of the above mentioned limitation. A failed self-debug will be an indication that a debugger is present.

The Reverse Engineer can however patch the call to ptrace() or modify the return value of ptrace() to return TRUE when self-debugging fails. The following are some of the different approaches that can be used to mitigate this but can still be bypassed:

Code Patching Detection.
Posix timer for self-debugging.The code can be found on my GitHub here

Timing Analysis #

During debugging, there is usually an amount of time taken to read the assembly code, registers or basically understand in-depth the workings of the malware. Timing analysis is a technique that can be used to terminate or change the behaviour of the malware after an amount of time has elapsed as long as the execution of the malware was restarted/started.

The following source code snippet will raise a SIGALARM signal after 120 seconds has elapsed. This is used to make sure, when debugging or when a break-point is hit, if 120 seconds elapse, the malware should terminate.

void time_check){
	struct sigevent event;
        struct itimerspec timer;
        struct sigaction act;

        timer_t tid;

        act.sa_handler = exitFunc;

        sigaction(SIGALRM, &act, NULL);

        event.sigev_notify = SIGEV_SIGNAL;
        event.sigev_signo = SIGALRM;

        if((timer_create(CLOCK_REALTIME, &event, &tid)) < 0)
                _exit(-1);

        timer.it_value.tv_sec = 120; //terminate after 120 secondz

        if((timer_settime(tid, 0, &timer, NULL)) < 0)
                _exit(-1);
}

The exitFunc is a sighandler that will handle the raised signal, it can be used to terminated or change the behaviour of the malware. The time check should also be included before the execution of the main function probably defined in a __constructor__ decorated function.

The chosen time taken to raise the signal should be a time greater than than that taken to complete the execution of the process.

BreakPoint Detection #

A mechanism can be employed to check for the 3rd interrupt instruction INT3 0xcc in memory to determine the presence of a debugger. When a breakpoint is set, the debugger overwrites the target address where the breakpoint is put with 0xcc. This is explained in much detail here

The following code snippet can therefore be used to search for the breakpoint instruction in memory.

void detect_breakpoint(){
	unsigned long _start_address = (unsigned long) detect_breakpoint;
	int pageSize = sysconf(_SC_PAGESIZE);
	
	mprotect((void *)((intptr_t)_start_address &~ 0xfff), pageSize, PROT_READ|PROT_READ|PROT_EXEC); //make sure the page is readable

	unsigned char *mem = (unsigned char *) ((intptr_t) _start_address &~ 0xfff);

	for(size_t i = 0x0; i <= pageSize; i++){
		if (*mem == 0xcc){ fprintf(stderr, "BreakPoint detected!\n"); _exit(-1); }
		mem++;
	}
}

A break-point will most definately be set in the .text section, therefore using 0xcc in the stack (to be used in the search) will have no effect on the search functionality.

The above function will get the pagesize dynamically; change the protection of the page to make sure its readable/accessible and will traverse that page to find a breakpoint since functions in an ELF binary tend to be in the same page in memory.

Check for Debug Processes #

When a process is started from the debugger, the debugger will be the parent process. A linux parent process can be looked up from the /proc/self/stat or the /proc/self/status.

The /proc/self/status has a PPID value which is the parent process. The parent process name can further be obtained from /proc/<PID>/cmdline and checked against some of the common linux debuggers. The symlink to /proc/<PID>/exe can also be resolved to find the path of the actual file in memory. The following C source code snippet can be used for this implementation.

void check_for_debugger(){
	int fd = open("/proc/self/status", O_RDONLY, 0);
	if(fd < 0){ _exit(-1); }

	char buffer[SIZE];
	read(fd, buffer, sizeof(buffer));
	close(fd);
	
	char *needle = strstr(buffer, "PPid:");
	
	if(needle == NULL) _exit(-1);
	char *comm = needle;
	
	while(*comm != 0x0 && *comm != 0xa) comm++;
	*comm = 0x0;
	
	int ppid = atoi(needle + strlen("PPid:") + 0x1 + strlen(comm));
	char filename[SIZE], procs[SIZE];
	
	snprintf(procs, sizeof(procs), "/proc/%d/exe", ppid);
	readlink(procs, filename, sizeof(filename));

	char *debugs[] = {"gdb", "radare", "r2"}; //add more debuggers
	
	for(int i = 0;i < sizeof(debugs)/sizeof(char *); i++){
		if((strstr(filename, debugs[i])) == NULL)
			continue;
		fprintf(stdout, "Debugger Found!\n");
		_exit(-1);
	}
	
}

The above source code snippet will basically read the PPID value from /proc/self/status and resolve the symlink for that parent process. This will then be compared against an array of debugger names to determine if the process is being debugged.

The same can performed by reading the Parent Process ID from /proc/self/stat; which is usually the fourth value.

The TracerPid value from /proc/self/status can also be checked to determine if a process is being debugger; 1 indicates the presence of a debugger and 0 indicates the absence of a debugger. This implements the almost the same logic as the Windows isDebuggerPresent.

SandBox/VM Evasion #

These are techniques that can be employed to detect the kind of environment the malware is running on. Based on certain heuristics, the malware can change behaviour, delete itself or terminate on execution when it detects that it is running in a sand-boxed environment or a VM.

The following are the various ‘features’ that can be checked to determine if a malware is running in a VM.

Drivers

There are specific Kernel Drivers that are loaded into the Kernel when the VM is started. For exampe for VirtualBox, the following drivers can be registered:

/dev/vboxguest
/dev/vboxuser

PCI Devices

Some Virtulization technologie(s) implemented Peripheral Component Interconnect passthroughs to provide a way to share some hardware resources with the host Operating System. The following bash command for example can be used as an indicator to know if we are inside a VirtualBox.

lspci -d 80ee:cafe

80ee is the Vendor ID in this case VirtualBox while cafe is the device ID. The vendor id’s can be looked up from here.

The following is the C code snippet that can be used to check if we are indeed in a VM.


void vm_check(void){
        DIR *dir = opendir("/dev");
        char *drivers[] = {"vboxuser", "vboxguest"}; //add other VM drivers 

        if (!dir)
                _exit(-1); 
        struct dirent *e;

        while((e=readdir(dir)) != NULL){
                for(int i = 0; i < sizeof(drivers)/sizeof(char *); i++)
                        if(strcmp(drivers[i], e->d_name) == 0){fprintf(stderr, "VM driver found!\n"); _exit(-1); }
        }	
}

The above source sode snippet will look for unsual drivers that are included in guest machines to detect for the present of a VM.

The following are some of the other things that can be checked.

Kernel Logz: The Kernel log dmesg can contain information that maybe useful in determining if we are running in a VM.
Shared Libraries /usr/lib/*
Kernel Modules /lib/modules/5.16.0-kali7-amd64/modules.builtin.
Services and Processes: There can be specific process and services that can be check to detect the kind of environment that we are in.
hostnamectl: The hostname can aloso be an indicator to determine the type of environment we are on.
Disk Size : Most VMs are allocated smaller sizes for their disks compared to a real linux hosts.

There will be a later detailed post on Linux sandbox/VM/Container detection, keep watch.

Honorable Mentions #

Fileless Execution - A dropper can be installed to run a second staged payload using memfd_create() and fexecve() syscalls.

Hidden Artifacts - A malware may copy itself in the /dev/shm directory to run in the context of the host’s RAM, therefore leaving no traces after a reboot.

References #

https://attack.mitre.org/tactics/TA0005/