Web Enumeration

INTRODUCTION

Why write this?

Lately, I’ve settled into a pretty good set of web enumeration techniques. As I’ve been writing more and more walkthroughs, it has started feeling quite repetetive always writing about the same web enumeration strategy. Why not skip that whole song-and-dance, and write it on a separate page that I can link to?

Most importantly, I like having a convenient reference for myself: it really speeds up recon on a box when you have exactly the right enumeration commands at your fingertips. This way, I don’t have to spend any time formulating the perfect ffuf command; I can just copy-paste from my previous work.

Disclaimer

There are many, many ways to do web enumeration. I like the techniques below because they are versatile and don’t pigeonhole you into a particular set of tools. Many professionals prefer to simply use the paid version of Burpsuite for this, to accomplish all tasks within one piece of software.

NMAP SCANS

Good recon always begins with running some nmap scans. To prevent nmap from running as a basic “connect” scan, all of these should be ran as root. Also, OS detection (-O) requires root.

Burp proxy

Sometimes you’ll want Burp to build a site map as you enumerate. If you want your results to appear in Burp, the method depends on which tool you’re using:

For Gobuster: check out this article on how to set up Burp itself to act at the proxy for your target. With this method, you’ll point gobuster at your Burp proxy with the -u flag, and have Burp relay all requests to your target.
For Ffuf: just use -replay-proxy http://127.0.0.1:8080 and in Burp enable Intercept.
For Feroxbuster: simply use the --burp switch

Port scan

I always start with a simple but broad port scan. Some people prefer to use the timing flags like -T4 instead of specifying a --min-rate.

sudo nmap -p- -O --min-rate 1000 -oN nmap/port-scan-tcp.txt $RADDR

Here’s a sample of the output to expect:

Starting Nmap 7.94 ( https://nmap.org ) at 2023-08-17 12:00 IDT
Nmap scan report for 10.10.11.214
Host is up (0.24s latency).
Not shown: 65534 open|filtered udp ports (no-response), 65533 filtered tcp ports (no-response)
PORT      STATE  SERVICE
22/tcp    open   ssh
50051/tcp open   unknown
Warning: OSScan results may be unreliable because we could not find at least 1 open and 1 closed port
Aggressive OS guesses: Linux 5.0 (99%), Linux 4.15 - 5.8 (94%), Linux 5.0 - 5.4 (94%), Linux 5.3 - 5.4 (94%), Linux 2.6.32 (94%), Linux 5.0 - 5.5 (93%), ASUS RT-N56U WAP (Linux 3.4) (93%), Linux 3.1 (93%), Linux 3.16 (93%), Linux 3.2 (93%)
No exact OS matches for host (test conditions non-ideal).
Network Distance: 2 hops

You can often ignore the OS detection. The most important part is the tabular data outlining which TCP ports were listening.

Script scan

The above scan puts its results into nmap/port-scan-tcp.txt. Next, we’ll take the ports that we found and perform a script scan on them. The following will run the default scripts only:

# Make a comma-separated list of the discovered TCP ports
TCPPORTS=`grep "^[0-9]\+/tcp" nmap/port-scan-tcp.txt | sed 's/^\([0-9]\+\)\/tcp.*/\1/g' | tr '\n' ',' | sed 's/,$//g'`
# Run the script scan using those ports only
sudo nmap -sV -sC -n -Pn -p$TCPPORTS -oN nmap/script-scan-tcp.txt $RADDR

Here’s a sample of what to expect:

PORT     STATE SERVICE VERSION
22/tcp   open  ssh     OpenSSH 8.9p1 Ubuntu 3ubuntu0.4 (Ubuntu Linux; protocol 2.0)
| ssh-hostkey: 
|   256 96:07:1c:c6:77:3e:07:a0:cc:6f:24:19:74:4d:57:0b (ECDSA)
|_  256 0b:a4:c0:cf:e2:3b:95:ae:f6:f5:df:7d:0c:88:d6:ce (ED25519)
80/tcp   open  http    Apache httpd 2.4.52
|_http-title: Did not follow redirect to http://ermagerd-mern.htb/
|_http-server-header: Apache/2.4.52 (Ubuntu)
3000/tcp open  http    Node.js Express framework
|_http-title: Ermagerd-MERN
Service Info: Host: codify.htb; OS: Linux; CPE: cpe:/o:linux:linux_kernel

☝️ To find out exactly which scripts are “default”, try this:
grep -El 'categories = {.*default.*}' /usr/share/nmap/scripts/*.nse | xargs -I {} basename {} 

Vulnerability scan

Permitting that there is web application firewall running, and no kind of intrusion detection running, it can be an excellent idea to get nmap to scan for known common vulnerabilities. Note that this one can take quite a bit of time:

sudo nmap -n -Pn -p$TCPPORTS -oN nmap/vuln-scan-tcp.txt --script 'safe and vuln' $RADDR

If you enjoy living dangerously, you may want to use 'vuln' instead of just 'safe and vuln' 😼

UDP scan

It’s also good practice to do a UDP port scan. But it’s better to limit this one to “normal” ports. Like the vuln scan, this one can take quite a bit of time:

sudo nmap -sUV -T4 -F --version-intensity 0 -oN nmap/port-scan-udp.txt $RADDR

☝️ Note that any open|filtered ports are either open or (much more likely) filtered.

ENUMERATION

Record the Domain

If the nmap scans show a server on port 80 or 443, I’ll go through the webserver strategy. First I add the domain of the target to /etc/hosts. Usually, the domain of the target is shown in the nmap script scan. For example:

80/tcp  open     http        Apache httpd 2.4.56
|_http-title: Did not follow redirect to http://ermagerd-mern.htb/
|_http-server-header: Apache/2.4.56 (Debian)

DOMAIN=ermagerd-mern.htb
echo "$RADDR $DOMAIN" | sudo tee -a /etc/hosts

☝️ I use tee instead of the append operator >> so that I don’t accidentally blow away my /etc/hosts file with a typo of > when I meant to write >>.

Next, I do banner-grabbing on the target. This sometimes reveals a subdomain by insisting on a redirect. More often than not, though, it shows what type of webserver is running (apache, nginx, IIS, etc.)

whatweb $RADDR && curl -IL http://$RADDR

VHost scan

It’s very important that you perform subdomain and/or vhost enumeration before doing directory enumeration. While it is more likely you’ll find an important result using directory enumeration, the potential cost of not exploring subdomains is much much higher. Save yourself the time, and find those vhosts/subdomains first!

My preferred way, using Ffuf:

WLIST="/usr/share/seclists/Discovery/DNS/bitquark-subdomains-top100000.txt"
ffuf -w $WLIST -u http://$RADDR/ -H "Host: FUZZ.htb" -c -t 60 -o fuzzing/vhost-root.md -of md -timeout 4 -ic -ac -v

Or here’s a similar way to do it with Gobuster

WLIST="/usr/share/seclists/Discovery/DNS/bitquark-subdomains-top100000.txt"
gobuster vhost -w $WLIST -u http://$RADDR \                                   
--random-agent -t 10 --timeout 5s \
--output "fuzzing/vhost-gobuster-root.txt" \
--no-error

This is a traditional vhost search using ffuf. If it returns anything, it will show what domains might be listening on http. Be sure to adjust this if you know HTTPS is running: both ffuf and gobuster use the -k flag (just like curl)

Subdomain scan

On HTB, there’s a really good chance that the above enumeration will yield no results. However, it leads into the next one which does often yield a result:

ffuf -w $WLIST -u http://$RADDR/ -H "Host: FUZZ.boxname.htb" -c -t 60 -o fuzzing/subdomain-boxname.md -of md -timeout 4 -ic -ac -v

While this is actually still a vhost scan, this method will find subdomains of the primary domain discovered from the last scan. For example, shop.boxname.htb, or registration.boxname.htb.

I find that doing a vhost scan finds the same results but is performed faster than a regular subdomain scan. If you insist on doing a regular subdomain scan, try this:
ffuf -w $WLIST:FUZZ -u http://FUZZ.$DOMAIN/FUZZ -t 80 -c -o fuzzing/subdomain-ffuf-$DOMAIN -of json -e php,asp,js,html -timeout 4 -v

Directory enumeration

Now comes the big one. Once vhost and subdomain enumeration have been performed, the next step is to perform directory enumeration on every vhost and subdomain that was found. Here’s how I do it with ffuf:

WLIST="/usr/share/seclists/Discovery/Web-Content/raft-small-words-lowercase.txt"
ffuf -w $WLIST:FUZZ -u http://$DOMAIN/FUZZ \
-t 80 --recursion --recursion-depth 2 -c \
-o fuzzing/directory-ffuf-$DOMAIN -of json \
-e php,asp,js,html -timeout 4 -v

Here’s a really similar way to do it with Gobuster:

WLIST="/usr/share/seclists/Discovery/Web-Content/raft-small-words-lowercase.txt"
gobuster dir -w $WLIST -u http://$DOMAIN \
--random-agent -t 10 --timeout 5s -f -e \
--status-codes-blacklist 400,401,402,403,404,405 \
--output "fuzzing/directory-gobuster-$DOMAIN.txt" \
--no-error

And if you really want to go nuts, try Feroxbuster with all the bells and whistles:

WLIST="/usr/share/seclists/Discovery/Web-Content/raft-small-words-lowercase.txt"
feroxbuster -w $WLIST -u http://$DOMAIN \
-A -d 1 -t 100 -T 4 --burp -f --auto-tune \
--collect-words --filter-status 400,401,402,403,404,405 \
--output "fuzzing/directory-feroxbuster-$DOMAIN.json"

☝️ The above Gobuster command will find directories and pages without extensions. The ffuf and feroxbuster commands will find directories and files.

API Fuzzing

Sometimes you’ll come across an API or something, and want to know what type of http requests it’s listening for. Thankfully, the number of HTTP verbs is very finite, making this a quick scan:

WLIST=/usr/share/seclists/Fuzzing/http-request-methods.txt
ffuf -w $WLIST:FUZZ -u https://boxname.htb/path/to/api -X FUZZ -t 80 -c -timeout 4 -v -mc all

☝️ If the results seem overwhelming, then this might not have been a useful test. In that case, it’s usually fine to simply default to using only GET and POST. Even theen, POST-based APIs are far more common.

Once you’ve discovered the set of HTTP verbs that an API might be listening for, an easy way to incorporate this info is just by using a bash loop:

# This is a good wordlist for fuzzing API actions
WLIST=/usr/share/seclists/Discovery/Web-Content/api/api-endpoints-res.txt
for METHOD in GET POST PUT; do
	ffuf -w $WLIST:FUZZ -u https://boxname.htb/path/to/api/FUZZ \
	-X $METHOD -t 80 -c -timeout 4 -v -mc all -fc 404; 
done

Exploring the Website

The two most important first steps while exploring a website are:

Fingerprint the website. What open source software does it use? What type of server is running?
Identify restricted resources. These are usually indicated by a login page or some other form of authentication.

Fingerprinting

As previously mentioned, do some banner-grabbing. As a quick followup to this, check what Wappalyzer has to say. This info (almost) never leads directly to a vulnerability, but it will let you know what you’re dealing with.

Next, I look for the page footer. Or some kind of “about”, “credits”, or “licenses” page. By the law of most open source licenses, the website must declare what software they used. Even if the website itself is closed-source, this can give you a very good lead on the supply chain behind the website. Often, this will even indicate software versions! You can save yourself a lot of time by doing a little OSINT on the software components of your target.

Once you’ve identified names of software components and version numbers, head straight to google and start searching for “sofwarename version x.y.z vulnerability CVE PoC” for a quick win.

Often, vulnerability disclosures, patch notes, and infosec blog articles are extremely formulaic. Get a feel for how you can rapidly search for the nuggets of information that will lead you to an actionable exploit.

Checking for a Path Traversal

LFI=/usr/share/seclists/Fuzzing/LFI/LFI-Jhaddix.txt
ffuf -w $LFI:LFI -u http://$DOMAIN/known/directory/LFI -t 80 -c -timeout 4 -v

It will shock you how many websites have a directory traversal problem.

Also, consider using my tool Alfie, or its precessessor LFI Enumerator. It checks for directory traversals using a ton of different patterns, encodings, and string escapes. Alfie also checks for a couple important LFIs.

For a more comprehensive and well-known tool, you can also check out fimap.