Information Gathering
Passive and Active Reconnaissance
Domains and Subdomains
Often, we are given a single domain or perhaps a list of domains and subdomains that belong to an organization. Many organizations do not have an accurate asset inventory and may have forgotten both domains and subdomains exposed externally. This is an essential part of the reconnaissance phase. We may come across various subdomains that map back to in-scope IP addresses, increasing the overall attack surface of our engagement (or bug bounty program). Hidden and forgotten subdomains may have old/vulnerable versions of applications or dev versions with additional functionality (a Python debugging console, for example). Bug bounty programs will often set the scope as something such as *.inlanefreight.com
, meaning that all subdomains of inlanefreight.com
, in this example, are in-scope (i.e., acme.inlanefreight.com
, admin.inlanefreight.com
, and so forth and so on). We may also discover subdomains of subdomains. For example, let's assume we discover something along the lines of admin.inlanefreight.com
. We could then run further subdomain enumeration against this subdomain and perhaps find dev.admin.inlanefreight.com
as a very enticing target. There are many ways to find subdomains (both passively and actively) which we will cover later in this module.
IP ranges
Unless we are constrained to a very specific scope, we want to find out as much about our target as possible. Finding additional IP ranges owned by our target may lead to discovering other domains and subdomains and open up our possible attack surface even wider.
Infrastructure
We want to learn as much about our target as possible. We need to know what technology stacks our target is using. Are their applications all ASP.NET? Do they use Django, PHP, Flask, etc.? What type(s) of APIs/web services are in use? Are they using Content Management Systems (CMS) such as WordPress, Joomla, Drupal, or DotNetNuke, which have their own types of vulnerabilities and misconfigurations that we may encounter? We also care about the web servers in use, such as IIS, Nginx, Apache, and the version numbers. If our target is running outdated frameworks or web servers, we want to dig deeper into the associated web applications. We are also interested in the types of back-end databases in use (MSSQL, MySQL, PostgreSQL, SQLite, Oracle, etc.) as this will give us an indication of the types of attacks we may be able to perform.
Virtual Hosts
Lastly, we want to enumerate virtual hosts (vhosts), which are similar to subdomains but indicate that an organization is hosting multiple applications on the same web server. We will cover vhost enumeration later in the module as well.
Passive information gathering
We do not interact directly with the target at this stage. Instead, we collect publicly available information using search engines, whois, certificate information, etc. The goal is to obtain as much information as possible to use as inputs to the active information gathering phase.
Active information gathering
We directly interact with the target at this stage. Before performing active information gathering, we need to ensure we have the required authorization to test. Otherwise, we will likely be engaging in illegal activities. Some of the techniques used in the active information gathering stage include port scanning, DNS enumeration, directory brute-forcing, virtual host enumeration, and web application crawling/spidering.
Tool - Argus
The Ultimate Information Gathering Toolkit
Online Tool
Enum TLDs
Brute Force TLD
DNS (53)Passive Recon Script
bash netlas_domains_and_ip_recon.sh domains_IPs_CIDRs.txt
Passive DNS
Whois
Whoiscrt.sh
Filtered by the unique subdomains:
curl -s https://crt.sh/\?q\=inlanefreight.com\&output\=json | jq . | grep name | cut -d":" -f2 | grep -v "CN=" | cut -d'"' -f2 | awk '{gsub(/\\n/,"\n");}1;' | sort -u
curl -s "https://crt.sh/?q=%25example.com&output=json" | jq -r '.[] | .name_value' | sort -u
IP addresses
for i in $(cat subdomainlist);do host $i | grep "has address" | grep inlanefreight.com | cut -d" " -f1,4;done
$ export TARGET="facebook.com"
$ curl -s "https://crt.sh/?q=${TARGET}&output=json" | jq -r '.[] | "\(.name_value)\n\(.common_name)"' | sort -u > "${TARGET}_crt.sh.txt"
Tool
CertSniff
OpenSSL
$ export TARGET="facebook.com"
$ export PORT="443"
$ openssl s_client -ign_eof 2>/dev/null <<<$'HEAD / HTTP/1.0\r\n\r' -connect "${TARGET}:${PORT}" | openssl x509 -noout -text -in - | grep 'DNS' | sed -e 's|DNS:|\n|g' -e 's|^\*.*||g' | tr -d ',' | sort -u
*.facebook.com
*.facebook.net
*.fbcdn.net
*.fbsbx.com
*.m.facebook.com
*.messenger.com
*.xx.fbcdn.net
*.xy.fbcdn.net
*.xz.fbcdn.net
facebook.com
messenger.com
Shodan
Shodan CLI:
shodan init YOUR_API_KEY
shodan host [IP]
$ for i in $(cat subdomainlist);do host $i | grep "has address" | grep inlanefreight.com | cut -d" " -f4 >> ip-addresses.txt;done
$ for i in $(cat ip-addresses.txt);do shodan host $i;done
10.129.24.93
City: Berlin
Country: Germany
Organization: InlaneFreight
Updated: 2021-09-01T09:02:11.370085
Number of open ports: 2
Ports:
80/tcp nginx
443/tcp nginx
10.129.27.33
City: Berlin
Country: Germany
Organization: InlaneFreight
Updated: 2021-08-30T22:25:31.572717
Number of open ports: 3
Ports:
22/tcp OpenSSH (7.6p1 Ubuntu-4ubuntu0.3)
80/tcp nginx
443/tcp nginx
|-- SSL Versions: -SSLv2, -SSLv3, -TLSv1, -TLSv1.1, -TLSv1.3, TLSv1.2
|-- Diffie-Hellman Parameters:
Bits: 2048
Generator: 2
10.129.27.22
City: Berlin
Country: Germany
Organization: InlaneFreight
Updated: 2021-09-01T15:39:55.446281
Number of open ports: 8
Smap
ShoLister - Subdomains enum
ShodanSpider
LazyHunter

Nrich
Based on Shodan - No API rate limiting
nrich IPs.txt
FOFA
Netlas.io
Hunter
ZoomEye
Censys
DNS Record
dig any inlanefreight.com
Metabigor
Subdomain Enumeration
DNS Subdomain EnumerationVirus Total

Create an account on VirusTotal (https://www.virustotal.com).
Generate or locate your API key.
Use the following endpoint to fetch URLs associated with a specific domain
https://www.virustotal.com/vtapi/v2/domain/report?apikey=YOUR_API_KEY&domain=example.com
In the JSON response, look under the undetected_urls
section. These are URLs that were fetched or scanned by VirusTotal but haven't been flagged as malicious—often a goldmine for sensitive endpoints
Virustotalx

Urlscan.io
Search for URL:

Dorking:

# XLS
domain:redacted.com AND page.url:xlsx
domain:redacted.com AND page.url:xls
# PDF
domain:redacted.com AND page.url:pdf
# Combine with interesting URL paths like upload , uploads, private , system , data , web , internal
domain:redacted.com AND page.url:pdf AND page.url:web
# Latest Subdomains
domain:gov.in
# JS Files
domain:redacted.com AND page.url:.js
# Parameter Hunting
domain:redacted.com AND page.url:search
domain:redacted.com AND page.url:query
domain:redacted.com AND page.url:page
domain:redacted.com AND page.url:id
domain:redacted.com AND page.url:type
domain:redacted.com AND page.url:search=
domain:redacted.com AND page.url:query=
domain:redacted.com AND page.url:page=
domain:redacted.com AND page.url:id=
domain:redacted.com AND page.url:type=
etc........
# Hidden stuff
domain:redacted.com AND page.url:internal
domain:redacted.com AND page.url:private
domain:redacted.com AND page.url:hidden
domain:redacted.com AND page.url:secret
domain:redacted.com AND page.url:dashboard
domain:redacted.com AND page.url:config
domain:redacted.com AND page.url:key
domain:redacted.com AND page.url:pwd
domain:redacted.com AND page.url:token
domain:redacted.com AND page.url:eyJ
# API Endpoints
domain:redacted.com AND page.url:api
domain:redacted.com AND page.url:api AND page.url:v1
domain:redacted.com AND page.url:api AND page.url:v2
domain:redacted.com AND page.url:api AND page.url:v3
domain:redacted.com AND page.url:api AND page.url:v4
domain:redacted.com AND page.url:api AND page.url:{anyversion} AND page.url:get
domain:redacted.com AND page.url:api AND page.url:{anyversion} AND page.url:fetch
domain:redacted.com AND page.url:api AND page.url:{anyversion} AND page.url:details
domain:redacted.com AND page.url:api AND page.url:{anyversion} AND page.url:list
domain:redacted.com AND page.url:api AND page.url:{anyversion} AND page.url:payment
domain:redacted.com AND page.url:api AND page.url:{anyversion} AND page.url:order
domain:redacted.com AND page.url:api AND page.url:{anyversion} AND page.url:format
domain:redacted.com AND page.url:api AND page.url:{anyversion} AND page.url:export
domain:redacted.com AND page.url:api AND page.url:{anyversion} AND page.url:retrieve
domain:redacted.com AND page.url:api AND page.url:{anyversion} AND page.url:system
domain:redacted.com AND page.url:api AND page.url:{anyversion} AND page.url:dashboard
domain:redacted.com AND page.url:api AND page.url:{anyversion} AND page.url:admin
domain:redacted.com AND page.url:api AND page.url:{anyversion} AND page.url:internal
domain:redacted.com AND page.url:api AND page.url:{anyversion} AND page.url:private
domain:redacted.com AND page.url:api AND page.url:{anyversion} AND page.url:secret
domain:redacted.com AND page.url:api AND page.url:{anyversion} AND page.url:debug
domain:redacted.com AND page.url:api AND page.url:{anyversion} AND page.url:users
domain:redacted.com AND page.url:api AND page.url:{anyversion} AND page.url:send
# Open Redirect Endpoints
domain:redacted.com AND page.url:uri
domain:redacted.com AND page.url:url
domain:redacted.com AND page.url:http
domain:redacted.com AND page.url:2F
domain:redacted.com AND page.url:http%3A
domain:redacted.com AND page.url:redirect
domain:redacted.com AND page.url:redirect_uri
domain:redacted.com AND page.url:redirect_url
domain:redacted.com AND page.url:forwarded
domain:redacted.com AND page.url:to
# SSRF Endpoints
domain:redacted.com AND page.url:dest
domain:redacted.com AND page.url:path
domain:redacted.com AND page.url:continue
domain:redacted.com AND page.url:window
domain:redacted.com AND page.url:site
domain:redacted.com AND page.url:return
domain:redacted.com AND page.url:port
domain:redacted.com AND page.url:view
domain:redacted.com AND page.url:print
domain:redacted.com AND page.url:export
domain:redacted.com AND page.url:dir
domain:redacted.com AND page.url:out
domain:redacted.com AND page.url:callback
# File Manager Endpoints
page.url:filemanager.php
page.url:manage AND page.url:php
page.url:file AND page.url:php
page.url:document AND page.url:php
page.url:upload AND page.url:php
# other than php
aspx,asp,jsp,jspx,do,action,cgi
# S3 Buckets
page.url:s3. AND page.url:amazonaws.com AND page.url:csv
TheHarvester
$ cat sources.txt
baidu
bufferoverun
crtsh
hackertarget
otx
projectdiscovery
rapiddns
sublist3r
threatcrowd
trello
urlscan
vhost
virustotal
zoomeye
$ export TARGET="facebook.com"
$ cat sources.txt | while read source; do theHarvester -d "${TARGET}" -b $source -f "${source}_${TARGET}";done
cat *.json | jq -r '.hosts[]' 2>/dev/null | cut -d':' -f 1 | sort -u > "${TARGET}_theHarvester.txt"
Merge all files
$ cat facebook.com_*.txt | sort -u > facebook.com_subdomains_passive.txt
$ cat facebook.com_subdomains_passive.txt | wc -l
Google Dorks
Google DorksDomain.glass

Passive - Infrastructure
Netcraft
Wayback Machine
Web Enumerationgo install github.com/tomnomnom/waybackurls@latest
$ waybackurls -dates https://facebook.com > waybackurls.txt
$ cat waybackurls.txt
2018-05-20T09:46:07Z http://www.facebook.com./
2018-05-20T10:07:12Z https://www.facebook.com/
2018-05-20T10:18:51Z http://www.facebook.com/#!/pages/Welcome-Baby/143392015698061?ref=tsrobots.txt
2018-05-20T10:19:19Z http://www.facebook.com/
2018-05-20T16:00:13Z http://facebook.com
2018-05-21T22:12:55Z https://www.facebook.com
2018-05-22T15:14:09Z http://www.facebook.com
2018-05-22T17:34:48Z http://www.facebook.com/#!/Syerah?v=info&ref=profile/robots.txt
2018-05-23T11:03:47Z http://www.facebook.com/#!/Bin595
<SNIP>
Waymore
Passive - others
RIPE Database
Infra and known vulnerabilities
Web-Check
LeakIX
SecurityTrails
FullHunt
Onyphe
DomLink
OSINT
OSINTCloud
CloudMail
EmailsOthers
IntelX
Public company information
CGU / CGV : Search for names, emails, etc
Societe.com
Social medias:
LinkedIn
Facebook
Twitter
Instagram
Youtube - See Credentials in Youtube videos
etc.
Job Post - Search for technologies used, HR names, etc
Github, repos - See Credentials in Git Repos
Active - DNS
DNS Subdomain Enumeration
DNS Subdomain EnumerationDNS - Zone Transfer
DNS (53)Active - Infrastructure
HTTP Headers
$ curl -I "http://${TARGET}"
HTTP/1.1 200 OK
Date: Thu, 23 Sep 2021 15:10:42 GMT
Server: Apache/2.4.25 (Debian)
X-Powered-By: PHP/7.3.5
Link: <http://192.168.10.10/wp-json/>; rel="https://api.w.org/"
Content-Type: text/html; charset=UTF-8
Cookies
.NET:
ASPSESSIONID<RANDOM>=<COOKIE_VALUE>
PHP:
PHPSESSID=<COOKIE_VALUE>
JAVA:
JSESSION=<COOKIE_VALUE>
Target Website - Source Code

Target Website - Comments

Whatwbeb
$ whatweb -a3 https://www.facebook.com -v
WhatWeb report for https://www.facebook.com
Status : 200 OK
Title : <None>
IP : 31.13.92.36
Country : IRELAND, IE
Summary : Strict-Transport-Security[max-age=15552000; preload], PasswordField[pass], Script[text/javascript], X-XSS-Protection[0], HTML5, X-Frame-Options[DENY], Meta-Refresh-Redirect[/?_fb_noscript=1], UncommonHeaders[x-fb-rlafr,x-content-type-options,x-fb-debug,alt-svc]
<---SNIP--->
Wappalyser
Waf detection
wafw00f -v https://www.tesla.com
Scan multiple subdomains
#!/bin/bash
input_file="target.subdomains" # File containing the subdomains
output_file="no_waf_target.txt" # Output file
echo "[*] Starting WAF scan..."
# Clear the output file before starting
> "$output_file"
while read -r domain; do
echo "[*] Testing $domain..."
# Run wafw00f and capture the result
result=$(wafw00f "$domain")
if echo "$result" | grep -q "No WAF detected"; then
echo -e "\e[32m[✔] $domain has NO WAF\e[0m"
echo "$domain" | tee -a "$output_file"
else
echo -e "\e[31m[✘] $domain is protected by a WAF\e[0m"
fi
done < "$input_file"
echo "[✔] Scan completed. Results saved in $output_file"
Aquatone
On ubuntu:
$ sudo apt install golang chromium-driver
$ go install github.com/michenriksen/aquatone@latest
$ export PATH="$PATH":"$HOME/go/bin"
cat facebook_aquatone.txt | aquatone -out ./aquatone -screenshot-timeout 1000
wget https://github.com/michenriksen/aquatone/releases/download/v1.7.0/aquatone_linux_amd64_1.7.0.zip
unzip aquatone_linux_amd64_1.X.0.zip
Archive: aquatone_linux_amd64_1.X.0.zip
inflating: aquatone
inflating: README.md
inflating: LICENSE.txt
[Apr 07, 2024 - 03:00:24 (EDT)] exegol-CPTS /workspace # cat test.txt | ./aquatone -out ./aquatone-out -screenshot-timeout 10000


Lot of failed screen ==> Increase timeout option
firefox aquatone-out/aquatone_report.html &> /dev/null &

Eyewitness
exegol-CPTS /workspace # EyeWitness.py -f urls.txt --web
Gowitness
gowitness scan -l urls.txt
Slack Workspaces
SlackInteresting Books
Interesting BooksKali Linux Recon & Information Gathering Gather critical intelligence, map out networks, and track down every juicy piece of information about your target
Support this Gitbook
I hope it helps you as much as it has helped me. If you can support me in any way, I would deeply appreciate it.
Last updated