Reconnaissance: What Attackers See Before They Strike

Every attack starts with a question: “What can I learn about this target?”

Reconnaissance is the intelligence-gathering phase. Before an attacker sends a phishing email, scans a port, or writes an exploit, they research. The more they know, the more targeted and effective the attack.

As a defender, understanding what attackers see helps you minimize your exposure. As a pentester, this is where every engagement begins.

Passive Reconnaissance

Passive recon gathers information without directly interacting with the target. No packets sent to their network, no login attempts, no scanning. The target can’t detect it.

OSINT (Open Source Intelligence)

Information freely available online:

Company information:

Website (technology stack, team pages, contact info)
Job postings (reveal which technologies they use: “Experience with Kubernetes, AWS, Terraform”)
Press releases and news articles
SEC filings (for public companies)
Social media profiles

People information:

LinkedIn profiles (employee names, roles, technologies they work with)
Email format (does the company use first.last@company.com or f.last@company.com?)
Social media (personal details useful for phishing pretexts)
Conference talks and blog posts by employees
GitHub/GitLab profiles (code, commits, sometimes accidental credential leaks)

DNS Reconnaissance

DNS records reveal a lot about an organization’s infrastructure:

# Basic DNS lookup
nslookup example.com

# Detailed DNS records
dig example.com ANY

# Find mail servers
dig example.com MX

# Find name servers
dig example.com NS

# Find text records (SPF, DKIM, DMARC settings)
dig example.com TXT

What DNS reveals:

IP addresses of servers
Mail server infrastructure
Cloud providers being used
Subdomains (each might be a separate application)

Subdomain Enumeration

Subdomains are often where the interesting targets live: dev.example.com, staging.example.com, vpn.example.com, admin.example.com.

Passive methods:

# Certificate Transparency logs (every issued certificate is public)
# Check crt.sh
curl -s "https://crt.sh/?q=%.example.com&output=json" | jq '.[].name_value' | sort -u

# Google dorking (using search operators)
# site:example.com -www
# site:*.example.com

# Threat intelligence platforms
# VirusTotal, SecurityTrails, Shodan

Certificate Transparency is particularly powerful. Every SSL/TLS certificate ever issued is logged publicly. Search crt.sh for %.example.com and you’ll find every subdomain that has ever had a certificate - including internal ones that were accidentally exposed.

Credential Reconnaissance

Check if the target’s employees have appeared in data breaches:

Have I Been Pwned - Check email addresses against known breaches
Breach databases (used ethically by pentesters to demonstrate credential exposure risk)
Paste sites where leaked data sometimes appears

Why this matters: If employees reuse passwords, credentials from an old breach might work on current systems. This is how credential stuffing attacks work, as we covered in the authentication attacks post.

Technology Fingerprinting

Identify what software the target runs without scanning:

# Check HTTP response headers
curl -I https://example.com
# Server: nginx/1.18.0
# X-Powered-By: PHP/8.1

What reveals technology:

HTTP headers (server software, framework)
HTML source (meta tags, JavaScript frameworks, CDN URLs)
Error pages (default error pages reveal the technology)
Cookies (session cookie naming: PHPSESSID = PHP, JSESSIONID = Java)
/robots.txt and /sitemap.xml (reveal application structure)

Tools like Wappalyzer (browser extension) automate this detection.

Active Reconnaissance

Active recon directly interacts with the target. It’s more informative but detectable. In a pentest, this is done only after authorization.

Port Scanning

Covered in detail in the ports and protocols post. Quick recap:

# Fast scan of common ports
nmap -T4 target.com

# Full port scan with service detection
nmap -p- -sV target.com

# Scan with OS detection and scripts
nmap -A target.com

# UDP scan (slower but important)
nmap -sU --top-ports 20 target.com

What you learn from port scanning:

Which services are running
Software versions (for vulnerability matching)
Operating system
Firewall configuration (which ports are filtered)

Service Enumeration

Once you know which ports are open, probe the services:

Web servers (80/443):

# Discover directories and files
gobuster dir -u https://example.com -w /usr/share/wordlists/dirb/common.txt

# Or with ffuf
ffuf -u https://example.com/FUZZ -w /usr/share/wordlists/dirb/common.txt

Common discoveries: /admin, /backup, /api, /test, /.git, /.env - each potentially exposing sensitive functionality or data.

SMB (445):

# List shares (if anonymous access is allowed)
smbclient -L //target.com -N

# Enumerate users
enum4linux target.com

SMTP (25):

# Verify email addresses exist
smtp-user-enum -M VRFY -U users.txt -t target.com

Vulnerability Scanning

Automated tools that check for known vulnerabilities:

# Nmap vulnerability scripts
nmap --script vuln target.com

# Nikto for web servers
nikto -h https://example.com

Dedicated vulnerability scanners like Nessus, OpenVAS, and Qualys check systems against databases of known CVEs and misconfigurations. In enterprise environments, these run regularly as part of vulnerability management.

Web Application Recon

For web applications specifically:

# Check for common files
curl -s https://example.com/robots.txt
curl -s https://example.com/.git/HEAD
curl -s https://example.com/.env
curl -s https://example.com/wp-login.php
curl -s https://example.com/admin

# Check security headers
curl -I https://example.com | grep -iE "(security|csp|x-frame|strict)"

What to look for:

Exposed .git directories (full source code leak)
.env files (database credentials, API keys)
Default admin panels
Outdated CMS versions
Missing security headers
Debug/development modes left enabled

Organizing Your Findings

Recon produces a lot of data. Organize it:

Target: example.com
├── Domains & Subdomains
│   ├── example.com → 93.184.216.34
│   ├── mail.example.com → 93.184.216.40
│   ├── dev.example.com → 93.184.216.50
│   └── vpn.example.com → 10.0.0.1 (internal? interesting)
├── Open Ports
│   ├── 22/tcp (SSH - OpenSSH 8.9)
│   ├── 80/tcp (HTTP - nginx 1.18)
│   ├── 443/tcp (HTTPS - nginx 1.18)
│   └── 3306/tcp (MySQL - should this be exposed?)
├── Technology Stack
│   ├── Server: Ubuntu 22.04
│   ├── Web: nginx + PHP 8.1
│   ├── CMS: WordPress 6.2 (outdated)
│   └── Database: MySQL 8.0
├── People
│   ├── IT Admin: John Smith (LinkedIn)
│   ├── Email format: j.smith@example.com
│   └── 3 employees in breach databases
└── Potential Issues
    ├── MySQL exposed to internet
    ├── WordPress outdated by 2 major versions
    ├── dev.example.com accessible without auth
    └── No CSP header on main site

This becomes the attack plan - or, from a defensive perspective, the remediation list.

The Defender’s Perspective

For every recon technique, there’s a defense:

Recon Technique	Defense
DNS enumeration	Minimize DNS records, use split-horizon DNS
Subdomain discovery	Remove unused subdomains, monitor Certificate Transparency
Job posting analysis	Be vague about specific technologies
LinkedIn OSINT	Security awareness training
Port scanning	Minimize exposed ports, use port knocking, IDS alerts
Directory brute-force	Custom error pages, rate limiting, WAF
Technology fingerprinting	Remove version headers, customize error pages
Credential exposure	Monitor breach databases, enforce unique passwords

Attack surface management is the defensive equivalent of reconnaissance. Know what you expose before attackers find it. Tools like Shodan, Censys, and SecurityScorecard show you what the internet sees.

What to Monitor

Watch for:
- Unusual DNS queries for internal subdomains
- Port scan patterns in firewall logs
- Brute-force directory enumeration in web logs
- Login attempts using breached credentials
- New subdomains appearing in Certificate Transparency

Practice Safely

Only perform active reconnaissance against systems you own or have explicit written permission to test.

For practice:

Your own lab - Scan your own VMs
HackTheBox - Machines designed for enumeration practice
TryHackMe - Guided recon rooms
OSINT Framework - osintframework.com - Collection of OSINT resources
Google Dorking practice - Use against your own domains

What’s Next

Reconnaissance gives the attacker a target. The reverse shell posts covered what happens after exploitation - getting interactive access. Now we’ll jump to what happens once you’re in but need more power: privilege escalation.

References

MITRE ATT&CK - Reconnaissance
OWASP - Web Security Testing Guide
Certificate Transparency Search - crt.sh
Shodan - Search engine for internet-connected devices

The best attackers spend more time on recon than exploitation. The best defenders think like attackers and monitor their own attack surface. Know what you expose.