<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Jer Catallo</title>
    <description>The latest articles on Forem by Jer Catallo (@jer_catallo).</description>
    <link>https://forem.com/jer_catallo</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3671368%2Fd44fc7ee-06ff-4bc6-a28b-b22473231f2f.png</url>
      <title>Forem: Jer Catallo</title>
      <link>https://forem.com/jer_catallo</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/jer_catallo"/>
    <language>en</language>
    <item>
      <title>Automated Web Content Discovery: How Attackers Find Hidden Paths on Your Web Server in Minutes Using Free Tools</title>
      <dc:creator>Jer Catallo</dc:creator>
      <pubDate>Fri, 08 May 2026 12:08:00 +0000</pubDate>
      <link>https://forem.com/jer_catallo/how-to-find-hidden-web-directories-and-sensitive-files-using-automated-discovery-44od</link>
      <guid>https://forem.com/jer_catallo/how-to-find-hidden-web-directories-and-sensitive-files-using-automated-discovery-44od</guid>
      <description>&lt;p&gt;Web applications often have directories and files that are not linked from the main pages. These paths can expose admin panels, backup files, logs, and config data. Automated content discovery tools like Gobuster use wordlists to test hundreds or thousands of paths quickly, and finding these before attackers do is a key part of web application security testing.&lt;/p&gt;

&lt;p&gt;Using the Acme IT Support practice target on TryHackMe, you can see exactly how an attacker builds up knowledge of a target step by step, starting from a small fast scan and moving to deeper coverage with file extension checks.&lt;/p&gt;

&lt;h2&gt;
  
  
  Ethical Considerations
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Only scan systems you own or have written permission to test.&lt;/li&gt;
&lt;li&gt;Set clear scope limits before scanning, including target hosts, paths, time windows, and allowed methods.&lt;/li&gt;
&lt;li&gt;Start with safe scan settings to avoid breaking services.&lt;/li&gt;
&lt;li&gt;Handle found data with care. Do not take, share, or publish sensitive content from logs, backups, or archives.&lt;/li&gt;
&lt;li&gt;Remove IPs, tokens, credentials, usernames, and other sensitive details before sharing findings publicly.&lt;/li&gt;
&lt;li&gt;Report high-risk findings through proper disclosure channels.&lt;/li&gt;
&lt;li&gt;Follow all applicable laws, platform rules, and company policies.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Step 1: Run a Baseline Scan with a Small Wordlist
&lt;/h2&gt;

&lt;p&gt;You can start with a small and fast wordlist to find common directories and files. This gives you quick results without waiting too long.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;gobuster &lt;span class="nb"&gt;dir&lt;/span&gt; &lt;span class="nt"&gt;--url&lt;/span&gt; http://&amp;lt;target-ip&amp;gt;/ &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-w&lt;/span&gt; /usr/share/wordlists/SecLists/Discovery/Web-Content/common.txt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;dir&lt;/code&gt; mode looks for directories. The &lt;code&gt;--url&lt;/code&gt; flag sets the target. The &lt;code&gt;-w&lt;/code&gt; flag points to the wordlist file. Gobuster uses 10 threads by default and treats 404 responses as negative results.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhzlh6oh2h4sqwmsc7v54.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhzlh6oh2h4sqwmsc7v54.png" alt=" " width="800" height="381"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The scan found 9 paths:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Path&lt;/th&gt;
&lt;th&gt;Status&lt;/th&gt;
&lt;th&gt;Notes&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/assets&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;301&lt;/td&gt;
&lt;td&gt;Redirect, static resources directory&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/contact&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;200&lt;/td&gt;
&lt;td&gt;Contact page&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/customers&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;302&lt;/td&gt;
&lt;td&gt;Redirect, possible user area&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/development.log&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;200&lt;/td&gt;
&lt;td&gt;Sensitive, exposed development log&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/monthly&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;200&lt;/td&gt;
&lt;td&gt;Monthly content endpoint&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/news&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;200&lt;/td&gt;
&lt;td&gt;News section&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/private&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;301&lt;/td&gt;
&lt;td&gt;Redirect, restricted area&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/robots.txt&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;200&lt;/td&gt;
&lt;td&gt;Crawler exclusion file, useful for recon&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/sitemap.xml&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;200&lt;/td&gt;
&lt;td&gt;Sitemap, reveals additional paths&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Finding &lt;code&gt;/development.log&lt;/code&gt; at status 200 shows a high-risk misconfiguration. Development logs can contain stack traces, database queries, and sometimes credentials.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Remediation:&lt;/strong&gt; Remove development logs from production servers. Use proper logging systems that store logs outside the web root. Add access controls if logs must be kept on the server.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 2: Expand Coverage with a Larger Wordlist
&lt;/h2&gt;

&lt;p&gt;You can use a bigger wordlist to find less common paths. Adding more threads and filtering noise makes the scan faster and the output cleaner.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;gobuster &lt;span class="nb"&gt;dir&lt;/span&gt; &lt;span class="nt"&gt;-u&lt;/span&gt; http://&amp;lt;target-ip&amp;gt;/ &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-w&lt;/span&gt; /usr/share/wordlists/dirb/big.txt &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-t&lt;/span&gt; 50 &lt;span class="nt"&gt;-b&lt;/span&gt; 404,403 &lt;span class="nt"&gt;--no-error&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;-u&lt;/code&gt; flag sets the target URL. The &lt;code&gt;-w&lt;/code&gt; flag points to the larger dirb wordlist with over 20,000 entries. The &lt;code&gt;-t 50&lt;/code&gt; flag increases threads for faster scanning. The &lt;code&gt;-b 404,403&lt;/code&gt; flag hides not-found and forbidden responses. The &lt;code&gt;--no-error&lt;/code&gt; flag removes error output for cleaner results.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftu2iv8o4ty280lvvy4ek.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftu2iv8o4ty280lvvy4ek.png" alt=" " width="800" height="380"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This scan found 2 new paths not seen in Step 1:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Path&lt;/th&gt;
&lt;th&gt;Status&lt;/th&gt;
&lt;th&gt;Notes&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/cookie-test&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;200&lt;/td&gt;
&lt;td&gt;New, cookie testing endpoint exposed&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/sitemap_xml&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;200&lt;/td&gt;
&lt;td&gt;New, alternate sitemap path&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Using &lt;code&gt;-b 404,403&lt;/code&gt; removes noise and shows only useful results. Setting threads to 50 makes the scan much faster on stable targets.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Remediation:&lt;/strong&gt; Remove internal test endpoints like &lt;code&gt;/cookie-test&lt;/code&gt; from production. Use a single sitemap path and redirect alternates to avoid confusion.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 3: Deep Scan with File Extension Checking
&lt;/h2&gt;

&lt;p&gt;You can add file extension checking to find backup files, config files, and other sensitive file types. This multiplies your test cases and gives wider coverage.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;gobuster &lt;span class="nb"&gt;dir&lt;/span&gt; &lt;span class="nt"&gt;-u&lt;/span&gt; http://&amp;lt;target-ip&amp;gt;/ &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-w&lt;/span&gt; /usr/share/wordlists/dirbuster/directory-list-2.3-medium.txt &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-x&lt;/span&gt; txt,json,bak,zip,md
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;-x&lt;/code&gt; flag adds each extension to every wordlist entry. For example, &lt;code&gt;backup&lt;/code&gt; becomes &lt;code&gt;backup.txt&lt;/code&gt;, &lt;code&gt;backup.json&lt;/code&gt;, &lt;code&gt;backup.bak&lt;/code&gt;, and so on. This helps you find files that directory-only scans will miss.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fipbtptnp3qe7i9bdzd28.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fipbtptnp3qe7i9bdzd28.png" alt=" " width="800" height="379"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This scan found 1 critical path:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Path&lt;/th&gt;
&lt;th&gt;Status&lt;/th&gt;
&lt;th&gt;Notes&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/tmp.zip&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;200&lt;/td&gt;
&lt;td&gt;Critical, archive file exposed on web root&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Adding &lt;code&gt;-x&lt;/code&gt; tests each wordlist entry with every extension, which gives much wider coverage. Finding &lt;code&gt;/tmp.zip&lt;/code&gt; shows why extension scanning is important. Backup and temp files left in web-accessible paths are a common issue.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Remediation:&lt;/strong&gt; Remove all backup and archive files from the web root. Use deployment scripts that clean up temp files. Store backups in secure locations outside the web server.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Findings
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Path&lt;/th&gt;
&lt;th&gt;Status&lt;/th&gt;
&lt;th&gt;Risk&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/development.log&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;200&lt;/td&gt;
&lt;td&gt;High, may contain credentials or stack traces&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/tmp.zip&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;200&lt;/td&gt;
&lt;td&gt;High, archive with unknown contents exposed&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/private&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;301&lt;/td&gt;
&lt;td&gt;Medium, restricted area worth investigating&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/customers&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;302&lt;/td&gt;
&lt;td&gt;Medium, potential user data area&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/cookie-test&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;200&lt;/td&gt;
&lt;td&gt;Low-Medium, exposes internal test endpoint&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/robots.txt&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;200&lt;/td&gt;
&lt;td&gt;Informational, reveals disallowed paths&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/sitemap.xml&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;200&lt;/td&gt;
&lt;td&gt;Informational, additional path disclosure&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Small wordlists are fast but miss many paths. You should layer scans with bigger wordlists to get better coverage.&lt;/li&gt;
&lt;li&gt;File extension scanning with &lt;code&gt;-x&lt;/code&gt; is needed to find backup files like &lt;code&gt;.bak&lt;/code&gt; and &lt;code&gt;.zip&lt;/code&gt;, and config leaks.&lt;/li&gt;
&lt;li&gt;Filtering noise with &lt;code&gt;-b&lt;/code&gt; for block status codes gives cleaner output for faster review.&lt;/li&gt;
&lt;li&gt;A &lt;code&gt;200&lt;/code&gt; response means the content is accessible. &lt;code&gt;301&lt;/code&gt; and &lt;code&gt;302&lt;/code&gt; mean redirects worth following. Even &lt;code&gt;403&lt;/code&gt; confirms a path exists.&lt;/li&gt;
&lt;li&gt;Exposed files like &lt;code&gt;development.log&lt;/code&gt; and &lt;code&gt;tmp.zip&lt;/code&gt; are real-world issues you will often see in penetration tests.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You can use these steps on your own assessments to find hidden content and improve your security posture. Always stay within scope and handle any sensitive data you find with care.&lt;/p&gt;




&lt;p&gt;If you found this helpful, drop a like and share it with someone learning security. If you have questions, ran into something different in your own lab, or want to share your results, leave a comment below. Always happy to connect and talk about security, recon techniques, or anything AppSec related.&lt;/p&gt;

&lt;p&gt;Feel free to connect with me on &lt;a href="https://www.linkedin.com/in/jer-catallo/" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Always open to connecting with people in security, development, or both. Whether you are building something, breaking something, or just getting started, feel free to reach out.&lt;/p&gt;

</description>
      <category>cybersecurity</category>
      <category>tutorial</category>
      <category>todayilearned</category>
      <category>security</category>
    </item>
    <item>
      <title>OSINT Content Discovery: Why You Need to Know What's Publicly Exposed About Your Web Assets</title>
      <dc:creator>Jer Catallo</dc:creator>
      <pubDate>Thu, 07 May 2026 12:15:00 +0000</pubDate>
      <link>https://forem.com/jer_catallo/how-to-find-hidden-web-contents-using-passive-osint-techniques-3kf1</link>
      <guid>https://forem.com/jer_catallo/how-to-find-hidden-web-contents-using-passive-osint-techniques-3kf1</guid>
      <description>&lt;p&gt;Passive content discovery helps you map attack surfaces without touching target systems. You can use public search engines, browser extensions, web archives, code repositories, and cloud storage references to find exposed assets. This guide covers five methods you can apply in your own authorized security assessments.&lt;/p&gt;

&lt;h2&gt;
  
  
  Ethical Considerations
&lt;/h2&gt;

&lt;p&gt;Only use these methods on assets you own or have clear written permission to test.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Get written permission before you target any domain, repo, or cloud resource.&lt;/li&gt;
&lt;li&gt;Follow all laws, platform terms, and bug bounty scope rules.&lt;/li&gt;
&lt;li&gt;Do not try to access accounts, use found credentials, steal data, or leave backdoors.&lt;/li&gt;
&lt;li&gt;Stop and report right away if you find sensitive data.&lt;/li&gt;
&lt;li&gt;Do not proceed if you are not sure about your authorization.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Google Dorking
&lt;/h2&gt;

&lt;p&gt;Google search operators let you filter results to specific domains, file types, URL paths, and page titles. These operators are passive and use only public indexed data.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 1: Use &lt;code&gt;site:&lt;/code&gt; to Scope Your Search
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;site:&lt;/code&gt; operator limits results to one domain or hosting platform.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;site:&amp;lt;target-domain&amp;gt; "&amp;lt;keyword&amp;gt;"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This query shows only pages from the target domain that contain your keyword. You can use it to find public pages hosted on a specific platform.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmtdpl1it6ckfwdnf7f68.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmtdpl1it6ckfwdnf7f68.png" alt=" " width="800" height="1191"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You can see indexed GitHub Pages sites that match the keyword. This shows how &lt;code&gt;site:&lt;/code&gt; limits search to one hosting domain.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 2: Use &lt;code&gt;filetype:&lt;/code&gt; to Find Exposed Documents
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;filetype:&lt;/code&gt; operator filters results by file extension.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;"&amp;lt;target-phrase&amp;gt;" filetype:&amp;lt;extension&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This query finds indexed files of a specific type that contain your target phrase. You can use it to map exposed documents and artifacts.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F13i2tw7h8meeibthqi7v.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F13i2tw7h8meeibthqi7v.png" alt=" " width="800" height="1255"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You can see public Jupyter notebooks that may hold code, data samples, or analysis work.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Remediation:&lt;/strong&gt; Treat found documents as sensitive even if they are public. Do not copy or share private content. Report exposure through approved channels only.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 3: Use &lt;code&gt;inurl:&lt;/code&gt; for Path-Based Discovery
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;inurl:&lt;/code&gt; operator targets pages with specific words in the URL path.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;inurl:&amp;lt;path-keyword&amp;gt; "&amp;lt;target-phrase&amp;gt;"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This query finds pages with your keyword in the URL path. You can use it to find specific page types.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs6dhdat20hyn0yrczo43.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs6dhdat20hyn0yrczo43.png" alt=" " width="800" height="1107"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You can see personal or professional about pages that give more context about the target.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Remediation:&lt;/strong&gt; Avoid personal targeting, doxxing, or profiling. Collect only the data you need for your security task.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 4: Use &lt;code&gt;intitle:&lt;/code&gt; for Title-Based Discovery
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;intitle:&lt;/code&gt; operator matches pages with specific text in the HTML title tag.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;intitle:"&amp;lt;title-text&amp;gt;" "&amp;lt;keyword1&amp;gt;" "&amp;lt;keyword2&amp;gt;"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This query finds pages with your text in the title plus extra keywords. You can use it to find project pages tied to certain technologies.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk116nxfvl7bt0cptp3hi.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk116nxfvl7bt0cptp3hi.png" alt=" " width="800" height="1160"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You can see developer portfolio pages that list their technology stack in the page title.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Remediation:&lt;/strong&gt; Keep searches within approved scope. Do not use findings to target hobby or student projects.&lt;/p&gt;

&lt;h2&gt;
  
  
  Wappalyzer Technology Fingerprinting
&lt;/h2&gt;

&lt;p&gt;Wappalyzer detects web technologies from the browser. It reads HTTP headers, HTML, JavaScript files, and loaded resources to identify frameworks, CDNs, and services.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 5: Fingerprint OWASP Juice Shop Stack
&lt;/h3&gt;

&lt;p&gt;Open the target URL in a browser with Wappalyzer installed.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;https://juice-shop.github.io
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The extension scans the page and shows detected technologies in its panel.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvh8w4v5x4bv91jy8o5pv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvh8w4v5x4bv91jy8o5pv.png" alt=" " width="800" height="636"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Wappalyzer found front-end libraries, CDN providers, and hosting indicators on &lt;code&gt;juice-shop.github.io&lt;/code&gt;. You can use this stack data to plan your next assessment steps.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Remediation:&lt;/strong&gt; Only fingerprint where recon is allowed. Do not assume you can attack just because you see stack details. Use this data for defensive testing.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 6: Analyze GitHub Technology Profile
&lt;/h3&gt;

&lt;p&gt;Apply Wappalyzer to large platforms to see their technology footprint.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;https://github.com
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The extension finds frameworks, analytics tools, CDN providers, and cloud services.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flfrmnegdj1kq5d6855tg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flfrmnegdj1kq5d6855tg.png" alt=" " width="800" height="714"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Wappalyzer found React, React Router, GSAP, AWS-related services, and more on &lt;code&gt;github.com&lt;/code&gt;. This shows how fingerprinting works on large applications.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Remediation:&lt;/strong&gt; Follow platform terms and rate limits. Do not scrape data in an abusive way. Use collected data only for authorized tasks.&lt;/p&gt;

&lt;h2&gt;
  
  
  Wayback Machine Archive Analysis
&lt;/h2&gt;

&lt;p&gt;The Wayback Machine stores historical snapshots of web pages. You can use it to find old URLs, retired endpoints, and content versions no longer on the live site.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 7: Search for Historical Snapshots
&lt;/h3&gt;

&lt;p&gt;Enter the target domain into the Wayback Machine search.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;https://web.archive.org/web/*/&amp;lt;target-domain&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Browse the calendar timeline to see archived snapshots from different dates.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpemit6u1bdcg4xgr8vks.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpemit6u1bdcg4xgr8vks.png" alt=" " width="800" height="431"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You can use this as your entry point for historical content analysis. You can find old URLs, retired endpoints, and content versions that are no longer on the live site.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Remediation:&lt;/strong&gt; Just because content is archived does not mean you can test current systems. Do not use archived findings to access restricted areas without approval. Check ownership and scope before you test any found endpoint.&lt;/p&gt;

&lt;h2&gt;
  
  
  GitHub OSINT
&lt;/h2&gt;

&lt;p&gt;GitHub search helps you find code references, config files, and metadata. Public repos often contain clues about infrastructure, dependencies, and potential misconfigurations.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 8: Search GitHub for Target Artifacts
&lt;/h3&gt;

&lt;p&gt;Use the GitHub search page with targeted queries.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;https://github.com/search?q=&amp;lt;target-query&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can find repos, code snippets, and config files in your assessment scope.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm6lmxc6imgfwxnu0wpk3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm6lmxc6imgfwxnu0wpk3.png" alt=" " width="800" height="432"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 9: Use GitHub Dork Patterns for Credential Discovery
&lt;/h3&gt;

&lt;p&gt;Organization-scoped searches limit results to one company's public repos.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;org:&amp;lt;company-name&amp;gt; &amp;lt;secret-keyword&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Add keywords like "password", "token", "api_key", or "secret" to check for credential exposure.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="err"&gt;#&lt;/span&gt; &lt;span class="o"&gt;=========================================================&lt;/span&gt;
&lt;span class="err"&gt;#&lt;/span&gt; &lt;span class="nx"&gt;GITHUB&lt;/span&gt; &lt;span class="nx"&gt;OSINT&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;HIGH&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;VALUE&lt;/span&gt; &lt;span class="nx"&gt;TARGET&lt;/span&gt; &lt;span class="nx"&gt;DORKS&lt;/span&gt;
&lt;span class="err"&gt;#&lt;/span&gt; &lt;span class="o"&gt;=========================================================&lt;/span&gt;

&lt;span class="err"&gt;#&lt;/span&gt; &lt;span class="o"&gt;---&lt;/span&gt; &lt;span class="nx"&gt;Cloud&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;Infrastructure&lt;/span&gt; &lt;span class="nx"&gt;Secrets&lt;/span&gt; &lt;span class="o"&gt;---&lt;/span&gt;

&lt;span class="err"&gt;#&lt;/span&gt; &lt;span class="nx"&gt;Searches&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="nx"&gt;AWS&lt;/span&gt; &lt;span class="nx"&gt;Access&lt;/span&gt; &lt;span class="nx"&gt;Key&lt;/span&gt; &lt;span class="nx"&gt;IDs&lt;/span&gt; &lt;span class="nx"&gt;within&lt;/span&gt; &lt;span class="nx"&gt;PEM&lt;/span&gt; &lt;span class="nx"&gt;certificate&lt;/span&gt; &lt;span class="nx"&gt;files&lt;/span&gt;
&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;AKIA&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="nx"&gt;extension&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="nx"&gt;pem&lt;/span&gt;

&lt;span class="err"&gt;#&lt;/span&gt; &lt;span class="nx"&gt;Locates&lt;/span&gt; &lt;span class="nx"&gt;exposed&lt;/span&gt; &lt;span class="nx"&gt;AWS&lt;/span&gt; &lt;span class="nx"&gt;credential&lt;/span&gt; &lt;span class="nx"&gt;configuration&lt;/span&gt; &lt;span class="nx"&gt;files&lt;/span&gt;
&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;AWS_ACCESS_KEY_ID&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="nx"&gt;filename&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="nx"&gt;credentials&lt;/span&gt;

&lt;span class="err"&gt;#&lt;/span&gt; &lt;span class="nx"&gt;Finds&lt;/span&gt; &lt;span class="nx"&gt;unprotected&lt;/span&gt; &lt;span class="nx"&gt;SSH&lt;/span&gt; &lt;span class="kr"&gt;private&lt;/span&gt; &lt;span class="nx"&gt;keys&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="nx"&gt;server&lt;/span&gt; &lt;span class="nx"&gt;access&lt;/span&gt;
&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;BEGIN OPENSSH PRIVATE KEY&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="nx"&gt;filename&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="nx"&gt;id_rsa&lt;/span&gt;

&lt;span class="err"&gt;#&lt;/span&gt; &lt;span class="nx"&gt;Discovers&lt;/span&gt; &lt;span class="nx"&gt;Google&lt;/span&gt; &lt;span class="nx"&gt;Cloud&lt;/span&gt; &lt;span class="nc"&gt;Platform &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;GCP&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nx"&gt;service&lt;/span&gt; &lt;span class="nx"&gt;account&lt;/span&gt; &lt;span class="nx"&gt;credentials&lt;/span&gt;
&lt;span class="nx"&gt;filename&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="nx"&gt;config&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;google_application_credentials&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;


&lt;span class="err"&gt;#&lt;/span&gt; &lt;span class="o"&gt;---&lt;/span&gt; &lt;span class="nx"&gt;Database&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;Authentication&lt;/span&gt; &lt;span class="nx"&gt;Leaks&lt;/span&gt; &lt;span class="o"&gt;---&lt;/span&gt;

&lt;span class="err"&gt;#&lt;/span&gt; &lt;span class="nx"&gt;Finds&lt;/span&gt; &lt;span class="nx"&gt;hardcoded&lt;/span&gt; &lt;span class="nx"&gt;MongoDB&lt;/span&gt; &lt;span class="nx"&gt;connection&lt;/span&gt; &lt;span class="nx"&gt;strings&lt;/span&gt; &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="nx"&gt;JavaScript&lt;/span&gt; &lt;span class="nx"&gt;files&lt;/span&gt;
&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;mongodb+srv://&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="nx"&gt;extension&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="nx"&gt;js&lt;/span&gt;

&lt;span class="err"&gt;#&lt;/span&gt; &lt;span class="nx"&gt;Searches&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="nx"&gt;Java&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="nx"&gt;MySQL&lt;/span&gt; &lt;span class="nx"&gt;database&lt;/span&gt; &lt;span class="nx"&gt;connection&lt;/span&gt; &lt;span class="nx"&gt;strings&lt;/span&gt; &lt;span class="kd"&gt;with&lt;/span&gt; &lt;span class="nx"&gt;passwords&lt;/span&gt;
&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;jdbc:mysql://&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;password&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;

&lt;span class="err"&gt;#&lt;/span&gt; &lt;span class="nx"&gt;Locates&lt;/span&gt; &lt;span class="nx"&gt;WordPress&lt;/span&gt; &lt;span class="nx"&gt;configuration&lt;/span&gt; &lt;span class="nx"&gt;files&lt;/span&gt; &lt;span class="nx"&gt;containing&lt;/span&gt; &lt;span class="nx"&gt;database&lt;/span&gt; &lt;span class="nx"&gt;passwords&lt;/span&gt;
&lt;span class="nx"&gt;filename&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="nx"&gt;wp&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;config&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;php&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;DB_PASSWORD&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;

&lt;span class="err"&gt;#&lt;/span&gt; &lt;span class="nx"&gt;Finds&lt;/span&gt; &lt;span class="nx"&gt;PostgreSQL&lt;/span&gt; &lt;span class="nx"&gt;password&lt;/span&gt; &lt;span class="nx"&gt;files&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="nx"&gt;local&lt;/span&gt; &lt;span class="nx"&gt;database&lt;/span&gt; &lt;span class="nx"&gt;instances&lt;/span&gt;
&lt;span class="nx"&gt;filename&lt;/span&gt;&lt;span class="p"&gt;:.&lt;/span&gt;&lt;span class="nx"&gt;pgpass&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;localhost:5432&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;


&lt;span class="err"&gt;#&lt;/span&gt; &lt;span class="o"&gt;---&lt;/span&gt; &lt;span class="nx"&gt;API&lt;/span&gt; &lt;span class="nx"&gt;Keys&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;Tokens&lt;/span&gt; &lt;span class="o"&gt;---&lt;/span&gt;

&lt;span class="err"&gt;#&lt;/span&gt; &lt;span class="nx"&gt;Hunts&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="nx"&gt;hardcoded&lt;/span&gt; &lt;span class="nx"&gt;Bearer&lt;/span&gt; &lt;span class="nx"&gt;authentication&lt;/span&gt; &lt;span class="nx"&gt;tokens&lt;/span&gt; &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="nx"&gt;Python&lt;/span&gt; &lt;span class="nx"&gt;scripts&lt;/span&gt;
&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;authorization: bearer&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="nx"&gt;extension&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="nx"&gt;py&lt;/span&gt;

&lt;span class="err"&gt;#&lt;/span&gt; &lt;span class="nx"&gt;Locates&lt;/span&gt; &lt;span class="nx"&gt;exposed&lt;/span&gt; &lt;span class="nx"&gt;Django&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="nx"&gt;Python&lt;/span&gt; &lt;span class="nx"&gt;web&lt;/span&gt; &lt;span class="nx"&gt;framework&lt;/span&gt; &lt;span class="nx"&gt;secret&lt;/span&gt; &lt;span class="nx"&gt;keys&lt;/span&gt;
&lt;span class="nx"&gt;filename&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="nx"&gt;settings&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;py&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;SECRET_KEY=&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;

&lt;span class="err"&gt;#&lt;/span&gt; &lt;span class="nx"&gt;Finds&lt;/span&gt; &lt;span class="nx"&gt;live&lt;/span&gt; &lt;span class="nx"&gt;Stripe&lt;/span&gt; &lt;span class="nx"&gt;payment&lt;/span&gt; &lt;span class="nx"&gt;processing&lt;/span&gt; &lt;span class="nx"&gt;API&lt;/span&gt; &lt;span class="nx"&gt;keys&lt;/span&gt;
&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;api.stripe.com&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;sk_live_&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;

&lt;span class="err"&gt;#&lt;/span&gt; &lt;span class="nx"&gt;Discovers&lt;/span&gt; &lt;span class="nx"&gt;exposed&lt;/span&gt; &lt;span class="nx"&gt;Slack&lt;/span&gt; &lt;span class="nx"&gt;webhook&lt;/span&gt; &lt;span class="nx"&gt;URLs&lt;/span&gt;
&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;hooks.slack.com/services/&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="nx"&gt;extension&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="nx"&gt;js&lt;/span&gt;


&lt;span class="err"&gt;#&lt;/span&gt; &lt;span class="o"&gt;---&lt;/span&gt; &lt;span class="nx"&gt;Targeted&lt;/span&gt; &lt;span class="nx"&gt;Corporate&lt;/span&gt; &lt;span class="nx"&gt;Recon&lt;/span&gt; &lt;span class="o"&gt;---&lt;/span&gt;
&lt;span class="err"&gt;#&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;Replace&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;companyname&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="kd"&gt;with&lt;/span&gt; &lt;span class="nx"&gt;your&lt;/span&gt; &lt;span class="nx"&gt;target&lt;/span&gt; &lt;span class="nx"&gt;organization&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="err"&gt;#&lt;/span&gt; &lt;span class="nx"&gt;Searches&lt;/span&gt; &lt;span class="nx"&gt;a&lt;/span&gt; &lt;span class="nx"&gt;specific&lt;/span&gt; &lt;span class="nx"&gt;organization&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;s repos for internal Jira passwords
org:companyname "jira_password"

# Finds Atlassian/Confluence access tokens for a specific target domain
"companyname.atlassian.net" "token"

# Locates terminal history files showing SSH connections to a target
filename:.bash_history "ssh user@companyname"

# Discovers internal corporate network routing or configuration files
"corp.companyname.internal" extension:conf
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can see high-value search patterns for cloud credentials, database leaks, and token discovery. The sheet includes org-scoped searches like &lt;code&gt;org:companyname&lt;/code&gt; for focused recon.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Remediation:&lt;/strong&gt; Only use this in authorized training, internal audits, or approved bug bounty scopes. Never use found secrets or credentials. Report exposed credentials through approved incident channels right away.&lt;/p&gt;

&lt;h2&gt;
  
  
  S3 Bucket Discovery
&lt;/h2&gt;

&lt;p&gt;Amazon S3 buckets often show up in public references through naming patterns, source code, and config files. You can find them using search operators and verify access with AWS CLI.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 10: Find S3 Buckets Through Public References
&lt;/h3&gt;

&lt;p&gt;Search for public S3 bucket references with Google dorking.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;site:s3.amazonaws.com "&amp;lt;target-company&amp;gt;"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can also search GitHub for bucket names in source code and config files.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 11: Check Bucket Access with AWS CLI
&lt;/h3&gt;

&lt;p&gt;Use the AWS CLI to check if a bucket allows public listing without credentials.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws s3 &lt;span class="nb"&gt;ls &lt;/span&gt;s3://&amp;lt;bucket-name&amp;gt; &lt;span class="nt"&gt;--no-sign-request&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Get the bucket ACL to see access permissions. A successful response means the bucket allows anonymous access.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws s3api get-bucket-acl &lt;span class="nt"&gt;--bucket&lt;/span&gt; &amp;lt;bucket-name&amp;gt; &lt;span class="nt"&gt;--no-sign-request&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="err"&gt;#&lt;/span&gt; &lt;span class="o"&gt;=========================================================&lt;/span&gt;
&lt;span class="err"&gt;#&lt;/span&gt; &lt;span class="nx"&gt;S3&lt;/span&gt; &lt;span class="nx"&gt;BUCKET&lt;/span&gt; &lt;span class="nx"&gt;OSINT&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;RECONNAISSANCE&lt;/span&gt;
&lt;span class="err"&gt;#&lt;/span&gt; &lt;span class="o"&gt;=========================================================&lt;/span&gt;

&lt;span class="err"&gt;#&lt;/span&gt; &lt;span class="o"&gt;---&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="nx"&gt;Passive&lt;/span&gt; &lt;span class="nc"&gt;Discovery &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;Google&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;GitHub&lt;/span&gt; &lt;span class="nx"&gt;Dorks&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;---&lt;/span&gt;

&lt;span class="err"&gt;#&lt;/span&gt; &lt;span class="nx"&gt;Google&lt;/span&gt; &lt;span class="nx"&gt;Dork&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Finds&lt;/span&gt; &lt;span class="nx"&gt;publicly&lt;/span&gt; &lt;span class="nx"&gt;indexed&lt;/span&gt; &lt;span class="nx"&gt;S3&lt;/span&gt; &lt;span class="nx"&gt;buckets&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="nx"&gt;a&lt;/span&gt; &lt;span class="nx"&gt;target&lt;/span&gt;
&lt;span class="nx"&gt;site&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="nx"&gt;s3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;amazonaws&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;com&lt;/span&gt; &lt;span class="nx"&gt;intitle&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;index of&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;companyname&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;

&lt;span class="err"&gt;#&lt;/span&gt; &lt;span class="nx"&gt;Google&lt;/span&gt; &lt;span class="nx"&gt;Dork&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Searches&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="nx"&gt;exposed&lt;/span&gt; &lt;span class="nx"&gt;bucket&lt;/span&gt; &lt;span class="nx"&gt;URLs&lt;/span&gt; &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="nx"&gt;target&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;s documents
"s3.amazonaws.com" ext:pdf "companyname"

# GitHub Dork: Locates bucket URLs hardcoded in a company&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="nx"&gt;s&lt;/span&gt; &lt;span class="nx"&gt;repositories&lt;/span&gt;
&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;s3.amazonaws.com&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="nx"&gt;org&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="nx"&gt;companyname&lt;/span&gt;

&lt;span class="err"&gt;#&lt;/span&gt; &lt;span class="nx"&gt;GitHub&lt;/span&gt; &lt;span class="nx"&gt;Dork&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Finds&lt;/span&gt; &lt;span class="nx"&gt;custom&lt;/span&gt; &lt;span class="nx"&gt;S3&lt;/span&gt; &lt;span class="nx"&gt;endpoints&lt;/span&gt; &lt;span class="nx"&gt;mapped&lt;/span&gt; &lt;span class="nx"&gt;to&lt;/span&gt; &lt;span class="nx"&gt;a&lt;/span&gt; &lt;span class="nx"&gt;target&lt;/span&gt; &lt;span class="nx"&gt;domain&lt;/span&gt;
&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;companyname.s3.amazonaws.com&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;


&lt;span class="err"&gt;#&lt;/span&gt; &lt;span class="o"&gt;---&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="nx"&gt;Active&lt;/span&gt; &lt;span class="nc"&gt;Enumeration &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;Brute&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;Force&lt;/span&gt; &lt;span class="nx"&gt;Naming&lt;/span&gt; &lt;span class="nx"&gt;Conventions&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;---&lt;/span&gt;
&lt;span class="err"&gt;#&lt;/span&gt; &lt;span class="nx"&gt;Common&lt;/span&gt; &lt;span class="nx"&gt;permutations&lt;/span&gt; &lt;span class="nx"&gt;used&lt;/span&gt; &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="nx"&gt;automated&lt;/span&gt; &lt;span class="nf"&gt;tools &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;g&lt;/span&gt;&lt;span class="p"&gt;.,&lt;/span&gt; &lt;span class="nx"&gt;ffuf&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;Gobuster&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="err"&gt;#&lt;/span&gt; &lt;span class="nx"&gt;Format&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;https&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="c1"&gt;//{target}-{keyword}.s3.amazonaws.com&lt;/span&gt;

&lt;span class="nx"&gt;companyname&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;assets&lt;/span&gt;
&lt;span class="nx"&gt;companyname&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="kr"&gt;public&lt;/span&gt;
&lt;span class="nx"&gt;companyname&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="kr"&gt;private&lt;/span&gt;
&lt;span class="nx"&gt;companyname&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;dev&lt;/span&gt;
&lt;span class="nx"&gt;companyname&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;backup&lt;/span&gt;
&lt;span class="nx"&gt;companyname&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;staging&lt;/span&gt;
&lt;span class="nx"&gt;companyname&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;prod&lt;/span&gt;
&lt;span class="nx"&gt;companyname&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;www&lt;/span&gt;


&lt;span class="err"&gt;#&lt;/span&gt; &lt;span class="o"&gt;---&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="nx"&gt;Access&lt;/span&gt; &lt;span class="nc"&gt;Verification &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;AWS&lt;/span&gt; &lt;span class="nx"&gt;CLI&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;---&lt;/span&gt;
&lt;span class="err"&gt;#&lt;/span&gt; &lt;span class="nx"&gt;Testing&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="nx"&gt;insecure&lt;/span&gt; &lt;span class="nf"&gt;permissions &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;Requires&lt;/span&gt; &lt;span class="nx"&gt;AWS&lt;/span&gt; &lt;span class="nx"&gt;CLI&lt;/span&gt; &lt;span class="nx"&gt;installed&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="err"&gt;#&lt;/span&gt; &lt;span class="nx"&gt;Attempt&lt;/span&gt; &lt;span class="nx"&gt;to&lt;/span&gt; &lt;span class="nx"&gt;list&lt;/span&gt; &lt;span class="nx"&gt;the&lt;/span&gt; &lt;span class="nx"&gt;contents&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;a&lt;/span&gt; &lt;span class="nx"&gt;bucket&lt;/span&gt; &lt;span class="nf"&gt;anonymously &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;No&lt;/span&gt; &lt;span class="nx"&gt;credentials&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nx"&gt;aws&lt;/span&gt; &lt;span class="nx"&gt;s3&lt;/span&gt; &lt;span class="nx"&gt;ls&lt;/span&gt; &lt;span class="nx"&gt;s3&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="c1"&gt;//companyname-assets --no-sign-request&lt;/span&gt;

&lt;span class="err"&gt;#&lt;/span&gt; &lt;span class="nx"&gt;Attempt&lt;/span&gt; &lt;span class="nx"&gt;to&lt;/span&gt; &lt;span class="nx"&gt;copy&lt;/span&gt; &lt;span class="nx"&gt;a&lt;/span&gt; &lt;span class="nx"&gt;sensitive&lt;/span&gt; &lt;span class="nx"&gt;file&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="nx"&gt;the&lt;/span&gt; &lt;span class="kr"&gt;public&lt;/span&gt; &lt;span class="nx"&gt;bucket&lt;/span&gt; &lt;span class="nx"&gt;to&lt;/span&gt; &lt;span class="nx"&gt;local&lt;/span&gt; &lt;span class="nx"&gt;machine&lt;/span&gt;
&lt;span class="nx"&gt;aws&lt;/span&gt; &lt;span class="nx"&gt;s3&lt;/span&gt; &lt;span class="nx"&gt;cp&lt;/span&gt; &lt;span class="nx"&gt;s3&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="c1"&gt;//companyname-backup/db_dump.sql . --no-sign-request&lt;/span&gt;

&lt;span class="err"&gt;#&lt;/span&gt; &lt;span class="nx"&gt;Attempt&lt;/span&gt; &lt;span class="nx"&gt;to&lt;/span&gt; &lt;span class="nx"&gt;write&lt;/span&gt; &lt;span class="nx"&gt;a&lt;/span&gt; &lt;span class="nx"&gt;harmless&lt;/span&gt; &lt;span class="nx"&gt;file&lt;/span&gt; &lt;span class="nx"&gt;to&lt;/span&gt; &lt;span class="nx"&gt;test&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="nx"&gt;insecure&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Write&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="nx"&gt;permissions&lt;/span&gt;
&lt;span class="nx"&gt;aws&lt;/span&gt; &lt;span class="nx"&gt;s3&lt;/span&gt; &lt;span class="nx"&gt;cp&lt;/span&gt; &lt;span class="nx"&gt;test_file&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;txt&lt;/span&gt; &lt;span class="nx"&gt;s3&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="c1"&gt;//companyname-public/ --no-sign-request&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can see passive discovery patterns for S3 references using Google and GitHub queries. The image includes naming permutation examples and CLI commands to check bucket permissions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Remediation:&lt;/strong&gt; Cloud enumeration needs explicit permission from the asset owner. Do not list, download, upload, or change bucket content unless you have written authorization. If you find an exposed bucket, stop testing and report it with minimal proof.&lt;/p&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;You now have five passive content discovery methods you can use in authorized assessments. Google dorking helps you map indexed content with targeted operators. Wappalyzer gives fast technology stack details. The Wayback Machine reveals historical web data and retired endpoints. GitHub OSINT uncovers code references and config metadata. S3 recon shows cloud storage discovery patterns. All methods are passive and you should only use them within authorized scopes with clear permission.&lt;/p&gt;




&lt;p&gt;If you found this helpful, drop a like and share it with someone learning security. If you have questions, ran into something different in your own lab, or want to share your results, leave a comment below. Always happy to connect and talk about security, recon techniques, or anything AppSec related.&lt;/p&gt;

&lt;p&gt;Feel free to connect with me on &lt;a href="https://www.linkedin.com/in/jer-catallo/" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Always open to connecting with people in security, development, or both. Whether you are building something, breaking something, or just getting started, feel free to reach out.&lt;/p&gt;

</description>
      <category>cybersecurity</category>
      <category>tutorial</category>
      <category>todayilearned</category>
      <category>security</category>
    </item>
    <item>
      <title>Manual Web Content Discovery: How You Can Find Hidden Paths Before Attackers Do</title>
      <dc:creator>Jer Catallo</dc:creator>
      <pubDate>Mon, 04 May 2026 12:05:00 +0000</pubDate>
      <link>https://forem.com/jer_catallo/how-to-find-hidden-web-content-using-manual-reconnaissance-5bm</link>
      <guid>https://forem.com/jer_catallo/how-to-find-hidden-web-content-using-manual-reconnaissance-5bm</guid>
      <description>&lt;p&gt;Manual content discovery is a core skill in application security testing. Instead of relying only on automated scanners, you can use simple HTTP requests and browser tools to find exposed files, hidden paths, and technology fingerprints. This covers techniques like checking &lt;code&gt;robots.txt&lt;/code&gt;, fingerprinting favicons, reading &lt;code&gt;sitemap.xml&lt;/code&gt;, inspecting HTTP headers, and spotting framework markers in HTML source.&lt;/p&gt;

&lt;p&gt;These methods help you understand a target's structure and find information disclosure issues early, before running heavy scanning tools.&lt;/p&gt;

&lt;h3&gt;
  
  
  Ethical Considerations
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Only test systems you own or have explicit written permission to assess.&lt;/li&gt;
&lt;li&gt;Follow the defined scope, timing, and rules of engagement set by the owner.&lt;/li&gt;
&lt;li&gt;Stop immediately if you find data outside scope and report it through approved channels.&lt;/li&gt;
&lt;li&gt;Use findings for defense and remediation, not exploitation.&lt;/li&gt;
&lt;li&gt;Treat discovered paths like admin or staff portals as sensitive data. Do not brute-force or abuse them.&lt;/li&gt;
&lt;li&gt;Do not publish sensitive headers, tokens, or internal values outside approved reports.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Robots.txt Analysis
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;robots.txt&lt;/code&gt; file tells web crawlers which paths to avoid. It can accidentally reveal sensitive routes like admin panels or staff portals.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-s&lt;/span&gt; https://&amp;lt;target-domain&amp;gt;/robots.txt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This command fetches the robots.txt file so you can check &lt;code&gt;Disallow&lt;/code&gt; and &lt;code&gt;Allow&lt;/code&gt; directives for hidden paths.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyws6ueo9s63wtusulc3n.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyws6ueo9s63wtusulc3n.png" alt=" " width="800" height="379"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The response shows a &lt;code&gt;Disallow: /staff-portal&lt;/code&gt; directive under &lt;code&gt;User-agent: *&lt;/code&gt;. This means the site owner does not want crawlers to index the staff portal, but the path is still visible to anyone who checks this file.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Result&lt;/strong&gt;: The &lt;code&gt;/staff-portal&lt;/code&gt; route is exposed through robots.txt. While this does not mean the path is vulnerable, it gives you a starting point for further authorized testing.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Remediation&lt;/strong&gt;: Remove sensitive paths from &lt;code&gt;robots.txt&lt;/code&gt;. Use proper authentication and authorization controls to protect those routes instead. Security through obscurity is not a reliable protection.&lt;/p&gt;

&lt;h3&gt;
  
  
  Favicon Fingerprinting
&lt;/h3&gt;

&lt;p&gt;Favicons are small icons that browsers display in tabs. Different frameworks and products use unique favicon files, so you can calculate a hash and match it against known databases to identify the technology.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-s&lt;/span&gt; https://&amp;lt;target-domain&amp;gt;/favicon.ico | &lt;span class="nb"&gt;md5sum&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This downloads the favicon and calculates its MD5 hash for comparison.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fskpvahzuvet828lv7gym.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fskpvahzuvet828lv7gym.png" alt=" " width="800" height="399"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The browser network tab confirms a successful HTTP 200 response for &lt;code&gt;favicon.ico&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F09m4a6mtifse579al0oo.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F09m4a6mtifse579al0oo.png" alt=" " width="800" height="381"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The calculated MD5 hash is &lt;code&gt;f276b19aabcb4ae8cda4d22625c6735f&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd2zwqwgmkhq6tn16j6wu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd2zwqwgmkhq6tn16j6wu.png" alt=" " width="800" height="382"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Searching this hash in the OWASP favicon database returns a match for &lt;code&gt;cgiirc (0.5.9)&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Result&lt;/strong&gt;: The favicon hash maps to &lt;code&gt;cgiirc (0.5.9)&lt;/code&gt;, an IRC web client. This suggests the target may reuse assets from this product or run it in the background. You can use this information to check for known issues with this version.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Remediation&lt;/strong&gt;: Replace default framework or third-party favicons with a custom one. This prevents passive technology identification through favicon hashing.&lt;/p&gt;

&lt;h3&gt;
  
  
  Sitemap.xml Enumeration
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;sitemap.xml&lt;/code&gt; file lists pages that the site wants search engines to index. It often reveals old routes, API endpoints, or parameterized URLs you might not find through normal browsing.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-s&lt;/span&gt; https://&amp;lt;target-domain&amp;gt;/sitemap.xml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This retrieves the sitemap to find discoverable paths and endpoints.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5cr3vf2jlwnq9y4h1iwv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5cr3vf2jlwnq9y4h1iwv.png" alt=" " width="800" height="380"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The sitemap contains multiple URL entries including &lt;code&gt;/news/&lt;/code&gt;, &lt;code&gt;/contact&lt;/code&gt;, and parameterized article paths with sequential IDs like &lt;code&gt;news/article?id=1&lt;/code&gt;, &lt;code&gt;news/article?id=2&lt;/code&gt;, and &lt;code&gt;news/article?id=3&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Result&lt;/strong&gt;: The sitemap exposes several routes and a pattern for article IDs. You can use this to map out the content structure and check for IDOR or other parameter-based issues on these endpoints.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Remediation&lt;/strong&gt;: Avoid listing sensitive or internal endpoints in &lt;code&gt;sitemap.xml&lt;/code&gt;. Only include public-facing, intended content. For parameterized URLs, validate and authorize each request server-side.&lt;/p&gt;

&lt;h3&gt;
  
  
  HTTP Header Inspection
&lt;/h3&gt;

&lt;p&gt;HTTP response headers contain metadata about the server, security configuration, and sometimes version information. Missing security headers or verbose server details can reveal weaknesses.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-I&lt;/span&gt; https://&amp;lt;target-domain&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This sends a HEAD request to get only the response headers without the full page body.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxpuxs14wl9e39fcj597t.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxpuxs14wl9e39fcj597t.png" alt=" " width="800" height="381"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The headers show &lt;code&gt;Server: nginx/1.18.0 (Ubuntu)&lt;/code&gt; and a custom &lt;code&gt;X-FLAG: THM{HEADER_FLAG}&lt;/code&gt; header.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Result&lt;/strong&gt;: The &lt;code&gt;Server&lt;/code&gt; header leaks the exact web server version and operating system. This helps you narrow down potential version-specific issues. The response also lacks important security headers like &lt;code&gt;Content-Security-Policy&lt;/code&gt; and &lt;code&gt;Strict-Transport-Security&lt;/code&gt;, which means the site may be vulnerable to clickjacking or downgrade attacks.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Remediation&lt;/strong&gt;: Configure your web server to suppress or mask the &lt;code&gt;Server&lt;/code&gt; header. Add security headers like &lt;code&gt;Content-Security-Policy&lt;/code&gt;, &lt;code&gt;Strict-Transport-Security&lt;/code&gt;, &lt;code&gt;X-Frame-Options&lt;/code&gt;, and &lt;code&gt;X-Content-Type-Options&lt;/code&gt;. You can use tools like &lt;a href="https://securityheaders.com" rel="noopener noreferrer"&gt;securityheaders.com&lt;/a&gt; to check your current header posture.&lt;/p&gt;

&lt;h3&gt;
  
  
  Framework Stack Identification
&lt;/h3&gt;

&lt;p&gt;Web frameworks often leave markers in HTML source code, such as generator comments or meta tags. These markers reveal the technology stack and sometimes the exact version.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-s&lt;/span&gt; https://&amp;lt;target-domain&amp;gt; | &lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-i&lt;/span&gt; &lt;span class="s2"&gt;"generated&lt;/span&gt;&lt;span class="se"&gt;\|&lt;/span&gt;&lt;span class="s2"&gt;framework"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This fetches the homepage HTML and filters for framework-related comments.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzrqr66g9fbfjrc4u3gno.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzrqr66g9fbfjrc4u3gno.png" alt=" " width="800" height="377"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The HTML source contains a comment showing the page was generated using the THM Framework.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fef870cqn6ga9wa4zdoxr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fef870cqn6ga9wa4zdoxr.png" alt=" " width="800" height="381"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Visiting the framework reference URL confirms it is the THM Web Framework with visible version details.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Result&lt;/strong&gt;: The source comment reveals &lt;code&gt;THM Framework v1.2&lt;/code&gt; as the underlying technology. You can now research this framework for known misconfigurations, default paths, or version-specific vulnerabilities.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Remediation&lt;/strong&gt;: Strip generator comments and version markers from production HTML before deployment. Configure your build pipeline or template engine to exclude debug and version metadata from rendered output.&lt;/p&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;Manual content discovery gives you a clear picture of a target without heavy tooling. You can see how &lt;code&gt;robots.txt&lt;/code&gt; can leak sensitive paths, favicon hashes can identify technologies, &lt;code&gt;sitemap.xml&lt;/code&gt; can map out hidden routes, HTTP headers can expose server versions and missing security controls, and HTML source comments can reveal framework details. These techniques work well as a first step before running automated scanners and help build a stronger picture of the target's attack surface.&lt;/p&gt;




&lt;p&gt;If you found this helpful, drop a like and share it with someone learning security. If you have questions, ran into something different in your own lab, or want to share your results, leave a comment below. Always happy to connect and talk about security, recon techniques, or anything AppSec related.&lt;/p&gt;

&lt;p&gt;Feel free to connect with me on &lt;a href="https://www.linkedin.com/in/jer-catallo/" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Always open to connecting with people in security, development, or both. Whether you are building something, breaking something, or just getting started, feel free to reach out.&lt;/p&gt;

</description>
      <category>cybersecurity</category>
      <category>programming</category>
      <category>tutorial</category>
      <category>todayilearned</category>
    </item>
    <item>
      <title>Subdomain Enumeration: How Attackers Find What You Forgot to Hide</title>
      <dc:creator>Jer Catallo</dc:creator>
      <pubDate>Sat, 02 May 2026 11:41:50 +0000</pubDate>
      <link>https://forem.com/jer_catallo/how-to-find-hidden-subdomains-from-passive-osint-to-active-brute-force-hf2</link>
      <guid>https://forem.com/jer_catallo/how-to-find-hidden-subdomains-from-passive-osint-to-active-brute-force-hf2</guid>
      <description>&lt;p&gt;Subdomain enumeration is the process of finding all subdomains that belong to a target domain. Each subdomain is a potential entry point, making this a key step in external reconnaissance. In this write-up, we walk through the subdomain enumeration techniques tested in a hands-on lab, so you can see the tools, commands, and results along the way.&lt;/p&gt;

&lt;p&gt;There are two main approaches:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Passive enumeration&lt;/strong&gt;: Uses public data sources like search engines, certificate transparency logs, and third-party APIs. This method does not send direct requests to the target, so it has low risk of detection.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Active enumeration&lt;/strong&gt;: Sends direct requests to DNS servers or web servers using wordlists. This method finds more results but creates network traffic that can be logged or detected.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We demonstrate both approaches below, so you can see how they complement each other and why relying on only one method leaves blind spots.&lt;/p&gt;

&lt;h2&gt;
  
  
  Ethical Considerations
&lt;/h2&gt;

&lt;p&gt;Subdomain enumeration is a reconnaissance activity. It has legal and operational impact. You should follow these rules in every engagement:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Authorization is mandatory. Active DNS brute forcing or host-header fuzzing without written permission can break laws such as CFAA and local statutes.&lt;/li&gt;
&lt;li&gt;Passive does not always mean harmless. Passive OSINT can still show sensitive information, so handle it with care.&lt;/li&gt;
&lt;li&gt;Rate limiting matters. Use low thread counts (for example, &lt;code&gt;-t 1&lt;/code&gt;) and wait between requests to avoid service disruption.&lt;/li&gt;
&lt;li&gt;Scope validation is required. A discovered host is not automatically in scope. Always check every asset against the approved scope list.&lt;/li&gt;
&lt;li&gt;Responsible disclosure applies. Report unintended exposed infrastructure through proper authorized channels.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;All activity in this write-up ran against:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Personal domain (&lt;code&gt;jercarlocatallo.com&lt;/code&gt;), owned by the author.&lt;/li&gt;
&lt;li&gt;TryHackMe lab environments, authorized training infrastructure.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Demonstration
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Search Engine Dorking
&lt;/h3&gt;

&lt;p&gt;Search engine dorking is the simplest passive technique. By using the &lt;code&gt;site:&lt;/code&gt; operator, you can ask Google which subdomains it has indexed for a domain. No traffic reaches the target, and results are available in seconds.&lt;/p&gt;

&lt;p&gt;We use HackTheBox below so you can see how this technique works.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;HackTheBox query:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;site:*.hackthebox.com -site:www.hackthebox.com
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6mg801fh76p8zbezke0f.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6mg801fh76p8zbezke0f.png" alt=" " width="800" height="1037"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Indexed results include subdomains such as &lt;code&gt;roadmap&lt;/code&gt;, &lt;code&gt;jobs&lt;/code&gt;, &lt;code&gt;ctf&lt;/code&gt;, &lt;code&gt;status&lt;/code&gt;, and &lt;code&gt;trust&lt;/code&gt;. This confirms indexing-based visibility and helps you prioritize deeper checks.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Takeaway:&lt;/strong&gt; Search engines know only what they have crawled. Results are quick to get but limited to indexed assets. This technique alone will miss subdomains that search engines never discovered.&lt;/p&gt;

&lt;h3&gt;
  
  
  Certificate Transparency Lookup
&lt;/h3&gt;

&lt;p&gt;Certificate Transparency (CT) logs record every SSL/TLS certificate issued by participating authorities. crt.sh is a public search interface for these logs. Unlike search engines, CT logs can reveal hostnames that were never indexed or are no longer active.&lt;/p&gt;

&lt;p&gt;We check tryhackme.com below so you can see how CT logs compare to search engine results.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;tryhackme.com:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;https://crt.sh/?q=tryhackme.com
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fak5jkag22b8gnuvlzyeb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fak5jkag22b8gnuvlzyeb.png" alt=" " width="800" height="586"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Certificate entries show wildcard and service-specific names. You can see that CT logs include historical names not currently indexed by search engines.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Takeaway:&lt;/strong&gt; CT logs provide deeper hostname visibility through certificate history. They often reveal names that search engines miss, including wildcard entries and expired certificates. However, CT logs cannot find subdomains that never had a certificate issued.&lt;/p&gt;

&lt;h3&gt;
  
  
  Passive Aggregation with Sublist3r
&lt;/h3&gt;

&lt;p&gt;Sublist3r automates passive enumeration by querying multiple third-party sources (search engines, DNS aggregators, VirusTotal, and more) in a single run. We run it against a personal domain so you can see what public data sources know.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Command:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;sublist3r &lt;span class="nt"&gt;-d&lt;/span&gt; jercarlocatallo.com
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fijhvx6ryzuatpvm1szkl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fijhvx6ryzuatpvm1szkl.png" alt=" " width="800" height="513"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;We can see 3 unique subdomains were discovered:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;www.jercarlocatallo.com&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;m.jercarlocatallo.com&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;wwww.jercarlocatallo.com&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Note that &lt;code&gt;wwww.jercarlocatallo.com&lt;/code&gt; (four w's) is a misspelled typo domain, which is interesting because it shows up in third-party data sources, possibly from a crawl or a misconfigured link somewhere.&lt;/p&gt;

&lt;p&gt;Two sources failed during collection: DNSDumpster (CSRF token error) and VirusTotal (blocking requests). Passive output quality depends on source availability and rate limits.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Remediation:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Monitor certificate transparency logs for your domains to detect unauthorized subdomain creation.&lt;/li&gt;
&lt;li&gt;Remove unused or forgotten subdomains from DNS to reduce attack surface.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Takeaway:&lt;/strong&gt; Sublist3r is efficient for gathering what third-party sources already know, but it cannot discover subdomains that no source has indexed. Infrastructure names like dev, admin, and vpn often do not appear in passive data because no one has publicly referenced them.&lt;/p&gt;

&lt;h3&gt;
  
  
  Virtual Host Discovery with ffuf
&lt;/h3&gt;

&lt;p&gt;Not all subdomains resolve in public DNS. Some exist only as virtual hosts behind a single IP address. ffuf can find these by fuzzing the &lt;code&gt;Host&lt;/code&gt; header in HTTP requests and filtering out default responses.&lt;/p&gt;

&lt;p&gt;We run this in a TryHackMe lab environment so you can see the technique in action.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Command:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ffuf &lt;span class="nt"&gt;-w&lt;/span&gt; /usr/share/wordlists/SecLists/Discovery/DNS/namelist.txt &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Host: FUZZ.acmeitsupport.thm"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-u&lt;/span&gt; http://10.48.142.81 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-fs&lt;/span&gt; 2395
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fek571w6imgmytrps2vmj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fek571w6imgmytrps2vmj.png" alt=" " width="800" height="402"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;We can see two virtual hosts were found: &lt;code&gt;delta&lt;/code&gt; and &lt;code&gt;yellow&lt;/code&gt;. This confirms that HTTP-layer discovery can expose hosts not visible in public OSINT datasets.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Remediation:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Configure web servers to return consistent responses for unknown Host headers.&lt;/li&gt;
&lt;li&gt;Use a default catch-all virtual host that does not leak information about other hosted services.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Takeaway:&lt;/strong&gt; Virtual host fuzzing reveals assets that DNS and passive sources cannot find. It is an active technique that requires sending real HTTP requests, so it is more detectable but more complete for web-hosted infrastructure.&lt;/p&gt;

&lt;h3&gt;
  
  
  Active DNS Brute Force with Gobuster
&lt;/h3&gt;

&lt;p&gt;Gobuster sends direct DNS queries for each word in a wordlist, resolving subdomains that passive sources never collected. We run this against a personal domain to find subdomains that Sublist3r missed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Command:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;gobuster dns &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; jercarlocatallo.com &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-w&lt;/span&gt; subdomains-top1million-5000.txt &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-t&lt;/span&gt; 1 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--resolver&lt;/span&gt; 8.8.8.8 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--protocol&lt;/span&gt; tcp
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Standard DNS typically uses UDP, which can drop packets under load. TCP reduces timeout noise and improves consistency for wordlist-based enumeration.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdus9p85az6jiqejh8jpg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdus9p85az6jiqejh8jpg.png" alt=" " width="800" height="515"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You can see 8 subdomains resolved:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Subdomain&lt;/th&gt;
&lt;th&gt;IP Resolved&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;www.jercarlocatallo.com&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;64.29.17.65, 216.198.79.65&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;mail.jercarlocatallo.com&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;127.0.0.1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;dev.jercarlocatallo.com&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;127.0.0.1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;admin.jercarlocatallo.com&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;127.0.0.1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;vpn.jercarlocatallo.com&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;127.0.0.1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;api.jercarlocatallo.com&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;127.0.0.1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;staging.jercarlocatallo.com&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;127.0.0.1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;uat.jercarlocatallo.com&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;127.0.0.1&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;code&gt;www&lt;/code&gt; returned two public IPs (64.29.17.65 and 216.198.79.65), which suggests a round-robin or load-balanced setup. Some words (autos, soap, chemie) produced i/o timeouts, which is expected resolver noise. The scan completed all 4,989 words. Loopback results indicate locally configured DNS entries that may map to non-public infrastructure.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Remediation:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Use split-horizon DNS to separate internal and external DNS records.&lt;/li&gt;
&lt;li&gt;Avoid exposing internal infrastructure names (dev, staging, uat, admin) in public DNS.&lt;/li&gt;
&lt;li&gt;Implement DNSSEC to prevent DNS spoofing and cache poisoning attacks.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Takeaway:&lt;/strong&gt; Active brute forcing found 8 subdomains where Sublist3r found only 3. The 5 additional names (mail, dev, admin, vpn, api, staging, uat) were not indexed by any passive source. As you can see, this gap shows why active enumeration is necessary for full coverage.&lt;/p&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;Passive techniques show what public data sources know. Active techniques show what DNS and application behavior expose. When you combine both approaches, you improve attack surface visibility and reduce blind spots.&lt;/p&gt;

&lt;p&gt;Sublist3r found 3 publicly known subdomains from third-party data, including a typo domain (&lt;code&gt;wwww.jercarlocatallo.com&lt;/code&gt;). Gobuster found 8 subdomains through direct DNS resolution, adding infrastructure names like dev, admin, vpn, api, staging, uat, and mail that no passive source had indexed. That gap between passive and active results is why using both methods together matters.&lt;/p&gt;

&lt;p&gt;For defenders, the takeaway is clear: monitor certificate logs, clean up unused DNS records, and use split-horizon DNS to keep internal names out of public resolvers.&lt;/p&gt;




&lt;p&gt;If you found this helpful, drop a like and share it with someone learning security. If you have questions, ran into something different in your own lab, or want to share your results, leave a comment below. Always happy to connect and talk about security, recon techniques, or anything AppSec related.&lt;/p&gt;

&lt;p&gt;Feel free to connect with me on &lt;a href="https://www.linkedin.com/in/jer-catallo/" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Always open to connecting with people in security, development, or both. Whether you are building something, breaking something, or just getting started, feel free to reach out.&lt;/p&gt;

</description>
      <category>security</category>
      <category>todayilearned</category>
      <category>tutorial</category>
      <category>cybersecurity</category>
    </item>
  </channel>
</rss>
