Ethical Hacking and
Counterm easures
Counterm easures
Version 6
M o d u le IV
Module Objective
This m odule will fam iliarize you with:
• What is Google Hacking
• What a Hacker Can Do With Vulnerable Site
G
l H
ki
B i
• Google Hacking Basics
• Google Advanced Operators
• Pre-Assessm ent
• Locating Exploits and Finding Targets
g
p
g
g
• Tracking Down Web Servers, Login Portals, and Network
Hardware
Module Flow
Google Hacking Pre-Assessm ent
What a Hacker Can Do
With Vulnerable Site Locating Exploits and Finding Targets
Google Hacking Basics Tracking Down Web Servers,
Login Portals, and Network Hardware
Google Advanced Operators Google Hacking Tools
EC-Counci l
Copyright © byEC-Counci l
What is Google Hacking
Google hacking is a term that refers to the art of creating
com plex search engine queries in order to filter through large p g q g g am ounts of search results for inform ation related to com puter security
In its m alicious form at it can be used to detect websites that In its m alicious form at, it can be used to detect websites that are vulnerable to num erous exploits and vulnerabilities as well as locate private, sensitive inform ation about others, such as credit card num bers, social security num bers, and passwords
What a Hacker Can Do With
Vulnerable Site
Vulnerable Site
Inform ation that the Google Hacking Database identifies:
g
g
Advisories and server vulnerabilities
Error m essages that contain too m uch inform ation
Files containing passwords
Files containing passwords
Sensitive directories
Pages containing logon portals
Pages containing network or vulnerability data such as firewall
EC-Counci l
Copyright © byEC-Counci l
All Rights Reserved. Reproduction is Strictly Prohibited
Anonym ity with Caches
Hackers can get a copy sensitive data even if plug on that pesky Web server is pulled off and they can crawl into entire website without even sending a single packet to server
they can crawl into entire website without even sending a single packet to server
If the web server does not get so m uch as a packet, it can not write any thing to log files
EC-Counci l
Copyright © byEC-Counci l
Using Google as a Proxy Server
Google som e tim es works as a proxy server which requires a Google translated URL and som e m inor URL m odification
translated URL and som e m inor URL m odification
Translation URL is generated through Google’s translation service located at www google com / translate t
service, located at www.google.com / translate_ t
If URL is entered in to “Translate a web page” field, by selecting a language pair and clicking on Translate button Google will
Directory Listings
A directory listing is a type of Web page that lists files and directories that exist on a Web server
server
It is designed such that it is to be navigated by clicking directory links, directory listings typically have a title that describes the current directory, a list of files and directories that can be clicked
J ust like an FTP server, directory listings offer a no-frills, easy-install solution for granting access to files that can be stored in categorized folders
access to files that can be stored in categorized folders
Problem s faced by directory listings are:
• They do not prevent users from downloading certain files or accessing certain directories hence they are not secure • They can display inform ation that helps an attacker learn specific technical details about Web server
• They do not discrim inate between files that are m eant to be public and those that are m eant to rem ain behind the
EC-Counci l
Copyright © byEC-Counci l
All Rights Reserved. Reproduction is Strictly Prohibited
scenes
Locating Directory Listings
Since directory listings offer parent directory links and allow y g p y browsing through files and folders, attacker can find sensitive data sim ply by locating listings and browsing through them
Locating directory listings with Google is fairly straightforward as they begin with phrase “Index of,” which shows in tittle
An obvious query to find this type of page m ight be
ntitle:index.of, which can find pages with the term “index of” in the title of the docum ent
intitle:index.of “parent directory ” or intitle:index.of “nam e size” queries indeed provide directory listings by not only f d f l b k d f f d d
EC-Counci l
Copyright © byEC-Counci l
All Rights Reserved. Reproduction is Strictly Prohibited
Locating Directory Listings
(cont’d)
Finding Specific Directories
This is easily accom plished by adding the nam e of the directory to the search
query
To locate “adm in” directories that are To locate adm in directories that are
accessible from directory listings, queries such as intitle:index.of.adm in or
intitle:index.of inurl:adm in will work well, as shown in the following figure
EC-Counci l
Copyright © byEC-Counci l
Finding Specific Files
As the directory listing is in tree style, it is also possible to find specific files in a
directory listing
Server Versioning
The inform ation an attacker can use to determ ine the best m ethod for attacking a Web server is the exact software version
Web server is the exact software version
An attacker can retrieve that inform ation by connecting directly to the Web port of that server and issuing a request for the HTTP headers
Som e typical directory listings provide the nam e of the server software as well as the version num ber at the bottom portion. These inform ation are faked and attack can be done on web server
intitle:index.of “ server at” query will locate all directory listings on the Web with
index of in the title and server at any w here in the text of the page
In addition to identifying the Web server version, it is also possible to determ ine the operating system of the server as well as m odules and other software that is installed
EC-Counci l
Copyright © byEC-Counci l
All Rights Reserved. Reproduction is Strictly Prohibited
Going Out on a Lim b: Traversal
Techniques
Techniques
Attackers use traversal techniques to expand a sm all foothold into a larger
com prom ise
co p o
se
The query intitle:index.of inurl:“/ adm in/ *” is helped to traversal as
shown in the figure:
EC-Counci l
Copyright © byEC-Counci l
Directory Traversal
By clicking on the parent directory link the sub links under y g p y it will open. This is basic directory traversal
Regardless of walking through the directory tree ,
traversing outside the Google search wandering around on traversing outside the Google search wandering around on the target Web server is also be done
Th d i th URL ill b h d ith th d The word in the URL will be changed with other words
Poorly coded third-party software product installed in the
t di t t hi h ll
server accepts directory nam es as argum ents which allows users to view files above the web server directory
Increm ental Substitution
This technique involves replacing num bers in a URL in an attem pt to
This technique involves replacing num bers in a URL in an attem pt to
find directories or files that are hidden, or unlinked from other pages
By changing the num bers in the file nam es, the other files can be found
In som e exam ples, substitution is used to m odify the num bers in the
URL to locate other files or directories that exist on the site
URL to locate other files or directories that exist on the site
• / docs/ bulletin/ 2.xls could be m odified to / docs/ bulletin/ 2.xls
• / DigLib_ thum bnail/ spm g/ hel/ 0 0 0 1/ H/ could be changed to
/ Di Lib th
b
il/
/ h l/
/ H/
EC-Counci l
Copyright © byEC-Counci l
All Rights Reserved. Reproduction is Strictly Prohibited
/ DigLib_ thum bnail/ spm g/ hel/ 0 0 0 2/ H/
Extension Walking
File extensions and how filetype operator can be used to locate files with specific file i
extensions
HTM files can be easily searched with a query such as filetype:HTM HTM
Filetype searches require a search param eter and files ending in HTM always have HTM in the URL
After locating HTM files, substitution technique is used to find files with the sam e file nam e and different extension
E i d i f b k fil i l di li i
Easiest way to determ ine nam es of backup files on a server is to locate a directory listing using intitle:index.of or to search for specific files with queries such as intitle:index.of index.php.bak or inurl:index.php.bak
Google Advanced Operators
Google Advanced Operators
EC-Counci l
Copyright © byEC-Counci l
Site Operator
The site operator is absolutely invaluable during the p y g inform ation-gathering phase of an assessm ent
Site search can be used to gather inform ation about the servers g and hosts that a target hosts
Using sim ple reduction techniques, you can quickly get an idea Using sim ple reduction techniques, you can quickly get an idea about a target’s online presence
Consider the sim ple exam ple of site:washingtonpost.com – Consider the sim ple exam ple of site:washingtonpost.com site:www.washingtonpost.com
Site Operator (cont’d)
EC-Counci l
Copyright © byEC-Counci l
intitle:index.of
intitle:index.of is the universal search for
directory listings
In m ost cases, this search applies only
to Apache-based servers, but due to the
Screenshot
EC-Counci l
Copyright © byEC-Counci l
error | warning
Error m essages can reveal a great deal of inform ation about a target
Oft l k d id i i ht i t th li ti Often overlooked, error m essages can provide insight into the application or operating system software a target is running, the architecture of the network the target is on, inform ation about users on the system , and m uch m ore
Not only are error m essages inform ative, they are prolific
error | warning (cont’d)
EC-Counci l
Copyright © byEC-Counci l
login | logon
Login portals can reveal the software and operating system of a target, and in m any cases “self-help” docum entation is linked from the m ain and in m any cases self help docum entation is linked from the m ain page of a login portal
These docum ents are designed to assist users who run into problem s g p during the login process
Whether the user has forgotten his or her password or even usernam e, Whether the user has forgotten his or her password or even usernam e, this docum ent can provide clues that m ight help an attacker
Docum entation linked from login portals lists e-m ail addresses, phone
b f h i h h l bl d
num bers, or URLs of hum an assistants who can help a troubled user regain lost access
login | logon (cont’d)
EC-Counci l
Copyright © byEC-Counci l
usernam e | userid | em ployee.ID |
“your usernam e is”
y
password | passcode | “your
password is”
p
The word password is so com m on on the Internet, there are over The word password is so com m on on the Internet, there are over 73 m illion results for this one-word query
During an assessm ent, it is very likely that results for this query com bined with a site operator will include pages that provide help to users who have forgotten their passwords
In som e cases, this query will locate pages that provide policy inform ation about the creation of a password
This type of inform ation can be used in an intelligent-guessing or
b t f i i t d fi ld
EC-Counci l
Copyright © byEC-Counci l
All Rights Reserved. Reproduction is Strictly Prohibited
password | passcode | “your
password is” (cont’d)
adm in | adm inistrator
The word adm inistrator is often used to describe the person in control of a k
network or system
The word adm inistrator can also be used to locate adm inistrative login pages, or login portals
The phrase Contact your system adm inistrator is a fairly com m on phrase on p y y y p the Web, as are several basic derivations
A query such as “please contact your * adm inistrator” will return results that
f l l it d t t t t k d t b
reference local, com pany, site, departm ent, server, system , network, database, e-m ail, and even tennis adm inistrators
If a Web user is said to contact an adm inistrator, chances are that the data
EC-Counci l
Copyright © byEC-Counci l
All Rights Reserved. Reproduction is Strictly Prohibited
adm in login
adm in login Reveals Adm inistrative Login Pages
EC-Counci l
Copyright © byEC-Counci l
– ext:htm l – ext:htm
– ext:shtm l – ext:asp – ext:php
p
p p
The – ext:htm l – ext:htm – ext:shtm l – ext:asp –
h
f
h
fil
ext:php
query uses ext, a synonym for the filetype
operator, and is a negative query
It returns no results when used alone and should
be com bined with a site operator to work properly
– ext:htm l – ext:htm – ext:shtm l –
ext:asp – ext:php (cont’d)
ext:asp ext:php (cont d)
EC-Counci l
Copyright © byEC-Counci l
inurl:tem p | inurl:tm p |
inurl:backup | inurl:bak
p |
The inurl:tem p | inurl:tm p | inurl:backup | inurl:bak query , com bined
ith th it
t
h
f
t
b
k
fil
w ith the site operator, searches for tem porary or backup files or
directories on a server
Although there are m any possible nam ing conventions for tem porary or
backup files, this search focuses on the m ost com m on term s
Pre-Assessm ent
Pre-Assessm ent
EC-Counci l
Copyright © byEC-Counci l
intranet | help.desk
The term intranet, despite m ore specific technical m eanings, has becom e a generic term that describes a network confined to a sm all group
In m ost cases, the term intranet describes a closed or private network unavailable to the general public
Many sites have configured portals that allow access to an y g p intranet from the Internet, bringing this typically closed network one step closer to the potential attackers
Locating Exploits and
g
p
Finding Targets
EC-Counci l
Copyright © byEC-Counci l
Locating Public Exploit Sites
One way to locate exploit code is to focus on the file extension of the source code and then search for specific content within that code
search for specific content within that code
Since source code is the text-based representation of the difficult-to-read m achine code, Google is well suited for this task
For exam ple, a large num ber of exploits are written in C, which generally use source code ending in a .c extension
A f fil t l it t d lt t f hi h tl th A query for filety pe:c exploit returns around 5,0 0 0 results, m ost of w hich are exactly the types of program s you are looking for
These are the m ost popular sites hosting C source code containing the word exploit, the
t d li t i d t t f li t f b k k returned list is a good start for a list of bookm arks
Using page-scraping techniques, you can isolate these sites by running a UNIX com m and against the dum ped Google results page
Locating Exploits Via Com m on
Code Strings
g
Another way to locate exploit code is to focus on com m on strings within
y
p
g
the source code itself
O
d hi i f
i
l
i
h
d fil
One way to do this is to focus on com m on inclusions or header file
references
For exam ple, m any C program s include the standard input/ output library
functions, which are referenced by an include statem ent such as # include
<stdio.h> within the source code
A query like this would locate C source code that contained the word
exploit, regardless of the file’s extension:
EC-Counci l
Copyright © byEC-Counci l
All Rights Reserved. Reproduction is Strictly Prohibited
Locating Source Code with
Com m on Strings
g
EC-Counci l
Copyright © byEC-Counci l
Locating Vulnerable Targets
Attackers are increasingly using
Google to locate Web-based
In fact, it’s not uncom m on for
public vulnerability
i
Google to locate Web based
targets vulnerable to specific
exploits
announcem ents to contain
Google links to potentially
Locating Targets Via Dem onstration
Pages
Pages
Develop a query string to locate vulnerable targets on the Web; the vendor’s Web
site is a good place to discover what exactly the product’s Web pages look like
site is a good place to discover what exactly the product s Web pages look like
For exam ple, som e adm inistrators m ight m odify the form at of a vendor-supplied
Web page to fit the them e of the site
These types of m odifications can im pact the effectiveness of a Google search that
targets a vendor-supplied page form at
You can find that m ost sites look very sim ilar and that nearly every site has a
“powered by” m essage at the bottom of the m ain page
EC-Counci l
Copyright © byEC-Counci l
“
Powered by” Tags Are Com m on Query
Fodder for Finding Web Applications
Locating Targets Via Source Code
A hacker m ight use the source code of a program to discover ways to
g
p
g
y
search for that software with Google
To find the best search string to locate potentially vulnerable targets, you
g
p
y
g
, y
can visit the Web page of the software vendor to find the source code of
the offending software
In cases where source code is not available an attacker m ight opt to
In cases where source code is not available, an attacker m ight opt to
sim ply download the offending software and run it on a m achine he
controls to get ideas for potential searches
EC-Counci l
Copyright © byEC-Counci l
Vulnerable Web Application
Exam ples (cont’d)
p
(
)
EC-Counci l
Copyright © byEC-Counci l
Locating Targets Via CGI Scanning
One of the oldest and m ost fam iliar techniques for locating vulnerable Web servers is through the use of a CGI scanner
These program s parse a list of known “bad” or vulnerable Web files and attem pt to locate those files on a Web server
Based on various response codes, the scanner could detect the presence of these potentially l bl f l
vulnerable files
A Single CGI Scan-Style Query
Exam ple: search for inurl:/ cgi-bin/ userreg.cgi
EC-Counci l
Copyright © byEC-Counci l
Tracking Down Web
g
Servers, Login Portals, and
Network Hardware
Finding IIS 5.0 Servers
Query for
“Microsoft-IIS/ 5.0 server at”
EC-Counci l
Copyright © byEC-Counci l
Web Server Software Error
Messages
g
Error m essages contain a lot of useful inform ation, but in the context of locating specific servers, you can use portions of various error m essages to locate servers running specific
f i
software versions
The best way to find error m essages is to figure out what m essages the server is capable of generating
You could gather these m essages by exam ining the server source code or configuration files or by actually generating the errors on the server yourself
The best way to get this inform ation from IIS is by exam ining the source code of the error pages them selves
IIS 5 and 6, by default, display static HTTP/ 1.1 error m essages when the server encounters som e sort of problem
Th d b d f l i h %SYSTEMROOT%\ h l \ ii H l \
Web Server Software Error Messages
(cont’d)
(
)
A query such as intitle:”The page cannot be found” “please follow ing”
“Internet * Services” can be used to search for IIS servers that present a
p
40 0 error
EC-Counci l
Copyright © byEC-Counci l
IIS HTTP/ 1.1 Error Page Titles
(cont’d)
(cont d)
EC-Counci l
Copyright © byEC-Counci l
“Object Not Found” Error Message
Apache Web Server
Apache Web servers can also be located by focusing on server-generated error
m essages
Som e generic searches such as
“Apache/ 1.3.27 Server at” -intitle:index.of
intitle:inf” or “Apache/ 1.3.27 Server at” -intitle:index.of intitle:error
EC-Counci l
Copyright © byEC-Counci l
Application Software Error
Messages
Messages
Although this ASP m essage is fairly benign som e ASP
Although this ASP m essage is fairly benign , som e ASP
error m essages are m uch m ore revealing
Consider the query
“ASP.N ET_ SessionId”“data source=”,
which locates unique strings found in ASP.NET
application state dum ps
These dum ps reveal all sorts of inform ation about the
running application and the Web server that hosts that
application
Er r o r
app cat o
An advanced attacker can use encrypted password data
and variable inform ation in these stack traces to subvert
h
f h
l
d
h
h
b
EC-Counci l
Copyright © byEC-Counci l
All Rights Reserved. Reproduction is Strictly Prohibited
ASP Dum ps Provide Dangerous
Details
Many Errors Reveal Pathnam es
and Filenam es
and Filenam es
EC-Counci l
Copyright © byEC-Counci l
Default Pages
Another way to locate specific types of servers or Web ft i t h f d f lt W b
software is to search for default Web pages
Most Web software, including the Web server software itself, ships with one or m ore default or test pages
These pages can m ake it easy for a site adm inistrator to These pages can m ake it easy for a site adm inistrator to test the installation of a Web server or application
Google crawls a Web server while it is in its earliest stages Google crawls a Web server while it is in its earliest stages of installation, still displaying a set of default pages
In these cases there is generally a short window of tim e
EC-Counci l
Copyright © byEC-Counci l
All Rights Reserved. Reproduction is Strictly Prohibited
Locating Default Installations of IIS 4.0 on
Windows NT 4.0 / OP
/
EC-Counci l
Copyright © byEC-Counci l
Default Pages Query for Web Server
Many different types of Web server can be located by querying for
default pages as well
Outlook Web Access Default Portal
Query
allinurl:”exchange/ logon.asp”
EC-Counci l
Copyright © byEC-Counci l
Searching for Passwords
Password data, one of the
“Holy Grails” during a
penetration test, should be
protected
p
Unfortunately, m any
Unfortunately, m any
exam ples of Google queries
Windows Registry Entries Can Reveal
Passwords
Query like
filety pe:reg intext: “internet account m anager”
could
reveal interesting keys containing password data
reveal interesting keys containing password data
EC-Counci l
Copyright © byEC-Counci l
Usernam es, Cleartext Passwords, and
Hostnam es!
Search for password inform ation
intext:(passw ord |
Search for password inform ation,
intext:(passw ord |
l
ki
l
Google Hacking Tools
EC-Counci l
Copyright © byEC-Counci l
Google Hacking Database
(GHDB)
(GHDB)
The Google Hacking Database (GHDB) contains queries that identify
sensitive data such as portal logon pages, logs with network security
Visit http:/ / johnny.ihackstuff.com
p
g
p g ,
g
y
inform ation, and so on
EC-Counci l
Copyright © byEC-Counci l
Google Hacking Database
(GHDB)
SiteDigger Tool
SiteDigger searches Google’s cache to look for vulnerabilities, errors,
configuration issues proprietary inform ation and interesting security nuggets
configuration issues, proprietary inform ation, and interesting security nuggets
on websites
EC-Counci l
Copyright © byEC-Counci l
Gooscan
johnny.ihackstuff.com
johnny.ihackstuff.com
Gooscan is a tool that autom ates queries against Google search
Gooscan is a tool that autom ates queries against Google search
appliances
But it can be run against Google itself in direct violation of their Term s
of Service
For the security professional, gooscan serves as a front end for an
external server assessm ent and aids in the inform ation-gathering
phase of a vulnerability assessm ent
For the web server adm inistrator, gooscan helps discover what the web
com m unity m ay already know about a site thanks to Google's search
appliance
Goolink Scanner
It rem oves the cache
inform ation from
your searches and
your searches and
only collects and
displays the links
This is very handy
for finding
vulnerable sites
vulnerable sites
wide open to google
and googlebots
EC-Counci l
Copyright © byEC-Counci l
Goolag Scanner
Tool: Google Hacks
code google com / p/ googlehacks/
code.google.com / p/ googlehacks/
l
k i
il i
f
f ll
f d
l
Google Hacks is a com pilation of carefully crafted Google
searches that expose novel functionality from Google's
search and m ap services
You can use it to view a tim eline of your search results,
view a m ap, search for m usic, search for books, and
view a m ap, search for m usic, search for books, and
perform m any other specific kinds of searches
You can also use this program to use google as a proxy
EC-Counci l
Copyright © byEC-Counci l
Google Hacks: Screenshot
EC-Counci l
Copyright © byEC-Counci l
All Rights Reserved. Reproduction is Strictly Prohibited
Google Hack Honeypot
Google Hack Honeypot is the reaction to a new type of m alicious web
traffic: search engine hackers
Google Hack Honeypot:
Screenshot
Screenshot
EC-Counci l
Copyright © byEC-Counci l
Tool: Google Protocol
Google Protocol is a little app that when installed,
Google Protocol is a little app that when installed,
registers two extra protocols sim ilar to the http: and the
ftp: protocols under windows, nam ely google: and lucky:
Urls starting with the ‘google:’ refer to the corresponding
google search
Urls starting with the ‘lucky:’ refer to the top Google
l
Google Cartography
Google Cartography uses the Google API to find web pages referring
Google Cartography uses the Google API to find web pages referring
to street nam es
Initial street and region criteria are com bined to form a search query,
which is then executed by the Google API
Each URL from the Google results is fetched and the content of the
pages converted into text
The text is then processed using regular expressions designed to
capture inform ation relating to the relationship between streets
EC-Counci l
Copyright © byEC-Counci l
All Rights Reserved. Reproduction is Strictly Prohibited
Sum m ary
In this m odule, Google hacking techniques have been
reviewed
The following Google hacking techniques have been
discussed:
discussed:
• Software Error Messages
• Default pages
p g
• Explanation of techniques to reveal password
• Locating targets
• Searching for passwords
EC-Counci l
Copyright © byEC-Counci l
EC-Counci l
Copyright © byEC-Counci l