Professional Documents
Culture Documents
Suncoast Security Society
Google Operators / Directives
Ten Google searches
Other Google searches and Info gathering
Automated Tools
Protecting Yourself
Always use these characters without surrounding spaces!
• ( + ) force inclusion of something common
• ( ‐ ) exclude a search term
• ( “ ) use quotes around search phrases
• ( . ) a single‐character wildcard
• ( * ) any word
• ( | ) boolean ‘OR’
• Parenthesis group queries (“master card” | mastercard)
Intitle Allintext
Inurl Site
Filetype Link
Inanchor Cache
Numrange Daterange
Info Related
Author Group
Insubject Msgid
Stocks Define
Phonebook
Searches for specific words within the title or
with Allintitle a series of words or phrase
Ex. Intitle:”Index of” “backup files”
Returns results with “Index of” or “backup files”
Ex. Allintitle:”Index of” “backup files”
Returns results with “Index of” and “backup files”
Not as commonly used unless looking for
something very specific
Locates a string(s) within a webpage(s)
Ex allintitle:google security
restricts results to those containing all the query
terms you specify in the URL
Searches only within the given domain
Reads from right to left
Ex. apple.com – reads com then apple to return
results under apple.com
Ex1. store.apple.com – reads com, apple, then
store to return results under store.apple.com
Commonly used with other operators*
Allows you to search for specific files based on
type. ie. Doc, Xls, Pdf
Ex. Filetype:doc doc
13 Main File Types
According to filext.org there are 0ver 8,000
known file types on the net
This is commonly used in conjunction with other
operators, such as SITE to find files
Allows you to search for pages that link to other
pages.
Ex. link:defcon.org
You can also search for links to deep links
Ex. link:www.blackhat.com/html/blackpages/blackpages.html
When improperly put together such as link:linux
Google will treat it as a regular search, although the results
may not look normal
Restrict the results to pages containing the query
terms you specify in the anchor text or links to
the page
Inanchor:smashingthestack
Displays Google’s cached version of a web page,
instead of the current version of the page
cache:backhat.org
Cache can have some unpredictable results –
You might be better off doing regular search and
then accessing the cache from there
As an alternative you can use archive.org and
the Wayback Machine
Requires two parameters, a low and high number
Ex. Numrange:12344‐12346
Ex1. Numrange:12344..12346
It has been suggested that this is one of the most
dangerous searches. That could be used to
harvest phone numbers, credit cards, etc.
In fact in Google help doesn’t make mention of
this directive – Be careful using this
These operators are for the google groups
If used in a normal search the operator is
dropped and a regular search is performed.
*The Insubject operator sometimes works in a
normal search, however the behavior is
unpredictable.
Ext:rdp rdp
Hits for Remote Desktop Protocol
Clicking links opens RDP client
Site:[.domin root] intitle:index.of passwd –ftp
Finds password files
Bash_history also reveals good stuff
http://*:*@www domain
Finds URLs that include username and password
(user:password@www.somesite.com)
Intitle:”AXIS 240 Camera Server” intext:”server
push” –help
Open AXIS 240 Video Cameras
1. site
2. intitle:index.of
3. error | warning
4. login | logon
5. username | userid | employee.id | “your
username is”
6. password | passcode | “your password is”
7. admin | administrator
8. ‐ext:html ‐ext:htm ‐ext:shtml ‐ext:asp ‐ext:php
9. inurl:temp | inurl:tmp | inurl:backup | inurl:bak
10. intranet | help.desk
Now that we have seen what can done
Here are some of the more interesting results
These sites launch a
VNC Java client so you
can connect! Even if
password protected, the
client reveals the server
name and port.
Print server
administration,
Google-style!
GooScan
Wikto
Goolag
Security Policy
Blocking Crawlers – Robots.txt
NOARCHIVE \ NOSIPPET
Directory Listing
FTP Log Files \Web Traffic Reports
HTML \ Code Comments
Hidden Form Fields, Javascript
This isn’t Google’s fault.
Google is very happy to remove references. See
http://www.google.com/remove.html
Follow the webmaster advice found at
http://www.google.com/webmasters/faq.html
Determine what you want to out there and then
write a security policy to match it. This can cover
areas such as
Participation in forums, and user groups
Out of office emails
Open Employment opportunities
Technologies in use
Provides a list of instructions for web crawlers
User‐agent: *
Disallow: /cgi‐bin/
Disallow: /images/
Disallow: /tmp/
Disallow: /private/
http://www.robotstxt.org/
http://www.mcanerin.com/EN/search‐engine/robots‐txt.asp
Prevents caching of webpage / site
Done through meta tags
<META NAME=“ROBOTS” CONTENT=“NOARCHIVE”>
<META NAME=“GOOGLEBOT” CONTENT=“NOARCHIVE”>
Prevents the small amount of text listed below the title from
being collected
Done through meta tags
<META NAME=“ROBOTS” CONTENT=“NOSNIPPET”>
One side effect of NOSNIPPET is the document will not be
cached either
If servers are missing their default page type like:
Index.htm, Index.html, default.asp, default.aspx
And directory browsing is enabled: Google will
read the file system in and make it searchable
intitle:index.of “.<extension of file type>"
Lock these files/directories down with
NOARCHIVE/NOSNIPPET
This will keep it out of the cache and the
waybackmachine from archiving it
So what is the big deal
Log and traffic reports give the attacker insight in
to what is going on with the site
How much traffic
When the traffic is at its peak
What files are being accessed
Comments make code more legible and easier to
understand
The Problem
Comments make code more legible and easier to
understand
Just like HTML Comments:
Hidden form fields
Javascript
Includes
Directory structure
Are all read by the bots for searching