Educational CyberPlayGround ®

google hack

Ways Google is shaking the security world

More security articles

Ask Google anything--what's happening to GE's stock price, how to get to 881 Seventh Ave. in New York, where Mission Impossible 3 is showing, whatever happened to Brian W. after he moved away in the ninth grade--and you'll get an answer. That's the power of this $6 billion search engine sensation, which is so good at what it does that the company name became a verb.

That kind of power keeps Google on the front page of the news--and sometimes under unfavorable scrutiny, as demonstrated by Google's recent clashes with the U.S. Department of Justice and also with critics displeased by the search giant's stance on Chinese government censorship. CSOs and CISOs have a different reason to think carefully about Google and the implications of having so much information online, instantly accessible by almost anyone. Although these issues relate to all search engine companies, Google gets most of the attention--not only because of its huge share of the Web search market but because of its unabashed ambitions to catalog everything from images and libraries to Earth, the moon and Mars. "We always get enamored of a new technology, and it takes us a while to understand the price of that technology," says Robert Garigue, vice president of information integrity and chief security executive of Bell Canada Enterprises in Montreal. For security pros, the price is that Google can be used to dig up network vulnerabilities and locations of sensitive facilities, to enable fraud and cause other sorts of mayhem against the enterprise. Here, CSO examines the ways Google is shaking the security world, and what companies can do about them. 1. Google Hacking (strictly defined) What it is: Using search engines to find systems vulnerabilities. Hackers can use carefully crafted searches to find things like open ports, overly revealing error messages or even (egads) password files on a target organization's computer systems. Any search engine can do this; blame the popularity of the somewhat imprecise phrase "Google hacking" on Johnny Long. The author of the well-read book Google Hacking for Penetration Testers, Long hosts a virtual swap meet where members exchange and rate intricately written Google searches. How it works: The way Google works is by "crawling" the Web, indexing everything it finds, caching the index information and using it to create the answers when someone runs a Web search. Unfortunately, sometimes organizations set up their systems in a way that allows Google to index and save a lot more information than they intended. To look for open ports on CSO's Web servers, for instance, a hacker could search for INURL:WWW.CSOONLINE.COM:1, then INURL:WWW.CSOONLINE.COM:2, and so on, to see if Google has indexed port 1, port 2 and others. The researcher also might search for phrases such as "Apache test page" or "error message", which can reveal configuration details that are like hacker cheat sheets. Carefully crafted Google searches sometimes can even unearth links to sloppily installed surveillance cameras or webcams that are not meant to be public. Why it matters: Suppose someone is scanning all your ports. Normally, this activity would show up in system logs and possibly set off an intrusion detection system. But search engines like Google have Web crawlers that are supposed to regularly read and index everything on your Web servers. (If they didn't, let's face it--no one would ever visit your website.) By searching those indices instead of the systems themselves, "you can do penetration testing without actually touching the victims' sites," points out consultant Nish Bhalla, founder of Security Compass. What to do: Beat hackers at their own game: Hold your own Google hacking party (pizzas optional). Make Google and other search engines part of your company's routine penetration testing process. Bhalla recommends having techies focus on two things: which ports are open, and which error messages are available. When you find a problem, your first instinct may be to chase Google off those parts of your property. There is a way to do this--sort of--by using a commonly agreed-upon protocol called a "robots.txt" file. This file, which is placed in the root directory of a website, contains instructions about files or folders that should not be indexed by search engines. (For a notoriously long example, view the White House's file at Many companies that run search engines heed the instructions in this file. Notice we said "many"? Some search engines ignore robots.txt requests and simply index everything anyway. What's more, the robots.txt file tips off hackers about which public parts of your Web servers you'd prefer to keep quiet. Meanwhile, the information that your pen testers found through Google is already out there. Sure, you can contact search engines individually and ask them, pretty please, to remove the information from their caches. (Visit for instructions.) But you're better off making the information useless. "The persistence of these caches is impossible to manage, so you have to assume that if it's there, it's going to be there forever," says Ed Amoroso, CISO of AT&T. His solution? Simple. "Let's say you found a file with a bunch of passwords. Change those passwords." Then, fix the underlying problem. Eliminate or hide information that shouldn't be publicly available. Long term, you'll have to do the heavy lifting too, by closing unnecessary ports or fixing poorly written applications. Shock waves: 4 (highest). It's up to you to make sure your company isn't accidentally publishing instructions on how to hack its systems. 2. Google Hacking (loosely defined) What it is: Using search engines to find intellectual property. It's Google intel: The researcher uses targeted Web searches to find bits and pieces of information that, when put together, form a picture of an organization's strategy. Unlike, say, launching a SQL injection attack, doing competitive intelligence using public sources is quite legal (and may in fact be good business). How it works: The researcher scours the Web for information that might include research presented at academic conferences, comments made in chat rooms, rsums or job openings. "Companies leave bread crumb trails all over the place on the Web," says Leonard Fuld, founder of Fuld & Co. and author of the forthcoming book The Secret Language of Competitive Intelligence. One common tactic is using search queries that reveal only specific file types, such as Microsoft Excel spreadsheets (filetype:xls), Microsoft Word documents (filetype:doc) or Adobe PDFs (filetype:pdf). This kind of search filters out a lot of noise. Say you want information about General Motors. Searching for "GENERAL MOTORS" "FINANCIAL ANALYSIS" one day in February yielded 56,400 results. Searching for "GENERAL MOTORS" "FINANCIAL ANALYSIS" FILETYPE:XLS brought up only 34 documents. One of those documents was a spreadsheet from a recruiting agency that contains the current jobs and work history (though not the names) of executives at numerous companies (including GM) who may be on the job market. Another common approach is searching for phrases that may indicate information that wasn't intended to be public. For this, keywords such as "personal", "confidential" or "not for distribution" are invaluable. These targeted searches don't always hit pay dirt, but they can be fascinating. For instance, on that same day in February, the top hit on a search for "GENERAL MOTORS" "NOT FOR DISTRIBUTION" was a PDF from a credit-rating company with poorly redacted information that could be easily viewed by pasting the text into another document. (Oops!) A final tactic is to target the organization's site itself for information, such as phone lists, that could be useful for social engineering scams. Researchers might use the site search function and look for the phrase "phone list" or "contact list". (An actual search might be SITE:CSOONLINE.COM "PHONE LIST", and if you run that particular search, you'll find stories CSO has published about why your company's phone directory is better kept under wraps.) Why it matters: "If it's on Google, it's all legal," says Ira Winkler, information security consultant and author of Spies Among Us. Competitive intelligence of this sort is illegal espionage only when it involves a trade secret--and if something is public enough to appear in Google, can you really argue that it was protected like a trade secret? What to do: That Google hacking party we mentioned earlier should involve a few site searches for sensitive files, such as financial records and documents labeled "not for distribution." Beyond your own borders, it's a good idea to know what people are saying about your organization, even if there's little you can do about it. "Using search engines to figure out what your public-facing view looks like has become a de facto element in any corporate security program," Amoroso says. Brand protection companies such as MarkMonitor and Cyveillance will work the beat for you, if you'd prefer. Creating (and enforcing) good policies about employee blogging or the use of message boards and chat rooms can also limit your exposure. Shock waves: 3 (significant). This kind of competitive intelligence has been going on forever, and it is damaging. The Web means more information gets out, and it's easier to find. 3. Google Earth

What it is: A software download that provides highly navigable satellite and aerial photography of the entire globe. (The same images are also available through Google Maps at The scope and resolution of the photos are eye-popping enough that Google Earth drew ire even as a beta product in 2005. Some people feel threatened that a photo of, say, their backyard is only a few clicks away, and others fear that terrorists will use the images of landmarks or pieces of the critical infrastructure to plot attacks.

How it works: After the user installs the software (the basic version is free at, she can zoom to any spot on the planet, often with enough detail to see driveways, if not cars. The virtual globe can be overlaid with information on roads, train tracks, coffee shops, hotels and more. Enterprising researchers are also overlaying Google Maps with everything from locations of murders to public rest rooms that have baby-changing tables. Images are up to three years old and come from commercial and public sources, with widely varying resolution. Why it matters: The privacy implications of having this information so readily available are certainly worth discussing as a society, but the security risks to U.S.-based companies are low. Much of the information was already available anyway. For instance, Microsoft stitched together images from the U.S. Geological Survey a decade ago with its Terraserver project It just doesn't work as smoothly. Not only have these types of images long been available online, but they can also be easily purchased from government and private sources, says John Pike, director of the military think tank There are only a couple of legal restrictions. First, the images must be at least 24 hours old. Second, the U.S. military has what Pike calls "shutter control": the ability to tell commercial satellite companies not to release imagery that might compromise U.S. military operations. To the best of Pike's knowledge, the U.S. military has never invoked this power, nor have the regulations governing satellite imagery changed during the Bush administration's war on terrorism. "If Rummy's not worried about it," Pike says, referring to Secretary of State Donald Rumsfeld, "it's hard for me to see how anyone can lose much sleep over it." What to do: If your organization's security plan is based on no one being able to obtain aerial or satellite photography of a facility, then it probably ain't much of a plan. "Anybody who has the capacity to constitute a threat that rises much above graffiti is going to have it in their power to get imagery of a facility," Pike says. "If security managers have something that they don't want to be seen, they need to put a roof on it." Beyond that, be prepared for cocktail party banter about the risks and rewards of Google Earth and Google Maps. At the U.S. Food and Drug Administration, for instance, CISO Kevin Stine finds Google Earth personally fascinating, and he likes to muse about its potential for use in, say, disaster planning. "From a CISO perspective, I think we need to be aware of these kinds of tools," he says. But for his security group, the only impact he thinks Google Earth might eventually have, if it begins to encompass more business applications, is a drain on bandwidth. In other words, it's a concern about as big as your lawn chairs seen from space. Shock waves: 1 (minimal). Security by obscurity is so 20th century. Google Earth just illustrates why.

4. Click Fraud

What it is: The act of manipulating pay-per-click advertising. Perpetrators inflate the number of people who have legitimately clicked an online ad, either to make money for themselves or to bleed a competitor's advertising budget. How it works: With pay-per-click advertising, an advertiser pays each time someone clicks an ad hosted on a website. Google, Yahoo and other search engine companies make their money by selling advertisers the right to have their text-only ads appear when someone searches for a particular keyword. There are two ways to manipulate pay-per-click advertising: competitor click fraud and network click fraud. < First, the competitor variety: Let's suppose a company that sells life insurance wants to advertise on Google. The company might bid for and win rights to the phrase "life insurance". Then, when someone runs a Google search for that exact phrase, the company's ad appears next to the search results as a sponsored link. (How close to the top of the list depends on both the price per click and the superpowered algorithms that constitute Google's secret sauce.) Each time someone clicks the sponsored link, Life Insurance Co. pays the agreed-upon price to Google -- say $5. With competitor click fraud, an unscrupulous competitor tries to run up Life Insurance Co.'s advertising bill by clicking the link. A lot. Network click fraud, on the other hand, cashes in on the fact that Google isn't the only company that hosts Google advertising. Suppose someone has a blog about insurance. She can sign up as a Google advertising affiliate and have ads for insurance run on her site. If Life Insurance Co. is paying Google $5 per click, Ms. Insurance Blogger might pocket $1 for each click her site generates. Network click fraud is when an affiliate generates fraudulent traffic in order to boost its revenue. Google insists it is trying to keep the problem in check. Shuman Ghosmajumder, product manager for trust and safety at Google, says the company monitors for all kinds of what it dubs "invalid clicks," and that it routinely issues refunds to advertisers and closes down fraudulent affiliates. In 2005, Google even won a lawsuit against an affiliate it charged with click fraud. But some advertisers say that Google isn't doing enough to prevent and monitor for fraud because it profits from the fraud. Google faces a class-action lawsuit led by AIT, a Web-hosting company, and is in the midst of reaching a $90 million settlement with Lane's Gifts & Collectibles, a mail-order store. (At press time, the proposed settlement was before a judge.) Why it matters: Click fraud is following a trajectory that will be familiar to any CSO, and it's a telling example of how sophisticated and profitable electronic crime has become. First, the good guys started looking at server logs to find IP addresses in patterns that indicated fraud. The bad guys responded by creating automated bots that simulated different IP addresses and had varying time stamps. Then, the good guys improved their click-fraud detection tools, with a cottage industry sprouting up that specializes in helping online advertisers monitor for fraud. Queue up "click farms," where the bad guys hire people in other countries to do the clicking in a way that looks more realistic. "It's a cat-and-mouse game," says Chris Sherman, executive editor of What to do: The first step is to put tracking measures in place. In a recent survey done by the Search Engine Marketing Professional Organization (Sempo), a trade group, 42 percent of respondents said they had been victims of click fraud, but nearly one-third of respondents said they weren't actively tracking fraud. "The way you monitor it is you look for something that doesn't make sense," explains Kevin Lee, chair of the group's research committee. "If you spent $100 every day last week, and then this week you spent $130 every day and didn't get any more conversions, or whatever your success metrics are," then you might have a problem, he says. "Usually the engines will catch the obvious fraud, and they won't even bill you for it," Lee continues. But if you have a larger problem, you may need to gather information about why you believe some of the clicks are fraudulent and ask the company hosting the ads for a refund. Ghosmajumder says Google devotes significant resources to a team of investigators who proactively monitor for fraud and also do research about possible fraud reported by advertisers. Google also has engineers working on technical means to identify invalid clicks. According to the Sempo survey, 78 percent of advertisers that have been victims of click fraud have received credit from a paid search provider, and 40 percent of the time it was based on their request. The question, of course, is whether to bother making a request. Who better than the CSO to help the advertising department figure out whether it would cost more for the company to tamp down on the problem or simply to pay for the fraud? Shock waves: 2 (moderate). For companies using pay-per-click, this is one to watch. Click fraud has the potential to dramatically reduce the effectiveness of online advertising. But with more than 90 percent of Google's revenue coming from advertising, the company has a serious incentive to keep the problem in check so that advertisers don't lose faith in the pay-per-click model.

5. Google Desktop

What it is: A free tool offered by Google that allows users to quickly search the contents of their hard drives. (Similar tools are offered by MSN, Yahoo and others.) The latest version can also be used to share files between computers. How it works: After the user downloads the tool, it works in the background to index everything on his hard drive, much like Google indexes the Web. All fixed drives are indexed by default, but the user can specify folders to exclude or extra drives to add. The software can be set to return results on text files, spreadsheets, PDFs, Web history, e-mail and more. Once the indexing is done, when the user runs a Google search, items from his own computer appear at the top of the results. Alternately, he can use the tool by itself by opening it on his desktop; he doesn't even need to be connected to the Web. A new version also has a controversial feature that allows a user to share files between computers. With this setting enabled, Google indexes the files on one computer, pulls them up on its servers, then pushes them down onto another computer (which is similarly configured with the software). Then, a search done on one computer returns results from both. Why it matters: It's easy to see why people get all prickly about this one. Once the tool is installed and files are indexed, a snoop needs only a coffee break, rather than a lunch hour, to search someone's hard drive for files about, say, Bob Jones's salary. To make matters worse, freewheeling users may not pay attention or understand how to make sure that sensitive documents aren't indexed. To its credit, Google has tried to improve the standard configuration of the tool. An early version automatically returned results with password-protected files and secure HTTP pages; now, those types of files aren't indexed unless the user changes a setting. "People screamed about that, and Google changed it very quickly,"'s Sherman says. Even so, setting up appropriate exclusions can get complicated. Some companies--as well as many individuals who are concerned about their personal privacy--are also leery of making so much information available to Google. The new Search Across Computers feature only heightens these concerns. With this feature, Google says, copies of users' personal files can sit on Google's servers for up to 30 days. Google downplays this time frame. Says Matthew Glotzbach, product manager for Google Enterprise, "If both of your computers are on and syncing, [the files are on Google's servers] only a matter of minutes"--the time it takes for Google to pull up the information and push it back down onto the second computer. But having the information saved on Google's servers at all is troubling, given that search engine companies are routinely subpoenaed by prosecutors. (Google's privacy policy states: "We may also share information with third parties in limited circumstances, including when complying with legal process, preventing fraud or imminent harm, and ensuring the security of our network and services.") In one especially charged case, Google fought a subpoena from the U.S. Department of Justice, which wanted search results to help analyze its enforcement of the Children's Online Privacy Protection Act. A judge reduced the amount of information Google must turn over, and the ensuing debate raised awareness about the amount (and nature) of information that Google has in its stores. The fact that the software is relatively untested raises additional questions. Last November, an Israeli researcher reported that he had found a vulnerability in Microsoft Internet Explorer that allowed him to illicitly access information in Google Desktop. Google fixed the problem, but legitimate concerns linger. "Anytime you install software from a third party directly on a hard drive of a particular machine, you're potentially opening up holes in the security of that machine," says Matt Brown, a Forrester senior analyst. What to do: It's time to catch up--something that Brown says is especially important given the fact that Sarbanes-Oxley requires companies to keep tabs on where and how long their information is retained. Consider whether your users actually need desktop search for their jobs. If they do, you'll want to have a hand in how it's configured and used. (Bonus points go to the CSO who makes sure that users understand the privacy implications of all these tools, beyond just telling them to read the privacy policy.) At the FDA, Stine is in the early stages of looking at the tool. "There have been some requests [for desktop search] here and there, but there hasn't been a user outcry," he says. If (or when) there comes a point when a lot of users have a legitimate need for desktop search, Stine says he'll look carefully at how the technology identifies, indexes and presents information. "We'd have to ensure that we still maintain complete control--at least as complete as possible--over the information," he says. Fortunately, he'd have plenty of options. Several companies have enterprise desktop search tools that help CISOs keep tabs on the information. Google Desktop 3 for Enterprise, currently in beta, allows administrators to completely disable features such as the Search Across Computers feature. Google says it is working make future versions of this tool easier to manage. "I don't think we anticipated such a concerned or negative response," Glotzbach says. "We've taken to heart the feedback on the Search Across Computers feature, especially in the enterprise context, and we're actively working on making it even easier for the companies to use" in a secure manner, he says. X1 Technologies, which has partnered with Yahoo, offers a competing enterprise search tool that Brown says is more manageable from an IT perspective. "Part of the problem with these technologies is they get announced and people immediately start downloading," Brown says. "It takes companies a little while to catch on to what's happening." Shock waves: 4 (highest). Desktop search is an untested technology with a wide potential for misuse. If your users don't need it, don't let them use it; if they do need it, consider enterprise tools that can be centrally managed and controlled.

Future Shocks

Google has shaken us, by holding up a mirror and forcing us to look at what we've put online. "Google provides a lot of capability that can do you harm as well as providing you search capabilities," Winkler says. "What makes it its strength makes it its danger." The future will make search technology only more dangerous. Bell Canada's Garigue points out that search technology is still in its very infancy, barely scratching the surface of what he calls the shallow Web. "The shallow Web is everything that's public on Web servers," he says. "The deep Web is what's hidden inside databases." >From the Library of Congress to Lexis-Nexis' legal and news archives, to Medline's medical databases, the great bulk of information that people access online is still available only to subscribers, not to Google. "Google is the first generation of tools," Garigue says. As those tools get more sophisticated, the shock waves will only grow stronger. May 16, 2006
This story is reprinted from CSO, an online resource for information executives. Story Copyright CXO Media Inc., 2006.