Thursday, August 14, 2008

To infinity and beyond? No!

Sent to you by xingxing via Google Reader:

via Google Webmaster Central Blog by Maile Ohye on 8/5/08
When Googlebot crawls the web, it often finds what we call an "infinite space". These are very large numbers of links that usually provide little or no new content for Googlebot to index. If this happens on your site, crawling those URLs may use unnecessary bandwidth, and could result in Googlebot failing to completely index the real content on your site.

Recently, we started notifying site owners when we discover this problem on their web sites. Like most messages we send, you'll find them in Webmaster Tools in the Message Center. You'll probably want to know right away if Googlebot has this problem - or other problems - crawling your sites. So verify your site with Webmaster Tools, and check the Message Center every now and then.



Examples of an infinite space

The classic example of an "infinite space" is a calendar with a "Next Month" link. It may be possible to keep following those "Next Month" links forever! Of course, that's not what you want Googlebot to do. Googlebot is smart enough to figure out some of those on its own, but there are a lot of ways to create an infinite space and we may not detect all of them.


Another common scenario is websites which provide for filtering a set of search results in many ways. A shopping site might allow for finding clothing items by filtering on category, price, color, brand, style, etc. The number of possible combinations of filters can grow exponentially. This can produce thousands of URLs, all finding some subset of the items sold. This may be convenient for your users, but is not so helpful for the Googlebot, which just wants to find everything - once!

Correcting infinite space issues

Our Webmaster Tools Help article describes more ways infinite spaces can arise, and provides recommendations on how to avoid the problem. One fix is to eliminate whole categories of dynamically generated links using your robots.txt file. The Help Center has lots of information on how to use robots.txt. If you do that, don't forget to verify that Googlebot can find all your content some other way. Another option is to block those problematic links with a "nofollow" link attribute. If you'd like more information on "nofollow" links, check out the Webmaster Help Center.

Written by Torrey Hoffman, Webmaster Tools team

Things you can do from here:

How to start a multilingual site

Sent to you by xingxing via Google Reader:

via Google Webmaster Central Blog by Chark on 8/7/08
Have you ever thought of creating one or several sites in different languages? Let's say you want to start a travel site about backpacking in Europe, and you want to offer your content to English, German, and Spanish speakers. You'll want to keep in mind factors like site structure, geographic as well as language targeting, and content organization.

Site structure
The first thing you'll want to consider is if it makes sense for you to buy country-specific top-level domains (TLD) for all the countries you plan to serve. So your domains might be ilovebackpacking.co.uk, ichlieberucksackreisen.de, and irdemochilero.es.es. This option is beneficial if you want to target the countries that each TLD is associated with, a method known as geo targeting. Note that this is different from language targeting, which we will get into a little more later. Let's say your German content is specifically for users from Germany and not as relevant for German-speaking users in Austria or Switzerland. In this case, you'd want to register a domain on the .de TLD. German users will identify your site as a local one they are more likely to trust. On the other hand, it can be pretty expensive to buy domains on the country-specific TLDs, and it's more of a pain to update and maintain multiple domains. So if your time and resources are limited, consider buying one non-country-specific domain, which hosts all the different versions of your website. In this case, we recommend either of these two options:
  1. Put the content of every language in a different subdomain. For our example, you would have en.example.com, de.example.com, and es.example.com.
  2. Put the content of every language in a different subdirectory. This is easier to handle when updating and maintaining your site. For our example, you would have example.com/en/, example.com/de/, and example.com/es/.
Matt Cutts wrote a substantial post on subdirectories and subdomains, which may help you decide which option to go with.

Geographic targeting vs. Language targeting
As mentioned above, if your content is especially targeted towards a particular region in the world, you can use the Set Geographic Target tool in Webmaster Tools. It allows you to set different geographic targets for different subdirectories or subdomains (e.g., /de/ for Germany).

If you want to reach all speakers of a particular language around the world, you probably don't want to limit yourself to a specific geographic location. This is known as language targeting, and in this case, you don't want to use the geographic target tool.

Content organization
The same content in different languages is not considered duplicate content. Just make sure you keep things organized. If you follow one of the site structure recommendations mentioned above, this should be pretty straightforward. Avoid mixing languages on each page, as this may confuse Googlebot as well as your users. Keep navigation and content in the same language on each page.

If you want to check how many of your pages are recognized in a certain language, you can perform a language-specific site search. For example, if you go to google.de and do a site search on google.com, choose the option below the search box to only display German results.
If you have more questions on this topic, you can join our Webmaster Help Group to get more advice.

Posted by Charlene Perez and Juliane Stiller, Search Quality Team

Things you can do from here:

It's 404 week at Webmaster Central

Sent to you by xingxing via Google Reader:

via Google Webmaster Central Blog by Maile Ohye on 8/11/08
This week we're publishing several blog posts dedicated to helping you with one response code: 404.

Response codes are a numeric status (like 200 for "OK", 301 for "Moved Permanently") that a webserver returns in response to a request for a URL. The 404 response code should be returned for a file "Not Found".

When a user sends a request for your webpage, your webserver looks for the corresponding file for the URL. If a file exists, your webserver likely responds with a 200 response code along with a message (often the content of the page, such as the HTML).

200 response code flow chart


So what's a 404? Let's say that in the link to "Visit Google Apps" above, the link is broken because of a typing error when coding the page. Now when a user clicks "Visit Google Apps", the particular webpage/file isn't located by the webserver. The webserver should return a 404 response code, meaning "Not Found".

404 response code flow chart


Now that we're all on board with the basics of 404s, stay tuned 4 even more information on making 404s good 4 users and 4 search engines.

Written by Maile Ohye, Developer Programs Tech Lead

Things you can do from here:

Farewell to soft 404s

Sent to you by xingxing via Google Reader:

via Google Webmaster Central Blog by Maile Ohye on 8/12/08
We see two kinds of 404 ("File not found") responses on the web: "hard 404s" and "soft 404s." We discourage the use of so-called "soft 404s" because they can be a confusing experience for users and search engines. Instead of returning a 404 response code for a non-existent URL, websites that serve "soft 404s" return a 200 response code. The content of the 200 response is often the homepage of the site, or an error page.

How does a soft 404 look to the user? Here's a mockup of a soft 404: This site returns a 200 response code and the site's homepage for URLs that don't exist.



As exemplified above, soft 404s are confusing for users, and furthermore search engines may spend much of their time crawling and indexing non-existent, often duplicative URLs on your site. This can negatively impact your site's crawl coverage—because of the time Googlebot spends on non-existent pages, your unique URLs may not be discovered as quickly or visited as frequently.

What should you do instead of returning a soft 404?
It's much better to return a 404 response code and clearly explain to users that the file wasn't found. This makes search engines and many users happy.

Return 404 response code



Return clear message to users



Can your webserver return 404, but send a helpful "Not found" message to the user?
Of course! More info as "404 week" continues!

Written by Maile Ohye, Developer Programs Tech Lead

Things you can do from here:

Video units more widely available

by Inside AdSense Team

Sent to you by xingxing via Google Reader:

via Inside AdSense by Inside AdSense Team on 8/13/08
We've heard your feedback about the availability of video units, so we've expanded the languages and geographies where they're supported. We've added French, German, and Spanish to our list of supported languages, and also Brazil, Germany, India, and Mexico to the list of supported countries. To help you determine whether you'll be able to use video units, consider these two questions:
  1. Do you have an English, French, German, Japanese, or Spanish-language AdSense account?

  2. Are you located in one of the following countries?

    Australia, Brazil, Canada, France, India, Ireland, Italy, Germany, Japan, Mexico, Netherlands, New Zealand, Poland, Spain, United Kingdom, United States
If you answered 'yes' to both questions, then video units are available to you. Sign in to your AdSense account and visit your AdSense Setup tab to get started. After linking your YouTube and AdSense accounts, you'll be able to customize and embed the video player into your website, and choose between having videos automatically targeted to your content or manually selecting content categories and content providers. Companion and text overlay ads will appear with your videos, and you'll generate earnings for valid clicks and impressions on those ads.

You can find more information about video units in our Help Center, and as always, please feel free to leave us a comment with your video unit feedback. We'll be sure to keep you posted if video units become available in any additional countries or languages.

Posted by Arlene Lee - AdSense Publisher Support

Things you can do from here:

Where is Georgia on Google Maps?

by Karen
Cross-posted from the Google LatLong Blog.

Sent to you by xingxing via Google Reader:

via The Official Google Blog by Karen on 8/12/08
Cross-posted from the Google LatLong Blog.

The recent conflict in Georgia has raised some questions about how Google Maps has handled mapping in that part of the world. The most obvious question is, why doesn't Google Maps show any cities or roads for Georgia, or its neighbors Armenia and Azerbaijan? The answer is we never launched coverage in those countries because we simply weren't satisfied with the map data we had available. We're constantly searching for the best map data we can find, and sometimes will delay launching coverage in a country if we think we can get more comprehensive data. Some of our customers have asked if we removed map data from any of these countries in response to the recent hostilities in that region and I can assure you that is not the case. Data for these countries were never on Google Maps in the first place.

But this has generated a lot of feedback that we are listening to and learning from. We're hearing from our users that they would rather see even very basic coverage of a country than see nothing at all. That certainly makes sense, and so we have started preparing data for the handful of countries that are still blank on Google Maps. Georgia, Armenia, Azerbaijan, as well as other significant regions of the world will benefit from this effort.

In the meantime, much of this data, including cities in Georgia and other surrounding countries, can be found in Google Earth.

Posted by Dave Barth, Product Manager

Things you can do from here:

We feel your pain, and we're sorry

by Gmail Blog
Posted by Todd Jackson, Gmail Product Manager

Sent to you by xingxing via Google Reader:

via Gmail Blog by Gmail Blog on 8/11/08
Posted by Todd Jackson, Gmail Product Manager

Many of you had trouble accessing Gmail for a couple of hours this afternoon, and we're really sorry. The issue was caused by a temporary outage in our contacts system that was preventing Gmail from loading properly. Everything should be back to normal by the time you read this.

We heard loud and clear today how much people care about their Gmail accounts. We followed all the emails to our support team and user group, we fielded phone calls from Google Apps customers and friends, and we saw the many Twitter posts. (We also heard from plenty of Googlers, who use Gmail for company email.) We never take for granted the commitment we've made to running an email service that you can count on.

We've identified the source of this issue and fixed it. In addition, as with all issues that affect Gmail and our other services, we're conducting a full review of what went wrong and moving quickly to update our internal systems and procedures accordingly. We don't usually post about problems like this on our blog, but we wanted to make an exception in this case since so many people were impacted. In general, though, if you spot a problem with your Gmail account, please visit the Gmail Help Center and user group, where the Gmail Guides are your fastest source of updates.

Again, we're sorry.

Things you can do from here:

Get your Google Calendar in 38 languages

by Gmail Blog
Posted by Ken Norton, Product Manager

Sent to you by xingxing via Google Reader:

via Gmail Blog by Gmail Blog on 8/13/08
Posted by Ken Norton, Product Manager

One of our goals at Google is to give everyone the information they want in the language they speak. We've been hard at work making Google products available in as many languages as possible. Recently we launched Google Calendar in eight more languages, bringing our total number of supported languages to 38 (and closing in on Gmail's 50). The new languages are Latvian, Romanian, Filipino/Tagalog, Serbian, Ukrainian, Bulgarian, Hindi and Indonesian.

To use Google Calendar in your preferred language, just sign in, click Settings in the upper right hand corner and look for Language.


Things you can do from here:

Clustering Photos To Make for a 3D Scene (Video)

Sent to you by xingxing via Google Reader:

via Google Blogoscoped by Philipp Lenssen on 8/14/08

The University of Washington and Microsoft Research took many photos from a location from Flickr and algorithmically stitched them together to let users explore them in a 3D view. [Via Reddit.]

[By Philipp Lenssen | Origin: Clustering Photos To Make for a 3D Scene (Vid ... | Comments]


[Advertisement] Google books at eBay: background info on Google, AdWords, AdSense, Blogger and more...

Things you can do from here:

Google Search Results Show Metadata for Scientific Papers

by Ionut Alex Chitu

Sent to you by xingxing via Google Reader:

via Google Operating System by Ionut Alex Chitu on 8/13/08
Google started to integrate in the search results information about the scientific papers included in Google Scholar. Below the snippet, Google lists the authors, the number of citations and links to related articles and other versions available online. The integration is not perfect and the search results look cluttered, but it's yet another class of results that have richer snippets.

Here's the top search result for buddy tree (a data structure):


... and the same result at Google Scholar:


Google also shows additional information next to videos, books, web pages that include addresses and tests displaying metadata for forums and extracting specialized information from web pages. While Yahoo tries to convince webmasters to make structured data explicitly available, Google has a more practical approach and uses what's already available to enhance search results.

{ via Blogoscoped Forum }

Things you can do from here:

Sunday, August 3, 2008

A Week Without Google

What would you miss the most if your ISP blocked all Google services for a week? Among other things, the search engine would no longer work, Gmail's web interface wouldn't load, YouTube videos would be blocked, web pages would load slower because the Analytics tracking code would no longer work, all the Google Maps mashups would be completely useless for you.

Sent to you by xingxing via Google Reader:

via Google Operating System by Ionut Alex Chitu on 8/2/08
What would you miss the most if your ISP blocked all Google services for a week? Among other things, the search engine would no longer work, Gmail's web interface wouldn't load, YouTube videos would be blocked, web pages would load slower because the Analytics tracking code would no longer work, all the Google Maps mashups would be completely useless for you.


Fortunately, you don't have stay a week without Google, but this blog takes a break for a week.

Things you can do from here:

Saturday, August 2, 2008

Gmail Shows "Never Send It To Spam" Filter

Google's webmailer Gmail has an apparently* new filter function named "Never send it to Spam". Ticking this should ensure that a certain email – with criteria you define, like by entering your friend's name in the "From" field – will not be accidentally sorted into the spam folder. It's a nice option to have as last resort, like when you identified certain types of good mail which never see the inbox, even though naturally most of the time we'd like to have Gmail figure it out for us automatically, I guess (e.g. to perhaps not flag something as spam which someone we talked to before sent to us, unless we flagged their messages as spam later on).

Sent to you by xingxing via Google Reader:

via Google Blogoscoped by Philipp Lenssen on 7/30/08

Google's webmailer Gmail has an apparently* new filter function named "Never send it to Spam". Ticking this should ensure that a certain email – with criteria you define, like by entering your friend's name in the "From" field – will not be accidentally sorted into the spam folder. It's a nice option to have as last resort, like when you identified certain types of good mail which never see the inbox, even though naturally most of the time we'd like to have Gmail figure it out for us automatically, I guess (e.g. to perhaps not flag something as spam which someone we talked to before sent to us, unless we flagged their messages as spam later on).

[Thanks Hebbet!]

*I can't tell how new it is.

[By Philipp Lenssen | Origin: Gmail Shows "Never Send It To Spam" Filter | Comments]


[Advertisement] Find the right keywords for your campaigns at KeywordDiscovery.com

Things you can do from here:

Google Street View Car Pulled Over by Police

One of the cars that are currently filming every road in Britain stopped by police – for driving in a bus lane.

Sent to you by xingxing via Google Reader:

via Google Blogoscoped by Philipp Lenssen on 7/31/08

The Telegraph writes:

<<One of the cars that are currently filming every road in Britain stopped by police – for driving in a bus lane.

The distinctive vehicle – complete with roof-mounted camera pole and Google logo on the door – was stopped in the centre of Bradford at 12.40pm yesterday.

Eyewitnesses described how the Google car was followed through the city centre by a panda car with sirens blazing.>>

If there's ever going to be a movie made about Google, please, put in something from that last paragraph.

In other news, the British Information Commissioner's Office just gave Google's Street View filming the thumbs up, Guardian reports. This organization, called a privacy watchdog by the Guardian, stated they are "satisfied that Google is putting in place adequate safeguards to avoid any risk to the privacy or safety of individuals, including the blurring of vehicle registration marks and the faces of anyone included in Street View images".

[Thanks Manoj Nahar and Dave Shaw!]

[By Philipp Lenssen | Origin: Google Street View Car Pulled Over by Police | Comments]


[Advertisement] PingPongPie - the art of linkbaiting and social media marketing

Things you can do from here:

UK police fly out to probe Antigua murder

A man critically injured in a honeymoon attack that killed his wife has arrived back in the UK, as British detectives fly to Antigua to help with the investigation, according to media reports.

Sent to you by xingxing via Google Reader:

via CNN.com on 8/2/08
A man critically injured in a honeymoon attack that killed his wife has arrived back in the UK, as British detectives fly to Antigua to help with the investigation, according to media reports.


Things you can do from here:

Therapist: Anthrax suspect had 'kill plan'

An anthrax researcher who committed suicide earlier this week had threatened his therapist and recently outlined a plan to kill his co-workers, according to audiotape of court testimony.

Sent to you by xingxing via Google Reader:

via CNN.com on 8/2/08
An anthrax researcher who committed suicide earlier this week had threatened his therapist and recently outlined a plan to kill his co-workers, according to audiotape of court testimony.


Things you can do from here: