Category Archives: Google

Google user-agent detect fail

So I was using Chrome to check my gmail account and noticed a message at the bottom of the page by the stats that I could download the latest Google toolbar. I was pretty excited to see this and give it a go, so I clicked the link, which took me to this page:

gtoolbar-msg

Ok it only mentions ie and firefox, but I kinda skimmed over that and just saw the first blue link to download the toolbar, so I clicked it:

gtoolbar-download

Ok it mentions firefox, but I thought, Chrome is still fairly new and there’s still a lot of sites out there that are not recognizing it and classifying it as something else. Anyway, I thought I’d continue just in case, so now, 3 clicks later I get presented with this:

gtoolbar-sorry

Ok, now it detects my browser, but it took 3 clicks for me to discover this. If this was an ecommerce site trying to sell a product, it would have been a really disappointing experience for a user. And as we’re aware, once you give a user a disappointing experience, it’s really hard to win them back.

Now I’m assuming that the rotating message that appears at the bottom of the screen is served by some type of ad serving application, so why could it not have detected my Chrome user-agent and bypassed that ad?

It makes you wonder how many companies are buying ad inventory which is being served to users who cannot convert.

If you’re buying ad inventory, make sure you have an exclusion list, so you can work with whoever serves your ads to maximize exposure to your specific audience, however you define it.

Google’s Speech Recognition Fails

I used the Google for Search app on my iphone tonight for the first time. I wanted to look up the names of the three tenors, so I tapped the voice icon and said “the three tenors”. Unfortunately it did not fully understand my accent and instead thought I was wanting to find “history tennis”. My family found it quite amusing when I had to repeat the search with a heavy American accent.

It would be really nice to have a settings option where I can read a passage of text so that it can better understand me. Then at least I wouldn’t have to have to break out the British fake southern American twang in public again.

Anyone else have problems? If so, do you have an accent?

Edit 6/15/2009:
Well it looks like they implemented some updates to their speech recognition so that it also understands British and Australian accents. Well done fellas!

Google TrustRank Myth Busted!

[Update]
Well since this post was written, it seems that Google has decided to release something else which it’s calling TrustRank. The original TrustRank confusion was related to detecting and filtering spam, while the latest iteration is to do with calculating the “trust” of users bnased on the quality of annotations, reviews and tags they provide. These signals may be used to reorder the ranks of pages in the results.

Bill Slawski, as usual, has a great rundown of what it is from the Google Trust Rank patent filings

Original Post:
If you search Google for TrustRank you will find many blogs and forums talking about it and giving advice and theories about what you can do to alter it, but the fact of the matter is that it just simply does not exist.

At pubcon 2007 Suresh Babu interviewed Matt Cutts and asked him specifically to define TrustRank. Below is the video of that interview.

For those of you not able to watch the video, here’s a transcript where Matt Cutts talks about its origins and confusion between a Yahoo intern’s project and an antiphishing filter Google was developing.

What is trustrank? everybody’s curious about that. It’s kinda nice you asked because it’s good to have a chance to debunk this a little bit. So it turns out there was a summer intern who was at Yahoo and Jan Pedersen and some other people at Yahoo, and they wrote a paper about something called TrustRank; and what it does is it tries to treat reputation like it’s physical mass and see how it flows around on the web and what physical properties does trust have; and it’s really interesting stuff. But it’s completely separate from Google. So a couple of years ago at like the exact same time, Google was working on an antiphishing filter, and as part of that we needed to come up with a name for it and so they filed for a trademark, and I think they used the name TrustRank, so it was a really weird coincidence. Yahoo had a TrustRank project and we had this TrustRank trademark, and so everybody talks about TrustRank, TrustRank, TrustRank and yet if you go and ask five different SEOs you’ll have five different opinions and definitions about exactly what TrustRank is.

If you go to the US Patent and Trademark website and do a trademark search you’ll find this result:

Word Mark TRUSTRANK
Goods and Services (ABANDONED) IC 042. US 100 101. G & S: Computer services, namely organizing information, sites and other resources available on computer networks
Standard Characters Claimed
Mark Drawing Code (4) STANDARD CHARACTER MARK
Serial Number 78588592
Filing Date March 16, 2005
Current Filing Basis 1B
Original Filing Basis 1B
Published for Opposition December 6, 2005
Owner (APPLICANT) Google Inc. CORPORATION DELAWARE 1600 Amphitheatre Parkway Mountain View CALIFORNIA 94043
Type of Mark SERVICE MARK
Register PRINCIPAL
Live/Dead Indicator DEAD
Abandonment Date February 29, 2008

If you go to the advanced published applications search page on US Patent and Trademark website and search for Trustrank you will find these results. Notice that none of them are filed by or Assigned to Google, although there are references to Yahoo’s link-based spam detection patent application.

IDENTIFYING SOURCES OF MEDIA CONTENT HAVING A HIGH LIKELIHOOD OF PRODUCING ON-TOPIC CONTENT
Inventors: Wolters; Timothy J.; (Superior, CO) ; Setayesh; Mehrshad; (Lafayette, CO)
Assignee Name and Adress: COLLECTIVE INTELLECT, INC. Boulder CO
Serial No.: 938691
Series Code: 11
Filed: November 12, 2007


SYSTEM AND METHOD FOR SECURE, ANONYMOUS, AND PERTINENT REPOSTING OF PRIVATE BLOG POSTING, ETC.

Inventors: Drayer; Jay A.; (Houston, TX) ; Howe; Grant M.; (Cypress, TX)
Serial No.: 923366
Series Code: 11
Filed: October 24, 2007

Enhanced Detection of Search Engine Spam
Inventors: Caldwell; Larry Thomas; (Annandale, VA)
Assignee Name and Adress: Idalis Software, Inc. Annandale VA
Serial No.: 871539
Series Code: 11
Filed: October 12, 2007

System and method for characterizing a web page using multiple anchor sets of web pages
Inventors: Joshi; Amruta Sadanand; (Palo Alto, CA) ; Ravikumar; Shanmugasundaram; (Cupertino, CA) ; Reed; Benjamin Clay; (Morgan Hill, CA) ; Tomkins; Andrew; (San Jose, CA)
Assignee Name and Adress: Yahoo! Inc. Sunnyvale CA
Serial No.: 542079
Series Code: 11
Filed: October 3, 2006

Dynamic updating of display and ranking for search results

Inventors: Ferrenq; Isabelle; (Saint Lattier, FR) ; Chevalier; Pierre-Yves; (Biviers, FR)
Assignee Name and Adress: EMC Corporation
Serial No.: 522498
Series Code: 11
Filed: September 15, 2006

User-sensitive pagerank
Inventors: Berkhin; Pavel; (Sunnyvale, CA) ; Fayyad; Usama M.; (Sunnyvale, CA) ; Raghavan; Prabhakar; (Saratoga, CA) ; Tomkins; Andrew; (San Jose, CA)
Assignee Name and Adress: YAHOO! INC.
Serial No.: 474195
Series Code: 11
Filed: June 22, 2006

Providing a rating for a web site based on weighted user feedback

Inventors: Repasi; Rolf; (Sunrise Beach, AU) ; Clausen; Simon; (New South Wales, AU)
Serial No.: 803922
Series Code: 11
Filed: May 16, 2007

Search engine with augmented relevance ranking by community participation
Inventors: Xu; Zhichen; (San Jose, CA) ; Berkhin; Pavel; (Sunnyvale, CA) ; Rose; Daniel E.; (Cupertino, CA) ; Mao; Jianchang; (San Jose, CA) ; Ku; David; (Palo Alto, CA) ; Lu; Qi; (Saratoga, CA) ; Walther; Eckart; (Palo Alto, CA) ; Tam; Chung-Man; (San Francisco, CA)
Serial No.: 478291
Series Code: 11
Filed: June 28, 2006

Trust propagation through both explicit and implicit social networks
Inventors: Berkhim; Pavel; (Sunnyvale, CA) ; Xu; Zhichen; (San Jose, CA) ; Mao; Jianchang; (San Jose, CA) ; Rose; Daniel E.; (Cupertino, CA) ; Taha; Abe; (Sunnyvale, CA) ; Maghoul; Farzin; (Hayward, CA)
Assignee Name and Adress: Yahoo! Inc. Sunnyvale CA
Serial No.: 498637
Series Code: 11
Filed: August 2, 2006

Realtime indexing and search in large, rapidly changing document collections
Inventors: Rose; Daniel E.; (Cupertino, CA) ; Mao; Jianchang; (San Jose, CA) ; Walters; Chad; (San Francisco, CA)
Assignee Name and Adress: Yahoo! Inc. Sunnyvale CA
Serial No.: 498706
Series Code: 11
Filed: August 2, 2006

Using community annotations as anchortext
Inventors: Rose; Daniel E.; (Cupertino, CA) ; Mao; Jianchang; (San Jose, CA) ; Xu; Zhichen; (San Jose, CA) ; Ku; David; (Palo Alto, CA) ; Lu; Qi; (Saratoga, CA) ; Walther; Eckart; (Palo Alto, CA) ; Tam; Chung-Man; (San Francisco, CA)
Serial No.: 498682
Series Code: 11
Filed: August 2, 2006

Link-based spam detection
Inventors: Barkhin; Pavel; (Sunnyvale, CA) ; Gyongyi; Zoltan Istvan; (Stanford, CA) ; Pedersen; Jan; (Los Altos Hills, CA)
Assignee Name and Adress: Yahoo! Inc. Sunnyvale CA
Serial No.: 198471
Series Code: 11
Filed: August 4, 2005

So since Google has dropped the trademark, does not have any patent applications for it and Matt Cutts explained the confusion, I think I’d call this myth busted!

My Thoughts On Google knol

Google could be heading down a slippery slope with Youtube and now knol. Their organic search results are meant to be unbiased and provide the most relevant results for any given query. In 2000 Google launched Google Adwords and we started seeing sponsored search results. That was ok, the ads were clearly marked and listed alongside the search results.

Google starts releasing additional services to diversify their revenue streams, like the Google search appliance, site targeting, radio and print ads.

At some point here, they suddenly realized that they have this great money making scheme called Google Adsense, but they only ever take a cut out of each ad because as a search engine, they’re designed to send traffic away. The next logical step here, from a money making point of view, is to take a larger percentage, or take the entire ad revenue. They can only do that if they become a content publisher. At that point there’s a huge conflict of interest in serving relevant, unbiased results and a serving up your own site, which makes money.

In Oct 2006 Google bought Youtube , then in May 2007 they launch Universal search to provide Youtube with more exposure in the search results.

Now with Google knol they want to extend publishing to textual content and run adsense. To ensure quality content and to keep the spammers at bay they will probably not offer a rev share either.

Since wikipedia has such a strong organic presence, that leaves 9 spots on page 1 for other reference material type sites. Enter Google knol and that reduces it to 8. Throw in Google’s diversification of search results and now for ecommerce queries, you may find that there’s only 1 spot left on page 1 for an informational article about a product.

So once knol builds critical mass, what could happen? Well obviously sites like about.com will loose rankings to knol. answers.com who already took a 28% traffic hit this year could face a double whammy by loosing rankings and having their keywords links in the results are replaced with knols.

What do you think might happen? Would you want to contribute an authoritative article to knol?

SEO Client Story

I used to work for an SEO agency in Pittsburgh and dealt with a number of interesting clients in a variety of industries, with large and small sites. There were a number of funny incidents that I encountered, which I’d like to recount here, although names will be withheld.

No Google Traffic
After taking on this client I gained access to their webtrends reports and it showed an astounding lack of Google organic traffic. I looked over the meta tags and page content and all seemed to be targeting the right set of keywords to some degree, although onpage could still use some improvement.

I knew they weren’t doing anything advanced like IP delivery so I used Firefox with the useragent switcher extension and confirmed that with my useragent set to googlebot, slurp or msnbot I could browse the site without any problems. After checking the robots.txt I found that googlebot had been disallowed! After asking the client’s developer why they decided to ban googlebot their response was: It was crawling the site too often and there were errors on some of the pages that were leading to open database connections and locking up the server.

Needless to say the developers got a quick lesson in why banning googlebot to mask their programming errors is not good business practice.

Want to hear more stories? Do you have any of your own you’d like to share?

Thoughts On Google Sitelinks Update

Google webmaster tools was updated on Thursday with some new features. The one that caught my eye was Sitelinks located under Dashboard > Links > Sitelinks. Here’s a screenshot showing the new feature, see if you can spot where I think another updated is needed.

Google Webmaster Tools Sitelinks

Hint: Hmmm… looks like the developer didn’t have English as his first language.

Ok, I’ll stop poking fun, on to some proper stuff:

From the blog post:

Selecting pages to appear as sitelinks is a completely automated process. Our algorithms parse the structure and content of websites and identify pages that provide fast navigation and relevant information for the user’s query…… occasionally you might want to exclude a page from your sitelinks, for example: a page that has become outdated or unavailable, or a page that contains information you don’t want emphasized to users.

1) If Google has not yet crawled the entire site, it’s choosing the sitelinks based on partial data.

2) How does Google define “fast navigation”? Is it referencing the time spent downloading a page? The number of clicks to that page, which is deemed to be relevant to the users’ query? Or perhaps this was just worded in a strange way and what it really means is that by providing the sitelinks they’re providing a faster way for users to get to the relevant information on my site from the SERP.

3) I like that I can exclude a page, but why can’t I add pages as sitelinks? This would be much more useful for webmasters and users, since site owners will most likely know where their traffic goes and more to the point, where they really want to drive traffic.

4) If pages become outdated, it also stands to reason that there may be some new pages which are more valuable, but haven’t been automatically chosen to become sitelinks. (Yet another Google algo to look into…)

5) If pages that are listed as sitelinks become unavailable, I’d hope that Google will automatically remove the sitelinks.

6) Since Google only displays 4 sitelinks in the SERPs but offers a bigger list in GWT, it would be nice to see some examples of queries that bring up different combinations of sitelinks.

7) The anchor text of the sitelinks uses the first 25 characters of the title tag of the page it’s linking to. I’d like to be able to edit that link text to make it more useful for users and ensure some of the links aren’t truncated.

This brings on an interesting point, is it worth optimizing the title tags of pages that are listed in sitelinks? Would it make any difference? All of the ones I’ve seen are pretty accurate and don’t need adjusting.

Related: Results Changed For Me

I used to write about my Volkswagen Beetle on this site before I turned this into an SEM blog at the end of 06. Interestingly enough, within the past few weeks Google decided to change the results for the search query [related:www.reubenyau.com]

Not long after I changed the content to a completely different topic, I had a couple of posts (1, 2) go popular on digg which generated many topical backlinks. Shortly after that the posts started to rank for much broader terms, especially the Analytics post and today it’s still generating backlinks and good traffic. My WordPress post was also quoted by Matt Cutts in his ppt presentation at Wordcamp this year.

I’m shooting off on a tangent here, so bare with me. Google indexes .ppt files and can display them in HTML for convenience (I love that feature) , although the written out URL within the presentation doesn’t count as a backlink, according to Google webmaster tools. I often wonder if one day search engines will also count written out URLs as backlinks, even if they aren’t coded as a hyperlink.

I’m pretty sure that the related: query results returned are just a snapshot in time, similar to the PageRank value given in the Toolbar. I’m also wondering if, just like the link: command, the results are somewhat cropped.

Well… at least my touchgraph results are a little more contextually relevant now.

Website Optimizer Is Search Engine Friendly

Website Optimizer by Google AdWords is a multivariate testing tool, that is to say, it takes A/B testing one step further. Instead of testing 2 versions of a page, you can test multiple page elements and the various combinations.

When you create an experiment you can specify page elements that you want to test, for example, a page heading, intro copy or a lead image. It uses javascript on the landing page to swap out the test element with the other variations that you specify within Website Optimizer.

I participated in the beta test of this and my initial concern was that it may not be search engine friendly due to the changing page elements, however, after a short call with the Website Optimizer Product Manager, he confirmed that it would not have any impact on organic rankings.

If you are still concerned about it, then you can always set up a specific landing page that is not linked to from your main navigation and use either the robots.txt or meta noindex tags to prevent search engines from crawling those pages.

Once your AdWords account is fairly well optimized, I would highly recommend trying out this tool. You will learn new things about your website, its traffic and motivators. Just make sure you carefully plan the test elements and don’t test too many elements at once to ensure that you can run through enough iterations with conversions to gain meaningful data.

Smashing Magazine Google PageRank Article Misleading

Through mybloglog, I came across the smashing magazine community and an article entitled Google PageRank: What Do We Know About It? The article does a very good job of explaining Google PageRank and has 15 points listed under how PageRank works and 13 factors impacting PageRank, but also does a good job of confusing content, quality and ranking factors with items that affect PageRank. I started writing a comment to the post, but quickly realized that it was going to be too long, so I decided to post my comments here:

Summary: How Does PageRank Work?
1. PageRank is only one of numerous methods Google uses to determine a pageís relevance or importance.Google uses over 200 signals to determine a page’s rank within the index. While PageRank is one of the signals, the PageRank calculation itself is not actually used to determine relevance or importance, that’s the job for other parts of the algorithm.

2. Google interprets a link from page A to page B as a vote, by page A, for page B. Google looks not only at the sheer volume of votes; among 100 other aspects it also analyzes the page that casts the vote.In the PageRank calculation there aren’t 100 other aspects, the formula calculates links, that’s it.

3. PageRank is based on incoming links, but not just on the number of them – relevance and quality are important.Relevance and quality are not part of the PageRank calculation. You should try to obtain relevant, quality inbound links to your website, but judging relevance and quality is a content issue.

7. Bad incoming links donít have impact on Page Rank.The PageRank calculation does not understand what is good and bad, it just calculates a value.

8. Page Rank considers site age, backlink relevancy and backlink duration.Site age and relevancy are ranking factors, although backlink duration is a factor. Just as you can accrue PR through links, it can also be diminished when links are removed.

14. Google calculates pages PRs once every few months.Actually it’s calculated all the time, but what we see in the Google Toolbar (or other online PR tools) is a snapshot in time which is updated every 3 months or so.

Summary: Impact on Google PageRank
1. Frequent content updates donít improve Page Rank automatically.They don’t update them manually either. Content is not part of the PR calculation.

6. Wikipedia links donít improve PageRank automatically (update: but pages which extract information from Wikipedia might improve PageRank).Content has nothing to do with the PageRank calculation. Creating great content will earn you links, but referencing Wikipedia here is misleading.

8. Efficient internal onsite linking has an impact on PageRank.I think this needs to be explained further. When developing a website, you should strive to make it search engine friendly and especially Google-friendly by ensuring that your site does a good job of linking to its internal pages. This passes on PageRank from page to page which can help keep internal pages out of the supplemental index. This is important because those pages aren’t spidered anywhere near as frequently and are indexed slightly differently than pages in the main index. So good internal linking has an affect on the PageRank of pages deeper in your site.

9. Related high ranked web-sites count stronger.Ideally you want to acquire links from pages with a high PageRank, but again, the reference to related sites (in terms of topicality) does not have an impact on PageRank. A page with high pagerank may actually pass you less if it has more links, because it’s spread too thin. A page with a PR of 4 could feasibly pass on more PR than a PR8 page depending on the number of links that are on the page.

10. The anchor text of a link is often far more important than whether itís on a high PageRank pageThis is a ranking factor, not something specific to calculating PageRank. Again, the content or topicality has nothing to do with the PR calculation.

11. Links from and to high quality related sites have an impact on Page Rank.Topicality again, see previous comments.

Google Analytics Authorized Consultant (GAAC) Requirements

I looked into this a while back and wrote to Google Analytics support. They responded with this set of requirements for becoming a Google Analytics Authorized Consultant:

  • In business for 1 year
  • At least one dedicated person for Google Analytics support
  • Must provide full service i.e. setup, support, training, and consultation for Google Analytics
  • Must have an online ticketing system that Google can access
  • Must provide support for both Google Analytics and Urchin software
  • Must have a web site of sufficient Google Analytics/Urchin content and quality
  • Proven background in Analytics and Search Engine Marketing (SEM)
  • Must have at least one Google Adwords Certified employee
  • Must attend training sessions at a Google office – usually once per year

While I have most of these items covered, I just need to find another 6 hours in a day and I may be able to get the rest done.