Definite's Extractor

My findings on Life, Linux, Open Source, and so on.

Tag Archives: comparison

Review on “Chinese Eye Tracking Study: Baidu Vs Google”

Review on “Chinese Eye Tracking Study: Baidu Vs Google”

Today I see an “interesting” post about Eye Tracking Study about Google and Baidu. “interesting” blog post. That post does make some valid points, such as reasoning in Q: What’s the difference in user experience between Baidu and Google?
and first two factors in Q: Why choose Baidu?.

However, that post has several major reasoning flaws:
1. The third factor in Q: Why choose Baidu? is misleading. Browser multi-tab viewing mode is required by all over the world, not only for Chinese.

2. The actual third factor is G.F.W. Baidu follow Chinese policy closely, and usually does show the target pages which are blocked by the firewall; Google on the other hand, does not comply as much as Baidu does, thus it’s likely that the search results lead to “dead” links, which upsets ordinary end users.

3. It claims that Chinese is hard to skim through because Chinese has too many characters without space to split the meaning in comprehensible way. It even tried to emphasize this point by providing following all-uppercase, no-space paragraph:

TOTRYTOPUTINAWESTERNCONCEPTUALFRAMEWORK,IMAGINEHOW DIFFICULTITWOULDBETOSCANMEANINGFROMTHISPARAGRAPHIF OURALPHABETWASEXTENDEDTO2000CHARACTERS,PRESENTEDIN BLOCKLETTERSANDALLTHESPACESBETWEENWORDSWEREREMOVED

(To try to put in a Western conceptual framework, imagine how difficult it would be to scan meaning from this paragraph if our alphabet was extended to 2000 characters, presented in block letters and all the spaces between words were removed.)

That reason is quite silly. If that is true, Chinese would have abandon that writing system eons ago, as few can understand and willing to pass the writing through generations.

Actually, like comment 1 said, most concepts in Chinese can be represented in no more than two characters, native name seldom exceed 3 characters; European languages on the other hand, often require you to look through much more characters for a meaningful word.

Comparing display length, Chinese text looks much shorter than English, yet carries the same amount of information. Using his example:

(To try to put in a Western conceptual framework, imagine how difficult it would be to scan meaning from this paragraph if our alphabet was extended to 2000 characters, presented in block letters and all the spaces between words were removed.)

(以西方的概念架構來說,很難想像如果我們的字母增加至2000個,全以大寫顯示,移除空白的話,要怎麼讀這段文章。 )

As you can see, it doesn’t even occupy half the visual area if using the same font size. Shape-eyed readers might also notice that punctuation marks provide necessary space for scanning the meaning of the paragraph. 😛

Anime “Red pig” also provides another visual comparison among major languages in the beginning. 🙂

Google v.s. Baidu

They said Google is leaving China. Consequently, more people start to compare Google and Baidu, which is current lead in Chinese search market.

Two days ago, I visited this “interesting” blog. That post measures the effectiveness of search engines by their “recall”, i.e. the number of results they returned, regardless the whether the results are truly related or not.

The author uses three phrases as test set:
1. 许霆 (Xu Ting: A Chinese citizen who was recently involved in a controversial criminal case)
2. 次级房贷 (Subprime mortgage)
3. 看羹吃饭 (Kan-Geng-Chi-fan: A phrase used and recognized by a relatively small number of Chinese, meaning that you have to think carefully before taking action)

He summaries that Baidu is superior than Google, as the results are more favorable for mainland Chinese users. Well, I don’t agree with him.

For 许霆 (Xu Ting), I think Baidu wins here, because it also returns a link to an article in Baidu’s encyclopedia which introduces Xu Ting, the professor in a Chinese university; while Google merely returns the suspect Xu Ting in first 10 pages. Variety is good, because it does not assume what you are looking for. For example, if I were searching “smart”, I prefer to choose among an adjective for a cunning man, a car, a package management system, or disk error reporting mechanism.

For 次级房贷 (Subprime mortgage), yes, he has made a typo in English, but let’s ignore that. He said that since simplified Chinese is preferred in China, so showing Traditional Chinese characters are bad. But AFAIK, the default search preference is “showing both Simplified and Traditional”, that means it is a bug that it only showing simplified. 😛

For 看羹吃饭 (Kan-Geng-Chi-fan). Well, if you don’t know how to use Google, right,
you got the result he got. But if you remember to add “” to quote the phrase you care,
you get a good result.

Oh, let’s forgive him, as he did not receive proper training to tell which searching is good. 🙂

Browser comparison.

Today I installed Google chrome on my Windows XP box. It’s a HP compaq nx6120, no extra modification except I change to a 100 GB harddisk.
I opened a few web pages that I often visit and compared the memory consumption among Firefox 3.5.3, Opera 10.0.0, and Chrome using the about:memory. I was quite surprise what I saw:

But after a few minute later, it becomes:

It appears that chrome does some optimizing behind the theme.