Definite's Extractor

My findings on Life, Linux, Open Source, and so on.

Monthly Archives: January 2010

Review on “Chinese Eye Tracking Study: Baidu Vs Google”

Review on “Chinese Eye Tracking Study: Baidu Vs Google”

Today I see an “interesting” post about Eye Tracking Study about Google and Baidu. “interesting” blog post. That post does make some valid points, such as reasoning in Q: What’s the difference in user experience between Baidu and Google?
and first two factors in Q: Why choose Baidu?.

However, that post has several major reasoning flaws:
1. The third factor in Q: Why choose Baidu? is misleading. Browser multi-tab viewing mode is required by all over the world, not only for Chinese.

2. The actual third factor is G.F.W. Baidu follow Chinese policy closely, and usually does show the target pages which are blocked by the firewall; Google on the other hand, does not comply as much as Baidu does, thus it’s likely that the search results lead to “dead” links, which upsets ordinary end users.

3. It claims that Chinese is hard to skim through because Chinese has too many characters without space to split the meaning in comprehensible way. It even tried to emphasize this point by providing following all-uppercase, no-space paragraph:


(To try to put in a Western conceptual framework, imagine how difficult it would be to scan meaning from this paragraph if our alphabet was extended to 2000 characters, presented in block letters and all the spaces between words were removed.)

That reason is quite silly. If that is true, Chinese would have abandon that writing system eons ago, as few can understand and willing to pass the writing through generations.

Actually, like comment 1 said, most concepts in Chinese can be represented in no more than two characters, native name seldom exceed 3 characters; European languages on the other hand, often require you to look through much more characters for a meaningful word.

Comparing display length, Chinese text looks much shorter than English, yet carries the same amount of information. Using his example:

(To try to put in a Western conceptual framework, imagine how difficult it would be to scan meaning from this paragraph if our alphabet was extended to 2000 characters, presented in block letters and all the spaces between words were removed.)

(以西方的概念架構來說,很難想像如果我們的字母增加至2000個,全以大寫顯示,移除空白的話,要怎麼讀這段文章。 )

As you can see, it doesn’t even occupy half the visual area if using the same font size. Shape-eyed readers might also notice that punctuation marks provide necessary space for scanning the meaning of the paragraph. 😛

Anime “Red pig” also provides another visual comparison among major languages in the beginning. 🙂

Google v.s. Baidu

They said Google is leaving China. Consequently, more people start to compare Google and Baidu, which is current lead in Chinese search market.

Two days ago, I visited this “interesting” blog. That post measures the effectiveness of search engines by their “recall”, i.e. the number of results they returned, regardless the whether the results are truly related or not.

The author uses three phrases as test set:
1. 许霆 (Xu Ting: A Chinese citizen who was recently involved in a controversial criminal case)
2. 次级房贷 (Subprime mortgage)
3. 看羹吃饭 (Kan-Geng-Chi-fan: A phrase used and recognized by a relatively small number of Chinese, meaning that you have to think carefully before taking action)

He summaries that Baidu is superior than Google, as the results are more favorable for mainland Chinese users. Well, I don’t agree with him.

For 许霆 (Xu Ting), I think Baidu wins here, because it also returns a link to an article in Baidu’s encyclopedia which introduces Xu Ting, the professor in a Chinese university; while Google merely returns the suspect Xu Ting in first 10 pages. Variety is good, because it does not assume what you are looking for. For example, if I were searching “smart”, I prefer to choose among an adjective for a cunning man, a car, a package management system, or disk error reporting mechanism.

For 次级房贷 (Subprime mortgage), yes, he has made a typo in English, but let’s ignore that. He said that since simplified Chinese is preferred in China, so showing Traditional Chinese characters are bad. But AFAIK, the default search preference is “showing both Simplified and Traditional”, that means it is a bug that it only showing simplified. 😛

For 看羹吃饭 (Kan-Geng-Chi-fan). Well, if you don’t know how to use Google, right,
you got the result he got. But if you remember to add “” to quote the phrase you care,
you get a good result.

Oh, let’s forgive him, as he did not receive proper training to tell which searching is good. 🙂

Who get benefits with Digital Rights management (DRM)?

Supposedly, DRM protects the media from copyright holders’ (e.g. movie maker) and their followers’ (e.g. retailer) rights. But does it?

I am a fan of MacGyver, and I already purchased 4 series of DVD. When I tried to find the fifth, I am told that it is not possible to buy that in retailer shop, because only four series are released in Zone 4.

Now, three ways lie before me:
1. Grab a pirate copy from internet: I either got low quality video, or spend ages for download the images (blame Australian network), plus contribute to my carbon footprint ant electricity bill. Sounds not good. It’s also obvious that local retailers cannot not earn my money, so does the series maker.

2. Get the DVD from somewhere else: Good for me, but bad for local retailers.

3. Lost interest completely: Actually, that my current mood. Not that bad to me, as there are many things for me to do. But definitely bad for local retailers and consequently, the video makers.

So come on, video makers, abolish the DRM, it does evil (and convince evil) to you.