OK, this spam slightly scares me ...

This message slightly frightens me: Spam Message: Chen Shui-bian Why does it worry me?  It's not like I'm the only one whose gotten this message.  Far from it. What interest me, and forgive me for not doing the research, is that the names dropped suggest that the spammer has some idea who I am and what I'm interested in. Chen Shui-bian is the former president of Taiwan who was imprisoned last year for corruption.  Since I read Chinese and am interested in Taiwan, I knew this and it made sense to me that Chen might have some money he might want to be willing to get rid of.  I wonder if this spammer somehow has access to some information about my browsing habit.

Eh, probably just one of those rare clever spam attacks that actually has some intelligence behind it.  I doubt they're targeting anyone in particular.

Google has left the Mainland ...

... and moved into Hong Kong.

I haven't posted anything for a while, but while I still get a little trickle of views I thought I'd finally follow up on something I covered a while back.  I've been following the Google China news and as of yesterday, Google officially left its China offices and redirected google.cn to a simplified-character version of google.com.hk.  The theory was that since Hong Kong has uncensored Internet, they would be able to provide uncensored search from their Hong Kong servers.  That is, until they get blocked, and they apparently already have -- at least selectively.

News hype has been pretty big up to this point, and there was a popular response: a group of Chinese netizens put up an open letter (Chinese) to the government asking to have a say in the case (English translation here).  Anyway, this topic is already being discussed everywhere, so here's a few links I've rounded up on the matter:

Anyone in mainland China reading this, have you had any issues with Google in the mainland so far?  If so, I'd love to hear about it.

UPDATE: ChinaSMACK has some translated netizen reactions from various Chinese forums.

UPDATE 2: The BBC has a short write up on some anti-Google Chinese reactions, slightly unclear though.  Also, Han Han (韩寒, a famous Chinese writer and blogger) has something about it in his latest post:
事实上,无论谷歌是做这个决定的真正原因是什么,在展现给公众的说法上,谷歌有一个失策,谷歌说,他不想再接受敏感内容的审核了。注意,这里说的敏感内容 其实不是指情色内容, , , 这里所谓的敏感内容只是指不利于政府利益的内容。但是所谓的开放所有审查结果,现实的中国人有多少人在乎呢?这在正常的国家可以感动国人的理由,在中国看似不太管用。

Actually, it doesn't matter what the real reason for Google's decision is.  According to the theory that is coming out publicly, Google made a miscalculation, they no longer want to have to censor senitive content.  Note, "sensitive content" is not a reference to pornographic content...  This so-called "sensitive content" is content that does not benifit the government's interest. But as to this so-called opening-up of all these censored results, how many real Chinese people care?  In an ordinary country this could move the reason of that countries people, but in China it doesn't seem to be very effective.

Google's China problems: worth leaving over?

Yesterday Google put up a blog post (via Jason Morrison) mentioning cyber attacks originating in China that appeared to be targeting human rights activists.  Since entering China, Google has been in a precarious position of balancing the Chinese government's insistance on censored results with their own mission to make information free and available to everyone.  It is a little surprising though, to see their conclusion here:
These attacks and the surveillance they have uncovered--combined with the attempts over the past year to further limit free speech on the web--have led us to conclude that we should review the feasibility of our business operations in China. We have decided we are no longer willing to continue censoring our results on Google.cn, and so over the next few weeks we will be discussing with the Chinese government the basis on which we could operate an unfiltered search engine within the law, if at all. We recognize that this may well mean having to shut down Google.cn, and potentially our offices in China.

I don't pretend to know what Google means by this, whether they are seriously considering shutting down their China offices or just trying to draw attention to China's censorship policies.  Cyber attacks can come from anywhere, and pulling out of China will not make Chinese cyber attacks go away, whether they really are government attacks or just nationalist Chinese vigilantes.  In any case, I'm sure it will get some of the authorities going.  ChinaSMACK poster Python seemed just as confused and skeptical about the issue (note: go to that post to get some translated Chinese reactions to the news):
The reasons provided by Google for the closing of their Chinese offices are rather vague if not unpersuasive.

  • Yes, cyber attacks exist in China and some originated from this country, but Google is not the only victim and even its major opponent Baidu recently got DNS hijacked by the so-called “Iranian Cyber Army”.

  • Second, isn’t it Google’s responsibility to utilize all its technical might to protect users’, including human rights activists’, privacy? Saying “we will retreat because some of our users’ email account were monitored” is like admitting their own disadvantage in technical strength and persuading users to switch to other companies.

  • Third, I fail to see why compromise of some users’ computers due to their own lack of sense in internet security is a fault of Google itself: anyone using ANY email system could be hacked if the user acts like a security newbie, and it doesn’t matter where the login portal pages are hosted (I remember Google doesn’t have a data center in China).

Anyway, we'll see whether this leads to any real policy changes on Google's part.  The ChinaSMACK article linked above recently updated with a translation of a Sina blog post (original Chinese here) calling Google's announcement "psychological warfare", and I'm inclined to agree, considering that the announcement itself said that this information was shared partly to contribute to "much bigger global debate about freedom of speech."  If that's the case, let's hope someone gets the message.

MIT makes your bike a hybrid

So apparently MIT scientists have made a special bike wheel that stores breaking energy in a battery, then uses that for an assist motor.  Interesting, but the fact is there are already electric bikes, and they never caught on in the US.  They're all over China, though -- you can't cross the street in Hangzhou without almost getting hit by one.

China bans individuals from registering .cn domains

So, just as the Internet has had a major democratization globally, China steps in with another ham-handed attempt at restricting it.  Authorities in China have banned individuals from registering .cn domains. (via Shanghaiist):
According to the latest report published by CNNIC, users who want to apply for domain names should provide written application materials to domain name registration service providers while submitting applications online. The application materials include a domain name registration application form with official seal (original); an enterprise business license or organization code certificate (copy); and the identification of registration contact (copy).

Not sure how much it matters, there's nothing that prevents you from going through a commercial registrar to get an international .com, .org, or .net domain.  I don't know exactly what authorities attempt to accomplish with this ban, maybe someone who knows more about domain names can help me out.  I suppose individual countries have the right to put whatever restrictions they want on their country code domains, but letting businesses and organizations register but not individuals seems odd to me.  If I were to restrict a country code, I would restrict it to government sites.

Join the Wave!

So, I've been invited to Google Wave, Google's not-so-super-secret collaborative editor and communication tool that they hope will replace email.  The system gives me eight email invites, and since I don't want to waste them on random people I've decided I'll just post about it.  Anyone who wants an invite can post a comment (Facebookers, click through to the blog so all comments end up in one place), and the first eight can get the invites, if you want it.

Make sure to look at the info on it.  Not too many people are on Wave yet, so it's not practical for most things yet, just a toy to play with for now, but I got into a Wave conversation and it's  pretty cool.

Chinese government departments fighting over WoW

I'm a little late in talking about this, word is recently that the Ministry of Culture has admonished the General Administration of Press and Publications for blocking the latest expansion of World of Warcraft in China.

A little background: Fully half of WoW's userbase is in China, about 5 million players, and has penetrated the culture enough that there is even a WoW-themed restaurant in Beijing. Earlier this year Blizzard switched it's local operations from The9 to NetEase, causing a long server outage. Later GAPP suspended operations in China, citing "gross violations". WoW has had trouble in China before, having been required to flesh out skeletons*, change the color of blood, and even hide skulls in icons and models behind bags or boxes, and the second expansion, "Wrath of the Lich King", was initially rejected because of "a city raid** and skeleton characters". You can argue about how damaging a cartoon skeleton may or may not be on young gamer's minds (and most of WoW really is a cartoon, nothing in the game is terribly realistic in it's art style), but there's always suspicions about ulterior motives, especially considering that limits have been put on overall foreign investment in gaming.

Now the Ministry of Culture has publicly accused GAPP of "overstepping it's authority". Public feuds between government departments in China isn't a common practice, but it's not the first time the Internet has inspired this sort of infighting: The recent Green Dam project, which would have required all computers sold in China to come with some particularly awful and virus-prone filtering software, was openly criticised in Communist Party news outlets and was eventually abandoned mostly due to popular pressure and revelations about the actual flaws in the software.

In any case, the ban on WoW, like many internet regulations in China, isn't too hard to skirt. It's always been possible to connect to foreign game servers in China (I've done so to check mail on characters, etc. on US servers), so at times when WoW has been unavailable, many Chinese gamers moved to Taiwanese servers to play.

*A small note: I have asked several Chinese friends why skeletons are particularly targeted for censorship in China. So far I haven't gotten any real answers. I had been wondering if there was some specific cultural reason for this, or whether it was one of those cases where the moral authorities have an odd focus on one particular thing (such as when the FCC back home allows large amounts of blood and gore to appear on television, both real [in news reports] and fake [watch some of the earlier Heroes episodes], but under no circumstances may you show female breasts or utter taboo words without editing them out.) Also, one friend noted that skeletons do appear in Chinese media -- not sure where that is.

**Avid WoW players know exactly which city raid they are talking about, of course. I presume anyone else reading this doesn't particularly care :P

Chinese domain names on the way

ICANN, the international organization that maintains Internet domain names has announced that they are going to begin allowing domain names using scripts other than the Latin script to be used for top-level domains (that is, the extensions such as .com, .org, etc). Up until now, domains have been restricted mostly to the 26 letters of the English alphabet plus the 10 numeral glyphs of the Hindu-Arabic number system. CNET has a good write-up on all this:

IDNs will allow domain names to be to be written in native character sets, such as Chinese, Arabic, and Greek. In charge of managing domain names, ICANN has argued that IDNs are necessary to expand use of the Web in regions where people don't understand English. Since its inception, the Internet has been limited to the Latin character set used by the U.S. and many other nations.


To expedite the new plan, ICANN will launch a Fast Track process on November 16. At that time, the organization will begin accepting applications from countries for new top level domains, or Internet extensions, based on each nation's character set.

Initially, the change will apply only to local country codes, such as .kr for Korea and .ru for Russia. Major top level domains (TLDs) such as .com, .net., and .org won't see non-Latin editions just yet. But ICANN is pushing to make progress on these major TLDs and hopes to include them in the IDN system before long.

This is definitely an important event in the history of the Internet. Evan Osnos of the New Yorker predicts a new .中国 domain (zhong1 guo2 = China), though I hope for simplicity's sake they keep it .中. According to Wikipedia the effort to allow more character sets other than the basic ASCII set began with a proposal in 1996, and started bearing fruit in 1998. However, though they list several domains as accepting Chinese characters, I have yet to ever see a second-level domain (the main part of the URL) using them, usually I see them with domain names and pinyin. If Chinese-character top-level domains, that may cause them to be more used, as Chinese users won't need to switch out of their IME's to finish the address.

Still, I wonder how many of these new domains we'll see used, other than companies grabbing their own brand names to make sure they have them. Many Chinese speakers do not use a "Chinese keyboard" to use a phrase used by a Tom Merrit on Buzz Out Loud's commentary, but instead a pinyin-based IME, most of which have an "English" setting for typing Latin characters (the default Windows IME works this way), and you still have to switch the punctuation type, unless ICANN finds a way to map 。 to the Western-style period (.) they use as the "dot" in "dot-com". Still, convenient or not, I think Osnos has a point that nationalism and cultural significance will drive Chinese sites to use and advertise their Chinese-character domain names.

Final note: I am by no means an expert on any of this. If I'm off base in saying there aren't so many character-domain names, or if I have misunderstood something about the availability of Chinese character domains, please call me out. I'm still a little confused about the history here, so I might be off on some things.

Wolfram|Alpha and language info

Wolfram Alpha came out today.  For anyone who hasn't heard of it, Wolfram Alpha is a "computational knowledge engine" created by Stephen Wolfram which draws information from various Internet sources.  Anyway, I thought I'd play around with what it does with langauge data.

Searching languages is fairly straightforward.  Search English or Spanish for example, and you get a form giving basic info -- number of speakers character frequency, lexical simiarity (what languages share the most cognates with it), genetic classification and major regions where it's spoken.  If I search "Chinese", it defaults to Mandarin Chinese but also gives me an option to go to Chinese langauges, which gives me links to several Chinese/Sinitic languages as well as a link to Pidgin English Chinese, which leads to a comparison of English, Chinese, and Hawaii Creole English  -- which I had thought referred to a creole of English and Hawai'ian -- the Wikipedia page doesn't mention Mandarin, though it says Cantonese was a major source language -- along with Portuguese, Japanese, and several others.  That page also gives you the option to go to Tok Pisin which also doesn't seem to have such a strong Madarin influence according to Wikipedia.

A couple other things I find: Each language entry lists the numbers 1-10 (taken from Zompist's list) but no mention is made of what numerical base the language uses.  While pretty much all major languages use a decimal base, there are a number of languages, particularly in Mesoamerica, that use a visegimal base (base-20) and there are other bases in use.  (I should note that "vigesimal" leads to a dictionary definition, and a serch for "base 20" seems to take you to something about DNA base-pairs.)

Another issue is with writing systems.  Several languages correctly identify the writing system (Latin alphabet for English and Spanish) others are not so helpful.  The entry for Japanese lists the "Chinese script", apparently ignoring the kana.  And searching for Latin alphabet and various other scripts turns up nothing useful, and even more seems to associate a script only with a particular language.  I found that Devanagari has some useful information listed, but Cyrillic only has a dictionary definition.  "Chinese characters", "hanzi", "kanji", and "hanja" all turned up nothing, though you can find something with "Chinese script" -- though it's really only a Mandarin-specific block of Unicode code points.

Basically Wolfram Alpha looks like a very nice little tool for research, but it has some kinks.  This is really just a small segment of what the system can do, and being that it is built on a mathematics engine I'm sure that it does a lot better with things like equations and statistics.  For example: Each of my language searches turned up good info on number of speakers, character frequency, and even an estimate of translation length (based on character count).  The semantic search could definitely use some improvement, and I'm sure it will in the future.  Ultimately I think this system will find a niche or maybe several.  But it's definitely not going to kill Google or Wikipedia.  In fact, I can really only see it being a compliment to all the info we already have on the web.