The Art of Language Invention: a review

                I received an advance copy of David J Peterson’s The Art of Language Invention, which after some procrastination I have recently finished. It’s a book I highly recommend both to newbie and advanced conlangers, as well as anyone who might be interested in conlanging. Before I get started with my review, I will point out that William Annis has a great overview of the book on his Tumblr, and Gretchen McColloch of All Things Linguistic livetweeted her reading, and her take, as a non-conlanger, is really fun.

                David is very good at introducing basic concepts of linguistics in an entertaining and understandable way. There are, of course, those points that make me go, “Ah, yes, that’s the simplified lie we tell the undergrads,” but that part of the book isn’t for me, and I highly suggest anyone interested in conlanging to read the book especially if you don’t know anything about linguistics. I grew up in conlanging with the web version of The Language Construction Kit (years before it became a book), and I wish I’d had this book as well. It really is a beautifully laid out and easy-to-digest.

                All that said, as a linguist and a moderately skilled conlanger, the most valuable part of the book was in the copious examples and especially the case studies provided in the book. I was interested to see how David’s approach differs from mine, and what I could learn from him. Some of these things are just a function of different experiences with language. For instance, in his case study on Irathient, he discusses how he wanted to make the language “slow”, and that his prototype for a language with a slow speaking rate was Inuktitut, a language with a massive derivational system that packs a large amount of meaning into a word via derivation. Irathient is by no means like Inuktitut – morphologically it has more in common with Bantu languages – but I wonder if I might have approached that problem differently. The reason being, my idea of a prototypical “slow” language is Mandarin Chinese – almost the opposite of Inuktitut in that it is an analytic language where almost every morpheme is a single, complete syllable, and the majority of derivation is in the form of two-syllable compounds. In addition, whereas David feels the need to build in agreement in case lines get cut down in editing, my experience with Mandarin would make me comfortable with a language where you have no agreement but can find missing elements by context. Neither of these approaches are right or wrong, and I think increasing one’s repertoire of tricks and language structures can only be good for a conlanger.

                Another side of it is David’s focus on lexicon building and historical derivation. This is a place where I have to say David is far better than me. I’ve only recently started building a conlang from a lexicon-centric position, and seeing how David builds his words is very helpful. Look out for his example of an entry in his Sondiv dictionary, which already surpasses any entries I’ve made in a conlang dictionary for completeness and number of terms (of course, it is an entry for a triconsonantal root, but I think any derivation system should be built in a way that can handle this complexity).

                In the end of the book, David speaks briefly on the status of conlanging as an art form, and also on how an economy for professional conlanging can evolve. David encourages authors of speculative fiction to collaborate with conlangers, even if all they can offer is a percentage of royalties or the conlanger’s name on the front cover. I like this idea. I’d be more than willing, given the opportunity and the time (oh where to find time!), to collaborate with an author in that way myself, and I think there are a lot of very good conlangers who would as well. I would like to add, though, that even if you are a conlanger yourself and writing some creative work, I think that collaborating with other conlangers can be a benefit. Lots of conlangers have skills and knowledge applicable to a particular type of conlang (e.g. non-humanoid alien languages, or historical a posteriori languages). But definitely, definitely, I wholeheartedly agree that creators should partner with conlangers.

                The Art of Language Invention is not a comprehensive guide. If you are a beginning conlanger, this book is your starting point. As McCullogh put it, it’s “a geek’s guide to linguistics”, and something that makes for a good introductory text. If you are an experienced conlanger (or an intermediate one like me), it’s a window into how another conlanger does his craft, and being exposed to different approaches can only be beneficial to your own work. All in all, I recommend it to anyone interested in our weird little hobby.

Drawing from an urn

Today's XKCD had a joke with an interesting linguistic angle:

I think this kind of humor illustrates two important things about words: First, you can't separate a word from it's associations and connotations. Second, those associations are different for different people in different situations.

For most English speakers, an urn is primarily a container for the ashes of a deceased individual, but for this teacher, in a discussion of statistics, it's just a container for randomized balls in an urn problem. But it's not a pun and these aren't homophones; both of these people are probably picturing a ceramic jar of some kind, they have the same core meaning for the word. It's just the clash of associations that makes the joke.

Language is a way of describing the world, and our understanding of the world is always affecting how we interpret it.


Adventures in Linguistics: He is X nor Y

I had a curious experience in semantics class today.  We were covering the scope of negation, and the professor had presented us with three sentences:

(1) Pat isn't a plumber and isn't an architect.

(2) Pat is not a plumber or an architect.

(3) Pat is neither a plumber nor an architect.

Part of what we were discussing was the fact that all three of these sentences mean the same thing  (that is, the sentence is true only if Pat does not belong to either of these professions), but it seems that (1) and (2) derive that meaning differently, and we were working on which of those sets of rules apply to (3).

I won't bore people with the technical details, but along the discussion, one of my classmates brought up an example of their own:

(4) Pat is neither a plumber or an architect.

Which was grammatical to her, though I find it slightly questionable.  This encouraged me to bring up an example that I had been mulling over in my head for about 10 minutes:

(5) Pat is a plumber nor an architect.

Though I thought that (5) was good and means the same as (3), apparently no other native English speaker in the class agreed with me that (5) was grammatical at all.  One person thought it may have to do with me being from "the South" -- which still amuses me, since I never did consider the part of West Virginia I come from particularly Southern (I suppose it looks very different from Wisconsin).  In any case, it did lead to a short discussion of what could possibly be going on with my dialect of English to cause this construction.

It's funny how these things pop up.  I've had a moment like this before, when the double-modal might could was brought up in syntax class (that one I know is common in Appalachia and the South, but not up here), and I'm sure these things will happen again.

From Aeruyo to Malviz: Where is this phonology even going?

So, one issue I realized I would have to deal with in deriving a language is historically is the fact that I would have to go back and analyze the phonology to figure out what the heck phonemes were there, anyway.  So, after a lot of wrangling, I managed to get Zounds to apply my changes to my entire lexicon.  Now after doing that, I'm finding the analysis will be a daunting task, and I'm feeling lazy.  So I thought, why not show the word list to my conlanger friends and see what they think of it.

So, with no explanation, I have here a list of all the words, in phonetic transcription, without the original Aeruyo words.  I won't tell you anything about what I've worked out from my initial eyeballing to keep it pure.  So, if you feel like doing some phonological analysis: here is the word list.

I'll come back when I've had time and desire to work up my own analysis, as well as any tweaks I've made.

Conlang Language Options in Minecraft?

While looking over the patch notes for Minecraft 1.2.4, I noticed a section under the known bugs labelled "Translation Related".  There, in addition to a lot of notes about Spanish translations that mostly seemed to involve correcting names (including some interesting juggling of the terms castellano and español that might be deserving of its own post), I found this curious and rather amusing line:

The translation [Quenya (Arda)] has "Lever" labeled as "Mechanic Pen*s"

A quick check reveals that Minecraft is actually available in three constructed languages: Esperanto [listed as "Esperanto (Mondo)"], Quenya ["Quenya (Arda)"], Klingon ["tlhIngan Hol (US)"] ...  Why Klingon's listing is US and not some term for the Klingon Empire or their homeworld Kronos/Qo'noS I wouldn't know.

The trivia on Minepedia's Language* page does not redact the term, so I presume that some joker did indeed name the Lever element "Mechanical Penis" (Minecraft uses a crowdsourcing site for translations, and it has gotten them in bigger trouble than this.), however, the problem was apparently fixed, as when I jumped in the game using the Quenya UI and made a lever, the mouseover text read "Turolwen" as shown in the image below.

I can't vouch for the accuracy of any of these translations of course, though the Quenya is obviously incomplete, as a few English words and phrases are still being used.  Of course, I'm sure that many of the words Minecraft needs would not be in any canonical Tolkien source, and I think the Elven language people tend to be a little touchy about coinages -- it's just one of the things that can get them arguing.

In any case, it's cool to see people having fun with some conlangs.  In addition to the proper conlangs listed above, there is also a hilarious joke language called Pirate English in the options, and it's pretty much exactly what you would expect it to be.  And of course, there are a wide array of natural languages, too, which will of couse benefit Minecraft a bit more.

*Which, as I write this, does not list Esperanto, though I'm sure that will be corrected.

From My Conlanging Past

I was cleaning out my room today and found an old binder done up as a "spellbook" in Aerol (the predecessor to my constructed language Aeruyo -- which I am in the process of putting finishing touches on a grammar for).  It's been so long now that I have trouble deciphering the old Aerol script.  It doesn't help that I created a horrific featural monstrosity that I hope no one with dislexia would attempt to learn -- literally distinguishing characters by rotation.

But anyway, I thought I'd share some images:


I believe this cover reads "Sagal tan Hatal", roughly "Summonings and Wishes" or some such nonsense.  The small text might mean "Written in the Aerol Language, by Fondor", Fondor being a pseudonym I used to use (what's here is the Aerol reflex of "Fondor", of course).  The following are the spells I wrote that for the life of me I cannot read right now, and I don't feel like taking the time to decipher them (I lost the key to this script a long time ago, and it will take some time to figure it out.

Design Parameters for Romanization

As some of you may know, I am one of those übergeeks who actually likes to create languages for fun.  I even produce and host a podcast about the art of creating languages.  During that podcast, one particular topic has come up tangentially more than once.  That topic is romanization.  Many of the constructed languages I have seen have quite odd romanizations, though most have been understandable.  Of course, an odd romanization scheme is not necessarily a deal breaker:  Indeed, quite a few natural languages have quite annoying problems with romanization -- particularly those language for which the Latin alphabet simply isn't well suited (and there are a great many of those.

It has struck me that there are four competing design goals that a language creator (or indeed, a field linguist) needs to consider when creating a romanization scheme.  I will do my best to explain them: 

  • Elegance:  One of my priorities is to have as elegant a romanization scheme as possible.  This means trying my best to keep to a ratio of one grapheme per phoneme, minimize the number digraphs of diacritics, and over all make the romanization as simple as possible while expressing all the necessary information.  Certain aspects of your language's phonology can affect just how elegant your romanization can be.  For instance, if you have a large vowel inventory, you will have to resort to digraphs or diacritics.  If you have a three-way voiced-voiceless-aspirated distinction, you are probably going to have to use digraphs for one part of that, and if you make any significant use of tone you are almost certain to use diacritics.  This is also the pressure that militates against unnecessary apostrophes that have no phonetic use.  Ultimately and elegant romanization will have as few graphemes as possible while still leaving the phonemes of any given word explicit and unambiguous.
  • Accessibility:  If you want your conlang to be appreciated by people who are not linguistically savvy (an uphill battle at the start) or use it in a context where non-linguists will need to read the words, such as in fiction, then your romanization needs to be accessible.  This means that the graphemes you use should be easily understood by the target audience's language.  For instance, and English speaking audience should fairly understand that <kh> represents /x/ or something like it, and will be less likely to make a mistake than if you use <ch> or <x>.  However, for a Spanish-speaking audience, <j> is an even better choice, as it is used in Spanish exclusively for /x/.  Accessible romanizations, like elegant romanizations, will try to reduce ambiguity, but for accessibility one needs to consider not only the ambiguity among the language's own phonemes, but with the target audience's language as well.  Thus, languages that would use <c> for /k/ in all positions lose some accessibility with an English-speaking audience (though Welsh speakers would have no problem).  I should note that accessibility need not militate toward giving readers the correct native pronunciation, which is often not possible purely through orthography (how do you tell an English speaker there is an ejective in a word without some explanation?).  They merely need to be able to produce a passable approximation, or an appropriate Anglicization/Hispanicization/etc, particularly where proper names are concerned.  How often do you hear a news announcer pronounce a foreign name in a non-Anglicized manner?  How about when those names are not Spanish or French in origin?
  • Aesthetics:  Many language creators will use certain artistic preferences when designing an orthography.  For instance, someone may not like the letter <y> and prefer to use <j> or <i> for all instances of /j/ for no other reason.  In my experience, aesthetic considerations are among the most frequent reasons for language creators to make odd choices in romanization.  Why else would Teonaht use <ht> for /θ/ if not for an odd aesthetic preference on the part of the author.  And since artistic preferences are all over the map, a priority placed on aesthetics can lead to some pretty strange orthographies.
  • History:  This is not actual history, but world-internal history.  Some conlangers derive their languages from real world languages written in the Latin alphabet, and thus understandably derive their spellings from those real world spellings.  Others develop complex histories for their languages, and thus may decide to make certain choices based on spellings that would have made sense in earlier forms o the language, particularly when such choices jive with the native script.  This seems much less common in constructed languages than in the real world, though part of that may come from the fact that many real-world romanization schemes were actually created at an earlier stage of the language (think of the Postal Map romanization of Chinese, which uses <k> for both /k/ and /tç/ because the sound change that produced /tç/ was still in progress when the romanization was devised).

 The above design goals are by no means the only factors involved in creating a romanization.  Obviously the phonology of a language is a key factor.  As I mentioned above, many phonological choices can severely limit how elegant you can make your romanization, and it also can put a limit on how accessible it can be made.  Certain phonological features might be treated differently depending on priorities, however.  For instance, a conflict between elegance and accessibility to English speaker seems to be the reason some romanizations of Japanese represent /si/ as <si> and others write it as <shi> (though differing opinions on how to analize Japanese [ʃi] may also come into play -- romanizing natlangs is soo much more complicated).

 Think of a language with heavy lenition.  A conlanger who prioritized elegant romanizations would likely represent the lenited consonants the same as the underlying phonemes in all cases.  Someone concerned with accessibility would probably represent the various lenited forms differently from the underlying phonemes.  Someone interested in aesthetics would choose whatever they felt looked better, perhaps even creating a deliberately obtuse system for denoting lenited forms because they felt like it.  And the historical conlanger might decide to represent them according to the older forms, perhaps before the sound changes leading to lenition occurred, thus producing something similar to the schema used by the elegant conlanger.

 Some language creators may apply different design priorities in different areas.  For instance, Tolkien bowed to aesthetics over accessibility when he chose to use <c> for /k/ in nearly all positions in his Elven languages, a fact known painfully by any fan who mistakenly pronounced Celeborn as /sɛlɛbɔ˞n/ and was corrected for it, but he admittedly introduced the dieresis for reasons of accessibility, saying it was to disambiguate vowels that could be interpreted by English speakers as part of a digraph, part of a diphthong rather than a sequential vowel, or silenced -- such as <e> at the end of a word after a consonant. (How successful he was is hard to say, given that English speakers often ignore diacritics.).  I doubt that anyone could really be described as relying purely on one design parameter.  Even someone who cares only about aesthetics might need some way to break a tie between two graphemes they like equally for a given sound.

My own preferences hew toward prioritizing elegance and accessibility, with English speakers as my target audience.  Thus, I try to represent as many phonemes as possible with a single letter, never use <c> for /k/, only use <'> for the glottal stop, etc.  As for the lenition example above, I would represent them as their underlying form except where the lenited forms also exist as phonemes in the language, in which case I would represent those phones as the phoneme associated with the lenited form.  Thus, I strike a balance between elegance and accessibility.  I don't necessarily advocate that position, as I cared much more about aesthetics and very little for elegance when I started conlanging, and I don't find a particular problem with it, despite my tendency to have negative feelings toward <c> for /k/.  I hope that people who read this might simply use it to better understand people's romanization choices, or even as a way to think about their own choices, since, in my opinion, mindful art is often better art.  And romanization really is an art, particularly in the world of conlanging.

EDIT:  I made an error in the previous version of this post, claiming that Wade-Giles uses <k> for /tç/.  In fact it actually uses <ch>, making it more-or-less up-to-date. If anything, Wade-Giles is simply less elegant than modern pinyin (with some attempt to be accessible, though it's difficult to make a Chinese romanization truly accessible).

WOTD Defense: Don't be a statistic

Today's WOTD was the usage of statistic in various stock phrases along the lines of "I don't want to be a statistic," or "Don't be a statistic."  The rationale for hating on this was that, according to the email read on the show, that you cannot avoid being a statistic, that no matter what you do, you are part of one statistical group or another.

This brings in one of the most common fallacies by usage mavens and regular folks everywhere -- trying to apply mathematical logic to language.  It's the same logic that is used to argue against "double negatives" (which I prefer to call negative concord or negative agreement, but I won't get into that here) by claiming that "two negatives equal a positive".  In this case, the peevologist is applying a strict definition of statistic something along the lines of "a member of a statistical group".  I would argue that there are two more useful ways of approaching this problem:

  1. You could propose that statistic has a secondary, figurative meaning of "someone who, through action or inaction on known risk factors, has put themselves in a negative statistical groups (ex. smokers with lung cancer).  This allows us to explain these various phrases all at once, though it does require the qualifier that this usage is fairly restricted.
  2. Alternately, you could consider the phrase be a statistic is an idiom.  In linguistics, an idiom is a phrase that has a meaning that cannot be arrived at by analyzing the components.  For example, nothing in the idiom kick the bucket tells us that death is involved, native speakers simply memorize the definition "to die" for the whole phrase.
I think that the second is the more elegant explanation.  But whichever way you slice it, be a statistic seems to, in fact, be a great way to express an idea that would otherwise take much longer: "to be negatively affected by something due to known risk factors that I failed to mitigate through personal behavior".  Just tell me, which of those would you rather type?

WOTD Defense: Unpack and Netiquette

So, today the Word of the Day on the Morning Stream was unpack in the sense of "to analyze (a news announcement, event, speech, etc.)".  Scott Johnson specifically stated that unpack should only be used for luggage.  That seems to me to be an unnecessary limiting to me.  Words take on figurative meanings all the time, it's part of how language extends itself.  What's more, it has a less formal feel than the synonym analyze (which is derived from Greek, whereas unpack uses a native Germanic root).  I suppose that another synonym break down might have worked just as well, but I don't see how using unpack in this sense causes any confusion.

I also want to talk a little about yesterday's discussion on netiquette, which, since I didn't watch live, and so didn't write a defense for.  Netiquette itself is one of those wonderful neologisms of the Internet age, a portmanteau of net + etiquette.  A lot of people hate these words simply because 1) they are new (or perceived as new) and 2) they represent the Internet culture that is "rotting our children's minds".

What I found more interesting was the discussion during that segment on the role of dictionaries.  Many people seem to have some sort of odd mysticism about dictionaries, as if inclusion in a dictionary somehow makes a word "real".  This also leads some people to object to "unworthy" words being included.  It might make sense for a usage dictionary or a technical dictionary to be selective in that way, but dictionaries are ultimately about documentation.  The Oxford English Dictionary in particular draws particular negative attention for inclusion of certain word, despite the fact that the mission of the OED is quite the opposite of a language authority:  It is a historical record of the English language.  Thus, inclusion OED means nothing other than the fact that a word is common enough in their corpus to be included.  (They have some criteria, but it's mainly that.)  Criticizing the Oxford English Dictionary for recording a word is a bit like criticizing Scott for making podcasts, it's exactly what they set out to do.


WOTD Defense: Anticlima(c)tic

I really, really enjoy The Morning Stream.  If you haven't heard of it, it's a morning show at 8 am Mountain Time (10 am ET), done by Scott Johnson and Brian Ibbot of the Frogpants Studios Network.  In addition to the livestream, it is also put out as a podcast for those who can't listen in the morning.  It's the perfect background stuff to put in the background as I do other, usually undemanding, things, like check on my podcast site, fill in dictionary entries on Aeruyo, or even do important but tediously boring paperwork.  Do be prepared for long episodes, though -- especially on Thursdays.

That said, I would like to say I hate, hate, HATE the Word of the Day segment at the beginning of the show, where they choose a word, usage of a word, or a variant of a word and decide to ban it.  You see, I am a bit of a linguistics geek, and as such I almost always take the descriptive approach to language -- I do not see alternate variations as "wrong".  In fact, they are often interesting in their own right.

Don't get me wrong, everything Scott and Brian do is all in good fun, they are taking a common trope in the media of making highly personal and emotionally charged usage advice and having fun with it.  I have no doubt that they don't actually expect the words they "ban" to disappear from the lexicon.  However, there words they discuss often come from interesting processes.  So I thought maybe taking a moment to discuss where a word comes from might be more interesting than this simple "Oh, man, I hate that word soo muuuch!"

So let's get to it

The Word of the Day today is a phonological variant of anticlimactic, /ˌæn.ti.klajˈmæ.tɪk/, that is, anticlimactic pronounced without a /k/ before the second <t>.  I think the argument against it involves it being confused with *anticlimatic, which I am not certain exists as a common word, though it could conceivably be created with the same rules that created anticlimactic.  I would argue, however, that given what I would guess of meanings for *anticlimatic, context will very easily clear up the distinction in almost all cases.

What is happening in anticlima(c)tic is just a simplification of consonant clusters.  /kt/ is a somewhat difficult cluster, consisting of two consecutive stops pronounced in two very different points of articulation (places in the mouth).  It only makes sense that some speakers would simplify this difficult cluster by deleting one of the sounds.  This is fairly common in English, given its very large number of allowable clusters -- its the reason you might delete the second /f/ in fifth or not pronounce the plural marker -s in a complex word like ghosts or strengths, especially in running speech.

In summary, given the fact that English speakers regularly simplify difficult clusters with no problem, and the fact that the alternate pronunciation of anticlimactic with a simplified cluster is not likely to cause confusion, I would say that this word does not need to be banned.  In fact, in the future I predict one of two things -- either the simplified variant of anticlimactic will be the norm, or, if the more complex form persists far in the future, the /k/ will perhaps be dropped and replaced by another distinction -- perhaps the /t/ will geminate, or lengthen, or perhaps English will develop a tone system like Chinese languages have, with the historical /k/ affecting the tone of the previous syllable.

In short -- don't ban this word pronunciation!

What tense for a video game manual?

A while back friend of mine asked me for some writing advice.  She was writing documentation for a video game and, English being her second language, she was unsure of what tense to use when writing a narrative.  I mentioned that, depending on how the story is presented, either past tense or present tense could be appropriate.  Past tense, of course, is typical for written fiction, and is what is used by almost all Anglophone authors, but present tense seemed more appropriate for what she was going for, since she planned to describe the player's expected actions within the description, something that wouldn't sound right as a past-tense narrative.

Then she asked me "Yes, that is what the player going to do, so why not future tense?"

That got me thinking.  Future tense narrative is very rare, but I'm not entirely sure why.  The only reason I can think of is that English actually does not have a dedicated future tense.

Confused?  If you've had an introductory linguistics class you might have learned that while traditional grammarians refer to past, present, and future tenses, English in reality only has two tenses: past and non-past.  What is traditionally referred to as the "future tense of the verb" is a construction of "will + V".  But "will" doesn't really mark simply for future tense.  It is a modal verb with a whole list of usages (you can find a good list on Wikipedia.)

But that doesn't quite explain it.  Spanish does have a ture future tense, albeit not commonly used, but as far as I know, future tense narratives aren't too common there either.  This makes me curious about other languages with tense systems.  Maybe future tense narrative isn't common anywhere.  After all, most stories are told about events in the past -- we can't really know the future in that kind of detail.

Anyway, what my friend and I settled on was actually a hybrid present-future narrative.  The general background of the game was given in present-tense, while the expected actions of the player used a future narrative.  This seemed like a fairly natural narrative to me for this specific purpose: the actions of the player are future events, because the player (who may be reading the synopsis) hasn't actually started playing yet.  I would be curious as to how others would approach the problem, though.

Two Tsunami Posts

Usually, I prefer to do separate topics for Mil Palabras and 千字作文, maybe with a connecting theme, maybe not.  But this week I decided that I would do something different.  I decided to do both pieces on the Sendai earthquake and subsequent tsunami.  I did this not because of laziness, but because I was personally affected by the news, despite the fact that I was no where near any of the affected areas, and as far as I know none of my friends has been harmed (though I do worry about at least one Japanese friend).

Of the two posts, what I wrote for 千字作文仙台地震:全太平洋的灾难) is a little more "newsy", while what I wrote in Mil Palabras (El tsunami y yo) gets more into my personal feelings.  But one thing I suggested in both is: do whatever you can to help, if you can.  And there are tools to help if you can't find someone.  That is all.

Thet foreign talk is up

Today I have posted my first entries for Mil Palabras and 千字作文.  Well, actually, I had already posted a 千字作文 for last week, so now I have two.  My 千字作文 for last week covered a recent story where a few mummies and other artifacts from Xinjiang were pulled from an exhibit in Philadelphia at the request of the Chinese government.  This weeks posts are less "newsy", with both going for personal accounts about language learning.  In this week's Mil Palabras (the first!) I talked about how my experience learning Spanish helped me along when I decided to learn Chinese.  In 千字作文 for this week, I talked about some other advantages I had in learning Chinese.  Feel free to read, enjoy, correct and complain.

I'm feeling pretty good about these projects as ways to build and maintain my language skills.  Time will tell whether I will be able to keep up with two essays a week into the future.  For now, I think I already have a good topic for next week's Mil Palabras, thanks to a friend in Mexico -- so long as nothing else strikes me.  If anyone else has suggestions for topics, don't hesitate to send them to me.

Signs in Chinese

So, Egypt is in everyone's news today, but I came accross a particularly curious story today that tickles a couple of my fancies.  Victor Mair posted on Language Log today a couple photos of protestors holding signs that feature Chinese.  Here are the signs:

I won't bore people with translations and analysis of errors when Mr. Mair has already done that job, but I do find the use of Chinese interesting here.  The theme seems to be "Hosni Mubarak doesn't seem to understand Arabic", as a proxy for the sentiment that he doesn't understand the Egyptian people.  I've heard of similar uses of English in the protests, so I'm guessing these protesters decided to add the second most widely spoken language to cover more bases.  What's next?  Spanish? Hindi?  Or maybe something more obscure.  In any case, the Egyptian people are making it very clear that they want President Mubarak to leave.

Two new personal projects

As I look ahead toward graduation, I realize that I must very soon start finding more opportunities to practice my language skills so that I don't lose them.  One of the hardest things to find practice for outside of formal classes is writing, so I have decided to start two new blogs specifically to practice writing in my two secondary languages.  Each week, I will write one thousand characters in Chinese and one thousand words in Spanish.  Both will be posted here on the site for everyone to read, comment on, and correct errors.  So, for Chinese-speaking readers, please check out 千字作文, and Spanish-speaking readers please look at Mil Palabras. My first proper essays will be coming either this week or next, as time permits.

Bing is not (only) "disease" in Chinese

There is an urban legent that the Chevy Nova had to change it's name in Mexico because it could be interpreted as "no va".  it's a cute story, but it's false.

Now, apparently there is a little claim going around that Bing, the brand name of Microsoft's search engine in fact sounds like the word for "disease".  And this is based on what?  A fortune cookie. Of course, the fortune cookie is right in this case (they usually aren't bad, but never take them seriously), there is a character 病 that is pronounced bìng* and means "disease."  However, one thing that you can always count on in Chinese, especially with single-character words, is that there are homophones and near-homophones that are just as likely.  Lots of them:


What matters is what characters you use to transliterate it.  It should be noted that Google could have been transliterated to mean "skeleton", but the company wisely found a couple characters that could loosely mean something like "valley song" (in other words, a nonsense phrase with non-offensive characters).

Of course, Microsoft seems to just want to dodge it altogether.  Their China site has no transliteration of their name, just the name in Latin characters:


Not sure if what the deal is there.  Maybe they expect Chinese people to pronounce it as pinyin and be done with it (ok for mainlanders, but what about Taiwanese people who don't learn pinyin in school?)  In any case, if Microsoft has any Chinese speakers in their marketing staff, I'm sure they will never, ever brand themselves as "disease".  Some nasty netizen might make fun of their name, but I don't think there's much chance of avoiding that in Chinese.

*the pronunciation may or may not be the same as various English realizations of <Bing> but I won't get into that right now.


Last weekend I finally got a chance to see Avatar.  The film had been delayed in China until January 2, and from what I hear about it, it's unlikely that I would have been able to see it at that time, if I had tried (as it was I just waited until I was back in the states.

I'd already read a few reviews of it, both positive and negative, so I knew what to expect.  The story was actually a bit better than I had thought from the reviews, but it was still very much suffering from the noble savage and white guilt tropes (those aren't necessarily bad, though), and I do see why people have objected to the hero being a white American who not only assimilates into Na'vi culture but becomes better than them at everything they do in a very short time (the second bit is the key to the objection).  However, I had to agree with my brother who mentioned the Avatar body as being "liberating" for the paraplegic protagonist.

I was impressed by the depth of the world and the alienness of the creatures living there.  The world of Pandora is beautifully rendered and at no time did I detect a flaw in the CGI -- in fact, I didn't even think about it most of the movie.  Like others, I noticed the conspicuousness of the humanoid Na'vi on a planet where all other land animals have six limbs, a second pair of eyes, and breathing orifices on the underside of the body, particularly when much of the world uses realistic science to make fantastic landscapes (those floating mountains are not magical in the least).  I do, however, think it is a good alien design for the purpose -- there are a few things that will take people out of their comfort zone (the neural link takes on a whole different meaning when you find it not only links to other animals, but is also used during mating -- though in my mind it makes it more plausible as far as evolution goes).

Plus, too much alienness in the Na'vi could have messed with one of the reasons I saw the movie: the language.  I've tried creating languages, or conlanging, a bit myself, and when I had read that a linguist consultant was hired to construct the language I knew I wanted to see the movie, and I think this language could possibly achieve its goal of "out-Klingon Klingon". I have tried to find as much information about it ever since.  The consultant, Paul Frommer posted a sketch of the language at Language Log, and I know of a fan site that is trying to make sense of what materials have come out.  Certain bits of the romanization (which I hear were decided from above) irk me, (x marks ejectives when ' is being used for the glottal stop?) but I do think that the language has a beautiful sound to fit the beauty of the Na'vi while still being somewhat unconventional.  I would like someday to see a developed constructed language for aliens that actually used some non-human sounds, but I can understand Cameron's desire for actors to perform their lines without manipulation.  In any case, don't be surprised if you hear me calling someone a "skxawng" (if I can get the pronunciation down, that is :P ).

Do Chinese-Americans speak better Chinese than me?

I recently got into an interesting discussion at Ben Ross' Blog about language competency and Huaqiao*.  The original post topic was on how to get a job using one's Chinese skills, and the main point of it I am in total agreement on:  Unless you are specifically interested in something like translation, knowing a foreign language is not enough to get you a job.  Plenty of people have told me that my languages (English, Chinese, and Spanish) would make me a good candidate for companies that do international business and they will be valuable in getting a job, but as I approach graduation I have got to thinking that, while speaking more that one language is definitely an advantage and I would encourage anyone to learn a foreign language, I haven't quite learned any skills that I can apply that language too.

But that's for another post, what sparked the discussion was this:
What this means is that not even counting the hundreds of thousands of American currently studying Chinese as a second language, there are already over two million Americans, who by virtue of growing up speaking Chinese, speak the language better than you ever will, regardless of how much you study.

Myself and another commenter took issue with that statement.  While there are a large number of people in the United States who speak Chinese at home, their children are not necessarily going to be that good at Chinese.  Here's a basic sample of what I put in:
I’m here at a special language program at Zhejiang University and have met several huaqiao here studying Chinese for one of two reasons:

1) They spoke a fangyan at home and had little or no exposure to Mandarin.
2) They can speak Mandarin, but never learned to read.

I think as more Mandarin-speakers move out into the diaspora and more Mandarin-language schools start popping up, there will gradually be more huaqiao that are competent in Mandarin, but it won’t necessarily mean they will all be better than a non-native. Language loss happens in a lot of immigrant groups — I also speak Spanish and I have met a few Hispanics with limited vocabulary or who never really learned Spanish at all (at home, that is), in the US the general rule is that immigrants lose their “mother language” in the third generation.

To expand a bit, I have met Huaqiao here who have no literacy in Chinese to speak of, and others who had noticable foreign accents or who spoke no Chinese at all.  Back in the states I have met Hispanics with very good Spanish ability, but also a Mexican American who did not learn Spanish until college and now speaks Spanish with a very noticeable West Virginia accent.  So, while there are definitely people back home that speak either of those languages natively, they wouldn't represent the whole of the immigrant group, and even those who do speak natively may not have any professional vocabulary to speak of.  A native speaker of Mandarin would definitely have an edge on me -- my Mandarin is still no where near it would need to be to actually do any kind of serious professional work.  A native Spanish speaker would to, if he has the professional vocabulary to back it up (and that still might not take much, as my Spanish is slipping through lack of exposure.

Ben later clarified that "expanding ones ideolect to include intelligent terminology and industry jargon is not one of the more demanding aspects of language learning."  I tend to agree with that statement, once you know the language to a certain point, new vocabulary is not so difficult.  But native fluency is definitely not a golden ticket, just as having a second language in itself is not a golden ticket into international business.

Anyway, enough recap, if I haven't bored you to tears and you are really interested in this stuff, read the original discussion thread.  Poke around on the other posts over there, too, Ben has a lot more experience than I have with China and Chinese learning, his stuff is a lot more informative than mine.

*Huaqiao 华侨 "Chinese diaspora" (#6 here)

English Names of Chinese people

ChinaSMACK has a contest up to win a copy of the book In China, my name is... which focuses on the phenomenon of Chinese people choosing English names for various reasons.  All you have to do is post a comment to their contest post with a story or stories about Chinese people's English names or foreigners' Chinese names.  My (slightly obnoxious) posting can already be found in the comment thread (Sorry Burr, I stole your story before I thought about sharing the link, maybe you can post a more accurate version :S ).

Note: Mom I think you know this, but just to remind, ChinaSMACK is not safe for middle school (sorry, but you'll see why pretty quick).