linguistics – sardonick http://motespace.com/blog Disclaimer: The following web space does not contain my own opinions, merely linguistic representations thereof. Fri, 14 Oct 2011 16:26:45 +0000 en-US hourly 1 https://wordpress.org/?v=4.6.1 The Problem with Linguistics http://motespace.com/blog/2006/12/10/the-problem-with-linguistics/ Sun, 10 Dec 2006 21:34:14 +0000 http://fairuz.isi.edu/blog/index.php/2006/12/10/the-problem-with-linguistics/ via LanguageLog:

Linguistics will become a science when linguists begin standing on one another’s shoulders instead of on one another’s toes.

–Stephen R. Anderson’s A-Morphous Morphology (Cambridge University Press, 1992):

]]>
Finished the Dissertation Proposal http://motespace.com/blog/2006/06/19/finished-the-dissertation-proposal/ Mon, 19 Jun 2006 07:37:08 +0000 http://fairuz.isi.edu/blog/index.php/2006/06/19/finished-the-dissertation-proposal/ Ahhh, I’m done. Now, don’t that feel good. 71 pages on building a computational model of language learner errors. Phew, now to sleep.

]]>
CALICO 2006 Call for Papers http://motespace.com/blog/2005/09/21/calico-2006-call-for-papers/ Wed, 21 Sep 2005 16:32:31 +0000 http://fairuz.isi.edu/blog/index.php/archives/2005/09/21/calico-2006-call-for-papers/ CALL FOR PARTICIPATION CALICO 2006 ANNUAL SYMPOSIUM Online Learning: Come Ride the Wave Hosted by University of Hawaii at Manoa Honolulu, Hawaii May 16-20, 2006 Preconference Workshops: Tuesday, May 16 - Wednesday, May 17 Courseware Showcase: Thursday, May 18 Presentation Sessions: Thursday, May 18 - Saturday, May 20 Use CALICO's on-line proposal submission form at http://calico1.modlang.txstate.edu or click on CALICO 2006 on the homepage: http://calico.org You will need to register on the site ("Proposer registration") before being able to submit. DEADLINE FOR PROPOSALS: OCTOBER 31, 2005 All presenters must be current members of CALICO by the time of the conference and are responsible for their own expenses, including registration fees. The Computer Assisted Language Instruction Consortium (CALICO) is a professional organization dedicated to the use of technology in foreign/second language learning and teaching. CALICO's symposia bring together educators, administrators, materials developers, researchers, government representatives, vendors of hardware and software, and others interested in the field of computer-assisted language learning. For more information or if you have questions or problems, contact Mrs. Esther Horn CALICO Coordinator 512/245-1417 (phone) 214 Centennial Hall 512/245-9089 (fax) 601 University Drive http://calico.org San Marcos, TX 78666 e-mail: info@calico.org or ec06@txstate.edu ]]> Learner Language Modeling at NICT http://motespace.com/blog/2005/08/22/learner-language-modeling-at-nict/ Mon, 22 Aug 2005 19:06:11 +0000 http://fairuz.isi.edu/blog/index.php/archives/2005/08/22/learner-language-modeling-at-nict/ About a month ago, some researchers from NICT (Japan’s National Institute of Information and Communications Technology) came to visit ISI and give a series of short presentations on their work. Among those presenting was Emi Izumi, a woman who is involved in research very similar to mine. She, and a few others over there, have been working on modeling mistakes in learner language–specifically, typical Japanese school-taught learners of English. I expect they have encountered much less logistical details than we have with tactical language (namely, shortage of language-learner-speakers, native-speaker-annotators, and pre-existing speech data models)… lucky them.

Interestingly, their work is very complementary to my own–while I have concentrated on phonology-related errors, they have put more effort into syntax and morphosyntax. It looks like there’s a lot of future for further cooperation here =).

They have also created a healthy-sized annotated database of learner speech, the NICT JLE Corpus. In accordance with their research, the corpus is rich with syntactic errors (but, unfortunately, mispronunciations are replaced with learner-intended words where they are understandable).

I’m curious how I can use this corpus to benefit my own research. While I expect many errors to be language-dependent–unique to the interaction between the L1 and L2 involved–I am sure there are some language universals that come into play–and, as I’m dealing with a paucity of data, I can at least use a Japanese model as a bootstrap.
Of course, once I get enough data, it will be really cool to compare relative statistics–get a glimpse of what exactly is universal…

I have uploaded a few of Izumi’s papers here, to my citeulike page.

]]>
Mad Paper-Reading http://motespace.com/blog/2005/06/11/mad-paper-reading/ Sun, 12 Jun 2005 01:36:11 +0000 http://fairuz.isi.edu/blog/index.php/archives/2005/06/11/mad-paper-reading/ Expanding our Eurospeech paper (which we found last week was accepted!) on modeling language learner spoken disfluencies into a full journal paper. Been re-acquainting myself with the masses of related work this weekend.

A fun side effect is that I don’t feel so alone in my research any more. Here, where I sit, at the dovetail of Second Language Acquisition, Natural Language Processing, Artificial Intelligence and Automatic Speech Recognition, I don’t get to meet people who deal with the same questions I deal with day-to-day. It’s not exactly pure interdisciplinary work, but it’s definitely in the same order of merging of academic cultures and demands.

But, it’s always nice to realize there’s other people (even if only a handful) who’ve trodden this same road.

]]>
Basal Ganglia, Birds and Humans, Chirps and Words http://motespace.com/blog/2005/04/26/basal-ganglia-birds-and-humans-chirps-and-words/ Wed, 27 Apr 2005 04:55:53 +0000 http://fairuz.isi.edu/blog/index.php/archives/2005/04/26/basal-ganglia-birds-and-humans-chirps-and-words/ Via Great Minds Working and Slashdot: neurobiologists at MIT are studying the role of the basal ganglia in bird songs, in an effort to learn more of the BG‘s role in human L1A (first language acquisition) & language processing.

Here’s MIT’s press release

I think the fact that this comparison between birds and humans is of particular interest, especially in the way that it calls established evolutionary theories into question. Most all evolutionary accounts of language I’ve read about have said human language arises out of gesture and social interactions in primates. Given, however, that that birds are so distant from humans from an evolutionary standpoint, this “parallel evolution” between chirps and words, songs and paragraphs, could help us understand a different story about the low-level functioning of the BG. It could be parallel evolution, or it could be that the human basal ganglia wasn’t honed by evolutionary pressures of social interaction after all–that it was already sufficiently developed for these tasks in the brains of animals simpler than primates.

Here’s the full research study (which I will read in my copious spare time).

]]>
Thoughts on Cognition, Language Acquisition, Hard and Soft Sciences http://motespace.com/blog/2005/03/22/thoughts-on-cognition-language-acquisition-hard-and-soft-sciences/ Tue, 22 Mar 2005 23:44:53 +0000 http://fairuz.isi.edu/blog/index.php/archives/2005/03/22/thoughts-on-cognition-language-acquisition-hard-and-soft-sciences/ Last week marked the end of my Second Language Acquisition class with Dr. John Schumann over at UCLA. The class was amazingly good. Dr. Schumann is an old-school applied linguist who, halfway through his career, decided that studying applied linguistics from a cognitive psychology background was futile without more practical grounding in how the brain actually works. So, he decided to pick up neuroanatomy, in his spare time. He now does research in second language acquisition, but from a highly neuroanatomy slant. The majority of the class, therefore, was spent contrasting the acquisition of language from a psycholinguistic perspective to acquisition of language from a neurolinguistic perspective.

Now, The continuum of science in academia has always been a fight between “hard” and “soft” sciences. On the soft side, we have the humanities and social science. On the hard side we have chemistry and physics (aside: is String Theory a hard or soft science? On one hand it tries to be theoretically robust, on the other hand, it hasn’t been verifiable/falsifiable, so it cannot be grounded in reality). Because hard sciences are usually taken more seriously, I would say that most sciences that are on the fringe between hard and soft (e.g. psychology, linguistics, and specifically second language acquisition) usually try to establish themselves as “hard” rather than “soft”. Thus sprung psychological behaviorism, and psycholinguistics out of their mother fields. Second Language Acquisition (especially in the context of applied rather than theoretical linguistics), attempting to align itself with the “hard” side of things, has also cozied up to behaviorism and cognitive psychology in an attempt to “prove” and itself to itself and the rest of the world.

The problem with this, Schumann says, is that no matter how far researchers follow cognitive psychology, it won’t ever lead Second Language Acquisition into being a “hard science”. What is missing, and what cognitive psych can’t ever give, is indexicalization of the brain. That is, saying “this area processes grammar”, “that area processes phonemic distinction”. (Wernicke’s area and Broca’s area are vast oversimplifications of what really goes on, so don’t get me started on that. But I digress). The problem with cognitive psychology from a “hard science” perspective is that it creates these cognitive models, these black boxes (it calls them “cognition”, “motivation”, and other names), but they are just models that approximate observed behavior in humans–and, like theoretical linguistics, it’s easier to get caught up in the pursuit of a model than finding empirical proof of the physical real-world counterparts to the models (and that, right there, was my major bugaboo with theoretical linguistics). Where, Schumann asks, is the localized referent for “mind”? Where in the brain is the “cognitive” center? And do we dare even begin to ask about “consciousness”?

This is not to say that no theoretical models have specific physical referents in the brain. It is very established that the temporal lobe processes vision and simple motivation. But vision (and even more so, language) is spread through a bunch of places in the brain. It seems like the system is too emergent for us to pinpoint specific areas to which to tie down our black boxes.

What is cognition? Cognitivists say it is “computation on representation”. Neurolinguists, by contrast, say it’s not such a unified thing–that what we call cognition is an amalgam of specialized functions (perception, memory, fear analysis, reasoning, …). And so, with no unified referrent for a theoretical model, “cognition” (and second language acquisition that’s motivated by cognitivist theory) can never become a truly hard science.

Structurally, it’s the difference between the brain being a turing machine, and being just turing-compatible.

]]>
TactLang on Slashdot http://motespace.com/blog/2005/01/04/tactlang-on-slashdot/ Tue, 04 Jan 2005 19:28:38 +0000 http://fairuz.isi.edu/blog/index.php/archives/2005/01/04/tactlang-on-slashdot/ Neat: our Tactical Language project was featured on slashdot this morning.

The main link in the article pointed to a journal paper written by Ravi Purushotma, documenting his vision of customizing The Sims to teach German. From reading his paper (and his updates to the paper), it is unclear if this Sims system is more vapor-/concept-ware or an actual implemented system, but it looks interesting. I’m not sure how pedagogically effective it could be, though. The Sims certainly a good motivator that will encourage learners to use the software, but is the vocabulary that the learner is exposed to going to be useful beyond the game environment?

This is one thing that working on TactLang has really impressed upon me: the user does not have an infinite amount of time, especially not to spend on any language teaching program that you might want to develop. If we’re trying to give the learner basic language functionality through a scant 80 hours of teaching, we’d better be very intentional about what we do/don’t include in our curriculum, and very confident that our AI-driven pedagogical feedback is as effective as possible.

And, as an aside: also mentioned in the article is a link to MIT’s SCILL, which focuses on computer aided language pedagogy for Mandarin Chinese. I saw these guys back in Venice at the InSTIL conference last summertime–very promising stuff they have. We were originally thinking of focusing on Chinese as a target language for TactLang, but speech recognition over a tonal language was a headache we didn’t want to worry about =p.

]]>
Home English Home http://motespace.com/blog/2004/12/12/home-english-home/ Mon, 13 Dec 2004 02:21:43 +0000 http://fairuz.isi.edu/blog/index.php/archives/2004/12/12/home-english-home/ A wonderful bit of web-based English pedagogy. Perhaps I can use this as inspiration for my work in our Tactical Langauge Training System.

]]>
Modeling Second Language Learner Speech http://motespace.com/blog/2004/12/03/modeling-second-language-learner-speech/ Fri, 03 Dec 2004 20:50:30 +0000 http://fairuz.isi.edu/blog/index.php/archives/2004/12/03/modeling-second-language-learner-speech/ A week from now I’m giving a talk on creating a language model for second-language-learner speech (basically, my PhD research up to this point, and what will eventually become my thesis).

Information:

Speaker: Nick Mote
Date: 10 Dec 04
Time: 3:00pm – 4:30pm
Location: Information Sciences Institute (Marina Del Rey, California)

Abstract:

ISI’s Tactical Language Project is a system designed to teach Americans how to speak Arabic through a video game environment. We’ve taken a FPS engine (Unreal 2003), added skins and maps so it looks like you’re in a typical Lebanese village, taken away the guns, added speech recognition, and set the player in the middle of it all. The theory is that if you learn well in a classroom, you’ll perform well in a classroom–but if you learn well in a pseudo-naturalistic environment, you’ll perform better in real life. My research comes into play because speech recognition is a hard thing–especially when you’re trying to understand language-learner speech, with all of its mispronunciations, disfluencies, and grammatical errors. Understanding speech is hopeless unless you have a good approximation of what kinds of mistakes learners make, and can anticipate them.

Say an English learner says “Water”. Is he asking you for water? Is he telling you there’s a puddle in front of you? Is he saying his name is “Walter”, but mispronouncing it? There’s a lot of ambiguity involved. In order to disambiguate, we need to look at context, the learner’s past language performance, and details about the learner’s mother language as it relates to English, to be able to guess what he is actually trying to say.

And then, of course, once we have a good guess at what the learner has said, what do we do about it? How do we correct him? How serious are different speech disfluencies in terms of native listener comprehension, pedagogical objectives, and social politeness (the Lebanese word ra’iib (sergeant) dangerously close to the word rahiib (terrible) ). We want to take special corrective care to make sure learners don’t make errors like these). And how do we compensate for poorly-performing speech recognition (ASR works great with a lot of data, but there’s not too much annotated data of Americans learning specific subdialects of Arabic)?

This is basically what I’m doing. I use a lot of Natural Language Processing–primarily statistical NLP, with a bit of pedagogy theory and linguistic (SLA and phonology) theory sprinkled in.

Let me know if you want to come, I can give you more details. I’ll also put my presentation slides up here (once I’m finished writing them).

]]>