Computer automatically deciphers ancient language

Viggen · July 1, 2010

A new system that took a couple hours to decipher much of the ancient language Ugaritic could help improve online translation software. In his 2002 book Lost Languages, Andrew Robinson, then the literary editor of the London Times

Ludovicus · July 1, 2010

Best of luck to scientists trying to decipher ancient languages. However, computers have been inadequate in producing reliable translations from one modern language to another modern language. If you are bilingual, try looking at one of your languages translated into the second by any of the online translating engines, such as Babelfish, etc.

I tried translating a simple American greeting, "What's up?" into Spanish. The site <http://babelfish.yahoo.com/translate_txt> failed miserably.

Another, "He broke her heart" was rendered, "He (literally) tore apart her heart." Babelfish failed to translate the idiomatic expression. If I sound a bit over vexed by the post it's because at my school the office staff believed they could get a way with using computer generated translations to communicate with parents. The final products were largely incomprehensible and, in some cases, very insulting.

What a computer program will do with writing from a culture with whom we have no living contacts is not to be trusted.

Bryaxis Hecatee · July 1, 2010

Best of luck to scientists trying to decipher ancient languages. However, computers have been inadequate in producing reliable translations from one modern language to another modern language. If you are bilingual, try looking at one of your languages translated into the second by any of the online translating engines, such as Babelfish, etc.

I tried translating a simple American greeting, "What's up?" into Spanish. The site <http://babelfish.yahoo.com/translate_txt> failed miserably.

Another, "He broke her heart" was rendered, "He (literally) tore apart her heart." Babelfish failed to translate the idiomatic expression. If I sound a bit over vexed by the post it's because at my school the office staff believed they could get a way with using computer generated translations to communicate with parents. The final products were largely incomprehensible and, in some cases, very insulting.

What a computer program will do with writing from a culture with whom we have no living contacts is not to be trusted.

I did follow courses on this particular topic, going deep into the way those tools work and indeed a lot of tools can't be trusted because they often work on very large corpora and can't determine semantic values except when the corpora is limited : they are very good for exemple to translate medical knowledge bases. One exception is Google Translate who does not use grammars and lexica rules like everyone else but statistical analysis, which is one of the main reasons of some of their projects like google books.

docoflove1974 · July 1, 2010

Heh this is what I know I've been telling my students for 11 years now...and many more before me have as well...as of now, most computer translations don't work.

There are only a couple of dictionaries I use with my Spanish classes: one larger one (SpanishDict.com, which uses definitions from various sources) and a smaller one (WordReference.com), which is used in SpanishDict.com. And even still, for a single word they're great, but for a translation they're not good. Nothing is.

However...the issue of corpora is an interesting one. More and more there are larger and larger online corpora for many languages, particularly ancient languages. Perhaps with a dead language it's possible to do some further analysis into translation, simply because you don't have a live morphology or syntax that is changing. I'd be curious to see what really comes of all this.

Ludovicus · July 1, 2010

Heh this is what I know I've been telling my students for 11 years now...and many more before me have as well...as of now, most computer translations don't work.

There are only a couple of dictionaries I use with my Spanish classes: one larger one (SpanishDict.com, which uses definitions from various sources) and a smaller one (WordReference.com), which is used in SpanishDict.com. And even still, for a single word they're great, but for a translation they're not good. Nothing is.

However...the issue of corpora is an interesting one. More and more there are larger and larger online corpora for many languages, particularly ancient languages. Perhaps with a dead language it's possible to do some further analysis into translation, simply because you don't have a live morphology or syntax that is changing. I'd be curious to see what really comes of all this.

Very well put. Tho computer programs do a poor job translating most texts, technology can analyze many features of written language: word frequency, pattern analysis, morphology, etc.

Kosmo · July 2, 2010

There has been some serious developments in this area including software good enough to play Jeopardy or to be part of a dialog. Modern computers have the brain power and database size to handle translations but the programs are not good enough yet, maybe because there are not many economic incentives to put resources into that.

Bryaxis Hecatee · July 2, 2010

oh they are massive incentives, with a good deal of the EU research budget going into this realm (try managing a whole set of countries with at least 20 langages without some level of automatic translations) : that's how Systran and SGML/XML were born/made really public, thanks to EU money ! Without these I can assure you there would be much fewer linguists in university's departements, and linguistic would be far less advanced than it is now. My university did fail a fairly big project in that realm end of the 80's/start of the 90's.

But what one has to bear in mind is that translation is truly the matching of two different linguistic fiels : grammar and semantics and both need specific attentions. The current efforts in ancient languages for treebanks (i.e. http://nlp.perseus.tufts.edu/syntax/treebank/ ) and automatic word analysis (ie. http://www.aclweb.org/anthology/W/W08/W08-2117.pdf ) are done because they are areas for which the grammar 1) is stable 2) is well studied by about 400 years of science. But often even with these dictionnaries, grammars, etc. the system are really efficient on corpus they've been trained for (like a specific author, or litterary genre, or texts on a specific topic). Typically a system who has to translate a text like "the artefact was a long, slim, piece of bone. I identified it as a fibula" (to use a recent topic of UNRV) could either relate to a human bone or a clothing device, depending on wheter we are reading medical description or archeological texts...

Statistical analysis of the potential sementic value of words in the same area, along with semantic analysis of the structure of the document (like determined in TEI http://www.tei-c.org/index.xml or EpiDoc http://epidoc.sourceforge.net/ text format) might help the system to determine the lexical area in light of which the text must be translated, but even that isn't sure and requires human help. That's why Google, who was the first to have enough data for modern languages analysis, used the statistical approach and offered to translate any page, then asking for eventual human corrections from it's users : it does give Google a very good wy to improve it's statistics in a cheap way.

According to the test we ran during my course on automatic language processing, Google performed about 60% above other tools, but was more efficient in short texts due to their lower morphological and grammatical analysis.

docoflove1974 · July 2, 2010

Bryaxis, this is very interesting; I knew that Google was big in the document translation arena, and that with their resources are probably able to go to the forefront. I know on LinguistList there are a bevy of job listings for computational linguistics, especially with translation, and no doubt that there will be a breakthrough shortly.

Sign In

Computer automatically deciphers ancient language

Recommended Posts

Viggen

Link to comment

Share on other sites

Ludovicus

Link to comment

Share on other sites

Bryaxis Hecatee

Link to comment

Share on other sites

docoflove1974

Link to comment

Share on other sites

Ludovicus

Link to comment

Share on other sites

Kosmo

Link to comment

Share on other sites

Bryaxis Hecatee

Link to comment

Share on other sites

docoflove1974

Link to comment

Share on other sites

Join the conversation

Browse

Activity