docoflove1974 Posted September 19, 2006 Report Share Posted September 19, 2006 This came on the wires of LinguistList; for those of you who are knowledgeable of the lingua latina, this might be something to look for in future...or perhaps even contribute! Call for Collaboration: Latin Treebank The Perseus Project has recently received a planning grant from the NSF to investigate the costs and labor involved in constructing a multimillion-word Latin treebank (a large collection of syntactically parsed sentences), along with its potential value for the linguistics and Classics community. While our initial efforts under this grant will focus on syntactically annotating excerpts from Golden Age authors (Caesar, Cicero, Vergil) and the Vulgate, a future multimillion-word corpus would be comprised of writings from the pre-Classical period up through the Early Modern era. To date we've annotated a total of 12,000 words in a style that's predominantly informed by two sources: the dependency grammar used by the Prague Dependency Treebank (itself based on Mel'cuk 1988), and the Latin grammar of Pinkster 1990. While treebanks provide valuable training data for computational tasks such as grammar induction and automatic syntactic parsing, they also have the potential to be used in traditional research areas as well. Large collections of syntactically parsed sentences have the potential to revolutionize lexicography and philology, as they provide the immediate context for a word's use along with its typical syntactic arguments (this lets us chart, for example, how the meaning of a verb changes as its predominant arguments change). Treebanks enable large-scale research into structurally-based rhetorical devices particularly of interest to Classicists (such as hyperbaton) and they provide the raw data for research in historical linguistics (such as the move in Latin from classical SOV word order to romance SVO). The eventual Latin treebank will be openly available to the public; we should, therefore, come to a consensus on how it should be built. To that end we encourage input from the linguistics and Classics community on the treebank design (including the syntactic representation of Latin) and welcome contributions by annotators (for which limited funding is available). Interested collaborators should contact David Bamman (David.Bamman@tufts.edu) at the Perseus Project. Quote Link to comment Share on other sites More sharing options...
Primus Pilus Posted September 20, 2006 Report Share Posted September 20, 2006 Very interesting news. Though I would suggest that the Perseus project as a whole upgrade their servers and potentially even the database software. Its a shame that that such an invaluable project is so abyssmal for surfing, page loading and searching. I rarely even try using it anymore (though the mirrors are a bit better). Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.