Yet another Macronizer

Here you can discuss all things Latin. Use this board to ask questions about grammar, discuss learning strategies, get help with a difficult passage of Latin, and more.
Post Reply
Alatius
Textkit Fan
Posts: 278
Joined: Mon May 14, 2007 11:21 am
Location: Upsalia, Suecia

Yet another Macronizer

Post by Alatius »

I invite everyone to test my latest creation, a new Latin Macronizer:
http://stp.lingfil.uu.se/~jowi4905/macronizer/
Apart from marking long vowels, it also has the ability to convert the orthography to use v and/or j, if so desired.

This project was at least partly inspired by Felipe Vogel's Māccer program, which has two drawbacks, as I see it, which I wanted to overcome: the reliance on macronized texts for training, and the fact that it doesn't (as far as I understand) take the context of an ambiguous word into consideration. My macronizer instead gets information about vowel lengths from a morphological analyzer (Morpheus, of the Perseus Project). To choose between ambiguous forms, the text is tagged with a parts-of-speech (POS) tagger, trained on the Latin Dependency Treebank.

Any suggestions for improvements are very welcome. I'm especially interested in reports of faults that are due to errors in the Morpheus lexicon. Short of true artificial intelligence, it is of course unreasonable to expect a perfect result from the macronizer, so if you find that it in one place mixes up, say, mīseris and miserīs, that is probably what you should expect, and there is little I can do about it. If, however, you notice a macronized word form which shouldn't exist, or which is very rare and thus highly improbable compared to the correct form, then please report it. (For example, I just today noticed that paria is often erroneously marked as parīa, which is an alternative form of parēas, a kind of snake! That will be corrected.)

Nesrad
Textkit Fan
Posts: 315
Joined: Thu Nov 01, 2012 1:10 pm

Re: Yet another Macronizer

Post by Nesrad »

This is a very nice tool. Thanks a lot!

cb
Textkit Zealot
Posts: 762
Joined: Tue Sep 18, 2007 3:52 pm

Re: Yet another Macronizer

Post by cb »

hi, this is great thanks. i noticed that if you put in by itself "in re publica", it macronises the a, but if you put in the first sentence of cicero's de oratore, it doesn't macronise the a in "in re publica", and gives:

Cōgitantī mihi saepe numerō et memoriā vetera repetentī perbeātī fuisse, Quīnte frāter, illī vidērī solent, quī in optimā rē pūblica, cum et honōribus et rērum gestārum glōriā flōrērent, eum vītae cursum tenēre potuērunt, ut vel in negōtiō sine perīculō vel in ōtiō cum dignitāte esse possent;

perhaps look into that? cheers, chad

Nesrad
Textkit Fan
Posts: 315
Joined: Thu Nov 01, 2012 1:10 pm

Re: Yet another Macronizer

Post by Nesrad »

Actually it has issues with the final a/ā, so that's one thing that needs to be proofread systematically.

User avatar
bedwere
Global Moderator
Posts: 5101
Joined: Fri Mar 07, 2008 10:23 pm
Location: Didacopoli in California
Contact:

Re: Yet another Macronizer

Post by bedwere »

New location:

http://stp.lingfil.uu.se/~winge/macronizer/

But if you are somewhat Linux savvy, I recommend a local installation

https://github.com/Alatius/latin-macronizer

PS

In my case I moved the latin-macronizer directory under /usr/local and recursively changed ownership to root:
sudo chown -R root.root /usr/local/latin-macronizer
Otherwise my version of apache2 would not execute the python script macronize.py

Post Reply