Have you thought about doing aoidoi-like commentaries?

Here you can discuss all things Ancient Greek. Use this board to ask questions about grammar, discuss learning strategies, get help with a difficult passage of Greek, and more.
psilord
Textkit Member
Posts: 184
Joined: Fri Dec 24, 2004 9:38 pm
Location: Madison, WI

Post by psilord »

Hold on, my new method might be working... Now I just need a solid block of time... :)

psilord
Textkit Member
Posts: 184
Joined: Fri Dec 24, 2004 9:38 pm
Location: Madison, WI

Post by psilord »

Sigh.

I think I'm going to give up on this project--simply because the output of the cgi scripts at perseus are anti-machine readable. :(

The best I could probably do is give an 80% or so hit rate on scanning the words. I could turn [size=150]ἔβαινες[/size] into [size=150]ἔ‐βαιν‐ες[/size] but as for getting the morphology index out, it is so freeform (both in HTML and dumped ascii) that parsing and typechecking it are prohibitively expensive to code.

This project would go faster if 10 people split up the first book into 60 lines each and did it by hand writing the morphological expansion into a special markup file which then got converted into html.

If Perseus made their DB and schema accessable, then this project would go a million times faster, but that probably isn't going to happen.....

Given all of the wonderous information Perseus has, it boggle the mind that they give you such a small view of it.

chad
Textkit Zealot
Posts: 757
Joined: Tue Jul 22, 2003 2:55 am

Post by chad »

hi, could you pls email me whatever mess comes out of perseus, i want to see it and see if i can clean it up using find/replace macros, i've done stuff like that before, thx :)

parsing the iliad could be done manually for sure, but people have been talking about this probably for 2.8k years and it hasn't happened yet as far as i know: getting it from perseus will more likely make it happen i think.

psilord
Textkit Member
Posts: 184
Joined: Fri Dec 24, 2004 9:38 pm
Location: Madison, WI

Post by psilord »

I don't think find and replace macros are going to help you very much. If you aren't a programmer, what I have isn't going to help you.

As for what I have, I have a program which will grab the full morphological index page for every single word you feed it, and cache it so you don't have to do multiple lookups on the same word. You feed it books of the illiad (or whatever in greek) and it builds the page index associated with the words found in the text. This was the "front end" to the thing I was writing.

With a teeny bit of work, I could make it grab the normal mophological lookup as well thereby giving you a definition of the word in question (or multiple definitions if the dictionary was confused) in addition to the stems and stuff. Maybe I'll do this, since it really is a small amount of work.

Then, what was supposed to happen in the backend was that I was to parse the cached web pages (in some manner) and spew out the information for each word as I find it in the text. Problem is, I can't parse the web pages properly since they are amazingly freeform.

I suppose what I could do is this.... I could provide an *extremely* rough draft _text_ file output of the information, but hands down a human will have to go through it and collapse the ambiguities.

If I got that part done, then someone else (probably me) could reformat the text file into a markup language (of my own design) which I could then spew out into HTML/PDF/whatever.

Would this be sufficient?

I was trying to minimize the human element in this, but Perseus made it impossible. :(

I'll see if I can get the *very* raw output done tomorrow. I've written enough perl already that it shouldn't be too much trouble to complete, but this project has had surprising time sinks in it, so we'll see.

psilord
Textkit Member
Posts: 184
Joined: Fri Dec 24, 2004 9:38 pm
Location: Madison, WI

Post by psilord »

Ok, here is a sample of what the file will contain:

WORD: a)/nac
Form: a)/nac Dictionary Entry: a)/nac
Stem: a)na masc c_ktos
Ending: c masc nom/voc sg c_ktos
---
Candidate: a)/nac
Definition: a lord, master
Speech: masc nom sg
Speech: masc voc sg
---

Ugly, eh? :)

Sometimes the "Speech" aspect has more pertinant information than then "Ending" one, but usually "Speech" is a subset of "Ending". Oh well.

I'll start the process of producing the harris-ified text for book one of the illiad tonight. Since I'm being a good neighbor while datamining perseus, it'll take a day or so to download the data and produce the output. The words will be emitted into the file in the order they are found in the text.

I'm sleepy. Good night.

chad
Textkit Zealot
Posts: 757
Joined: Tue Jul 22, 2003 2:55 am

Post by chad »

perfect, i think that'll be really useful, cheers Peter :)

psilord
Textkit Member
Posts: 184
Joined: Fri Dec 24, 2004 9:38 pm
Location: Madison, WI

Post by psilord »

There is a surprise in your mailbox, Chad.

Post Reply