Does anyone know of a website where I could find - in full dictionary format - the 1400 words that are reputed to make up the core vocabulary for at least 80% of any Latin text.
I have a list of the words not in full dictionary format which is not nearly as useful e.g. manus is listed but without genitive, gender and meaning and this would be so much more useful to me in full dictionary format.
It would doubtless be good for the soul to fill out this information for myself but if it already exists, why reinvent the wheel??
I heard that most beginner’s textbooks compile their vocabularies from a set of the most common words. You might be able to get more information from a publisher or something.
You are welcome. And please note - MORE vocabulary lists are in the works, as well as Greek. I’ve got two additional books I’d like to extract the vocabulary from & post into the vocabulary service. One book from 1919 (?) gathered it’s words from a NY state exam list IIRC.
Back in 1939, Paul B. Diederich compiled and submitted to the faculty of the University of Chicago, for the partial completion of the requirements for his master’s or doctorate (I forget which), a 100-page or so book of Latin words with frequencies. He explained the method he used to compile it and, in the back, gave a selection of some 1400 words with translations grouped by theme as a beginner’s vocabulary. If I remember right, he only gave the stem of the words for paedogogical reasons of his own.
All that goes by way of introduction to this, A Dual-Source Database of Word Frequencies in Latin compiled by James H. Dee. It integrates the results of his work and that of another man, and presents the results in the form of a plaintext or Excel document. However, to get a list of the most frequent, I think you’ll have to do your own scraping, and for translation…well, you could see about dumping it through the Words program and capturing the output. If you can find a copy of the two sources for the database, you might do better to work with those; as I said, the Diederich one has a vocabulary in the back selected to cover some 85% of all word occurrences. (Well, I didn’t give a number before, but I believe it is somewhere around there. After that, the percent increase per word learned goes down a bit too much.)
The Diederich one was compared to the College Board vocabulary list given for the Latin test they must have been administering at the time, so some copies of that may be floating about, as well.
I have all the words in the AQA A-level word list in an excel file with genitives, principal parts etc. Unfortunately I had to put it together myself.
It’s not the 1500 most used words (more like 1000), but it is I think enough to be getting on with.
I’ll post it as soon as BT get my broadband connection sorted out, which could be a week or so. I hope one can post attachments here, or that exceptions can be made.
Let’s see if I can dredge up my knowledge of Intellectual Property from when I studied it at university. The principle in copyright law, in the US as well as in the UK, is that copyright is not available for the “mere sweat of the brow”. In one case for instance, a company tried to copyright a telephone directory. It failed because the work must involve a minimal level of creativity:
(a) include numerous other words in the list, so as to transform it into a work of my own.
(b) ask AQA if they will grant permission… a long shot but it may work.
I think a critical factor will be that the words are merely listed in AQA’s document - no attempt is made to define them, which I concede definately would raise the compilation into an original work.
Anyway, I’ve no desire to transform myself into a barrack-room lawyer, or to subject textkit to legal action, so of course you’re decision as moderator is final.[/quote]
Reading the report of that case it seems adding extra words to the list would be quite sensible.
I’m reading the De Bello Gallico anyway, and I’ve made a spreadsheet wordlist for that. I think it would be wise to conflate the two. Copyright lawsuits can be nasty. Did you know for instance that they are one of the few circumstances in which English courts will award punitive damages? Nasty stuff.
And here’s another list to be getting on with…at least six weeks work I think…
Thanks, Hugh. I am familiar with these lists. I am particularly interested in the 1400 Words list, as it contains the lemmata from Gonzalez Lodge in digital format. The only place I have found them so far.