The Fourteen Hundred words

Does anyone know of a website where I could find - in full dictionary format - the 1400 words that are reputed to make up the core vocabulary for at least 80% of any Latin text.

I have a list of the words not in full dictionary format which is not nearly as useful e.g. manus is listed but without genitive, gender and meaning and this would be so much more useful to me in full dictionary format.

It would doubtless be good for the soul to fill out this information for myself but if it already exists, why reinvent the wheel??

I’ve heard figures as low as 500 words to as many as several thousand. Whitaker has a version of “Words”, a downloadable Latin/English dictionary here: http://www.erols.com/whitaker/words.htm, or you can use Perseus’ online dictionary: http://www.perseus.tufts.edu/cgi-bin/ptext?doc=Perseus%3Atext%3A1999.04.0060

Last, but certainly not least, there are several Latin lists for your use and education here on Textkit: http://www.textkit.com/vocabulary/

Thanks for this info Barrius.

I already subscribe to the Textkit vocab service but will follow up the other two.

I heard that most beginner’s textbooks compile their vocabularies from a set of the most common words. You might be able to get more information from a publisher or something.

You are welcome. And please note - MORE vocabulary lists are in the works, as well as Greek. I’ve got two additional books I’d like to extract the vocabulary from & post into the vocabulary service. One book from 1919 (?) gathered it’s words from a NY state exam list IIRC.

And I’m currently working on a military vocab list from N & H “Latin Prose Composition” but I am very slow about it. :blush:

Back in 1939, Paul B. Diederich compiled and submitted to the faculty of the University of Chicago, for the partial completion of the requirements for his master’s or doctorate (I forget which), a 100-page or so book of Latin words with frequencies. He explained the method he used to compile it and, in the back, gave a selection of some 1400 words with translations grouped by theme as a beginner’s vocabulary. If I remember right, he only gave the stem of the words for paedogogical reasons of his own.

All that goes by way of introduction to this, A Dual-Source Database of Word Frequencies in Latin compiled by James H. Dee. It integrates the results of his work and that of another man, and presents the results in the form of a plaintext or Excel document. However, to get a list of the most frequent, I think you’ll have to do your own scraping, and for translation…well, you could see about dumping it through the Words program and capturing the output. If you can find a copy of the two sources for the database, you might do better to work with those; as I said, the Diederich one has a vocabulary in the back selected to cover some 85% of all word occurrences. (Well, I didn’t give a number before, but I believe it is somewhere around there. After that, the percent increase per word learned goes down a bit too much.)

The Diederich one was compared to the College Board vocabulary list given for the Latin test they must have been administering at the time, so some copies of that may be floating about, as well.

Not as slow as I am about approving any of them. Now I know where to go to verify!

Interesting! I downloaded it, and will look over it.

I have all the words in the AQA A-level word list in an excel file with genitives, principal parts etc. Unfortunately I had to put it together myself.

It’s not the 1500 most used words (more like 1000), but it is I think enough to be getting on with.

I’ll post it as soon as BT get my broadband connection sorted out, which could be a week or so. I hope one can post attachments here, or that exceptions can be made.

I think AQA would have the copyright on that vocab list, meaning it can’t be mass distributed without their permission :frowning:

Let’s see if I can dredge up my knowledge of Intellectual Property from when I studied it at university. The principle in copyright law, in the US as well as in the UK, is that copyright is not available for the “mere sweat of the brow”. In one case for instance, a company tried to copyright a telephone directory. It failed because the work must involve a minimal level of creativity:

link

If you like I could:

(a) include numerous other words in the list, so as to transform it into a work of my own.

(b) ask AQA if they will grant permission… a long shot but it may work.

I think a critical factor will be that the words are merely listed in AQA’s document - no attempt is made to define them, which I concede definately would raise the compilation into an original work.

Anyway, I’ve no desire to transform myself into a barrack-room lawyer, or to subject textkit to legal action, so of course you’re decision as moderator is final.[/quote]

Hmmm…

Reading the report of that case it seems adding extra words to the list would be quite sensible.

I’m reading the De Bello Gallico anyway, and I’ve made a spreadsheet wordlist for that. I think it would be wise to conflate the two. Copyright lawsuits can be nasty. Did you know for instance that they are one of the few circumstances in which English courts will award punitive damages? Nasty stuff.

And here’s another list to be getting on with…at least six weeks work I think…

link

oooh, the power!



mwahahahahahahahahahha




cough



I’m ok now. :slight_smile:

http://dekart.f.bg.ac.yu/~vnedeljk/TL/apropos/wordlist.html

Does anyone have this Dual-Source Database compiled by James H. Dee? Would appreciate a link, or a download somehow. Thanks.


https://web.archive.org/web/20120623093540/www.uic.edu/las/clas/LF_database.html

Here is a list of nearly 1000:

https://dcc.dickinson.edu/latin-core-list1

To show where this comes from:

https://dcc.dickinson.edu/vocab/core-vocabulary

And here are some other lists:

http://hiberna-cr.wikidot.com/downloads

Thanks, will.dawe.

I have cleaned up the file a bit an dropped it into a LibreOffice spreadsheet. Shared here:

https://www.4shared.com/s/f8DT1q0O-jq
https://www.4shared.com/s/fTb16jk6Cjq


Thanks, Hugh. I am familiar with these lists. I am particularly interested in the 1400 Words list, as it contains the lemmata from Gonzalez Lodge in digital format. The only place I have found them so far.