The most useful tool is probably the vocabulary feature on Perseus, which allows you to create word frequency lists for books, or even chapters, and to customize the information it gives you. I’ve never tried to do a frequency list for their entire Grk corpus.
There are also books available. There is one for Homeric Grk by Owen and Goodspeed, called Homeric Vocabularies. And I think I have a photocopy somewhere of a list of the most frequent classical words, which was more of a pamphlet than a book. Neither of these gives the freq number for each word, but instead lumps them into categories like “10-50 times”, “50-100” times, etc., and arranges them alphabetically within each category.
swiftnicholas, are you positive that the photocopy you have is that of a Classical Greek frequency list? It sounds very reminiscent of some books that deal with NT Greek.
I have played with this feature at Persueus, but never was too happy with it, if I remember correctly.
Yes, I’m sure that it is for Classical Greek, but I forget exactly how it is arranged. I don’t think it gives frequency numbers for each word, but it might have broad categories; in any case it’s a list of the most common words. I’ll dig it up and give you more details.
You reminded me of another small book though, by Bruce Metzger, something like “Lexical Aids for Students of NT Greek”. Perhaps this is what you have in mind?
I found the photocopy. It’s just a short pamphlet that lists the 2500 most frequent words in alphabetical order. It doesn’t indicate any frequency other than that. But I discovered that I actually have two photocopies, so if you think it’s something that would interest you, I would be glad to send you one; contact me with a private message.
It was published in 1962 by “Thrift Publishing”, a small press in Ithaca, New York, so I’m not sure about its availability.
swiftnicholas, thanks for the offer. I might take you up on it. I just want to make sure they don’t have this book at the library. What is the exact title?
This is the short preface at the beginning: "This Compact Greek Vocabulary contains about 2500 of the most frequently used words of Classical Greek. For the benefit of a wider audience some words of NT Greek have been included. Special attention has been paid to important word families. It is hoped that this small treatise proves useful to the study of a much neglected subject. To stress certain linguistic relationships, the alphabetical order has been ignored occasionally.
Like I said before, it doesn’t break down the frequency within the list. It is 24 pages long, and it has a page for numerals at the back, and four pages listing important stems.
It might be available in libraries; I’m not really sure. If you can’t find it and want it from me, send me a private message.
Hi, I know this is an extreme long-shot given this was posted nearly 20 years ago, but I’m very interested in collecting information on Greek pedagogy and this list sounds valuable while being extremely hard to obtain! If by any chance you still have a copy, please do let me know.
How does one realistically use this? I just tried to generate one for Epictetus and it gives ALL instances of every word… what am I missing? I don’t want a list with 15,000 instances of “kai”. How do you run it so that it only shows one copy of each word?
Just for fun, I thought I’d see if ChatGPT could do it for me. Here was its response:
Generating a precise list of Greek vocabulary words from the works of Epictetus, ordered from most frequent to least frequent, requires access to his complete works and the ability to analyze them thoroughly. Unfortunately, I don’t have the capability to access specific texts or perform corpus analysis directly.
However, some common Koine Greek words found in the works of Stoic philosophers like Epictetus include:
ἐπιθυμία (epithymia) - desire
ἀπάθεια (apatheia) - freedom from passion
φύσις (physis) - nature
ἀρετή (aretē) - virtue
πάθος (pathos) - passion
διανοία (dianoia) - understanding
αἴσθησις (aisthesis) - sensation
ἀγαθός (agathos) - good
προαίρεσις (proairesis) - moral choice
λόγος (logos) - reason or rational principle
To obtain a more comprehensive and accurate list, you may need to consult academic resources or linguistic studies that have analyzed the vocabulary of Epictetus’ works.
Make sure ‘Word or phrase’ is selected. Enter Epictetus in the author field under the header ‘Find Texts in the Collection by Author or Title’. Leave the other fields blank. Then hit enter.
This should generate lists of lemmata and word forms and their frequencies for each work. If you want to combine some or all of them, you can try pasting them into a single sheet in Excel and then use conditional formatting to identify all duplicates. These can be quickly selected for deletion, etc. by sorting your column on the cell properties you defined in your conditional rule.
Sounds complicated, but it’s actually very quick and easy
Alternatively, you can generate one list based on all his works by clicking on ‘frequency tables’ at the top of the page (or use this link: https://anastrophe.uchicago.edu/perseus/GreekFrequency.html). Enter Epictetus as your author, then specify ‘Lemma’ in the ‘Search by’ dropdown box.
I’ve mentioned both methods, as some may want lists per work or per author, depending on what they’re reading, have already read, etc.
Copy & paste dump of the top 50% by weighted frequency from the Perseus tool -
ὁ 16,863 4,639 5887.98 5887.98 0.077 the following LSJ, Middle Liddell, Slater, Autenrieth
εἰμί 17,120 1,458 2187.5 8075.48 0.087 sum LSJ, Middle Liddell, Slater, Autenrieth
οὗτος 1,654 1,594 1609 9684.48 0.0101 this LSJ, Middle Liddell, Slater, Autenrieth
τίς 14,800 0 1588.48 11272.96 0.0752 who? which? Middle Liddell, Slater, Autenrieth
δέ 1,561 1,561 1561 12833.96 0.0071 but LSJ, Middle Liddell, Slater, Autenrieth
αὐτός 1,460 1,138 1218.5 14052.46 0.0074 self LSJ, Middle Liddell, Slater, Autenrieth
τις 12,990 0 1135.98 15188.45 0.0727 any one, any thing LSJ, Middle Liddell
τις 12,990 0 1135.98 16324.43 0.0727 anyone, anything Slater
καί 15,648 0 978 17302.43 0.0795 and now LSJ
καί 15,648 0 978 18280.43 0.0795 and indeed, and further LSJ
καί 15,648 0 978 19258.43 0.0795 and LSJ, Autenrieth
καί 15,648 0 978 20236.43 0.0795 and so LSJ, Middle Liddell, Slater
ἐγώ 2,026 792 972.67 21209.1 0.024 I at least, for my part, indeed, for myself LSJ, Middle Liddell, Slater, Autenrieth
σύ 2,981 323 805.25 22014.35 0.0229 thou LSJ, Slater, Autenrieth
οὖν 850 764 785.5 22799.85 0.0096 certainly, in fact LSJ, Middle Liddell, Slater, Autenrieth
ὅς 8,251 0 750.58 23550.43 0.0419 yas, yā, yad, LSJ, Slater, Autenrieth
ὅς 8,251 0 750.58 24301.01 0.0419 his, her, LSJ, Middle Liddell, Slater, Autenrieth
γάρ 740 740 740 25041.01 0.0045 for LSJ, Middle Liddell, Slater, Autenrieth
πρός 1,157 596 599.03 25640.04 0.0089 on the side of, in the direction of, LSJ, Middle Liddell, Slater, Autenrieth
ἄλλος 4,639 0 557.2 26197.24 0.0284 other, another Slater
ἄλλος 4,639 0 557.2 26754.44 0.0284 y LSJ, Middle Liddell, Autenrieth
μέν 490 490 490 27244.44 0.0027 indeed, of a truth LSJ, Middle Liddell, Slater, Autenrieth
ἐθέλω 484 470 473.5 27717.94 0.025 to be willing LSJ, Middle Liddell, Slater, Autenrieth
σός 2,411 45 446 28163.94 0.0965 thy, thine LSJ, Middle Liddell, Slater, Autenrieth
οὐδείς 431 431 431 28594.94 0.007 not one LSJ, Middle Liddell, Slater, Autenrieth
περί 405 405 405 28999.94 0.005 round about, all round LSJ, Middle Liddell, Slater, Autenrieth
ἐκεῖνος 532 340 388 29387.94 0.0055 the person there, that person LSJ, Middle Liddell, Autenrieth
οὕτως 381 381 381 29768.94 0.0053 in this way LSJ, Middle Liddell
γίγνομαι 427 363 378.17 30147.11 0.0035 come into a new state of being LSJ, Middle Liddell, Autenrieth
ποιέω 889 285 351.07 30498.17 0.0096 make LSJ, Middle Liddell, Autenrieth
μής 4,382 0 308.33 30806.51 0.029 [unavailable] LSJ
ἐπί 1,220 0 305 31111.51 0.0094 being upon LSJ
ἐπί 1,220 0 305 31416.51 0.0094 on, upon with gen., dat., and acc. Middle Liddell, Slater, Autenrieth
ὅστις 2,518 24 296.48 31712.98 0.0259 that LSJ, Middle Liddell, Slater, Autenrieth
εἶμι 5,709 0 280 31992.99 0.0378 to go Middle Liddell
εἶμι 5,709 0 280 32272.99 0.0378 ibo LSJ, Slater, Autenrieth
ἄνθρωπος 585 220 280 32552.99 0.0152 man, LSJ, Middle Liddell, Slater, Autenrieth
ἀλλά 3,377 0 279.78 32832.77 0.0276 otheruise LSJ, Middle Liddell, Slater, Autenrieth
ὅτι 2,494 0 272.48 33105.25 0.0362 that LSJ, Middle Liddell, Slater, Autenrieth
ὅτι 2,494 0 272.48 33377.73 0.0336 that Middle Liddell, Autenrieth
εἰς 2,971 0 272.17 33649.89 0.0151 into LSJ, Middle Liddell, Slater, Autenrieth
ἑαυτοῦ 278 270 272 33921.89 0.0063 Stadtrecht von Gortyn LSJ, Middle Liddell
ὅταν 262 262 262 34183.89 0.0207 whenever LSJ, Middle Liddell, Slater, Autenrieth
μή 4,124 0 257.75 34441.64 0.0402 mā´ LSJ, Middle Liddell, Slater
μή 4,124 0 257.75 34699.39 0.0402 lest perchance LSJ, Autenrieth
μή 4,124 0 257.75 34957.14 0.0402 [unavailable] LSJ
ἔχω 2,465 0 244.97 35202.11 0.0163 check LSJ, Autenrieth
ἔχω 2,465 0 244.97 35447.08 0.0163 bear, carry, bring LSJ, Middle Liddell, Slater
ἐάν 3,519 0 237.3 35684.38 0.0233 if haply, if LSJ, Middle Liddell
ἵνα 218 218 218 35902.38 0.0148 in that place, there LSJ, Middle Liddell, Slater, Autenrieth
ἀγαθός 335 173 213.5 36115.88 0.0124 good: LSJ, Middle Liddell, Slater, Autenrieth
δύναμαι 580 146 212.83 36328.71 0.0194 to be able, strong enough LSJ, Middle Liddell, Slater, Autenrieth
τοιοῦτος 212 212 212 36540.71 0.006 such as this, LSJ, Middle Liddell, Slater, Autenrieth
ἐν 1,782 3 200.67 36741.38 0.0109 in, into. LSJ, Middle Liddell, Slater, Autenrieth
κακός 498 110 193 36934.38 0.0176 bad LSJ, Middle Liddell, Slater, Autenrieth
πολύς 197 181 183.33 37117.71 0.0017 many, LSJ, Middle Liddell, Slater, Autenrieth
λέγω 1,895 1 182.92 37300.63 0.0235 to say, speak Middle Liddell
λέγω 1,892 1 182.58 37483.21 0.0245 pick up LSJ, Middle Liddell, Slater
μόνος 374 150 180.92 37664.13 0.0066 alone, solitary LSJ, Middle Liddell, Slater
λέγω 1,888 0 180.42 37844.54 0.0224 lay LSJ, Middle Liddell, Autenrieth
οἶδα 520 130 176.42 38020.97 0.0133 to know LSJ, Middle Liddell, Slater, Autenrieth
ἄν 2,816 0 176 38196.97 0.0245 he came, LSJ, Middle Liddell, Slater, Autenrieth
οἷος 176 176 176 38372.97 0.0055 such as, of what sort, LSJ, Middle Liddell, Autenrieth
ἄν 2,816 0 176 38548.97 0.026 LSJ, Middle Liddell, Autenrieth
ἀνά 2,816 0 176 38724.97 0.026 on board LSJ, Middle Liddell, Slater, Autenrieth
ὑπό 209 173 175.25 38900.22 0.0017 úpa LSJ, Middle Liddell, Slater, Autenrieth
εἶπον 235 139 163 39063.22 0.0048 said LSJ, Middle Liddell, Slater, Autenrieth
ἔτι 161 158 158.33 39221.55 0.0036 yet, still LSJ, Middle Liddell, Slater, Autenrieth
ἔρχομαι 207 148 156.5 39378.05 0.0098 ibo LSJ, Middle Liddell, Slater, Autenrieth
It’s lemmatized. you get multiple entries for καί because there are multiple dictionary entries for καί.
This tool is what the other one is based on (Perseus under Philologic) and has more functionality / more flexible. In your case, you can get a frequency list for all Epictetus works in one go rather than separate lists for different works as on Philologic.
EDIT: I massively underestimated the capabilities of the Philologic tool, thanks to the user below for pointing this out. I have little experience with it so my comparison is likely wrong. But the point that Perseus’ list also works still stands.
What are you basing this on? Philologic is far more sophisticated and offers the user ways of searching the Perseus data that aren’t remotely possible elsewhere (e.g. on the Perseus website at Tufts or Scaife). A random example – find all instances of the aorist optative spoken by Neoptolemus in the Philoctetes.
The frequency lists generated by Philologic are also far more accurate since the data have been disambiguated, whereas the Perseus vocabulary tool at Tufts very often has to make guesses (cf. the weightings it generates), resulting in large numbers of lemmata that don’t actually occur in a given text or author or at far higher frequencies than actually do occur.
My advice is read the manual.
? I explained how to do exactly this in Philologic in my first post in this thread:
I add that it’s customary for new members to introduce themselves on this forum before posting as a matter of courtesy.
I was being hyperbolic, not literal (I thought that would be obvious). When I tried to generate a list from the Perseus site I was getting many repeated words–perhaps I was doing it wrong. But my point is that I don’t want to have to go through the list and manually eliminate all of the duplicates for every single word. I want a list that has each word listed once, not multiple times.