Searching between the fields in online lexica?

Is there a way to search for parts of the entries which are neither lemmata nor meanings in an online version of LSJ? I want like to be able to search for something as non-specific as “ii” and to get heaps of nonsensical results. Then perhaps to be able to search within the returned results.

Both Perseus and TLG seem to be too efficient and clean in their categorisation of elements into data fields. I realise that the quality and consistency of data in lexica ranges quite a bit for different pieces of information given in different entries - with lemmata and meanings being the most reliable, but I do also want to search for a few things that do not belong in those two fields too.

Would wildcards work, something like ii or perhaps enclosing your search string in quotation marks, “ii”?

Download the full text of the LSJ here and search in your favorite text editor.

I get a lot of results when searching for “ii” at TLG’s full corpus textual search. Of course most of the results are inflected and not dictionary forms, and you need to have access to the full corpus search to get it. I don’t know if that would be useful for you? I get things like “διιπετέος”, “διιστᾶι”, “Εὐποιίας” and “Διί”.

An interesting experiment.

In TLG, with out without the wildcard markers, a search still only return ὡροσκόπησις, which apparently is because of corrupt?? / incorrect inclusion of the “II” in the meaning field.

In Perseus, one specifies where in the word a string should occur, rather than using wildcats. If just a double ii is specified, then, only ἀναδέω is returned, presumably also because of misalligned data in this section,

  • II. ἀναδῆσαι τὴν πατριὴν ἐς ἑκκαιδέκατον θεόν trace one’s family to a god in the sixteenth generation, Hdt.2.143.

In Perseus, with the wildcard option, a search returns Latin genitives in scientific names, etc., ie words with “ii” included in one word of the definition.

Both attempts at wildcards confirms my belief that searching is limited to fields.

Perhaps a clearer example is that in the entry for LSJ, “Corinth” is included in the dictionary entry for ἄναξ, but it will not turn up is the English of the definitions is searched for “Corinth”, because “IG4.236 (Corinth)” is a type of data other than the definitional data.

That search capability of the lemmata is something that I will have to try in the future. Trying it in the public version only reveals data errors in ἑξᾰμηνιαῖος and φρυκτωρός.

Out of curiosity, in the full version is it also possble to search for, say ϝάναξ or Κόρϝα from within their respective entries?

For me, for now, Perseus is capable of a similar search of the lemmata, but in this case, the diairesis on the second iota of the LSJ entries is part if a long-standing indexing problem for characters with the diairesis, between search results and dictionary entries for the Perseus LSJ searching, but not for those entries going to the Middle.

I can’t search for ϝ anywhere in TLG.

For the entries with ϝ, see this post: http://discourse.textkit.com/t/in-homer/14956/1

I no longer work there, but (a) I could have sworn we mapped ϝ back to Beta code for search (which the texts are underlyingly in); (b) if in doubt, try Beta Code. Digamma in Beta code is V.

Indeed: a lot of the ii’s used to denote different meaning fields, for example, were TEI markup in the version of the LSJ that the TLG got from Perseus; and the TLG version reflects years of debugging the markup further, so it would have gotten even more “clean”. (Not perfect still, as Hekebolos reports!)

Joel is right: your best bet is to do a text search through a downloaded version of the marked up LSJ. The Morpheus distro from Perseus should include somewhere their original TEI version; it will not have the TLG corrections and improvements, but it will still have more searchable text than an archive.org OCR.