The preceding sections on the lexical database treat the structure of a lexicon during elaboration. A dictionary bears a complicated relation to the lexical database from which it is output:
- Of the set of records of the lexical database, only a proper subset is exported into the dictionary. Those entries of the database that shall make up (the entry list of) the dictionary are selected according to criteria dealt with in the section on lemma selection. This is essentially done manually. Technically, a mark is set in a field of the selected entries. The data type of the field is boolean, the significance of the variable is ‘print (y/n)’.
- Of the set of fields making up a record of the lexical database, only a proper subset is exported into the dictionary. For instance, the methodology fields in the microstructure are normally not printed. Those fields that shall make up (the microstructure of) a dictionary entry are selected.
- Homonym numbering is redone for the selected entries.
- The layout of the microstructure of the dictionary is defined. This concerns
- the order of the elements of information,
- the automatic addition of material, like numbers of multiple senses or ‘see:’ for cross-references,
- the automatic subtraction of information, e.g. by abbreviating categorizations,
- the typographic marking of the elements by font modifications (bold, italics, small capitals etc.), by reduction to subscript or superscript,
- the separation of the elements by punctuation, parentheses and square, angled or curly brackets.
To repeat, none of this material or markup appears in the lexical database.
- An export template is defined that produces the desired layout in the output file.
- In a multilingual (including bilingual) dictionary, the export template should mark the language of each element of an entry. This is needed, among other things, for proper automatic word separation.
- The selected entries are sorted according to a subset of the fields contained. For a general dictionary, relevant sorting criteria are:
- the lemma,
- the homonym number,
- the part of speech.
- If the readings of a polysemous expression constitute database entries of their own, they must be combined into a complex entry during the confection of the dictionary file. See the corresponding section.
Naturally one wants to automatize these processes as much as possible. At the present state of the art of data processing, much manual work remains to be done, anyway.