Dictionary Domains


The Oxford English Dictionary uses domains, but does not give them definitions – the reader is supposed to work them out (not easy).

Description: law
  Category: Noun
  Feature count: 1
Feature 0: Subcategorization: AggregateForm
  Sense count: 4
   Sense 0
    Definition: the system of rules which a particular country or community recognizes as regulating the actions of its members and which it may enforce by the imposition of penalties
    Domain count: 1
     Domain 0: Law

In this case, the definition of the domain is the same as one of the definitions of the word.

If we make the connection explicit, it would give more meaning to the objects in the domain, such as:
  Description: abscond
  Category: Verb
  Feature count: 1   Feature 0: Subcategorization: Intransitive
 Sense count: 1
   Sense 0
    Definition: leave hurriedly and secretly, typically to avoid detection of or arrest for an unlawful action such as theft
    Domain count: 1
     Domain 0: Law

We moved from the OED to Wiktionary, impressed by the number of words it handles, and the fact that the processing of a dictionary lookup could be local, so many unknown words could be integrated into the semantic network on the run (parsing text containing unknown words).

Wiktionary does not use the notion of domains, and it also does something very bad:

Description: law
  Category: Noun
  Feature count: 3
   Feature 0: Number: Singular                    XXX
   Feature 1: Mass: Countable                      XXX
   Feature 2: Mass: Uncountable                 XXX

This effectively jumbles up “law” used to describe all of law and “law” describing a law. We are going to need to unjumble this so the distinction isn’t lost (the unjumbling is worth it if it allows local processing of unknown words. Thousands of single laws exists in the domain of all of law, many of them needing connection to a library of other laws (the Crimes Act, for example).
 

Sense count: 13
   Sense 0
    Definition: The body of binding rules and regulations, customs, and standards established in a community by its legislative and judicial authorities

The definition of law as a domain
    Sub Sense count: 2
     Sub Sense 0
      Definition: The body of such rules that pertain to a particular topic
      Example count: 2
       Example 0: property law
A subdomain of law
   Sense 1
This is a repeat of Sense 0


There is a great deal of logic embedded in English to hold it all together - logic that the user is mostly unaware of, it being handled by their Unconscious Mind. We have to implement all that logic explicitly, so the machine’s understanding mirrors that of the human (even though they are not consciously aware of the complexity, and will often deny that it even exists, never having had to think about it consciously).

The difference between a senabtic approach and an LLM is highlighted by this logic, the LLM not understanding a word, let alone the logic joining the words together.


Comments

Popular Posts