Introduction to Clumping and Wordgroups



Our mind automatically and unconsciously attempts to resolve ambiguities in the sentences we encounter as we come across them in order to settle upon a coherent interpretation. Most of the time, it is quite successful in this process, but ambiguity resolution is nowhere near as straightforward as it feels. In the phrase “the lamp near the painting of the house that was damaged in the flood”, was it the lamp that was damaged in the flood? Or was it the painting that was damaged? Or perhaps the painting is of a house that was damaged? Each of these interpretations is potentially a valid one, and the interpretation that you favor will be determined by the way that you clump the words in the sentence into objects.

Clumping is the process by which two or more linked words come to form a unitary construct to act and be acted upon as a single unit. Clumps can be very diverse in their composition and role within a sentence. Their formation is functionally necessary to determine precisely which objects are affected by which modifiers in a given sentence.

Clumps are formed of articles, modifiers, and the object of modification. The most basic sort of clump is a simple article-noun combination.  Addition of a single adjective modifier to that noun only marginally increases the complexity of the clump, but things can quickly become complicated from here. For starters, the modifiers folded into a clump can be numerous, but on top of that, they can be entire phrases unto themselves. A prepositional phrase or a relative pronoun clause may take on the role of an adjective and serve as a single modifying unit. In this case, the phrase acting as a modifier to form a clump with its object is already a clump itself, resulting in multiple layers of nested clumps. For example, in the clump “the house on the side of the hill”, the prepositional phrase “on the side of the hill” is itself a clump (with smaller embedded clumps within it) acting to modify “the house”. In “the house that Jack built with his winnings from gambling”, we have a prepositional phrase (“with…”) embedded within a relative pronoun clause (“that…”) acting as a clumped adjectival modifier to form an overarching clump with the base object (“the house”).


In the above example, the base object, “transfer”, goes on to form Clump 1 with the modifier “funds”. This clump is nested within a second clump when “electronic” goes on to modify the type of “funds transfer” that is happening. The hyphenated modifiers “same-institution” and “person-to-person” are themselves stand-ins for entire clumps, as becomes apparent if the sentence is rephrased to place them behind the base object (pictured below for “same-institution”).



A distinction must be made between created clumps and wordgroups (which include the class “open compound noun”). Clumps are objects that have been created for usage within the context of a particular sentence, while wordgroups represent stable units with their own independent entry that may function in many different contexts. In the case of clumps, all of the words involved have independent entries. Their consolidation is carried out to identify a multi-word unit that is affected by a modifier as a whole. Their flexibility is of benefit to them, as the way that clumps form (and the role that they play within a sentence) is heavily dependent upon the grammatical context in which they occur.

Accounting for the grammatical dependencies of clumping can quickly become a complicated process. In “he put the money on the table”, the money is not on the table until he puts it there. Here we have the interaction of a verb, a noun, a preposition, and a noun. We could have “he put the money on the table in his pocket”, where “the money on the table” is a valid clump, but we still have the problem of linking a verb, a clump, a preposition and a clump. Though the words used are exactly the same in these two examples, in the former case “the money” and “the table” have not been clumped, while in the latter case they have. Interactions with other words and phrases in the sentence (in this case the prepositional phrase “in his pocket”) change the way that clumping will occur. Context thus plays an essential role in determining the way that object clumps will form in any given case.

Wordgroups too appear as multiple words and are acted on as a single object, but they possess their own distinct entry and their meaning may be something other than what would be implied if the two words involved were interpreted independently. For example, the meanings of “real estate”, “ice cream”, and “lip service” cannot be intuited from the definitions of the words making up these concepts. For this reason, they need to have a separate entry and that entry must be favored during processing over the potential meaning offered by its individual components. Additionally, wordgroups may be made up of combinations of words and other preexisting wordgroups. The features distinguishing a wordgroup from a regular clump are both its validity beyond a particular sentence (stable cross-context meaning) and its potential to convey something that does not directly follow from the individual definitions of the combined words.

It should be noted that the list of wordgroups can be appended with commonly used clumps, so the distinction between the two should not be thought of as immutable. For example, it may be prudent to create a stable wordgroup for a technical term such as the “electronic funds transfer” clump pictured above if it is in common usage in a document of interest. In some technical areas, such as medicine, there are thousands of wordgroups that may not be used outside of that domain (or may mean something entirely different if they are). The ability to convert temporarily constructed clumps into stable wordgroups is valuable for simplification of future processing. The plasticity of Active Structure allows for both the flexible creation word clumps and the modification of wordgroup lists to accommodate frequently encountered clumps.

Comments

Popular Posts