Decreasing the Effect of Verbal Noise in Analyzing Cognitive Activity of a Design Process

In studying cognitive activity in design it is common practice to use designers' verbalizations during a design process to elicit the reasoning behind design actions. These verbalizations are segmented in order to enable a quantifiable analysis of the cognitive processes. Researchers have shown how Shannon's entropy can be applied to coded verbal data to provide a measure of creativity of those processes. We applied this method to a pilot study, investigating the effects of different design tools on creativity in the context of architectural design. Participants had to design three tasks of isomorphic nature, each with a different tool, in one design session. As shown a significant number of verbal comments were repetitions of already established ideas. Such comments brought nothing new to the sequence of activities but affected the value of information carried within that process which biased the measure of creativity. The paper regards these utterance as verbal noise. It proposes the use of corpus linguistic tools together with a coding scheme that can depict the hierarchical relationship of cognitive patterns used in the process to eliminate verbal noise from analysis. The method was applied to one participant's data, which shows a promising step in increasing the veracity of using verbal data in analyzing cognitive activity.


INTRODUCTION
In a protocol study, verbalizations are segmented based on the intention of the designer and then coded using a coding scheme. Each segment represents an idea and because design is known to be a reflection-in-action process, an idea may reoccur in different forms at different points within the process. The semantic connection leads to a linked network of ideas. In any cognitive system all of the segments (ideas) have the potential to be linked together depending on the conditions such as the external stimuli or characteristics of the designer.
Nevertheless, for every given idea the more links that can be made to ideas in succession, the richer that system is in terms of idea generation. Also the more able the system is in linking distant segments (ideas) together increases the integration of links and the cohesiveness of the creative system. Kan and Gero [6] use the terms forelinks to describe the former and horizonlinks for the latter, and suggest that a creative cognitive system is one that reaches an equilibrium between the number of generated ideas and their cohesiveness. In this context, segments represent information, each of which bears different values for that system. Therefore Kan and Gero suggest employing Shannon's entropy for measuring the sum value of these different link categories as a measure of creativity. Accordingly, if every idea is linked to every other idea, then the outcome of that system is known and there is no degree of uncertainty, therefore the entropy measures 0. This is also the case where an idea fails to make any connections, however unlikely. Maximum entropy of 1 is reached when each segment can make half of its potential links [6].
When a certain idea reoccurs or a semantically similar related idea is generated, a link with the original idea is made and the entropy of the cognitive system increases. So a verbal comment, however inferior in the number of links it makes, can affect the measurement of creativity. Imagine the case, where you are asked to solve a problem. If it is your first encounter with such a problem, although you may not find the answer, but you often find yourself exhausted after many seemingly discrete trial and errors of ideating. However if you had some adequate familiarity with the concepts involved to solve that problem, although you may never have solved it before, you tend to be able to provide a more coherent presentation of your answer in the same time frame. In the second case, you would probably provide more verbal comments as you are actually less engaged with the actual solving of the problem. You may find yourself repeating yourself without necessarily needing to do so as an effort to show you are producing thought. Such utterances are deemed as noise in this paper. Now, in two process where we can argue in favor of expecting similar patterns in the entropy of the cognitive system, but graphs depicting dynamic changes indicate otherwise; the viability of our proposed method for eliminating noise can be assessed in how well the graphs of dynamic entropies of the two process map onto (or close to) each other.

PILOT PROTOCOL STUDY
A pilot study was conducted to compare the effect of three different design tools (Rhinoceros 3D and SketchUp as Computer Aided Design tools and freehand sketching) on the distribution of cognitive activity and creativity for two architecture students with similar problem solving approaches. Each student engaged in three successive design tasks of isomorphic structure in one design session with only one tool assigned to that task and they were required to think aloud.
The analysis proceeded by using the Function-Behavior-Structure coding scheme (developed by Gero) [3] to code the data and then by linking the segments twice and arbitrating the results using the Delphi method. The linkoder [4] software was then used to calculate the mean dynamic entropies for each process. As shown in figure 1, in comparing the mean dynamic entropies of different tasks, both students displayed a significant drop in the forelink entropy of the last task but rose to their highest measure of horizonlink entropy in this task.
Further examination of the processes shows parallel graphs in changing from the second task to the third between the students, although either student used a different tool in their third task. This indicates that as the project develops the students become less reliant on the tool and more reliant on their mental imagery to shape a concept. Therefore by this stage much of the students' verbalizations were a result of previously conceived ideas. In other words, the outcomes of the third task revealed the effect of learning successive tasks have on one another.
Moreover the approach and succession of reasoning in the final task seemed to portray a close resemblance to that of the brief reading stage. Some of the verbal comments were identical to ideas provided earlier on and in some instance they were repeated more than a couple of times within the course of the final task. An example of such is student A's comment:"but it was like towards the[..]creation of mirrors, you will reflect all the flames, like onto the pavilion", "but it was like towards the[..]creation of mirrors, you will reflect all the flames, like onto the pavilion", "and then you would have like a couple of mirrors. All the inside of these walls that will be reflecting like all these flames." In regards to creativity and idea generation involved when designing, using tools such as sketching or designing based on mental imagery alone, Bilda et al [2] concluded no significant difference between these two. Their study was done on professional designers. Other studies indicate that experienced designers tend to address design based on their preconceived ideas, a breadth first depth later approach [8,10] and therefore the design tool acts as a peripheral aid to cognition. At a smaller scale, a similar conclusion can be made in case of less experienced designers like students, who gather experience during the course of a task and become less dependent on their external utensils in design thinking as the task proceeds.
The expectancy was therefore, to be able to see a similar pattern in the structure of forelink and horizonlink entropy graphs between the brief reading stage and the final task, yet this was not the case. And so it was conclude that a considerable number of the verbal comments were noise.

PROPOSITION FOR REDUCING VERBAL NOISE
In order to reduce the amount of verbal noise, the following necessities were recognized: 1-to identify semantically similar ideas via a concordance tool used in corpus linguistic analysis called AntConc [1]; 2-To investigate their significance in the succession of events, by considering their pattern structure and their role in the distribution of cognitive information across internal and external constructs of the cognitive system.
Antconc enables the extraction of keywords from a reference corpus in order to distinguish between semantic concordances within a target corpus/text. For this research, the verbal comments of the brief reading stage formed the reference corpus and the ones from the final task, were the target corpus. The method for eliciting pattern structure was derived by tailoring Zhang and Norman's [12] method in analyzing the effect of representations in distributed cognitive tasks, for design processes. The proposed method enabled the decomposition of the cognitive process into its internal and external components so that the different functions of internal and external representations can be identified. It also illustrated the hierarchical relationship between the patterns and ideas derived from them and allowed for a parallel analysis to that done via the F-B-S coding scheme.

Distributed cognition and representational analysis
Distributed cognitive theory, describes cognition as an aggregation of functions constructed by internal and external constructs of a cognitive system [5]. On this basis, Zhang and Norman discuss that the internal and external representational spaces together form a distributed representational space which is the representation of the abstract task space. Tasks which are isomorphic (structured the same with different representation) will have a similar structure when analyzed at the level of abstract task space but the distribution of patterns across the two spaces and their integrations may be different. To be able to analyze this they show that information that a person has to remember to solve a problem is homed in the internal task space, whereas the information in the external task space are those implicitly derived from the internal information or which the problem solver produces unconsciously.

The cognitive pattern structure of a design process
In design studies, a brief may provide the initial set of information needed for problem solving but a respectful number of information is produced as a result of evaluation and reflection in action. This fact is demonstrated by studies that use coding schemes, categorizing coded segments into new and old [11]. The modification and regeneration of ideas denotes a hierarchical order for cognitive patterns contributing to the idea. Similar to pattern recognition in the neocortex (refer to [9]), information in the brief acts as a corpus of primary patterns, which prepares the mind for recognition of higher level patterns that are a combination of the primary ones. The more a particular pattern is used and the more the designer becomes conscious of it, the higher possibility of its contribution to higher level patterns. The patterns which have homed in in the conscious part of memory can be communicated verbally.
Therefore in the pilot study conducted, a keyword generator was used to extract the main primary patterns at the center of attention to the designer in the brief reading stage. What the designers directly takes on from the brief and tries to memorize is their internal patterns and what they infer from it depending on other experiences is their external ones. Consequently there are three groups of patterns: primary, modified and generated. So where other coding schemes which code at the level of the artifact in production and Table 1

.Categories of a pattern of cognition in design
analyze the structure of the abstract task space, this research's method of coding provides an insight into the distribution of patterns and the cognitive complexities involved in producing the artifact. Table 1 describes the different pattern categories used in this research.

Reducing verbal noise for student A's final task
Student A's verbal comments from the brief reading stage were fed into AntConc as a reference corpus and a series of keywords were produced based on the likelihood of their use. Amongst these, keywords which referred to a functional, behavioral or structural considerations of the design were extracted. These words in the order of their likelihood are [sketchup, stuff, flames, see, mirror, corner, floor, opening, slab, wall, center, glass, geometry, material, organic, weaving, stairs, triangle, bricks, comfort, covered, movement, reflection and window]. The concordances for each keyword were then looked into. Figure 2 shows that the word "see", as an example, had 23 hits in the final task. Hits 9, 10 are the student's comment When no new pattern is added to the task space but a drawing action is executed on the current shape Task Space evaluation (ERs'3) When new patterns are introduced to the task space or old patterns refined as a result of transforming actions such as rotation.

Figure 2: Concordances of Student A's verbalization for the search term "see"
about "seeing something dark in the inside". Both of these hits, related to segments which had received the same F-B-S code (Bs). Each segment was analyzed in relation to the pattern structure of its preceding and successive segments. Both segments (hits) use the same distribution of internalexternal patterns (IRm9+IRm14/ERpe17). The pattern, IRm14 was newly generated in hit 9 and that segment affects the pattern structure of the segment in its succession. However hit 10 uses no new pattern and has no effect on the next segment. In the case of hit 10, IRm14 was also used in the preceding segment, and so the reuse of that pattern combination was expected and added no significant value of the succession of event; hence regarded as noise.
In total 58 verbal noises were identified and eliminated from the data. The link relationship between the remaining data was once again revised, for which forelink and horizonlink entropies were calculated and graphs drawn. As shown in figure 3 the pattern of change in dynamic entropies after noise reduction displays a closer resemblance to that of the brief reading stage. This outcome, suggests the proposed method a promising approach for eliminating confounding variables from protocol studies.

CONCLUSION AND FURTHER WORK
Although verbalization may be criticized as a way of understanding genuine cognitive activity, it is still a main tool for creating a general image of what goes on in the mind. Therefore it is important for analysis methods that use verbal data to be refined. This study displayed the relevance of an understanding of pattern structure and their internal-external cognitive distribution as an aid in distinguishing between genuine verbal data and noise. However, the method in itself needs to be refined and tested with a larger study group. In particular, for both eliminating noise and revising links, a less subjective, more automated approach should be sought after. Relevant to this and in another study Kan and Gero [7] reduced subjectivity in linking segment by developing a LISP program that uses an English language corpus to make semantic links. Therefore integrating corpus linguistic methods and cognitive pattern recognition is a step in a promising direction.