====== Improving knowledge collaboratively ======

Software development is a knowledge-intensive activity [(:ref:P.N99)]. Developers become learners (thus, knowledge acquisitors) when they need to locate potentially relevant source code and understand how to modify it to solve the task at hand.

Software development is also a social activity [(:ref:NK02)]. The activity is carried out by a group of developers, forming a community and engaging in collective creative knowledge work [(:ref:NOY00)]. It is a social activity mediated through artifacts, which are, primarily, source code and documents. Although sharing knowledge and information within a community of developers being indispensable, the primary means for developers to obtain knowledge is not through communicating with their peers, but through artifacts. Developers invest great effort recovering implicit knowledge by exploring code and documents. If this fails,they turn to their social network [(:ref:LVD03)]. But how can the community provide what the artifacts couldn’t? And if it can, how does it translates to useful, relevant knowledge to solve the task at hand? Can we trust others, just because we have no other choice?

===== Going for the crowd =====

In his book, WISDOM OF THE CROWDS: WHY THE MANY ARE SMARTER THAN THE FEW AND HOW COLLECTIVE WISDOM SHAPES BUSINESS, ECONOMIES, SOCIETIES AND NATIONS [(:ref:Sur04)], James Surowiecky presents an extensive analysis on how knowledge and reasoning in a group of people provide better results, on average, than an informed, expert individual. He hypothesises that people should act collectively to make decisions and to solve problems in matters of general interest (to the community). He states that, despite unawareness of it, //“we are collectively smart”// and intellectual superior to the isolated individual. When dealing with groups of people, there are concerns, not only regarding size and uniformity, but cognition, coordination and cooperation of the individuals. Nevertheless, //“groups do not need to be dominated by exceptionally intelligent people in order to be smart”//, performing better at deciding between possible solutions than coming up with them.

He then presents the four pillars that sustain the //wisdom of the crowd//: **diversity**, **independence**, **decentralisation** and **aggregation**. These are further detailed next.

==== Diversity ====

The best collective decisions come from disagreement. In a diverse group, each person should have some private information, even it’s just an eccentric interpretation of the known facts, to add perspective that would otherwise be absent. Diversity proves easier for individuals to say what they really think, consequently generating lots of losers (alternatives). Large collectives have the inherent ability to recognise these //losers// quickly and //kill them off//.

==== Independence ====

It may come as a paradox, but each member of the group should act as independently as possible. They should be free from the influence of others, as people’s opinions should not be determined by the opinions of those around them. This generates new, unfamiliar,ungeneralised data and keeps mistakes uncorrelated. Nevertheless, there are hindrances to this independence that arise from our own //fabric// as persons in a group:

  * **Social Proof**. We are social beings, who most of the times think that //“if everybody is doing it, there must be a good reason.”// In group behaviour, when things are uncertain, the best thing to do is just to follow along. This is also called herding: sticking with the crowd and failing small, rather than trying to innovate and run the risk of failing big. As reputation goes, it is better to fail conventionally then to succeed unconventionally.
  * **Information Cascade**. This phenomenon happens while making decisions based on bad judgment (one thinks is right) from who came before. This spawns a sequence of uninformed choices, so that collectively the group ends up making a bad decision. This is not always bad, if all other members are good judgers and spot the occurrence.
  * **Imitation**. Most of the time, as a rational response to our cognitive limits, we //piggyback// on the wisdom of others and, most of the time, it works. But it shouldn’t be a //slavish// imitation, where blind mimicry hurts the group. It should be an //intelligent// imitation that, if used well, is an effective and powerful tool to spread good ideas fast. Having a wide array of options and information, and the willingness to put their own judgment ahead of the group’s, are requisites for this kind of imitation. This can break negative cascades by consciously identifying bad choices.

Independence can be enforced and promoted by making sure, as much as possible, that decisions are made simultaneously (or very close) rather that sequentially, making people pay much less attention to what everyone else is saying. By //keeping the ties loose//, making groups ranging across hierarchies and exposing individuals to as many diverse sources of information as possible, independence can be maintained.

==== Decentralisation ====

People are able to specialise and draw on local knowledge. Decentralisation fosters (and is fed by) specialisation, increasing the scope and diversity of the opinions and information in the system. The closer to a problem, the more likely a good solution spawns, although there is no guarantee all information reaches everyone. Also it allows for tacit knowledge input. This is a very valuable knowledge, yet it is knowledge a person knows because they’ve been there, but they can really explain or communicate. Individual knowledge remains resolutely specific and local, but becomes globally and collectively useful.

==== Aggregation ====

Somehow, there must be a mechanism to compute, aggregate and broadcast the private judgments into a collective decision. These mechanisms need to be available to all the members, even unreliably assuring that the information reaches its destination (the member might ignore that knowledge). If this is some kind of whiteboard or global, sharable, communication infra-structure, that is not important as long as it serves its purpose.

Resorting to a wise crowd, or community, to solve problems can be advantageous. But how do we ask the community for help and effectively capture its answers, i.e., knowledge? Even if the community is only composed of experts, can we effectively tap into their collective knowledge? How do we acquire that knowledge?

====== Grasping the collective knowledge ======

Effectively capturing expertise from several heterogenous sources in a social environment is the goal of the Collaborative Knowledge Acquisition field of study, a spin-off of the Knowledge Acquisition domain. A succinct description is presented next.

===== Knowledge acquisition =====

The Knowledge Acquisition //(KA)// field deals with the process of extracting, structuring, and organising knowledge from human experts so that the problem-solving expertise can be captured and transformed into a computer-readable form. This captured knowledge forms the basis for the reasoning process of an expert system and has three main concerns: (i) involvement of appropriate human experts, (ii) proper knowledge elicitation techniques and (iii) a structured acquisition approach [(:ref:Wat86)][(:Ref:Lio92b)]. The term comes from the field of Expert Systems as the task of gathering the required knowledge from human experts, turning it into a computable form and fuelling the expert system. KA is a complex task with several identified issues that capturing techniques should address [(:Ref:Lio92b)] [(:ref:MD85)]:

   * **Most (but not all) knowledge is in the heads of experts**. Capturing and sharing this knowledge increases its already high value, although it should be shared in such a way to allow non-experts to understand it.
   * **Experts have vast amounts of knowledge**. It is therefore important to focus on the essential knowledge.
   * **Each expert doesn’t know everything**. Knowledge should be gathered and collated from different experts, and these should be allowed to interact.
   * **Experts have a lot of tacit knowledge**. An expert knows more than he/she can account for. Besides being hard (or nearly impossible) to describe, tacit knowledge is also hard to capture.
   * **Experts are very busy and valuable people**. Capturing techniques should take experts off the job for short periods of time, ideally, never, if they were seamlessly integrated into their working environment.
   * **Knowledge has a “shelf life”**. Knowledge evolves. Experts find new knowledge. Therefore knowledge should be maintained and validated throughout time.

As such, KA is a difficult and time-consuming process that frequently creates a bottleneck for building expert systems. It is possible, applying the right tools and methodologies, to improve and mitigate this bottleneck.

In [(:re:Cor89)], Cordingley provides a survey of knowledge acquisition methods and procedures, with suggestions about in which circumstances different methods are useful. These methods range from informal techniques such as user observation through common social science methods (interviews, questionnaires, and discourse analysis) to more formal techniques used in KA for expert systems. The reason for so many techniques lies in the fact that there are many different types of knowledge possessed by experts, and different techniques are required to access the different types of knowledge. This is referred to as the //Differential Access Hypothesis// [(:ref:And04)], and has been shown experimentally to have supporting evidence. Most recently, new developments in methodologies [(ref:SAA00)], the emergence of ontologies, improved software tools, and the expansion of knowledge management [(:ref:Dav98)] beyond that of expert systems have brought new insights into KA.

===== Collaborative knowledge acquisition: abandoning the useless =====

Knowledge acquisition in a social environment shares the same issues as seen earlier. Additionally, the developer has to rely on distributed knowledge resources (artifacts and people) where not everyone is an expert. This becomes even worse if the community scope goes beyond the team of developers and extends to the web, where other developers may have the answer for a specific problem regarding a well-known shared software artifact, API or framework.

The quality of the retrieved knowledge is evaluated by the behaviour of the community towards that knowledge. **If it is useful, it is used, if not, it is abandoned**. One way of capturing this behaviour is to give the community ways of expressing their intent, whether through rating or commenting. Otherwise, there are ways of implicitly capturing the com- munity behaviour, like page //hits//((The number of web users that visit that page.)) or //social bookmarking//. This is known as //**Collaborative Knowledge Acquisition**// [(:ref:Lio92a)], as it gathers information from several heterogeneous sources, such is the morphology of the Internet.

Systems that enable this kind of knowledge acquisition are denominated [[collectiveknowledgesystem | Collective Knowledge Systems]]. The DRIVER environment and toolset can be characterised as such a system.

[[start|< back to start page]]