ToolboxTools: Managing knowledge

"How would you store thirty years'-worth of experience in a database?"
[quote from a senior research scientist at one of our clients]

Knowledge is more than information

That sounds obvious, perhaps, but the implications aren't - especially where knowledge-management is concerned. One of our regular clients has a significant knowledge-management issue: they're an aircraft-engineering research group, and much of their knowledge-base has a data-life of fifty years or more. That may not seem that long a time - given that the products of some civil-engineering projects may have to last for centuries - but it's actually almost as long as the entire history of commercial computing. And there've been a few changes in that time...

In practice, the knowledge-base needs to be stored, expanded, protected and retrieved over a period which is longer than any of the usual measures:

  • exceeds computer hardware lifetime (three to five years);
  • exceeds software implementation lifetime (three to ten years);
  • exceeds database architecture lifetime (five to twenty years);
  • exceeds typical staff-contract lifetime (five to twenty years);
  • exceeds typical government/policy lifetime (seven to twenty years);
  • exceeds maximum permitted working-lifetime (forty to forty-five years).

This isn't a computing issue as such: a common mistake! (In fact most of their legacy data was on paper: a lot easier to maintain than computer-stored information, though much, much harder to search...) Current knowledge-management systems obviously depend in part on database technology to provide storage, search and cross-reference, access-control, and usage and performance metrics. But knowledge-management depends just as much on:

  • leadership - a commitment to organisational quality
  • change-management - creating a 'learning organisation'
  • culture - creating a habit of sharing knowledge and exploring its potential for re-use

At xio we believe that knowledge-management works best as an extension of the organisation's quality-system:

  • quality-system provides purpose
  • work-details and practices may change over time, but the basic nature of the work does not change
  • policies, procedures, work-instructions define data and metadata to be recorded during work
  • quality-system provides standards and mechanisms for knowledge-sharing and knowledge maintenance

For knowledge-sharing to happen within an organisation, knowledge acquired in each team's work needs to be linked with the organisational knowledge-management system - which often doesn't exist! So knowledge-management is a fundamental systems-level issue, relying on the human aspects of the system as much as on technical matters.


All of these depend on understanding of the overall data-content - an understanding of what knowledge actually is. In practice, the data-content of knowledge may be partitioned into three categories:

  • data: objective, usually quantitative ('what') - information derived from instruments and other objective processes; provides information content
  • metadata: usually descriptive 'information about information' ('who', 'when', 'where', 'how') - information describing the context and/or derivation of a data item; identifies information context
  • connection: subjective, usually qualitative ('why') - information describing perceived relationships between data-items; derived from experience; indicates information meaning.

Conventionally, only objective data is maintained in databases - partly because it is the easiest type of data to manage. In recent times there's been an increasing awareness, in the scientific community and elsewhere, of the importance of maintaining metadata, particularly for traceability and quality-management; but at present there still appears to be only a poor awareness of the importance or value of maintaining subjective data - particularly subjective associations between data-items. Although objective data are often strongly emphasised in scientific and commercial knowledge-management, and are often mistakenly considered to be the only valid part of the overall data, in essence they only provide source-material for subjective interpretation of meaning:

objective data are meaningless without metadata to provide context, and without subjective data to provide interpretation.

This is especially true in long-term knowledge-management, where dependence on human memory to 'fill in the gaps' becomes increasingly unreliable over time, and by definition unavailable for the kind of data-lifetimes required by this client of ours - and many others.

A real knowledge-management system needs to be able to provide appropriate support for all three data-categories, and maintain appropriate distinctions and separations between data-categories whilst also maintaining the contextual links between them. None of the commercial 'knowledge-management systems' we've seen as yet actually manage to do this - especially the proper long-term management of connections. But it should happen fairly soon: the technology already exists in XML/XLL (eXtensible Markup Language/Link Language), for example. What's often harder is finding the commitment - especially in senior management - to actually doing it...!


Short-term data-safety is mainly concerned with keeping appropriate control over users' access to functionality - ensuring that records can't be created or changed inadvertently or inappropriately, for example. Data-safety over longer periods - for the full required data-lifetime - depends less on control of access to function, and more on appropriate design and management of data structure:

  • supports item detail and associations (i.e. data, metadata and subjective connections) - describes raw information, the context of that information, and interpretations of the meaning of the information;
  • uses self-describing data structures - data-structures include sufficient information to reconstruct the entire data-context from within the structures themselves;
  • formatted to stable standards - although proprietary 'standards' for interfaces may be used, data-storage and internal notation should use non-proprietary (and preferably international) standards for data storage wherever practicable;
  • supports data-migration to succeeding systems - data-structure designs should support clean migration of all data to the various systems which succeed each other during the full data-lifetime.

Appropriate standards for data-content and structure (at least in transfer forms for data-sharing and data-migration) include the Australian National Library and Dublin Core standards on metadata, and the NCSA and XML standards for data-structure and formatting of transferred data content. In this sense, an XML file would be close to fully 'self-describing', especially if it contained (rather than referenced) the respective Document Type Description (DTD).


Over the full required data-lifetime, physical and logical systems will inevitably need replacement. Appropriate plans and procedures - the human side of systems - need to be in place to support migration of data from older systems to their replacements. This includes:

  • data review - identify legacy data to be transferred to the new system;
  • data-structure review - identify any differences between old and new data-structures;
  • data-migration review - identify structure-conflicts and other migration issues, such as default data for new fields or data-objects;
  • data-migration implementation - execute data-migration, and validate transferred data in new system.

The importance of clear requirements

The general experience is that trying to implement a knowledge-management system in one go is a bad idea:

  • it's usually too expensive
  • it often doesn't work, because requirements aren't clear
  • it's usually too much of a 'culture shock'

So in practice, implementation of a full knowledge-management should always be iterative - often over several years. And at xio we advise our clients that before trying to implement any of the system, it's essential to be clear about what the system is expected to achieve, both overall and at each iteration. Clearly-defined requirements provide a means to measure progress; and prioritised requirements maximise leverage and 'return on investment' at each stage. Requirements which are relevant to knowledge-sharing would include:

  • work-tracking and workflows (i.e. quality-system)
  • tests and processes (i.e. what was done)
  • test-results and test-data (i.e. what was found)
  • information-items (documents, images, test-specimens, fragments, data-extracts, or whatever else the business handles)
platforms, components, materials, defect-types, widespread damage, repeated problems, processes
  • descriptive articles - theory guidelines, local best practice
  • connections, comments and 'threads of interest' (i.e. perceived meaning)


In planning for knowledge-sharing, it's useful to ask:

  • what knowledge do we expect to share? (knowledge components: data, metadata, connections)
  • with whom do we expect to share this knowledge? (others in the present: clients, other alliances, academics, industry? others in the future: including ourselves!)
  • what other uses might others find for our knowledge? (knowledge re-use: from within our usual scope, or well outside of it, many years from now?)
  • how could we make it easier to re-use our knowledge? (examples: additional metadata fields, overview notes, links to material on current theory or practice)

A kind of conclusion...

So to come back to that question with which we started this item in the toolkit: how could we store thirty years of experience in a database, in a way that others could use?

   - by making it easy to do so, for everyone, every day.

   - by providing secure database support not just for data, but also for metadata and connections;
   - by providing facilities for organisation-wide cross-reference and search.

   - by acknowledging our responsibility to the future as well as to the present!

No particular suggestions for resources on this one: one of the most useful that we've found was a set of White Papers by Dataware Corporation, sadly no longer available since their takeover by LiveQuest Corporation.