The Semantic Organization: Knowing What You Know

How do we get from here to there?

Corporations have a tremendous amount of stored information. On top of this, new information is being created every day. A small but critical portion of this information is stored in highly structured and well-defined formats in relational databases. However, most of the information is on paper, in e-mail, in word processing documents, in spreadsheets, in PDF files, in engineering diagrams, and so on.

Ever since the initial XML draft in 1996, there has been an ongoing discussion of the semantic Web. A Google search for the exact expression the semantic Web returns about 1.2 million Web pages. Clearly there has been and continues to be a lot of discussion about the semantic Web. However, the semantic Web is still being worked on. This is mainly because very little information on the Web has not been semantically tagged. It may be more prudent to start on a smaller scale than the Web.

There is some clear precedent for this. Web services and the underlying technologies (UDDI, WSDL, SOAP) all started out as having been intended for the Web at large. However, they have been most successfully implemented inside an organization. Similarly, it makes sense to create a semantic organization rather than taking on the whole Web.

Here are two examples of what a semantic organization might look like:

  • A new RFP comes in to an advertising agency. It is from a consumer goods company that is looking to launch a new food product. The RFP manager could quickly locate similar RFPs that the agency has received, similar responses, similar project-related documents, and resumes of people who have worked on similar projects. All of these documents would be returned with the relevant passages highlighted. Documents for the launch of a food-grade plastic container would be successfully filtered out. In most organizations today, gathering all of this information can be half of the effort, and many relevant documents can be overlooked.
  • A human services agency has been asked to provide a summary of the impact of a change in eligibility laws. Searching all case documents provides a summary of the clients that will be impacted by the proposed changes. The specific combination of attributes that are needed can be found, for example, people who have lived in the community for more than 10 years, who have worked for more than 15 years, and who have one of several diagnosis codes. Without semantically tagged case documents, it would require a manual search of the documents.
There are seven key areas that are needed to support a semantic organization. The first is a common model of the organization. The remaining six are data tagging, data location, data relationships, data access, data storage, and data transformation. XML can play a role in three areas of building a semantic organization. These are data tagging, data location, and data relations. XML is not the correct tool for the common model, data access, data storage, and data transformation. (See Table 1.)

How do we get from here to there? The first step in implementing a semantic organization is to look for the low-hanging fruit. There are numerous areas in any organization where simple tagging of some information will provide a tremendous benefit. For example, a company could tag all of the proposals that they have sent out with some simple information such as company, project, type of project, etc. In these cases, the common model would be informal. Going beyond this will require a greater effort and support from vendors. The biggest challenge is the creation of a common model.

There are a number of tools on the market that support one or more of the areas required for a semantic organization. However, they tend to be special-purpose tools that require extensive setup. The process of tagging, storing, and retrieving documents should be built into the tools that we use every day. It should be a basic part of what everyone does.

Getting to this will provide organizations with tremendous insight into what they know. All of the information that is being generated will be widely available and useful. The data will be unlocked and available to benefit the organization. Organizations will be able to know what they know.

More Stories By Michael Wacey

Michael Wacey is a partner with CSC Consulting and has been involved in the data processing industry since 1982. He has worked as a CTO, CIO, and project leader in numerous areas, including the telecommunications, pharmaceutical, chemical, and financial industries.

Comments (4)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.