Counter Commentary: 'Web' content management is a large subject

CONTENT DEVELOPMENT AND MANAGEMENT : THE CONCEPT AND THE STRATEGY

J P S AHUJA,M.Sc(Statistics); M.L.I.Sc.

Email:jatinder@sify.com (or) jpsahuja@yahoo.com

Abstract:

The article attempts to describe the concept of the content management and its importance in any web-site development process. It stresses the necessity of creating and managing the content of a web-site and defines the terminology used in content management. An attempt has also been made to correlate the publishing processes involved in the print media with that of web-site management and therein the publication aspects such as purpose, publisher, author, audiences, format and structure are discussed. An editorial/Metatorial framework around which the content management system works is also explained in detail. The article gives an insight into the essential parts of a content management system and explains in detail the functions of each and every process involved. It emphasizes the need of better web-site/content management skills on the part of the developers and the LIS professionals who are responsible for the guiding the users to the content so developed/managed.

Introduction

Many issues are attributed to the success and failure of a web-site, the one major issue taken into consideration is the content. Fresh content, daily updating and customer specific information are being talked about as the most important ingredient for the success of any site on the Web. Developing and managing content seems to be a problem for most Web-based institutions as well as corporates. This is not due to lack of people who could take up the job but for the fact that most of the writers are yet to understand the functioning of the Internet and the web. Writers, publishers and the information managers are used to the print media style of functioning. Web sites require small and crisp stories and better website development and management skills on the part of Library/information managers. Web content management is a large subject. A Content Management System (CMS) is a concept not a product, and each individual's concept may be different. It can form part of a large-scale approach to information management in an institution, or be limited to specific forms of communication such as the web.

What is information and what is content?

Computers have only recently become ubiquitous in the world of information. Traditionally, computers have been tasked with handling data. As opposed to data, which is a fairly concrete term, information is a very vague term. Just about any communication (including data) can be described as information. For the purposes of this discussion, information will be taken to mean all the common forms of recorded communication: writing, recorded sound, images, video, and animations. Content, stated as simply as possible, is information put to use. Information is put to use when it is packaged and presented (published) for a specific purpose. More often than not, content is not a single "piece" of information, but a conglomeration of pieces of information put together to form a cohesive whole. A book has content, which is comprised of multiple chapters, paragraphs, and sentences. Newspapers contain content: articles, advertisements, indexes, and pictures. The newest entry to the media world, the Web, is just the same; sites are made of articles, advertisements, indexes, and pictures – all organized into a coherent presentation.

What is content management?

Content management is effectively collecting, managing, and making information available in targeted publications.

A content management system helps organize and automate collection, management, and publishing processes. A content management system is needed when:

Core Concepts Behind Content Management

content management is a discipline that involves the collection, management, and publication of content. Content management concepts include the following:

The Purpose of Your Content

A good place to begin discussing content management concepts is with content domain. The content domain is the scope or range of information that is intended to be captured, managed, and published. The content domain is directly related to your goals of the content management system overall. In fact, the content domain is the realm of information that needs to be controlled in order to meet stated goals. Conversely, it can be asked, "How will the stated goals be met?" The answer is, "By providing content and functionality." Functionality, is the set of features and abilities provided to an audience for getting to content and for performing transactions (monetary or information transfer) with an organization. Content is of interest only if it falls within the stated content domain.

All content management systems should have a concise domain statement. Effective domain statements are no more than a few sentences. When heard, one can immediately imagine what is part of the content and what is not. An individual will know immediately if the content is of interest to them and what questions or interests it should satisfy.

Content Components

Once a content domain has been established and there is a clear idea of all of the types of content, the content can then be broken up into its component pieces. Components divide information into convenient and manageable chunks. They are a set of discrete objects whose creation, maintenance, and distribution can be automated. They typically share some common attributes, such as format or length, and they should be able to "stand on their own." In other words, a component should have meaning in and of itself, without needing the context of other components to make it meaningful.

Each component travels through a content system as a unit. When new content is created it is done one entire component at a time. When content is archived or deleted, it is done by components. When a page is created, it is by pulling together one or more components into a page frame or template.

Content objects, or components, are based on the same basic idea. Content components are small reusable pieces of content that can be linked together to achieve larger results.

In a content management system, content components are used in the same way a programmer would use objects. Small pieces are linked together to make a larger whole. Users of content management systems learn how to work with components, not the information within the components, to achieve the same sort of independence that the programmer gets with object-oriented programming. To build a page, what is inside the component does not need to be known, only how the standardized container of the component operates.

Some content management systems store components in files. Most store them in a relational database, and a few store them in object databases that use XML hierarchies rather than relational tables to store the components.

The correct method of dividing content into components is the one that gives the organization using it the biggest advantage within current abilities and resources. This is not to say that this is an Ad-hoc process. On the contrary, the content must be divided according to well-defined and universally understood rules. The rules can and will change, but, at any instant, there has to be one set of rules so that at all times the content is organized.

Target Publications

The audience (the viewers or readers of content) doesn’t care about content components, nor does it care about the way these components are collected and managed. In fact, all the audience cares about is that it receives a coherent presentation of the information it wants. When all is said and done, an audience expects to see what it is used to seeing – a normal publication, like a book, magazine, or Web site.

Publishing is simply releasing information that was previously being developed. Following this definition, any content shown can be called a publication. However, that definition opens up the concept so far that it is not very helpful. To close the definition down a little, all publications should have these aspects to them:

These qualities, format and structure especially, define a publication for this discussion.

Publication Purpose

All of what has been said about purpose applies as much to a target publication as it does to the content as a whole. Moreover, when the whole system exists to produce a single publication (i.e., a Web site), then the purpose of the content system is the same as the purpose of the site.

Publication Publishers

The publishers make sure that the publication comes together and gets out to the audience. They assure that the publication serves its purpose, and in the electronic world, they assure that the publication works (does not crash or do other nasty things). Broadly speaking, electronic publishing groups have these sorts of players:

 

Publication Authorship

Every published work comes from somewhere. Nothing is ever read, watched, or listened to without some consideration of who is responsible for it. There is not much argument about that statement, but there should be. What happens when a publication is not authored by any single person or group? Is there authorship at the search engine site Lycos.com? At first glance the answer is clearly no. Lycos does not create the content; it dispassionately displays it. Lycos is merely an aggregation point where everyone else’s authored works appear.

Look again. Lycos is the author of their site. When a user can’t find what she wants, Lycos is blamed and the user switches to Yahoo! Furthermore, there are very tangible artifacts of the authors at Lycos. Lycos’ name is prominently displayed. There is the style and layout of the site, the way they provide search capabilities, and the way they display hits. Finally, there are the moneymaking ads and inducements by which portals like Lycos earn their living.

What Lycos and the other portals understand, classical publishers have always known and Net publishers tend to forget. The user assumes authorship whether it is planned or not. Why? Because authorship is a pivotal part of the context put around any content to determine its credibility and meaning. Consider this:

The world is flat. – Joe Nobody (A raving guy you passed on the street)

The world is flat. – Alan Greenspan (Chairman of the U. S. Federal Reserve Bank)

You can win-Shiv Khera

In the first case, one is led to believe that the comment is about geography and one is not likely to believe it. In the second case (with the context that Alan Greenspan is a very famous Economist), one is lead to believe that the comment is about financial performances and is likely to believe it. And in the last case it is all about winning your life's battles by the eminent management guru Shiv Khera.Thus, Authorship is useful and unavoidable.

Publication Audiences

A publication is an asynchronous conversation. It does not happen in real time. The authors imagine as they write what a reader is likely to think next, and respond to it in the next sentence. Readers then read and respond in their heads to the author’s words. To the extent that the authors can guess what the reader will think next, and can convey it in text, and to the extent that the user can understand and wants to understand the text, the conversation continues. In fact, it might be said that a quality of great works is that they inspire a lively exchange with the author in the user’s mind.

So, just as every publication has an author, it also has an audience; and just as electronic publications can forget to account for the author, they also tend to forget to account for the audience. The subject of audience analysis deals with how to understand the target audience and how to tailor content to them. It deals with some seemingly simple questions:

In the electronic publication world, a single audience does not exist. First, because of the ubiquity of the Web, a publication can reach a huge range of people. It is not easy to say who will be drawn to a publication and why. Second, the technical ability to detect and record the personal profile of each Web site visitor now exists. Armed with this information, publications can be tailored to every visitor, effectively creating as many different audiences as there are people viewing the site.

Regardless of the wide variety of people potentially reached by a site, the true audience is not who will be reached, but who the publisher wants to reach. There is always a group of people that are of the most interest. There may also be other groups in which there is less, but at least some, interest. The rest of the world can’t be taken into account if a publication is to be created that is of particular interest to a core audience.

The fact that each user can now be targeted individually, and the fact that everyone can’t be catered to may seem contradictory, but it isn’t. In fact, the former depends deeply on the latter. If an audience isn’t narrow enough, it is impossible to provide what they really want. Precise enough profile information will not be collected from them and the content provided will never be able to be adequately segmented.

Publication Format

Format has two basic flavors. Format is the way information is encoded, and format is the codes used to determine how to visually render information. A publication needs both of these. Take a printed magazine for example. The publishers might use a product such as Quark® Xpress. Quark has a particular way that it encodes the information entered into it. The publisher works with files that only Quark can decode and manipulate. Additionally, Quark has a set of formatting features (character, paragraph, section, etc.) that can be applied to the information entered that determine how the information will be rendered on the printed page. Again, only Quark knows how to decode and manipulate these formatting features.

The main content management issue with publication formats comes when there is more than one format. If all that’s being produced is a Web site, then everything can be standardized into HTML. However, if the goal is to create both a Web site and a set of brochures from the same content, and the brochures are produced from Quark files, and the site is HTML, publication requires a more involved solution.

Publication Structure

Even the simplest publication has some structure. Look at any flyer tacked to a telephone post. There is usually large type on the flyer announcing the flyer’s purpose. This is the title. There is also likely to be details subordinate to the title, differentiated by type size or position. Finally, there is some indication of the author or provider of the flyer, generally at the top or the bottom. Thus, the flyer can be broken down into a title, detail, and author info – structure. Of course, as the complexity of the publication increases, the structure increases, and more and more of the structural techniques discussed in this paper are required to manage the complexity.

As with format, the big issues with structure arise when there is more than one structure. When there are two publications, three structures are needed – one for each publication and one for the content, since base content should not be constrained by one format or another. In order to produce more than one publication from the same content base, the content base itself must be structured neutrally enough so that the structure of any of the end publications is derivable from the structure of the content base. Extensible Mark-up Language (XML) is becoming and increasingly common way of doing this.

An Editorial and Metatorial™ Framework

All professional publishing groups use an editorial and backbone framework to guide their work. This framework consists of these types of rules:

Without such a framework, publications lack unity and can seem disorganized and unprofessional. For static publications (such as books and magazines), which are manually produced and hand crafted, the editorial framework is all that you need to assure unity.

In dynamic publications, like Web sites, which are often automatically generated, not hand crafted, additional rules are needed to assure that the publication stays organized. Chase Bobko has defined these additional rules as the Metatorial™ framework. Whereas editorial rules govern the presentation of the content, Metatorial rules govern its management and accessibility. In static publications, content is managed and accessed by hand. In other words, the rules about how content is managed and accessed are in someone’s head. In dynamic publications, these rules must be made explicit so that the computer can use them to manage and access the content.

The Metatorial framework is a system of meta information. Meta information means information about the information you have created. For example, meta information about an article might be its publication date, author, the section in which it belongs on the site, its target audience, etc. This framework provides the "handles" on the content needed to find it, evaluate its appropriateness, and assemble it into useful publications. An editorial framework provides rules for creating content, a Metatorial framework provides rules for tagging previously created content with meta information.

The Metatorial framework has rules for tagging content with these types of meta information:

Just as an editorial framework has as its product an editorial guide, the Metatorial framework results in a Metatorial guide. The Metatorial guide establishes the guidelines that staff uses to divide and tag all of the content that crosses their desk.

To create the guide, two overlapping analyses are used:

Essential Parts of a Content Management System

Content management is the collection, management, and publication of content. A content management system starts with a purpose and a set of target publications. From these, a set of content components is derived that serve the stated purpose, and can be combined to create any of the target publications. A Metatorial framework is then built around these components to allow them to be created, managed, and drawn into publications by a staff whose actions are guided by a set of codified procedures called workflows. To make the content available, the system creates publications such as Web sites, printed documents, and email newsletters. A content management system is needed when there is too much information to collect, manage and publish by hand.

Collection

The collection system is the tools, procedures, and staff that employed to gather content, and provide editorial and Metatorial processing.

When content is collected, it is brought inside the content management system. The content collection process is one of adding new components to the existing repository. Content collection can be broken into these categories:

Management

The management system is the repository of all content and meta information, as well as the processes and tools employed to access and manage the collected content and meta information. The repository holds all of the content and meta information of the system.

Repositories perform the following functions:

Workflow

The workflow system is the tools, procedures, and staff that you employ to assure that the entire process of collection, storage, and publication runs effectively and efficiently, according to well-defined timelines and actions.

A workflow system supports the creation and management of business processes. In the context of a content management system, the workflow system sets and administers the chain of events around collecting, repositing, and publishing.

To be successful, the workflow system should:

Publishing

Content publishing describes the process by which content is drawn out of the repository and formatted into Web sites and other publications. To be flexible enough to produce a wide range of publications, the publishing system must include:

There are content management systems in existence today that meet the requirements as described here – to varying degrees. There are dozens of commercial products available as well as developers who are willing and able to build custom content management systems. The choice of "to buy or to build" depends largely on the results of a content and publication analysis. As the discipline of content management matures, the field of players will likely narrow to a few comprehensive and relatively easy-to-use solutions, similar to the way word processing, project management, and accounting software has evolved over the years. In the meantime, a system must be chosen carefully in order to ensure that it has both the comprehensiveness and flexibility needed to deliver the appropriate publications.

The conceptualization of the content management process

There is also a grey area between content management and website management - some CMSs include functions like linkchecking, feedback reporting and search indexing. Another grey area is with web 'applications'. There is an argument that sophisiticated websites now just groups of online applications (web apps.) anyway. Some CMSs provide forums, chat systems and calendaring. So a CMS can include web apps, or be itself a web app.

Limiting our attention to the web, basic content managment is about saving the webmaster time! Most of us are familiar with traditional 'content managment' - the file system with browse / search / delete / rename etc, file editors and file transfer protocols. We are also familiar with its limitations beyond a few files and a single author.

The next step is the implementation of content re-use, eg. for web page templates, and repeated information. The 'high end' page design tools include these kind of systems. Furthermore, anyone working directly on a web server, or using Macromedia Dreamweaver, can use Server Side Includes to reuse content. This increases the 'granularity' of web content below the 'file' level. e.g. a standard copyright notice or a navigation bar 'BRANDING' the pages, can exist in a single file, but included in all web pages as SSIs. Team this up with the user and file level security of say, UNIX, and you have the bones of a multi-user CMS.

Contributing the content:the designing processes

However, content management on an institutional scale has many wider issues, not least that most people never want to design web pages, but need to contribute content. Thus, when we want to enable people to contribute, we need to manage the process and make it 'self service'. The core processes for content management can be structured into distinct 3 groups.

1.Content authorship

2. Structured data and

3. Dynamic 'delivery'

In publishing a database to the web it is evident that the contents of the resulting web page are extracted database fields. An idea common to most CMSs is to extend this way of storing information to the structural and navigational components that make up the HTML web page (or other format) as well as the 'data'. A database is only one form of storage: the filesystem or an XML file can also act as repositories of such structural or formatting elements. The idea is not to store information directly in html, but to generate relevant output using the correct components for the required medium, thus allowing re-use of content in multiple files, and re-use of content in multiple file formats.

BUY OR BUILD: THE CORE ISSUES IN CONTENT MANAGEMENT

This is a common question when researching the development of a CMS. Within academia, the common answer has been to build, using open source components where possible. Building your CMS can lead to a higly tailored solution for your needs, and institutions often have access to staff with excellent programming skills and members of staff having abilities to provide high quality precise information for the website development. However, building such systems require time and skills resources that are not always available.

Buy and build' is a more flexible approach, so long as the product(s) you buy allows significant customisation, or has an understandable 'application programming interface' (API). Large scale CMSs are often perceived as expensive.

Care must also be taken regarding various layers of the website building process so that effective liason is maintained between the thinkers and the doers.

Most of the larger scale CMSs cover these areas well, though conversion of standard documents is not usually included, unless the document conforms to a vary specific structure. The devolution of direct authorship (and authority) removes the 'webmaster bottleneck' and can transform a website. Where contributors know web page design tools, this is easy - however the majority of users have no such skills and other approaches are needed. A special case here is in the management of 'virtual learning environemnts' (VLEs). In sites with VLE components, the input and tracking of content by students should be regarded as a core part of the CMS, and not the add-on web app usually seen on informational websites.

Think before you build

Before you invest time and/or money in a CMS think carefully about its method of delivery to the web. Are important pages available to search engines? Can they be bookmarked? Some CMSs dynamically generate all pages on the fly. Because each page is a series of database calls, the 'URL' is not always specific to the page. A more flexible approach is to use pre-rendered files that act like flat html (eg. .asp, .cfml) but contain database calls within them.

There are 2 layers to this type of content management.

  1. the method of building the page template, navigation and branding elements BEFORE deployment to the live website,
  2. the method of accessing the content itself AFTER deployment to the live website

In cases where the content does not change frequently (e.g. a descriptive webpage), using database calls on the live website is inefficient. In this case it is often preferable to use flat or pre-rendered html. On the other hand, a staff list on a website may need hundreds of flat pages. In this case there are obvious benefits to using a single page to browse the staff database or display query-dependant results. Essentially, a CMS should allow you to mix the dynamic with the static on the live website, to give your users the most efficient service.

CONCLUSION:

As long as has been communication, there has been content. And, while there have been various methods for managing that content, the discipline of Content Management has only surfaced recently with the advent of electronic publications – specifically the Web. Thus the discipline of Content Management has not fully matured. That said, Content management, as we know it today, is understandable, useful, and truly critical for anyone who wishes to communicate on a large scale. Fortunately, with knowledge of the core concepts described in this paper, combined with the wide variety of content management system options, the information beast can be tamed today. With Content Management, information can be delivered in a manner that produces a richer, more timely, more targeted experience for audiences, and a rational, cost-effective process for the publisher.

The above 'process flow' is from analysis of informational corporate / e-commerce web sites. In the preparation for this paper, I had to rely on the 'literature' available, so many of the features and processes described come from the e-commerce or corporate sectors. Similar strategies can be further developed to manage the content of the academic/ educational web sites.

 

Educational institutions also face a different priority. The authoring stage needs to be a 'self-service' part of the website, feeding information back to the core CMS. This feature is also important for institutional Intranets.

1
Hosted by www.Geocities.ws