|
�
Metadata
Why we need Metadata
Library is the traditional information repository to find most information
needed. Library cataloguing is the method of organizing bibliographic information of library
collections to facilitate their identification, location, access, and use.
But Internet suddenly becomes a major mechanism for users connecting to a
tremendous amount of digital resources directly from their own desktops. The Web itself
becomes a huge free information database, a virtual library and also a path to acquire
resources from other databases within the network.
Coming with this excitement are the phenomena that most of the Internet
information is poorly organized, not stable. It is difficult for normal user to search
or browse this huge wealth efficiently. There is an increasing demand to have a cataloguing
mechanism for the Internet information.
Different types of questions are raised when facing this challenge:
are the methods of organizing the digital collections similar to those of organizing
traditional library materials? Do we need new describing tools and techniques to catalogue
digital resources? What type of standard we deploy? They will be created and maintained
by professional librarians or normal users? What skills will be needed for this word
and how could users get these skills?
Metadata could be the main if not the only method to answer those questions
in the digital world. It helps provide needed descriptions for electronic resources,
decrease users' search and discovery time, and what is the most important - increase
the recall and precision of retrieval.
What is Metadata?
The most common definition of Metadata is "data about
data." ALA Committee on
Cataloging: Description and Access lists 27 definitions
of Metadata from different groups and projects1. It's
very common that most organizations start to create their
own digital collections according to the specific types
of structure or format of the resources inside. But currently
there are strong demands on sharing those digital resources
to save the time and investments. For users who want to
retrieval and use those resources from different locations
and services through a single interface, it's nearly impossible
for them to know the specific retrieval requirements of
each type of resource in advance. So there should be a
mechanism that removes the barriers between different
systems, provides common search terms, access points,
and data structures, and also makes those differences
transparent to end users.
Metadata is typically designed for describing
the bibliographic information and data structure of resources
among different systems to distinguish one piece from
another, let user easily locate, access and transfer the
data between. According to Prairie
Village, the term "metadata" has been used
since the early 1980s by the computer software and systems
development community to describe the information required
to document the characteristics of information contained
within databases. But right now the resource described
by Metadata is not limited to the document, it includes
text, image, audio, video and any other format.
For a typical Metadata standard, it should
have a complete elements set, each of which represents
one property of the resource (normally called entity)
described, with a structure that how these elements are
organized, and a controlled vocabulary to define the name
and value of elements. This sounds very familiar to librarians
and cataloguers, since the MARC standard that is wide
used in library community can be treated as a sample of
Metadata. The concept of Metadata is being spread from
library science to other domains such as computer science,
and commonly accepted by all. Since the distribution of
the first draft version of Text
Encoding Initiative (TEI) Guideline in 1990, there
have been a number of Metadata standards or schemas created
covering a wide range of communities. Some of these metadata
schemas are general, such as MARC
or the Dublin Core,
designed for electronic resources used in various disciplines.
Other metadata schemas are dealing with specific discipline
or domain, such as Government
Information Location Services (GILS) and the . FGDC
Content Standards for Digital Geospatial Metadata
The most common feature among these schemas
is that they all use a set of defined elements to describe
the properties of individual entity. As Vellucci
indicated, there are three basic characteristics common
to all metadata schemes:
�� 1. syntax �� 2. semantics
�� 3. structure
To clearly identify each entity within the collection, each element can have
one or more qualifiers to provide additional semantics to
the values of elements, enhance the authority control to
distinguish different entities.
On the other hand, the number of elements of metadata,
how they describe the entity, and their structure are quite
different. Besides the traditional description function,
some schemas also include administrative metadata elements,
structural metadata elements to manage the entity, the link
between the entities, the access points and access rights
to the entities. As an independent part from the entity
described, the metadata elements can be stored together
with the entity in the same file, or separately as a document
in another physical location.
|