Branka Kostic
The Categorisation of
Ethnomusicological Data
in Multimedia Databases
The continuing presence of the problem of the categorisation of ethnomusicological data also refers to the contemporary digital multimedia bases. In fact, these bases, which allow different type of data to be searched and connected, owe their existence to contemporary digital technology.
Likewise this refers to the uncovering of new perspectives in the archiving and searching of ethnomusicological data. Thus, determining the structure of each database and defining its elements is of crucial importance to its future use and value. Therefore, this essay will focus on this problem - the categorisation of ethnomusicological data in the multimedia bases.
As a matter of fact, this interest originates from the practical knowledge we attained while structuring the multimedia database of the Firfov Collection. This somewhat massive project, which still carries on, began at the School of Music in Skopje in 2001 (Kostic 2001, 2002).
The arival of the digitised record data and other data from this collection demanded a clearly-defined methodology and a precise definition of the data fields entering the multimedia base. After the architecture of the base and its components was set and worked through, the process of entering the data also began (Kostic 2002:80).
While defining the macrostructure of this multimedia base, two things were taken into consideration:
- the features of the archived material;
- the technical abilities and capacities of the medium in which the archiving will take place.
Therefore, the previously-mentioned Firfov Collection
multimedia base contains the following sections:
- audio section (where primary data of strictly sonic nature is stored);
- text section (where secondary data of descriptive (textual) nature is stored);
- graphic section (where secondary data of visual nature is stored).
Even though it was initially thought that all
the sections of the base are of equal importance, the audio is de
facto its basic part and the reason for its formation. The remaining
sections
are merely its attributes, or spring from some analytical steps in the
processing of the archived material. In this new approach, guided by
the
recommendations of IASA 2001, the data can be classified into
primary
(data that exclusively refers to the audio, i.e. the strictly sound
aspects
of the works), and secondary (the remaining data, i.e. metadata for the
primary data). According to IASA 2001, the secondary data can have many
forms (text, music and video graphics), and together with the primary
data
forms the concept of ‘cultural inheritance’. The secondary data
in
some cases is a part of the work itself (for example, the sticker of a
CD), while some require additional compiling. The importance of the
secondary
data depends on the content, type of carrier and the future needs of
the
users, i.e. the use (IASA 2001).
Because the categorisation into primary and secondary data developed from the practical use of the archives of cultural inheritance, we developed this concept further - we included part for analytical data. Even though the Firfov archives date data with analytical features, it is generally placed in the secondary data group. Besides archiving, a theoretical interest in the processing of primary data persists. This noted a need to create a third category of data known as tertiary data. (Kostic 2002:64-5).
Having in mind the size of our methodological
task,
to date our interest has been focused solely on defining the fields of
the textual section of the database. The next part of this essay will
present
the most important discoveries and conclusions made during the process
of entering the textual data in the Firfov Collection.
The categories of textual data
Defining the fields of the textual bases is a result of the need to organise them in a way which would ensure wide searching and identification of the entered material.
Because the textual data is basically a part of the secondary data, it is also a type of metadata. In digital archiving, metadata means data about data, i.e. detailed and specific expansion of cataloguing practice ( IASA 2001; Buzarovski 2002). According to IASA, metadata plays a vital part in the use and control of the digital collections. Therefore, its preservation should be the key component in the handling of any digital collection.*
When defining the fields of the textual bases,
one
must keep in mind their compatibility and ability of conversion in
other
formats, such as network use (for example Internet). We therefore
decided
to start from the already existing systems of global standardisation of
the metadata. We selected the Dublin Core as the most widely used and
accepted
system.
The Dublin Core system was invented by the DCMI
organization
(DCMI Dublin Core Metadata Initiative). Its central task is to develope
a modus operandi which will ease the searching of data in the systems
of
artificial intelligence (www.purl.oclc.org/metadata/dublin_core). The
elements
of the Dublin Core comply with the standards of vertical specific
semantic
information on the WEB-based resources.
Thus the definition for the Dublin Core:
“Metadata used to supplement existing methods for
searching
and indexing WEB-based metadata, regardless of whether the
corresponding
resource is an electronic document or a ‘real’ physical object”.
For this purpose, a group of 15 elements was
produced
(DCMES Dublin Core Metadata Element Set), which are basically
descriptive
semantic definitions (www.purl.org/metadata/dub-lin_core_elements). The
elements are made to suit a wide range of fields and purposes. The
elements
of the Dublin Core are mainly general. This ensures an easy conversion
into this system and to the bases with multiple elements into the
Dublin
Core. In that case, one field from the Dublin Core incorporates many
fields
from the other bases.
Having this as our starting point, we stepped forward into defining the structure of the textual base of the Firfov Collection, leaving aside the question of its software format for the time being.
Initially, the entire textual base of secondary data was conceived as a file. Experience proved the need for its division into several parts. In this way we avoided the unnecessary multiple entry of the same data. Thanks to the computer format, these parts are mutually connectable, i.e. the data can be read from one part to the other, if the need arises.
Hence our structural division of the textual section of the multimedia base:
- a secondary data file referring to the audio-files;
- a secondary data file for the persons whose names appear in the base;
- a tertiary analytical data file for the archived works.
When defining the names of the fields we had
in mind that English is unquestionably the lingua franca of today's
world.
We therefore decided to use it exclusively within the creation of the
base.
In accordance with what we already mentioned about the tertiary data, the development of this file was postponed, whereas the structure of the first three types of files was completely processed.
The secondary data file for digitised recordings
The part which contains the audio files data marks 18 categories. The fields cover different types of data and are found in different formats: numbers, dates, names, titles, etc. In orders to economize on space, some data was coded.
The first field defines the identification number of the corresponding audio-file. Three fields follow which determine its time components: time of beginning and termination, time duration and markers. Markers are of particular importance because of the fast searching and identification of specific works, their parts, etc.
The following two fields textual and melodic beginning, also relate to identification, since they allow precise definitions of a specific piece and its separation of other pieces, or variants.
The eighth field covers data about the language in which the vocal and vocal-instrumental pieces are performed.
The following three categories (from 8 to 11): Author, Arrangement and Performer, study the so-called creative subject, i.e. the subjects (individuals) who took part in the creation of the pieces. As we already mentioned, this type of file does not process the data about the individuals who are in some way connected to the archived material (such as authors, recorders, researchers etc.). Its main purpose is to present the basic data, according to which links with the files containing data about the individuals would be created. Nevertheless, the need for more accurate defining of the more important data resulted in the development of several subcategories. For instance, the category of age was introduced, in order to record the age of the performer at the time of the performance.
The twelfth field Original recording, i.e. its subcategories, plays the role of yielding information about the details of the original recordings, i.e. the digitised and archived materials. As far as the methods of recording are concerned, we will mention that we used the experiences of Dr. Dietrich Schueller of the Phonogram Archive in Vienna.
According to Schueller, the gathering of primary data, i.e. the process of recording itself may be conducted in an explorative or documented manner (Schueller 1993:77-8). The explorative method denotes the outdoor or studio recording in which the performers perform a specific task, such as a piece of work from a different genre. The documented method refers to the recording of real events (customs, festivals, etc.). Later on in the Phonogram Archive in Vienna, these two categories develop into three: explorative, actual and simulative recording. Actual recording refers to the direct recording of events (festivals, concerts, fairs, customs, etc.), whereas simulative recording refers to the performers simulating a specific event (Buzarovski 2002:9-10).
The next category of data, titled Score, refers to the score of the piece. Its subcategories contain the information about the author of the score, the editor, the publisher, its date of printing and re-printing, etc.
The following group yields complete information about the process of the digitisation of folklore recordings. This category is similar to the one which unites the data related to the original recordings, but has modifications caused by the characteristics of digital technology.
The fifteenth category in this file of the multimedia base covers the details of copyrights.
The sixteenth category Additional Materials gives information about the existence of some extra materials in the archived collection, which are not of an auditive nature.
The seventeenth category - Notes, provides space for entry of data important for future research, that have not been included in the previous fields.
The last category of this file is reserved for the names of the individuals who completed the entry of data.
The file for individuals
This section of the base covers all the subjects that in one way or another attributed to the process of the creation, performance, recording, processing or analysis of the material in the base. It is composed of 19 categories, containing data about: the name and age of the interpreter, his/her nickname, the ensemble in which he/she takes part or directs, sex, place of birth, ethnicity, religion, native and other languages in which the individual creates, level of education, profession, place of birth, upbringing, and current place of residence of the performer, his/her parents or ancestral heritage, additional comments, special notes and data for the archiver.
Data for the digital copies
The third model covers the digital copies. It consists of seven fields, most of which are covered in the first file. Here we find files in which data about the authors of the pieces, the duration of the songs and their location in the collection and the audio file is entered. Besides this, data files about the digital copy are incorporated (such as the type and number of carrier of the audio file and the number of the audio file of the specific carrier).
Microsoft Excel software was used in order to realise the physiognomy of the textual section of the multimedia base. During the design of the models we ensured they were concise and clear for the potential users.
The categorisation of the ethnomusicological
data
in the multimedia bases should be perceived as an open process, which
demands
constant redefining and adjustment. Having in mind that the Firfov
Collection
textual data base is the first of its kind here in the field of
ethnomusicology,
we might expect further work on its development.
References: