Computer-Based Thesauri

NISO/ASI/ALCTS

Workshop on Electronic Thesauri: Planning for a Standard


Report on the Electronic Thersauri Workshop



DATE: November 4-5, 1999
PLACE: Embassy Suites at The Chevy Chase Pavillion

4330 Military Rd, NW

(Wisc. Ave. at Western Ave.- Friendship Heights Metro stop)

Washington, DC 20015
TELEPHONE: 202-362-9300
FAX: 202-686-3405

ABOUT THE WORKSHOP

NISO, the American Society of Indexers (ASI), and the Association for Library Collections and Technical Service (ALCTS) are convening an invitational workshop to investigate the desirability and feasibility of developing a standard for electronic thesauri. Registration for the Workshop is closed.

Top of Page


WORKSHOP SCOPE


The definition of "thesaurus" for purposes of this meeting is broader than that of the present standard for thesauri ANSI/NISO Z39.19-1993 (R1998). The meeting will consider vocabularies which meet two basic criteria:

use to facilitate analysis of texts and their subsequent retrieval (or retrieval of the information which they contain); and inclusion of a rich set of semantic relationships among their constituent terms.

The scope includes (among others): standard thesauri, subject heading lists, semantic networks, and taxonomies (Internet directories). It excludes: simple term lists, with or without equivalence relationships; lists of terms whose only relationship is that of co-occurrence in documents; and lists of terms whose primary purpose is to provide definitions (e.g., dictionaries and glossaries)

Top of Page



ISSUES TO BE CONSIDERED ARE:

Four general and four specific issues will be considered.

General issues:

The need for (and feasibility of developing) a standard that speaks to
criteria and/or methods for generating thesauri by machine-aided or
automatic means.

The need for (and feasibility of developing) a standard set of tools
which show semantic relationships among terms, as aids to text and
information analysis and retrieval.

The need for (and feasibility of developing) a standard structure that
supports a variety of electronic thesaurus displays.

The need for (and feasibility of developing) a standard that supports
interoperability protocols, structures, and/or semantics applicable to
thesauri.

Specific issues:

Prescriptions for term structure

Schemata for relationships (not limited to those of the present standard)

Structures to support both vocabulary control* and vocabulary management*

Guidance on incorporating leaf nodes*

Controlled vs. managed vocabularies

*Some working definitions are:

Vocabulary control: The process of organizing a list of indexing terms to indicate the prefered term of a group of synonyms, and to distinguish between homographs. (Modified from definition in present thesaurus standard, Z39.19-1993)


Vocabulary management: The application of rules and policies for format, structure, and acceptability of indexing terms, without prescribing acceptability of individual terms.


Leaf node: Term for a specific entity, usually based on an authority list, generally located at the narrowest (most specific) level of a taxonomy.
The meeting will be expected to develop recommendations on:

Whether a standard should be developed; and,

Assuming a standard is to be developed, its scope (kinds of products it will cover, methods of production and display), and the communities which need to be involved in the standards development process.

Such a standard, if developed, will be for tools intended for implementation in operational text analysis and information retrieval systems, not for tools designed primarily for research in system design.

Top of Page


AGENDA

Thursday, November 4, 1999

1:00 - 1:30    Introduction (Margie Hlava)

1:30 - 2:30    Keynote (Jim Anderson)

2:30 - 3:00    Break

3:00 - 4:00    Presentation and discussion of Issue 1 (Joyce Ward):

The need for (and feasibility of developing) a standard that speaks to criteria and/or methods for generating thesauri by machine-aided or automatic means

4:00 - 5:00    Presentation and discussion of Issue 2
            (Dagobert Soergel):

The need for (and feasibility of developing) a standard set of tools which show semantic relationships among terms, as aids to text and information analysis and retrieval.

Friday, November 5, 1999

8:00am      Continental breakfast

8:30 - 9:00    Recap of work to this point

9:00 - 10:00   Presentation and discussion of Issue 3 (Eric Johnson):

The need for (and feasibility of developing) a standard structure that supports a variety of electronic thesaurus displays.

10:00 - 10:30  Break

10:30 - 11:30  Presentation and discussion of Issue 4 (John Kunze):

The need for (and feasibility of developing) a standard that supports interoperability protocols, structures, and/or semantics applicable to thesauri.

11:30 - 12:30  Workshop breakout groups:

1. Prescriptions for term structure (Stuart Nelson)
2. Schemata for relationships (not limited to those of the present standard) (Diane Vizine-Goetz)
3. Structures to support both vocabulary control and vocabulary management (Gail Hodge)
4. Guidance on incorporating leaf nodes
  (Joseph Busch)

12:30 - 1:30   Lunch

1:30 - 2:30    Reporting on workshop breakout groups

2:30 - 3:00    General discussion

3:00 - 3:30    Break

3:30 - 5:00    General discussion (continued), wrap-up, recommendations to NISO

Top of Page


ELECTRONIC THESAURI: PLANNING FOR A STANDARD
BACKGROUND READING

Buckland, Michael. Mapping Entry Vocabulary to Unfamiliar Metadata Vocabularies.@ D-Lib Magazine 5, no. 1, January 1999.  (http://www.dlib.org/dlib/january99/buckland/01buckland.html)



Cochrane, Pauline A., and Eric Johnson, eds. Visualizing Subject Access for 21st Century Information Resources, Proceedings of the 1997 Annual Clinic on
    Library Applications of Data Processing. Urbana-Champaign, IL: Graduate School of Library and Information Science, University of Illinois, 1998. (especially     papers by Allen, Belkin, Busch, Dubin, Johnson, Liddy, Milstead, and Vizine-Goetz).



Fellbaum, Christiane, ed. WordNet. Cambridge, MA: MIT Press, 1998.

Reviewed by Dagobert Soergel in D-Lib:
http://www.dlib.org/dlib/october98/10bookreview.html


Hodge, Gail M., and Jessica L. Milstead. Computer Support to Indexing.

Philadelphia: NFAIS, 1998. (esp. chapters 3 and 5)


Kramer, Ralf, Ralf Nikolai, and Corinna Habeck. Thesaurus federations:

loosely integrated thesauri for document retrieval in networks based
on Internet technologies.@ International Journal on Digital Libraries
1(2):122-131, September 1997.


Liddy, Elizabeth D. Enhanced Text Retrieval Using Natural Language

Processing.@ Bulletin of the American Society for Information
Science 24(4):14-16, April/May 1998.


Meta Data Coalition. "Knowledge Management Model - Knowledge

Descriptions." July 15, 1999.
http://www.mdcinfo.com/OIM/models/KDM.html.


Milstead, Jessica L. "Use of Thesauri in the Full-Text Environment."

September 1998. (http://www.jelem.com/full.htm)
(updated version of paper in Cochrane & Johnson)


Murray-Rust, P., and West, L. "Terminology in a Global Context: VHG

and XML Part II." http://www.vhg.org.uk/pub/vhgnews2.html.


National Information Standards Organization. Guidelines for the

Construction, Format, and Management of Monolingual Thesauri.
Bethesda, MD: NISO Press, 1994. 69p. (ANSI/NISO Z39.19-1993 R1998)


Nikolai, R., A. Traupe, and R. Kramer. Thesaurus Federations:

A Framework for the Flexible Integration of Heterogeneous,
Autonomous Thesauri.@ In ADL >98: Advances in Digital Libraries,
Proceedings of the IEEE International Forum on Research and
Technology.New York: IEEE Computer Society, 1998. p. 46-55.


Olson, Tony, and Gary Strawn. Mapping the LCSH and MeSH Systems.

@ Information Technology and Libraries 16(1):5-15, March 1997.


Shapiro, Celia D.; Yan, Puck-Fai. Generous Tools: Thesauri in Digital

Libraries.In 17th National Online Meeting. Proceedings. Medford, NJ:
Information Today, 1996. p. 323-332.


Schatz, Bruce R., Eric H. Johnson, Pauline A. Cochrane, and Hsinchun

Chen. Interactive Term Suggestion for Users of Digital Libraries:
Using Subject Thesauri and Co-occurrence Lists for Information
Retrieval. IN DL >96: Proceedings of the 1st ACM International
Conference on Digital libraries, March 20-23, 1996, Bethesda, MD.
New York: ACM, 1996. p. 126-133.
(http://www.acm.org/pubs/contents/proceedings/dl/226931/index.html)


Taylor, Mike, et al. "Zthes: a Z39.50 Profile for Thesaurus Navigation."

July 26, 1999.
(http://lcweb.loc.gov/z3950/agency/profiles/zthes-03.html)


Top of Page


ROSTER OF PARTICIPANTS

Workshop Organizer:

Jessica Milstead (JELEM Company)
Planning Committee:
Joseph Busch (DataFusion)
Peter Ciuffetti (SilverPlatter KnowledgeCite Library)
Margie Hlava (Access Innovations)
Gail Hodge (Information International Associates)
Nancy Knight (NISO)
Kate Mertes
Stuart Nelson (NLM)
Geraldine Ostrove (Library of Congress)
Diane Vizine-Goetz (OCLC)
Joyce Ward (Northern Light)
Participants:
Jim Anderson (Rutgers)
Karen Anspach (EOS International)
Denise Bedford (World Bank)
Terri Bernhardt (APA)
Jean Bowers (NTIS)
Mike Casey (Kluwer Academic Publishers)
Lois Chan (University of Kentucky)
Pauline Cochrane (University of Illinois)
Barbara E. Cohen (Consultant)
Bruce Croft (CIIR, University of Massachusetts)
Cindy Cunningham (amazon.com)
John Dickert (DTIC)
Emily Fayen (Rowecom)
Gregory Grazevich (Modern Language Association)
Pat Harris (NISO)
Quinn Hart (CERES project)
Linda Hill (Alexandria Digital Library Project, UCSB)
Susanne Humphrey (Lister Hill Center)
Eric Johnson (University of Illinois)
Bob Keating (AOL)
Pat Kuhr (H.W. Wilson Company)
John Kunze (University of California-San Francisco)
Elizabeth Liddy (Syracuse University)
Ruthanne Lowe (Cisco Systems)
Dorothy McGarry (UCLA)
Dee Andy Michel (Consultant)
Tim Miller (Derwent London)
Joan Mitchell (OCLC Forest Press)
Pat Molholt (Columbia University)
Deana Parks (National Agricultural Library)
Jose Perez-Carballo (Rutgers)
Joshua Powers (Semio Corp)
Roberta Rand (National Agricultural Library)
Lydia Reid (NARA)
Carlen Ruschoff (Georgetown University)
Etta Russell (Center for Army Lessons Learned)
Dagobert Soergel (University of Maryland)
Arlene Taylor (University of Pittsburgh)
Rick Thoroughgood (DTIC)
Tami Trotter (ExLibris)
Mark Tuttle (Lexical)
Claude Vogel (Semio Corp.)
Alvin Walker (American Psychological Association)
Bella Hass Weinberg (St. John's University)
Marcia Lei Zeng (Kent State University)

Top of Page