Planet Cataloging

August 24, 2016

Coyle's InFormation

Catalogs and Context: Part V

This entire series is available a single file on my web site.

Before we can posit any solutions to the problems that I have noted in these posts, we need to at least know what questions we are trying to answer. To me, the main question is:

What should happen between the search box and the bibliographic display?

Or as Pauline Cochrane asked: "Why should a user ever enter a search term that does not provide a link to the syndetic apparatus and a suggestion about how to proceed?"[1] I really like the "suggestion about how to proceed" that she included there. Although I can think of some exceptions, I do consider this an important question.

If you took a course in reference work at library school (and perhaps such a thing is no longer taught - I don't know), then you learned a technique called "the reference interview." The Wikipedia article on this is not bad, and defines the concept as an interaction at the reference desk "in which  the librarian responds to the user's initial explanation of his or her information need by first attempting to clarify that need and then by directing the user to appropriate information resources." The assumption of the reference interview is that the user arrives at the library with either an ill-formed query, or one that is not easily translated to the library's sources. Bill Katz's textbook "Introduction to Reference Work" makes the point bluntly:

"Be skeptical of the of information the patron presents" [2]

If we're so skeptical that the user could approach the library with the correct search in mind/hand, then why then do we think that giving the user a search box in which to put that poorly thought out or badly formulated search is a solution? This is another mind-boggler to me.

So back to our question, what SHOULD happen between the search box and the bibliographic display? This is not an easy question, and it will not have a simple answer. Part of the difficulty of the answer is that there will not be one single right answer. Another difficulty is that we won't know a right answer until we try it, give it some time, open it up for tweaking, and carefully observe. That's the kind of thing that Google does when they make changes in their interface, but we haven't got either Google's money nor its network (we depend on vendor systems, which define what we can and cannot do with our catalog).

Since I don't have answers (I don't even have all of the questions) I'll pose some questions, but I really want input from any of you who have ideas on this, since your ideas are likely to be better informed than mine. What do we want to know about this problem and its possible solutions?

(Some of) Karen's Questions

Why have we stopped evolving subject access?

Is it that keyword access is simply easier for users to understand? Did the technology deceive us into thinking that a "syndetic apparatus" is unnecessary? Why have the cataloging rules and bibliographic description been given so much more of our profession's time and development resources than subject access has? [3]

Is it too late to introduce knowledge organization to today's users?

The user of today is very different to the user of pre-computer times. Some of our users have never used a catalog with an obvious knowledge organization structure that they must/can navigate. Would they find such a structure intrusive? Or would they suddenly discover what they had been missing all along? [4]

Can we successfully use the subject access that we already have in library records?

Some of the comments in the articles organized by Cochrane in my previous post were about problems in the Library of Congress Subject Headings (LCSH), in particular that the relationships between headings were incomplete and perhaps poorly designed.[5] Since LCSH is what we have as headings, could we make them better? Another criticism was the sparsity of "see" references, once dictated by the difficulty of updating LCSH. Can this be ameliorated? Crowdsourced? Localized?

We still do not have machine-readable versions of the Library of Congress Classification (LCC), and the machine-readable Dewey Decimal Classification (DDC) has been taken off-line (and may be subject to licensing). Could we make use of LCC/DDC for knowledge navigation if they were available as machine-readable files?

Given that both LCSH and LCC/DDC have elements of post-composition and are primarily instructions for subject catalogers, could they be modified for end-user searching, or do we need to develop a different instrument altogether?

How can we measure success?

Without Google's user laboratory apparatus, the answer to this may be: we can't. At least, we cannot expect to have a definitive measure. How terrible would it be to continue to do as we do today and provide what we can, and presume that it is better than nothing? Would we really see, for example, a rise in use of library catalogs that would confirm that we have done "the right thing?"


Notes

[1]*Modern Subject Access in the Online Age: Lesson 3
Author(s): Pauline A. Cochrane, Marcia J. Bates, Margaret Beckman, Hans H. Wellisch, Sanford Berman, Toni Petersen, Stephen E. Wiberley and Jr.
Source: American Libraries, Vol. 15, No. 4 (Apr., 1984), pp. 250-252, 254-255
Stable URL: http://www.jstor.org/stable/25626708

[2] Katz, Bill. Introduction to Reference Work: Reference Services and Reference Processes. New York: McGraw-Hill, 1992. p. 82 http://www.worldcat.org/oclc/928951754. Cited in: Brown, Stephanie Willen. The Reference Interview: Theories and Practice. Library Philosophy and Practice 2008. ISSN 1522-0222

[3] One answer, although it doesn't explain everything, is economic: the cataloging rules are published by the professional association and are a revenue stream for it. That provides an incentive to create new editions of rules. There is no economic gain in making updates to the LCSH. As for the classifications, the big problem there is that they are permanently glued onto the physical volumes making retroactive changes prohibitive. Even changes to descriptive cataloging must be moderated so as to minimize disruption to existing catalogs, which we saw happen during the development of RDA, but with some adjustments the new and the old have been made to coexist in our catalogs.

[4] Note that there are a few places online, in particular Wikipedia, where there is a mild semblance of organized knowledge and with which users are generally familiar. It's not the same as the structure that we have in subject headings and classification, but users are prompted to select pre-formed headings, with a keyword search being secondary.

[5] Simon Spero did a now famous (infamous?) analysis of LCSH's structure that started with Biology and ended with Doorbells.

by Karen Coyle (noreply@blogger.com) at August 24, 2016 09:03 PM

Catalogs and Content: an Interlude

This entire series is available a single file on my web site.

"Editor's note. Providing subject access to information is one of the most important professional services of librarians; yet, it has been overshadowed in recent years by AACR2, MARC, and other developments in the bibliographic organization of information resources. Subject access deserves more attention, especially now that results are pouring in from studies of online catalog use in libraries."
American Libraries, Vol. 15, No. 2 (Feb., 1984), pp. 80-83
Having thought and written about the transition from card catalogs to online catalogs, I began to do some digging in the library literature, and struck gold. In 1984, Pauline Atherton Cochrane, one of the great thinkers in library land, organized a six-part "continuing education" to bring librarians up to date on the thinking regarding the transition to new technology. (Dear ALA - please put these together into a downloaded PDF for open access. It could make a difference.) What is revealed here is both stunning and disheartening, as the quote above shows; in terms of catalog models, very little progress has been made, and we are still spending more time organizing atomistic bibliographic data while ignoring subject access.

The articles are primarily made up of statements by key library thinkers of the time, many of whom you will recognize. Some responses contradict each other, others fall into familiar grooves. Library of Congress is criticized for not moving faster into the future, much as it is today, and yet respondents admit that the general dependency on LC makes any kind of fast turn-around of changes difficult. Some of the desiderata have been achieved, but not the overhaul of subject access in the library catalog.

The Background

If you think that libraries moved from card catalogs to online catalogs in order to serve users better, think again. Like other organizations that had a data management function, libraries in the late 20th century were reaching the limits of what could be done with analog technology. In fact, as Cochrane points out, by the mid-point of that century libraries had given up on the basic catalog function of providing cross references from unused to used terminology, as well as from broader and narrower terms in the subject thesaurus. It simply wasn't possible to keep up with these, not to mention that although the Library of Congress and service organizations like OCLC provided ready-printed cards for bibliographic entries, they did not provide the related reference cards. What libraries did (and I remember this from my undergraduate years) is they placed near the card catalog copies of the "Red Book". This was the printed Library of Congress Subject Heading list, which by my time was in two huge volumes, and, yes, was bound in red. Note that this was the volume that was intended for cataloging librarians who were formulating subject headings for their collections. It was never intended for the end-users of the catalog. The notation ("x", "xx", "sa") was far from intuitive. In addition, for those users who managed to follow the references, it pointed them to the appropriate place in LCSH, but not necessarily in the catalog of the library in which they were searching. Thus a user could be sent to an entry that simply did not exist.

The "RedBook" today
From my own experience, when we brought up the online catalog at the University of California, the larger libraries had for years had difficulty keeping the card catalog up to date. The main library at the University of California at Berkeley regularly ran from 100,000 to 150,000 cards behind in filing into the catalog, which filled two enormous halls. That meant that a book would be represented in the catalog about three months after it had been cataloged and shelved. For a research library, this was a disaster. And Berkeley was not unusual in this respect.

Computerization of the catalog was both a necessary practical solution, as well as a kind of holy grail. At the time that these articles were written, only a few large libraries had an online catalog, and that catalog represented only a recent portion of the library's holdings. (Retrospective conversion of the older physical card catalog to machine-readable form came later, culminating in the 1990's.) Abstracting and indexing databases had preceded libraries in automating, DIALOG, PRECIS, and others, and these gave librarians their first experience in searching computerized bibliographic data.

This was the state of things when Cochrane presented her 6-part "continuing education" series in American Libraries.

Subject Access

The series of articles was stimulated by an astonishingly prescient article by Marcia Bates in 1977. In that article she articulates both concerns and possibilities that, quite frankly, we should all take to heart today. In Lesson 3 of Cochrane's articles, Bates is quotes from 1977 saying:
"...with automation, we have the opportunity to introduce many access points to a given book. We can now use a subject approach... that allows the naive user, unconscious of and uninterested in the complexities of synonymy and vocabulary control, to blunder on to desired subjects, to be guided, without realizing it, by a redundant but carefully controlled subject access system." 
and
"And now is the time to change -- indeed, with MARC already so highly developed, past time. If we simply transfer the austerity-based LC subject heading approach to expensive computer systems, then we have used our computers merely to embalm the constraints that were imposed on library systems back before typewriters came into use!"

This emphasis on subject access was one of the stimuli for the AL lessons. In the early 1980's, studies done at OCLC and elsewhere showed that over 50% of the searches being done in the online catalogs of that day were subject searches, even those going against title indexes or mixed indexes. (See footnotes to Lesson 3.) Known item searching was assumed to be under control, but subject searching posed significant problems. Comments in the article include:
"...we have not yet built into our online systems much of the structure for subject access that is already present in subject cataloging. That structure is internal and known by the person analyzing the work; it needs to be external and known by the person seeking the work."
"Why should a user ever enter a search term that does not provide a link to the syndetic apparatus and a suggestion about how to proceed?"
Interestingly, I don't see that any of these problems has been solved into today's systems.

As a quick review, here are some of the problems, some proposed solutions, and some hope for future technologies that are presented by the thinkers that contributed to the lessons.

Problems noted

Many problems were surfaced, some with fairly simple solutions, others that we still struggle with.
  • LCSH is awkward, if not nearly unusable, both for its vocabulary and for the lack of a true hierarchical organization
  • Online catalogs' use of LCSH lacks syndetic structure (see, see also, BT, NT). This is true not only for display, but in retrieval, search on a broader term does not retrieve items with a narrower term (which would be logical to at least some users)
  • Libraries assign too few subject headings
  • For the first time, some users are not in the library while searching so there are no intermediaries (e.g. reference librarians) available. (One of the flow diagrams has a failed search pointing to a box called "see librarian" something we would not think to include today.)
  • Lack of a professional theory of information seeking behavior that would inform systems design. ("Without a blueprint of how most people want to search, we will continue to force them to search the we want to search." Lesson 5)
  • Information overload, aka overly large results, as well as too few results on specific searches

Proposed solutions

Some proposed solutions were mundane (add more subject headings to records) while others would require great disruption to the library environment.
  • Add more subject headings to MARC records
  • Use keyword searching, including keywords anywhere in the record.
  • Add uncontrolled keywords to the records.
  • Make the subject authority file machine-readable and integrate it into online catalogs.
  • Forget LCSH, instead use non-library bibliographic files for subject searching, such as A&I databases.
  • Add subject terms from non-library sources to the library catalog, and/or do (what today we call) federated searching
  • LCSH must provide headings that are more specific as file sizes and retrieved sets grow (in the document, a retrieved set of 904 items was noted with an exclamation point)

Future thinking 

As is so often the case when looking to the future, some potential technologies were seen as solutions. Some of these are still seen as solutions today (c.f. artificial intelligence), while others have been achieved (storage of full text).
  • Full text searching, natural language searches, and artificial intelligence will make subject headings and classification unnecessary
  • We will have access to back-of-the-book indexes and tables of contents for searching, as well as citation indexing
  • Multi-level systems will provide different interfaces for experts and novices
  • Systems will be available 24x7, and there will be a terminal in every dorm room
  • Systems will no longer need to use stopwords
  • Storage of entire documents will become possible

End of Interlude

Although systems have allowed us to store and search full text, to combine bibliographic data from different sources, and to deliver world-wide, 24x7, we have made almost no progress in the area of subject access. There is much more to be learned from these articles, and it would be instructive to do an in-depth comparison of them to where we are today. I greatly recommend reading them, each is only a few pages long.

----- The Lessons -----

*Modern Subject Access in the Online Age: Lesson 1
by Pauline Atherton Cochrane
Source: American Libraries, Vol. 15, No. 2 (Feb., 1984), pp. 80-83
Stable URL: http://www.jstor.org/stable/25626614

*Modern Subject Access in the Online Age: Lesson 2 Pauline A. Cochrane American Libraries Vol. 15, No. 3 (Mar., 1984), pp. 145-148, 150 Stable URL: http://www.jstor.org/stable/25626647

*Modern Subject Access in the Online Age: Lesson 3
Author(s): Pauline A. Cochrane, Marcia J. Bates, Margaret Beckman, Hans H. Wellisch, Sanford Berman, Toni Petersen, Stephen E. Wiberley and Jr.
Source: American Libraries, Vol. 15, No. 4 (Apr., 1984), pp. 250-252, 254-255
Stable URL: http://www.jstor.org/stable/25626708

*Modern Subject Access in the Online Age: Lesson 4
Author(s): Pauline A. Cochrane, Carol Mandel, William Mischo, Shirley Harper, Michael Buckland, Mary K. D. Pietris, Lucia J. Rather and Fred E. Croxton
Source: American Libraries, Vol. 15, No. 5 (May, 1984), pp. 336-339
Stable URL: http://www.jstor.org/stable/25626747

*Modern Subject Access in the Online Age: Lesson 5
Author(s): Pauline A. Cochrane, Charles Bourne, Tamas Doczkocs, Jeffrey C. Griffith, F. Wilfrid Lancaster, William R. Nugent and Barbara M. Preschel
Source: American Libraries, Vol. 15, No. 6 (Jun., 1984), pp. 438-441, 443
Stable URL: http://www.jstor.org/stable/25629231

*Modern Subject Access In the Online Age: Lesson 6
Author(s): Pauline A. Cochrane, Brian Aveney and Charles Hildreth Source: American Libraries, Vol. 15, No. 7 (Jul. - Aug., 1984), pp. 527-529
Stable URL: http://www.jstor.org/stable/25629275

by Karen Coyle (noreply@blogger.com) at August 24, 2016 09:02 PM

Catalog and Context Part III

This entire series is available a single file on my web site.

In the previous two parts, I explained that much of the knowledge context that could and should be provided by the library catalog has been lost as we moved from cards to databases as the technologies for the catalog. In this part, I want to talk about the effect of keyword searching on catalog context.

KWIC and KWOC

If you weren't at least a teenager in the 1960's you probably missed the era of KWIC and KWOC (neither a children's TV show nor a folk music duo). These meant, respectively, KeyWords In Context, and KeyWords Out of Context. These were concordance-like indexes to texts, but the first done using computers. A KWOC index would be simply a list of words and pointers (such as page numbers, since hyperlinks didn't exist yet). A KWIC index showed the keywords with a few words on either side, or rotated a phrase such that each term appeared once at the beginning of the string, and then were ordered alphabetically.

If you have the phrase "KWIC is an acronym for Key Word in Context", then your KWIC index display could look like:

 KWIC is an acronym for Key Word In Context
Key Word In Context
acronym for Key Word In Context
KWIC is an acronym for
acronym for Key Word In Context

To us today these are unattractive and not very useful, but to the first users of computers these were an exciting introduction to the possibility that one could search by any word in a text.

It wasn't until the 1980's, however, that keyword searching could be applied to library catalogs.

Before Keywords, Headings


Before keyword searching, when users were navigating a linear, alphabetical index, they were faced with the very difficult task of deciding where to begin their entry into the catalog. Imagine someone looking for information on Lake Erie. That seems simple enough, but entering the catalog at L-A-K-E E-R-I-E would not actually yield all of the entries that might be relevant. Here are some headings with LAKE ERIE:

Boats and boating--Erie, Lake--Maps. 
Books and reading--Lake Erie region.
Lake Erie, Battle of, 1813.
Erie, Lake--Navigation

Note that the lake is entered under Erie, the battle under Lake, and some instances are fairly far down in the heading string. All of these headings follow rules that ensure a kind of consistency, but because users do not know those rules, the consistency here may not be visible. In any case, the difficulty for users was knowing with what terms to begin the search, which was done on left-anchored headings.

One might assume that finding names of people would be simple, but that is not the case either. Names can be quite complex with multiple parts that are treated differently based on a number of factors having to do with usage in different cultures:

De la Cruz, Melissa
Cervantes Saavedra, Miguel de
Because it was hard to know where to begin a search, see and see also references existed to guide the user from one form of a name or phrase to another. However, it would inflate a catalog beyond utility to include every possible entry point that a person might choose, not to mention that this would make the cataloger's job onerous. Other than the help of a good reference librarian, searching in the card catalog was a kind of hit or miss affair.

When we brought up the University of California online catalog in 1982, you can image how happy users were to learn that they could type in LAKE ERIE and retrieve every record with those terms in it regardless of the order of the terms or where in the heading they appeared. Searching was, or seemed, much simpler. Because it feels simpler, we all have tended to ignore some of the down side of keyword searching. First, words are just strings, and in a search strings have to match (with some possible adjustment like combining singular and plural terms). So a search on "FRANCE" for all information about France would fail to retrieve other versions of that word unless the catalog did some expansion:

Cooking, French
France--Antiquities
Alps, French (France)
French--America--History
French American literature

The next problem is that retrieval with keywords, and especially the "keyword anywhere" search which is the most popular today, entirely misses any context that the library catalog could provide. A simple keyword search on the word "darwin" brings up a wide array of subjects, authors, and titles.

Subjects:
Darwin, Charles, 1809-1882 – Influence
Darwin, Charles, 1809-1882 — Juvenile Literature
Darwin, Charles, 1809-1882 — Comic Books, Strips, Etc
Darwin Family
Java (Computer program language)
Rivers--Great Britain
Mystery Fiction
DNA Viruses — Fiction
Women Molecular Biologists — Fiction

Authors:
Darwin, Charles, 1809-1882
Darwin, Emma Wedgwood, 1808-1896
Darwin, Ian F.
Darwin, Andrew
Teilhet, Darwin L.
Bear, Greg
Byrne, Eugene

Titles:
Darwin
Darwin; A Graphic Biography : the Really Exciting and Dramatic 
    Story of A Man Who Mostly Stayed at Home and Wrote Some Books
Darwin; Business Evolving in the Information Age
Emma Darwin, A Century of Family Letters, 1792-1896
Java Cookbook
Canals and Rivers of Britain
The Crimson Hair Murders
Darwin's Radio

It wouldn't be reasonable for us to expect a user to make sense of this, because quite honestly it does not make sense.

 In the first version of the UC catalog, we required users to select a search heading type, such as AU, TI, SU. That may have lessened the "false drops" from keyword searches, but it did not eliminate them. In this example, using a title or subject search the user still would have retrieved items with the subjects DNA Viruses — Fiction, and Women Molecular Biologists — Fiction, and an author search would have brought up both Java Cookbook and Canals and Rivers of Britain. One could see an opportunity for serendipity here, but it's not clear that it would balance out the confusion and frustration. 

You may be right now thinking "But Google uses keyword searching and the results are good." Note that Google now relies heavily on Wikipedia and other online reference books to provide relevant results. Wikipedia is a knowledge organization system, organized by people, and it often has a default answer for search that is more likely to match the user's assumptions. A search on the single word "darwin" brings up:

In fact, Google has always relied on humans to organize the web by following the hyperlinks that they create. Although the initial mechanism of the search is a keyword search, Google's forte is in massaging the raw keyword result to bring potentially relevant pages to the top. 

Keywords, Concluded

The move from headings to databases to un-typed keyword searching has all but eliminated the visibility and utility of headings in the catalog. The single search box has become the norm for library catalogs and many users have never experienced the catalog as an organized system of headings. Default displays are short and show only a few essential fields, mainly author, title and date. This means that there may even be users who are unaware that there is a system of headings in the catalog.

Recent work in cataloging, from ISBD to FRBR to RDA and BIBFRAME focus on modifications to the bibliographic record, but do nothing to model the catalog as a whole. With these efforts, the organized knowledge system that was the catalog is slipping further into the background. And yet, we have no concerted effort taking place to remedy this. 

What is most astonishing to me, though, is that catalogers continue to create headings, painstakingly, sincerely, in spite of the fact that they are not used as intended in library systems, and have not been used in that way since the first library systems were developed over 30 years ago. The headings are fodder for the keyword search, but no more so than a simple set of tags would be. The headings never perform the organizing function for which they were intended. 

Next


Part IV will look at some attempts to create knowledge context from current catalog data, and will present some questions that need to be answered if we are to address the quality of the catalog as a knowledge system.

by Karen Coyle (noreply@blogger.com) at August 24, 2016 09:02 PM

Catalog and Context, Part II

This entire series is available a single file on my web site.

In the previous post, I talked about book and card catalogs, and how they existed as a heading layer over the bibliographic description representing library holdings. In this post, I will talk about what changed when that same data was stored in database management systems and delivered to users on a computer screen.

Taking a very simple example, in the card catalog a single library holding with author, title and one subject becomes three separate entries, one for each heading. These are filed alphabetically in their respective places in the catalog.

In this sense, the catalog is composed of cards for headings that have attached to them the related bibliographic description. Most items in the library are represented more than once in the library catalog. The catalog is a catalog of headings.

In most computer-based catalogs, the relationship between headings and bibliographic data is reversed: the record with bibliographic and heading data, is stored once; access points, analogous to the headings of the card catalog, are extracted to indexes that all point to the single record.

This in itself could be just a minor change in the mechanism of the catalog, but in fact it turns out to be more than that.

First, the indexes of the database system are not visible to the user. This is the opposite of the card catalog where the entry points were what the user saw and navigated through. Those entry points, at their best, served as a knowledge organization system that gave the user a context for the headings. Those headings suggest topics to users once the user finds a starting point in the catalog.

When this system works well for the user, she has some understanding of where she was in the virtual library that the catalog created. This context could be a subject area or it could be a bibliographic context such as the editions of a work.

Most, if not all, online catalogs do not present the catalog as a linear, alphabetically ordered list of headings. Database management technology encourages the use of searching rather than linear browsing. Even if one searches in headings as a left-anchored string of characters a search results in a retrieved set of matching entries, not a point in an alphabetical list. There is no way to navigate to nearby entries. The bibliographic data is therefore not provided either in the context or the order of the catalog. After a search on "cat breeds" the user sees a screen-full of bibliographic records but lacking in context because most default displays do not show the user the headings or text that caused the item to be retrieved.

Although each of these items has a subject heading containing the words "Cat breeds" the order of the entries is not the subject order. The subject headings in the first few records read, in order:

  1. Cat breed
  2. Cat breeds
  3. Cat breeds - History
  4. Cat breeds - Handbooks, manuals, etc.
  5. Cat breeds
  6. Cat breeds - Thailand
  7. Cat breeds

If if the catalog uses a visible and logical order, like alphabetical by author and title, or most recent by date, there is no way from the displayed list for the user to get the sense of "where am I?" that was provided by the catalog of headings.

In the early 1980's, when I was working on the University of California's first online catalog, the catalogers immediately noted this as a problem. They would have wanted the retrieved set to be displayed as:

(Note how much this resembles the book catalog shown in Part I.) At the time, and perhaps still today, there were technical barriers to such a display, mainly because of limitations on the sorting of large retrieved sets. (Large, at that time, was anything over a few hundred items.) Another issue was that any bibliographic record could be retrieved more than once in a single retrieved set, and presenting the records more than once in the display, given the database design, would be tricky. I don't know if starting afresh today some of these features would be easier to produce, but the pattern of search and display seems not to have progressed greatly from those first catalogs.

In addition, it is in any case questionable whether a set of bibliographic items retrieved from a database on some query would reproduce the presumably coherent context of the catalog. This is especially true because of the third major difference between the card catalog and the computer catalog: the ability to search on individual words in the bibliographic record rather than being limited to seeking on full left-anchored headings. The move to keyword searching was both a boon and a bane because it was a major factor in the loss of context in the library catalog.

Keyword searching will be the main topic of Part III of this series.


by Karen Coyle (noreply@blogger.com) at August 24, 2016 09:01 PM

Catalog and Context, Part I

This multi-part post is based on a talk I gave in June, 2016 at ELAG in Copenhagen.
This entire series is available a single file on my web site.

Imagine that you do a search in your GPS system and are given the exact point of the address, but nothing more.

Without some context showing where on the planet the point exists, having the exact location, while accurate, is not useful.



In essence, this is what we provide to users of our catalogs. They do a search and we reply with bibliographic items that meet the letter of that search, but with no context about where those items fit into any knowledge map.

Because we present the catalog as a retrieval tool for unrelated items, users have come to see the library catalog as nothing more than a tool for known item searching. They do not see it as a place to explore topics or to find related works. The catalog wasn't always just a known item finding tool, however. To understand how it came to be one, we need a short visit to Catalogs Past.

Catalogs Past


We can't really compare the library catalog of today to the early book catalogs, since the problem that they had to solve was quite different to what we have today. However, those catalogs can show us what a library catalog was originally meant to be.
book catalog entry

A book catalog was a compendium of entry points, mainly authors but in some cases also titles and subjects. The bibliographic data was kept quite brief as every character in the catalog was a cost in terms of type-setting and page real estate. The headings dominated the catalog, and it was only through headings that a user could approach the bibliographic holdings of the library. An alphabetical author list is not much "knowledge organization", but the headings provided an ordered layer over the library's holdings, and were also the only access mechanism to them.

Some of the early card catalogs had separate cards for headings and for bibliographic data. If entries in the catalog had to be hand-written (or later typed) onto cards, the easiest thing was to slot the cards into the catalog behind the appropriate heading without adding heading data to the card itself.

Often there was only one card with a full bibliographic description, and that was the "main entry" card. All other cards were references to a point in the catalog, for example the author's name, where more information could be found.

Again, all bibliographic data was subordinate to a layer of headings that made up the catalog. We can debate how intellectually accurate or useful that heading layer was, but there is no doubt that it was the only entry to the content of the library.

The Printed Card


In 1902 the Library of Congress began printing cards that could be purchased by libraries. The idea was genius. For each item cataloged by LC a card was printed in as many copies as needed. Libraries could buy the number of catalog card "blanks" they required to create all of the entries in their catalogs. The libraries would use as many as needed of the printed cards and type (or write) the desired headings onto the top of the card. Each of these would have the full bibliographic information - an advantage for users who then would not longer need to follow "see" references from headings to the one full entry card in the catalog.


These cards introduced something else that was new: the card would have at the bottom a tracing of the headings that LC was using in its own catalog. This was a savings for the libraries as they could copy LC's practice without incurring their own catalogers' time. This card, for the first time, combined both bibliographic information and heading tracings in a single "record", with the bibliographic information on the card being an entry point to the headings.


Machine-Readable Card Printing


The MAchine Readable Cataloging (MARC) project of the Library of Congress was a major upgrade to card printing technology. By including all of the information needed for card printing in a computer-processable record, LC could take advantage of new technology to stream-line its card production process, and even move into a kind of "print on demand" model. The MARC record was designed to have all of the information needed to print the set of cards for a book; author, title, subjects, and added entries were all included in the record, as well as some additional information that could be used to generate reports such as "new acquisitions" lists.

Here again the bibliographic information and the heading information were together in a single unit, and it even followed the card printing convention of the order of the entries, with the bibliographic description at top, followed by headings. With the MARC record, it was possible to not only print sets of cards, but to actually print the headers on the cards, so that when libraries received a set they were ready to do into the catalog at their respective places.

Next, we'll look at the conversion from printed cards to catalogs using database technology.

-> Part II

by Karen Coyle (noreply@blogger.com) at August 24, 2016 09:01 PM

OCLC Next

Celebrating 45 years of WorldCat

WC-45-blog-color-green[1]Ohio University’s Alden Library was the first library to use WorldCat to catalog a book online. It was August 26, 1971, the day the OCLC Online Union Catalog and Shared Cataloging System began operation. Catalogers at Ohio University cataloged 133 books online from a single terminal that day.

Our contribution and participation in the creation of WorldCat with the submission of the first record is an incredible legacy and an incredible part of our history. And what WorldCat has become in the 45 years since is just as extraordinary. It speaks to the dedication and the hard work of librarians everywhere.

I know firsthand that sense of dedication.

A window into worldwide scholarship

In the early 1980s, while I was working on my MLS, I worked at OCLC. At that time, OCLC ran a retrospective conversion operation, and libraries from all over wc_ou_video_screen_capturethe world would send shelflist cards to OCLC in Dublin, Ohio, USA. I was a part of a small team of conversion experts who went card by card, searched the database, and either added holdings to an existing record or created a new record for those shelflist materials. In the OCLC database, there are probably tens of thousands of records with my initials on them that I’ve created or upgraded or added holdings to.

It was an extraordinary experience, a window into worldwide scholarship, because libraries had collected all of this incredible material. To go through it card by card, record by record, gave me a real insight into the way the database was built—a living database that’s constantly being added to, constantly being upgraded. It was a wonderful experience.

Sharing the world’s collected knowledge

There are several sets of materials I remember vividly. Probably the most distinctive was material from the Free Library of Philadelphia. We were sent cataloging cards, shelflist cards that had been created in the late 1700s and early 1800s. They were handwritten in an old-fashioned librarian’s script. The cards would nearly crumble in your hand as you handled them, they were that old.


What is important to remember is how WorldCat has shaped and continues to shape scholarship.
Click To Tweet


Another set of materials I remember were Bach scores and sound recordings, particularly the cantatas. For several weeks, I simply attached holdings to these materials as part of a conversion job for a library. After working on the cantatas and searching the database for several weeks, I sat down and finally just went to a library, checked out the Bach cantatas, and sat and listened through them all in one weekend. It was my introduction to Bach, and I’ve been a Bach lover ever since.

This week, as we mark the anniversary of this wonderful database, what I think is important to remember about WorldCat is how WorldCat has shaped and continues to shape scholarship.

To have the world’s knowledge cataloged and made available is just an extraordinary miracle. And to have been the first contributor is almost unthinkable. It’s such an honor.

Guest contributor Scott Seaman is Dean of Libraries, Ohio University.

Question…What unique items has your library added to WorldCat? What are your favorite collections in WorldCat? Share your answers with hashtag #OCLCnext

The post Celebrating 45 years of WorldCat appeared first on OCLC Next.

by Scott Seaman at August 24, 2016 02:05 PM

August 23, 2016

TSLL TechScans

Getting to Know TS Law Librarians: Eric Parker



1. Introduce yourself
I'm Eric Parker, Associate Director for Collection and Bibliographic Services at the Pritzker Legal Research Center, Northwestern University Pritzker School of Law.

2. Does your job title actually describe what you do? Why/why not?
      I believe it actually describes the day-to-day more or less.  Essentially, I head up the technical services operation at our library.  It maybe doesn't clearly indicate some of my responsibilities, such as convening/coordinating the library's selection team, and running certain projects.  This past year, we've been doing some weeding, off-site storage, and reclassification projects for the print collection.  

3. What are you reading right now?
      I am currently reading "City of Fortune" by Roger Crowley.  It's about the history of Venice as a maritime and imperial power in the late Middle Ages, and has been fascinating.  

4. You suddenly have a free day at work, what project would you work on?
It's hard to imagine what having a free day at work would be like! I'm sure many readers can relate. I'd really like to take the time to get more involved with the library's marketing and outreach efforts again. I had been somewhat involved with them for a while, but have had to step back from them in order to devote attention to the projects mentioned above. 

by noreply@blogger.com (Lauren Seney) at August 23, 2016 06:24 PM

August 21, 2016

Coyle's InFormation

Wikipedia and the numbers falacy

One of the main attempts at solutions to the lack of women on Wikipedia is to encourage more women to come to Wikipedia and edit. The idea is that greater numbers of women on Wikipedia will result in greater equality on the platform; that there will be more information about women and women's issues, and a hoped for "civilizing influence" on the brutish culture.

This argument is so obviously specious that it is hard for me to imagine that it is being put forth by educated and intelligent people. Women are not a minority - we are around 52% of the world's population and, with a few pockets of exception, we are culturally, politically, sexually, and financially oppressed throughout the planet. If numbers created more equality, where is that equality for women?

The "woman problem" is not numerical and it cannot solved with numbers. The problem is cultural; we know this because attacks against women can be traced to culture, not numbers: the brutal rapes in India, the harassment of German women by recent-arrived immigrant men at the Hamburg railway station on New Year's eve, the racist and sexist attacks on Leslie Jones on Twitter -- none of these can be explained by numbers. In fact, the stats show that over 60% of Twitter users are female, and yet Jones was horribly attacked. Gamergate arose at a time when the number of women in gaming is quite high, with data varying from 40% to over 50% of gamers being women. Women gamers are attacked not because there are too few of them, and there does not appear to be any safety in numbers.

The numbers argument is not only provably false, it is dangerous if mis-applied. Would women be safer walking home alone at night if we encouraged more women to do it?  Would having more women at frat parties reduce the rape culture on campus? Would women on Wikipedia be safer if there were more of them? (The statistics from 2011 showed that 13% of editors were female. The Wikimedia Foundation had a goal to increase the number to 25% by 2015, but Jimmy Wales actually stated in 2015 that the number of women was closer to 10% than 25%.) I think that gamergate and Twitter show us that the numbers are not the issue.

In fact, Wikipedia's efforts may have exacerbated the problem. The very public efforts to bring more women editors into Wikipedia (there have been and are organized campaigns both for women and about women) and the addition of more articles by and about women is going to be threatening to some members of the Wikipedia culture. In a recent example, an edit-a-thon produced twelve new articles about women artists. They were immediately marked for deletion, and yet, after analysis, ten of the articles were determined to be suitable, and only two were lost. It is quite likely that twelve new articles about male scientists (Wikipedia greatly values science over art, another bias) would not have produced this reaction; in fact, they might have sailed into the encyclopedia space without a hitch. Some editors are rebelling against the addition of information about women on Wikipedia, seeing it as a kind of reverse sexism (something that came up frequently in the attack on me).

Wikipedia's culture is a "self-run" society. So was the society in the Lord of the Flies. If you are one of the people who believe that we don't need government, that individuals should just battle it out and see who wins, then Wikipedia might be for you. If, instead, you believe that we have a social obligation to provide a safe environment for people, then this self-run society is not going to be appealing. I've felt what it's like to be "Piggy" and I can tell you that it's not something I would want anyone else to go through.

I'm not saying that we do not want more women editing Wikipedia. I am saying that more women does not equate to more safety for women. The safety problem is a cultural problem, not a numbers problem. One of the big challenges is how we can define safety in an actionable way. Title IX, the US statute mandating equality of the sexes in education,  revolutionized education and education-related sports. Importantly, it comes under the civil rights area of the Department of Justice. We need a Title IX for the Internet; one that requires those providing public services to make sure that there is no discrimination based on sex. Before we can have such a solution, we need to determine how to define "non-discrimination" in that context. It's not going to be easy, but it is a pre-requisite to solving the problem.

by Karen Coyle (noreply@blogger.com) at August 21, 2016 09:44 PM

August 17, 2016

Coyle's InFormation

Classification, RDF, and promiscuous vowels

"[He] provided (i) a classified schedule of things and concepts, (ii) a series of conjunctions or 'particles' whereby the elementary terms can be combined to express composite subjects, (iii) various kinds of notational devices ... as a means of displaying relationships between terms." [1]

"By reducing the complexity of natural language to manageable sets of nouns and verbs that are well-defined and unambiguous, sentence-like statements can be interpreted...."[2]

The "he" in the first quote is John Wilkins, and the date is 1668.[3] His goal was to create a scientifically correct language that would have one and only one term for each thing, and then would have a set of particles that would connect those things to make meaning. His one and only one term is essentially an identifier. His particles are linking elements.

The second quote is from a publication about OCLC's linked data experiements, and is about linked data, or RDF. The goals are so obviously similar that it can't be overlooked. Of course there are huge differences, not the least of which is the technology of the time.*

What I find particularly interesting about Wilkins is that he did not distinguish between classification of knowledge and language. In fact, he was creating a language, a vocabulary, that would be used to talk about the world as classified knowledge. Here we are at a distance of about 350 years, and the language basis of both his work and the abstract grammar of the semantic web share a lot of their DNA. They are probably proof of some Chomskian theory of our brain and language, but I'm really not up to reading Chomsky at this point.

The other interesting note is how similar Wilkins is to Melvil Dewey. He wanted to reform language and spelling. Here's the section where he decries alphabetization because the consonants and vowels are "promiscuously huddled together without distinction." This was a fault of language that I have not yet found noted in Dewey's work. Could he have missed some imperfection?!





*Also, Wilkins was a Bishop in the Anglican church, and so his description of the history of language is based literally on the Bible, which makes for some odd conclusions.



[1]Schulte-Albert, Hans G. Classificatory Thinking from Kinner to Wilkins: Classification and Thesaurus Construction, 1645-1668. Quoting from Vickery, B. C. "The Significance of John Wilkins in the History of Bibliographical Classification." Libri 2 (1953): 326-43.
[2]Godby, Carol J, Shenghui Wang, and Jeffrey Mixter. Library Linked Data in the Cloud: Oclc's Experiments with New Models of Resource Description. , 2015.
[3] Wilkins, John. Essay Towards a Real Character, and a Philosophical Language. S.l: Printed for Sa. Gellibrand, and for John Martyn, 1668.

by Karen Coyle (noreply@blogger.com) at August 17, 2016 05:23 PM

First Thus

BIBFRAME The Objects of Cataloguing and the objects of BIBFRAME (was Re: BIBFRAME Life after MARC?)

Posting to Bibframe

On 8/17/2016 05:58, Xu, Amanda wrote:

Under of influence of the five user tasks, e.g. Find, Identify, Select, Obtain, and Navigate as defined by the ICP principles, we are only capable to give people results, not the answers to their information inquiry.

Even this is theoretical, in my experience. For instance, I have seen that the overwhelming use among searchers of our catalogs is to serve as entry points into the collection where they can browse the materials themselves. Let’s imagine that someone wants “something” by Cicero. They go to the catalog where they find his main call number, so that they can walk to the shelves where they can browse and examine the materials themselves. This is what people want to do, and why they enjoy libraries so much. They don’t understand all they are missing when they do this, and often they only get angry when you point it out. But it is when they are standing in front of the shelves where the “identifying, selecting, and obtaining” takes place, all done at once and according to their own tastes which often have very little or nothing at all to do with any information that is in the catalog. One item may be harder to read or one may fit more easily in the hand. Another may be bound too tightly.

This is even clearer with online materials, where I do a full-text search and find something that may be interesting, but I don’t even know–and cannot know–what it is until I have already obtained it. It is only after I have obtained a resource that I discover that it is a political screed from a think tank or some crazy blog post. This is how I/S/O takes place in the real world. It is possible to obtain something but deselect it before identifying it.

I have written extensively that the task of “find” or “search” is changing in fundamental ways. Many believe that “find/search” should be–and is becoming–a push technology as our intellectual and emotional lives are being subjected to ever more intense algorithmic analysis, the goal of which is to present us with choices before we are even aware that we want them. It is important to note that lots of people want this.

My own opinion of “navigate”, i.e. “4.5. to navigate within a catalogue and beyond (that is, through the logical arrangement of bibliographic and authority data and presentation of clear ways to move about, including presentation of relationships among works, expressions, manifestations, items, persons, families, corporate bodies, concepts, objects, events, and places).” [See: http://www.ifla.org/files/assets/cataloguing/icp/icp_2009-en.pdf]

is not unique but merely a variation on “find/search”.

But what you say is very true: the catalog is neither designed, nor capable of giving answers to an information inquiry. Very few people who come to the library are seeking the bibliographic information in the catalog. The information they want is inside the books, articles, maps and so on, that make up the collection.

Yet the bigger question is: are these the tasks that users really want to do? Although I have no doubt people want to do this once in awhile, are these tasks the primary ones wanted by users? I ask myself if this is what I have wanted. From my experience, people want something quite different.

This is not to say that there should be no changes to MARC. Of course MARC should change and should have been the very first thing to change 25 or so years ago. But it is a pity that Bibframe is becoming so complex that I fear very few web designers will use it. Again, what is the purpose of Bibframe and who is it for? I still don’t know.

FacebookTwitterGoogle+PinterestShare

by James Weinheimer at August 17, 2016 09:34 AM

August 16, 2016

Coyle's InFormation

The case of the disappearing classification

I'm starting some research into classification in libraries (now that I have more time due to having had to drop social media from my life; see previous post). The main question I want to answer is: why did research into classification drop off at around the same time that library catalogs computerized? This timing may just be coincidence, but I'm suspecting that it isn't.

 I was in library school in 1971-72, and then again in 1978-80. In 1971 I took the required classes of cataloging (two semesters), reference, children's librarianship, library management, and an elective in law librarianship. Those are the ones I remember. There was not a computer in the place, nor do I remember anyone mentioning them in relation to libraries. I was interested in classification theory, but not much was happening around that topic in the US. In England, the Classification Research Group was very active, with folks like D.J. Foskett and Brian Vickery as mainstays of thinking about faceted classification. I wrote my first published article about a faceted classification being used by a UN agency.[1]

 In 1978 the same school had only a few traditional classes. I'd been out of the country, so the change to me was abrupt. Students learned to catalog on OCLC. (We had typed cards!) I was hired as a TA to teach people how to use DIALOG for article searching, even though I'd never seen it used, myself. (I'd already had a job as a computer programmer, so it was easy to learn the rules of DIALOG searching.) The school was now teaching "information science". Here's what that consisted of at the time: research into term frequency of texts; recall and precision; relevance ranking; database development.

I didn't appreciate it at the time, but the school had some of the bigger names in these areas, including William Cooper and M. E. "Bill" Maron. (I only just today discovered why he called himself Bill - the M. E., which is what he wrote under in academia, stands for "Melvin Earl". Even for a nerdy computer scientist, that was too much nerdity.) 1978 was still the early days of computing, at least unless you were on a military project grant or worked for the US Census Bureau. The University of California, Berkeley, did not have visible Internet access. Access to OCLC or DIALOG was via dial-up to their proprietary networks. (I hope someone has or will write that early history of the OCLC network. For its time it must have been amazing.)

The idea that one could search actual text was exciting, but how best to do it was (and still is, to a large extent) unclear. There was one paper, although I so far have not found it, that was about relevance ranking, and was filled with mathematical formulas for calculating relevance. I was determined to understand it, and so I spent countless hours on that paper with a cheat sheet beside me so I could remember what uppercase italic R was as opposed to lower case script r. I made it through the paper to the very end, where the last paragraph read (as I recall): "Of course, there is no way to obtain a value for R[elevance], so this theory cannot be tested." I could have strangled the author (one of my profs) with my bare hands.

Looking at the articles, now, though, I see that they were prescient; or at least that they were working on the beginnings of things we now take for granted. One statement by Maron especially strikes me today:
A second objective of this paper is to show that about is, in fact, not the central concept in a theory of document retrieval. A document retrieval system ought to provide a ranked output (in response to a search query) not according to the degree that they are about the topic sought by the inquiring patron, but rather according to the probability that they will satisfy that person‘s information need. This paper shows how aboutness is related to probability of satisfaction.[2] 
This is from 1977, and it essentially describes the basic theory behind Google ranking. It doesn't anticipate hyperlinking, of course, but it does anticipate that "about" is not the main measure of what will satisfy a searcher's need. Classification, in the traditional sense, is the quintessence of about. Is this the crux of the issue? As yet, I don't know. More to come.

[1]Coyle, Karen (1975). "A Faceted Classification for Occupational Safety and Health". Special Libraries. 66 (5-6): 256–9.
[2]Maron, M. E. (1977) "On Indexing, Retrieval, and the Meaning of About". Journal of the American Society for Information Science, January, 1977, pp. 38-43

by Karen Coyle (noreply@blogger.com) at August 16, 2016 06:14 PM

OCLC Next

Breaking the curse of knowledge

knowledge-curseThere are many experts out there—on technology, customer service, management, information science and more. These experts may be deeply immersed in their efforts to explore a subject and push the boundaries of what may be possible. But bringing an expert’s deep knowledge into the context of working professionals can be a challenge.

We forget what it’s like to “not know”

Teacher/blogger Chris Reddy speaks to the “teacher curse;” that is, the challenge of helping others learn when you are well versed in a topic. He says, “Knowledge is a curse. Knowing things isn’t bad itself, but it causes unhealthy assumptions—such as forgetting how hard it was to learn those things in the first place. It’s called the Curse of Knowledge.” Chris provides some fantastic suggestions around how to thoughtfully and effectively design the learning experience for others.

Lorcan Dempsey touched on this same idea in his recent blog post “Libraries and the Curse of Knowledge,” where he discusses how the curse of knowledge can manifest itself within libraries. As he shares, “To be clearly understood, one cannot presume that potential users already understand how the library works, or is structured, or what jargon is used to describe services.”

Know where your learners are starting

Training and instructional design is all about working through the curse of knowledge and helping learners where they are, not necessarily where the trainer’s (or subject matter expert’s) knowledge ends. The most effective teachers are driven by curiosity when approaching adult learning, and will explore a lot of questions to determine the appropriate scope of training. Fundamental questions we ask when designing training are:

  • Who is the audience?
  • Where are learners starting at in their work or experience?
  • What do they need to learn to move forward?
  • How do we build a meaningful learning experience?

Make learning a conversation

I enjoy working with adult learners. Through WebJunction, I help library staff navigate their learning journey and supply them with practical, tangible skills to apply at work. As an open learning community, WebJunction provides online resources, programming and learning opportunities that build the knowledge, skills and confidence that library staff need to power relevant, vibrant libraries. A program of OCLC Research, WebJunction designs and delivers transformational programs that connect public library service to community needs such as lifelong learning, health and wellness, and economic success. More than 70 percent of all US public libraries across all 50 states have participated in WebJunction programs and learning since 2003.

A benefit of helping adults learn is that we’re all working with more daily context and background than we have as children, and I can draw on learners’ often deep stores of knowledge and experiences, including ideas or solutions I’d never considered. At its best, training can be a wonderful conversation between instructor and learner, blending past experience with present need. We learn together around a topic that is relevant to the work at hand.


Training can be a wonderful conversation between instructor and learner.
Click To Tweet


Repetition is a good teacher

At WebJunction, we know that learning sticks best when a learner can try something out, make mistakes and try again. We also know that reflection or self-assessment can be a powerful learning tool. We encourage learners to answer these questions for themselves and for the trainers:

  • What is at least one new thing that they learned?
  • How will they apply this new knowledge?
  • What more do they need to learn or practice?
  • What past experiences and insight can they provide for others to learn from?

During our webinars, attendees use live chat to ask questions, share personal experiences, and simply not feel alone. Through blended-learning (a combination of independent study, facilitated discussions, and group interaction), learners can review recorded sessions as much as they need, ask as many questions as they need, and use the resources they find most helpful.

Start a virtual community to spread ideas

For some of our programs, we bridge the live online training sessions with a dedicated online space for learners to reflect on the training material and their library practice, read others’ reflections, and offer or receive peer support. This kind of virtual community of practice allows for new ideas to be shared and knowledge to be spread from library to library.

When you work through the curse of your knowledge, you can help an overwhelmed learner see that:

  • It’s going to be alright.
  • We’re here to help.
  • I know you can learn this.

We can better help competent adults learn new things when we honor their existing knowledge and offer new information and skills to build upon that base.

Question…What’s your advice for escaping the curse of library knowledge when communicating with your users? How do you overcome the curse of knowledge with users who have outdated views of your library? Share your answers with hashtag #OCLCnext

The post Breaking the curse of knowledge appeared first on OCLC Next.

by Kathleen Gesinger at August 16, 2016 02:16 PM

Coyle's InFormation

This is what sexism looks like: Wikipedia

We've all heard that there are gender problems on Wikipedia. Honestly there are a lot of problems on Wikipedia, but gender disparity is one of them. Like other areas of online life, on Wikipedia there are thinly disguised and not-so thinly disguised attacks on women. I am at the moment the victim of one of those attacks.

Wikipedia runs on a set of policies that are used to help make decisions about content and to govern behavior. In a sense, this is already a very male approach, as we know from studies of boys and girls at play: boys like a sturdy set of rules, and will spend considerable time arguing whether or not rules are being followed; girls begin play without establishing a set of rules, develop agreed rules as play goes on if needed, but spend little time on discussion of rules.

If you've been on Wikipedia and have read discussions around various articles, you know that there are members of the community that like to "wiki-lawyer" - who will spend hours arguing whether something is or is not within the rules. Clearly, coming to a conclusion is not what matters; this is blunt force, nearly content-less arguing. It eats up hours of time, and yet that is how some folks choose to spend their time. There are huge screaming fights that have virtually no real meaning; it's a kind of fantasy sport.

Wiki-lawyering is frequently used to harass. It is currently going on to an amazing extent in harassment of me, although since I'm not participating, it's even emptier. The trigger was that I sent back for editing two articles about men that two wikipedians thought should not have been sent back. Given that I have reviewed nearly 4000 articles, sending back 75% of those for more work, these two are obviously not significant. What is significant, of course, is that a woman has looked at an article about a man and said: "this doesn't cut it". And that is the crux of the matter, although the only person to see that is me. It is all being discussed as violations of policy, although there are none. But sexism, as with racism, homophobia, transphobia, etc., is almost never direct (and even when it is, it is often denied). Regulating what bathrooms a person can use, or denying same sex couples marriage, is a kind of lawyering around what the real problem is. The haters don't say "I hate transexuals" they just try to make them as miserable as possible by denying them basic comforts. In the past, and even the present, no one said "I don't want to hire women because I consider them inferior" they said "I can't hire women because they just get pregnant and leave."

Because wiki-lawyering is allowed, this kind of harassment is allowed. It's now gone on for two days and the level of discourse has gotten increasingly hysterical. Other than one statement in which I said I would not engage because the issue is not policy but sexism (which no one can engage with), it has all been between the wiki-lawyers, who are working up to a lynch mob. This is gamer-gate, in action, on Wikipedia.

It's too bad. I had hopes for Wikipedia. I may have to leave. But that means one less woman editing, and we were starting to gain some ground.

The best read on this topic, mainly about how hard it is to get information that is threatening to men (aka about women) into Wikipedia: WP:THREATENING2MEN: Misogynist Infopolitics and the Hegemony of the Asshole Consensus on English Wikipedia

I have left Wikipedia, and I also had to delete my Twitter account because they started up there. I may not be very responsive on other media for a while. Thanks to everyone who has shown support, but if by any chance you come across a kinder, gentler planet available for habitation, do let me know. This one's desirability quotient is dropping fast.

by Karen Coyle (noreply@blogger.com) at August 16, 2016 03:00 PM

First Thus

BIBFRAME Life after MARC?

Posting to Bibframe

On 8/13/2016 01:01, Karen Coyle wrote:

One of the key questions, for which we do not currently have an answer, is: What are the goals of the catalog? Cutter had his, in 1876

The latest version of the International Cataloging Principles has these goals

I think these ICP principles are pretty bizarre

I completely agree, and I think it is important for librarians and catalogers to accept that these are pretty bizarre goals. At least with Charles Cutter, he made it very clear that the goals of the catalog come from the questions asked by the users and he delineated those questions. See at http://tinyurl.com/3x4xpom where he provides the questions (asked in 1876):

1st. Has the library such a book by a certain author? Have you Bell on the Brain ? Have you John Brent, by Theodore Winthrop ?
2d. What books by a certain author has it ?’ What other books by Winthrop have you?
3d. Has it a book with a given title? Have you John Brent ?
4th. Has it a certain book on a given subject ? Have you a pamphlet on the bull-frog, by Professor -I’ve forgotten his name?
5th. What books has it on a given subject? Have you anything on glaciers? What have you on philosophy? I wish to see all the books.
6th. What books has it in a certain class of literature? What plays have you ? What poems ?
7th. What books have you in certain languages? What French books have you? How well provided are you with German literature?

Cutter’s catalog was created to answer most of these types of questions asked by people in 1876. While such questions are still asked today, the range of questions has become much broader–and I suspect the questions asked by the public were already much broader than those provided by Cutter in his day as well. But Cutter had to start somewhere.

An additional purpose of the catalog, and one that I think everyone can agree on, is that it serves as an inventory of (most of) the materials in the library. Beyond an inventory tool, in the early 21st century “search” is changing rapidly (Google changes its algorithm about twice a day) and people’s expectations are changing as well.

The ICP rules are just modern reinterpretations of Cutter’s purposes. It seems to me that if we want to build something for the public, then we should find out what kinds of questions they are asking and analyze them, just as Cutter did. The people who have the best knowledge of the public’s questions are reference librarians–reference librarians of all kinds: academic, public, children’s and so on.

People who are building the tools normally want to find out what the users of their tools want to do and what they expect, especially when there are tons of alternatives available that people find more attractive. (An interesting article in this regard is “The Strange Affliction of ‘Library Anxiety’ and What Librarians Do to Help” http://www.atlasobscura.com/articles/the-strange-affliction-of-library-anxiety-and-what-librarians-do-to-help, which I have certainly seen myself)

If we insist on building tools without knowing what modern society needs or wants (or we insist that we already know better) then it shouldn’t be surprising that anything we come up with will be bizarre.

FacebookTwitterGoogle+PinterestShare

by James Weinheimer at August 16, 2016 01:12 PM

August 12, 2016

First Thus

Comment to Emflix – Gone Baby Gone

Comment to Netanel Ganin’s Emflix – Gone Baby Gone , Code4Lib Journal Issue 33, 2016-07-19.

Summary:
“Enthusiasm is no replacement for experience. This article describes a tool developed at the Emerson College Library by an eager but overzealous cataloger. Attempting to enhance media-discovery in a familiar and intuitive way, he created a browseable and searchable Netflix-style interface. Though it may have been an interesting idea, many of the crucial steps that are involved in this kind of high-concept work were neglected. This article will explore and explain why the tool ultimately has not been maintained or updated, and what should have been done differently to ensure its legacy and continued use.”

Bravo! Every single development process follows a similar path, but unless you are involved you don’t know about it.

The developers will rarely tell the story. Most prefer to talk about their successes–that is, they talk about the final version of their successful projects (to be even more precise: they focus on the positive points of their most successful projects and rarely mention the bad points)–and finally, they almost never talk about the ones that wind up in the trash can, with all the distress those caused.

Such attitudes are understandable because it is painful to acknowledge and analyse problems, or it can be politically unwise to let others know. But there are consequences. Many of the conference papers we read or the articles published about a project, should be considered more as advertising than an unbiased critique.

The author of this paper has risen above all of that, and I have nothing but praise for him. Even though I don’t like failure, my experience has been that I have learned even more from failures–my own and others’–than from successes.

FacebookTwitterGoogle+PinterestShare

by James Weinheimer at August 12, 2016 10:10 AM