Planet Cataloging

April 17, 2015

What's the point?

Last post

This is the last post for this blog.

I maintained this blog as an adjunct to my professional life and discussed only professional matters within it; and I had a second blog for all the other stuff. Now that I am retired, my life no longer falls into two halves; it is a merry mash-up of thoughts about libraries (because I still think about them from time to time) and thoughts about all sorts of other things.

And so I am going to close this blog. It has been fun, but it’s over. I hope that some of you will follow me into my new world, and carry on reading at Now and then, as and when. See you there!

by Me ( at April 17, 2015 11:43 AM

April 16, 2015

025.431: The Dewey blog is coming back

Thank you to our members for your recent feedback regarding unavailability of, one of the first library linked data resources when it launched several years ago. I’m excited that you find this experimental service so valuable and I’m happy to share that is coming back!

At this time I do not have an exact date of availability, but please know that our OCLC team is actively working on reliability and scalability improvements needed to bring this service back online as soon as possible. I will keep you informed and appreciate your patience as I continue to learn more.

by Michael Panzer at April 16, 2015 08:05 PM

Mod Librarian

5 Things Thursday: DAM, Getty Images, Kickstarter

Here are five things for spring:

  1. Another DAM Podcast with Dave Ginsberg of the Sundance Institute.
  2. Working with librarians on humanizing search. Check out Wonder, the search guided by librarians.
  3. Fascinating infographic on 20 years of Getty Images.
  4. A Kickstarter campaign fueled the Recirculated library podcast.
  5. DAM is a service oriented profession.

View On WordPress

April 16, 2015 12:41 PM

April 15, 2015


VIAF RDF Changes

Here's a contribution from Jeff Young, who manages the RDF aspects of VIAF:

Since Wikidata’s introduction to the Linked Data Web in 2014 and subsequent integration of Freebase, it has become a premier example of how to publish and manage Linked Data. Like VIAF, Wikidata uses as its core RDF vocabulary and both datasets publish using Linked Data best practices. This consistency should allow applications to treat both datasets as complementary. The main difference will be in the coverage of entities/information, based on their respective sources.

The VIAF RDF changes outlined on the Developer Network blog are intended to further enrich and align the common purpose. Some of the VIAF changes provide additional information to help disambiguate entities, such as schema:location and schema:description. Where possible, schema:names are now language tagged, which should make it easier for applications to select a language-appropriate label for display.

The biggest change, though, is in the “shape of the data” that gets returned via Linked Data requests. Previously, this was a record-oriented view rather than a concise description of the entity. Like Wikidata, the new response will focus on the entity itself and depend on the related entities to describe themselves.

Alignment with Wikidata is a major step in the evolution of VIAF, which started with RDF/XML representations of name authority clusters in 2009 and transitioned to “primary entities” in 2011.  The introduction of VIAF as in 2014 extends the audience and integration with Wikidata further strengthens industry standard practices. These steps should help ensure that VIAF remains an authoritative source of entity identifiers and information in the linked web of data.


Note: We expect these RDF changes to be visible on April 16, 2015.  The bulk distribution will follow shortly after that.


by Thom at April 15, 2015 02:28 PM

April 14, 2015

First Thus

TSLL TechScans

Preservation Week 2015

Preservation Week is quickly approaching, this year it is the week of April 26 - May 2. Sometimes the preservation activities of an institution are not visible to the users of the library's materials, so this week is a great time to promote the activities your institution is undertaking to ensure continued access to its collections - both analog and digital. 

It's also a great time to take advantage of preservation training. This year ALA is sponsoring 3 FREE webinars on different preservation topics:
  • Moving Image Preservation 101
  • Digital Preservation for Individuals and Small Groups
  • Disaster Response Q&A
There are additional preservation videos available on the ALCTS Preservation play list.

by (Lauren Seney) at April 14, 2015 12:58 PM

April 13, 2015

Resource Description & Access (RDA)

April 12, 2015

Terry's Worklog

Building a better MarcEdit for Mac users

This all started with a conversation over twitter ( about a week ago.  A discussion about why the current version of MarcEdit is so fragile when being run on a Mac.  The short answer has been that MarcEdit utilizes a cross platform toolset when building the UI which works well on Linux and Windows systems, but tends to be less refined on Mac systems.  I’ve known this for a while, but to really do it right, I’d need to develop a version of MarcEdit that uses native Mac APIs, which would mean building a new version of MarcEdit for the Mac (at least, the UI components).  And I’ve considered it – mapped out a road map – but what’s constantly stopped me has been a lack of interest from the MarcEdit community and a lack of a Mac system.  On the community-side, I can count on two hands the number of times I’ve had someone request a version of MarcEdit  specifically for a Mac.  And since I’ve been making a Mac App version of MarcEdit available – it’s use has been fairly low (though this could be due to the struggles noted above).  With an active community of over 20,000, I try to put my time where it will make the most impact, and up until a week ago, better support for Mac systems didn’t seem to be high on the list.  The second reason is I don’t own a Mac.  My technology stack is made up of about a dozen Windows and Linux systems embedded around my house because they play surprisingly well together, where as, Apple’s walled garden just doesn’t thrive within my ecosystem.  So, I’ve been waiting and hoping that the cross-platform toolset would get better and that in time, this problem would eventually go away.

I’m giving that background because apparently I’ve been misreading the MarcEdit community.  As I said, this all started with this conversation on twitter ( – and out of that, two co-conspirators, Whitni Watkins and Francis Kayiwa set out to see just how much interest there actually was in having dedicated version of MarcEdit for the Mac.  The two set out to see if they could raise funds to acquire a Mac to do this development and indirectly, demonstrate that there was actually a much larger slice of the community interested in seeing this work done.  And, so, off they went – and I set back and watched.  I made a conscious decision that if this was going to happen, it was going to be because the community wanted it and in that, my voice wasn’t necessary.  And after 8 days, it’s done.  In all, 40 individuals contributed to the campaign, but more importantly to me, I heard directly from around 200+ individuals that were hopeful that this project would proceed. 

Development Roadmap

Now the hard work starts.  MarcEdit is a program that has been under constant development since 1999 – so even just rewriting the UI components of the application will be a significant undertaking.  So, I’m breaking up this work in chunks.  I figure it would take approximately 8-12 months to completely port the UI, which is a long-time.  Too long…so I’m breaking the development into 3 month “sprints”.  the first sprint will target the 80%, the functionality that would make MarcEdit productive when doing MARC editing.  This means porting the functionality for all the resources found in the MARC Tools and much of the functionality found in the MarcEditor components.  My guess is these two components are the most important functional areas for catalogers – so finishing those would allow the tool to be immediately useful for doing production cataloging and editing.  After that – I’ll be able to evaluate the remainder of the program and begin working on functional parity between all versions of the application. 

But I’ll admit, at this point, the road map is somewhat even cloudy to me.  See, I’ve written up the following document ( and shared it with Whitni and asked her to work with other Mac users to refine the list and let me know what falls into that 80%.  So, I’ll be interested to see where their list differs from my own.  In the mean time, I’ll be starting work on the port – creating wireframes and spending time over the next week hitting the books and familiarizing myself with Apple’s API docs and the UI best practices (though, I will be trying to keep the program looking very familiar to the current application – best practices be damned).  Coding on the new UI will start in earnest around May 1 – and by August 1, 2015, I hope to have the first version built specifically for a Mac available.  For those interested in following the development process – I’ll be creating a build page on the MarcEdit website ( and will be posting regular builds as new areas of the application are ported so that folks can try them, and give feedback. 

So, that’s where this stands and this point.  For those interested in providing feedback, feel free to contact me directly at  And for those of you that reached out or participated in the campaign to make this happen, my sincere thanks. 


by reeset at April 12, 2015 04:40 AM

April 11, 2015

First Thus

ACAT Subject heading question: Transnational identity of Europeans?

On 12/03/2015 14.53, Shorten, Jay wrote:
> I’ve upgraded OCLC #904588879 with some new subject headings, but I’m wondering if I’m missing something. The book is an analysis of survey data among the 27 European Union countries as to what people think about European integration and about their transnational identity, i.e. how and whether and who thinks of themselves as Europeans besides citizens of their country. This is what I have so far: >
> European Union $x Public opinion
> Europe $x Economic integration $x Public opinion << the original heading > Europe $x Economic integration $x Social aspects
> Public opinion $z European Union countries
> Transnationalism [this cannot be subdivided geographically

These are the sorts of things where I type in “european identity” into Worldcat and see what pops up. Here are a few that have been used: National characteristics, European.
Group identity — European Union countries.
Citizenship — European Union countries.
Nationalism — European Union countries.
European cooperation — Social (psychological) aspects

By the way, I think that tools could be made to help the catalogers find headings far more efficiently by being able to see them all in something like a word cloud, instead of having to open each record, scroll down, etc. It could save a load of tedium!

James Weinheimer First Thus First Thus Facebook Page Cooperative Cataloging Rules Cataloging Matters Podcasts


by James Weinheimer at April 11, 2015 02:42 PM

April 09, 2015

Mod Librarian

5 Things Thursday: More DAM, AI, Metadata is a Love Note...

Here are five more things:

  1. An article by my idol John Horodyski on metadata maturity. I especially like the nine metadata aspects.
  2. Ralph Windsor on artificial intelligence and metadata cataloging.
  3. Ever wondered how to identify an average metadata editing time?
  4. Read about the successes of DAM Guru Lara Hiller.
  5. What is the Museum of Forbidden Technologies?

And remember, DAM NY is fast approaching…

View On WordPress

April 09, 2015 12:48 PM

April 08, 2015

First Thus

BIBFRAME Linked data

On 09/03/2015 0.56, Karen Coyle wrote:
> On 3/8/15 12:07 PM, James Weinheimer wrote:
>> They can sit there quite literally, all day long and not have learned >> anything about the War. All they see are *catalog records* and if >> they are to learn about the war itself, they need to get into the >> books of the collection.
> Great. You just described Amazon and the Apple APP store, two of the > most successful sites on the web.

I don’t really understand the comment. I didn’t make a criticism of anything, I just made a statement of fact. It is a problem that users experience all the time and why some students–those who actually came to trust me enough to be honest–told me that library catalogs were a huge waste of their time. Even some faculty agree. (See the excellent articles: “The-3-Click-Dilemma” in the Chronicle with its follow-up

Another example of a similar problem, one person (I can’t remember where, but I could find it) mentioned that she had taught an information literacy session to undergraduates, explained how the catalog worked, what was on the library’s various websites etc., and then asked them: Tell me what hours the library is open on Sunday.

Most people searched *in the catalog* “What are the hours the library is open on Sunday?” We can laugh, but what they did was completely logical and understandable. When we consider it more carefully, this represents a disaster for libraries. I am sure that these sorts of things happen all the time; people get weird results, or nothing at all, and the only conclusion the searcher can come to is that the library catalog is bad. What other conclusion could they possibly come to? In an earlier time a person might think: maybe I did something wrong. Maybe I should ask, but those days appear to be over. It seems that nobody ever asks.

Today, one text box looks like any other text box, and how is someone to know that one search works with full-text and another with “summary records” (as I called catalog records). Understanding that these are different types of information is difficult (obviously). Merging the two types of data would only seem to me to complicate the problem further–but my mind is open and we should find out. We need to be honest in appraising the results however, basing them on what the users think, and not on what we think.

While Amazon may be popular among the public, I haven’t met any instructors who want students to use it as a research tool.

James Weinheimer
First Thus
First Thus Facebook Page
Cooperative Cataloging Rules
Cataloging Matters Podcasts


by James Weinheimer at April 08, 2015 09:22 AM

April 07, 2015

First Thus

BIBFRAME Linked data

On 3/7/2015 9:37 PM, Martynas Jusevičius wrote:
> I find these statements hard to believe. Data is just data. Data, > metadata – there is no difference.
> People are using RDF to describe proteins, semiconductor products, > horoscope signs, antique coins and who knows what else. What makes you > think libraries are special? Again, I mean real technical limitations > — all the history and the “traditional ways of doing things” are > irrelevant here.

There are different types of data, and we experience it in all kinds of ways every day. I have gone into greater detail in those podcasts and presentations I mentioned, but I’ll try to redo a little of it here. The differences are subtle, but clear.

Before I begin however, what you have claimed to be history, and traditional ways of doing things, is not history at all. Whether we like it or not, what I described is the way libraries still work. It is what users are supposed to do when they use a library, and if people don’t do it, they will get bad results. Of course, few people do it and this explains a lot of the frustration that people currently have with library catalogs.

The solution that libraries have tried is called “information literacy” and “bibliographic instruction” which, instead of fixing library tools to work in a modern environment, means to teach everybody how to use our tools the way they are. In my own opinion, this hasn’t worked and everything needs to be rethought, but what I described is not history–unfortunately it is still happening today.

About catalog data, it isn’t that it is special, but it is different from the other types of data that you point out. When someone comes to a library, they don’t come specifically to search the catalog (or at least, those that do are exceedingly rare). Instead, the vast majority are there because they have a question and want information. My example has been “What were the causes of the War of the Spanish Succession”. The catalog does not contain the information I want–the information that can answer my question is contained in the books, journal articles, and other materials in the collection–but if I use the catalog correctly, it can direct me to the resources that have the information I want. In this way, the information found in a catalog is similar to information found on … traffic signs.

If you want to drive from Rome to Paris, you need signs to help you get there. The better the signs, the better, the easier, and the more enjoyable the trip. Poor signs, or the absence of them (which happens in Italy all the time), can lead to frustration, anger or even disasters.

So, people want and need decent and reliable road signs, but they are very rarely interested in the signs themselves: who made them, where and when, what materials they are composed of and so on. Still, those in charge of the road signs need to know that information, so that they can replace them, update them, add to them, etc.

Using this same reasoning with catalogs and how things are changing, compare this with the person who is interested in the “War of the Spanish Succession” and searches the library catalog. They can sit there quite literally, all day long and not have learned anything about the War. All they see are *catalog records* and if they are to learn about the war itself, they need to get into the books of the collection. But when they search Google, in just a half-an-hour they have gotten some real information. This leads them to expect that library tools will work similar to what works (apparently) so easily and simply on Google, which seems logical but is completely wrong.

Google works with a different type of information: content; library catalogs work by giving people directional information: so even when the searcher does everything correctly, all they see are directions: for general books on the War, look here, For books on the politics look there, For battles, look here, etc.

For those who use catalogs incorrectly, they are practically doomed to disaster and for them it is similar to a driver who hasn’t seen a road sign for hours, and ends up at the end of a road in the middle of a field at midnight.

Believe me, this happens to students all the time when they are researching their papers at the last minute! Both end up in tears and/or almost screaming.

Catalogers see this difference in information clearly because they work with the actual materials that people want: the books, the recordings, the maps, etc. all go through their hands. The mistake that many catalogers make (again in my opinion) is that they believe people, who care about the information in the collection (i.e. who want to learn about The War of the Spanish Succession), also care about the catalog records they make. Of course for the public, these records are the equivalent of road signs that help them get where they want to go. They don’t care about the road signs and once they reach their destination, they completely forget about all the helpful road signs. I confess I remember only the frustrations and anger during the trips that had lousy signs. I think the same thing happens with catalog records.

While our methods still “work” in a sense, they are strange for people in the 21st century. They need to be, in a sense, translated so that they work in today’s environment.

So, all data is definitely not equal. I think there is still a need for our type of data but it needs to be reconsidered. Tools that work well for content data, don’t work so well with directional data. And with linked data, I am very skeptical about the usefulness of mixing content data with our directional data. Nevertheless, we should try it, to find out what happens. I would be very happy to be proven wrong.

There are other options, too.

James Weinheimer
First Thus
First Thus Facebook Page
Personal Facebook Page Google+
Cooperative Cataloging Rules
Cataloging Matters Podcasts The Library Herald


by James Weinheimer at April 07, 2015 07:07 PM

April 06, 2015

First Thus

BIBFRAME Linked data

Posting to Bibframe

On 3/6/2015 10:59 PM, Ross Singer wrote:
I’ve spent the last 20 years working to make computers understand AACR2. Theoretically, I could be “licensed librarian” in two (full disclosure: I am not a librarian). Are you saying that as a 20 year library technologist I can’t crack the marc/aacr2 nut? If so, wouldn’t that be more damning of MARC/aacr2?

I guarantee that if you actually worked with the MARC records that are in the majority (and didn’t have a stake in a company that supplies a subset of quality MARC data) you would never suggest relying on the fixed fields for this.

Fixed fields could potentially help me limit to what’s been published in England. If I’m not interested in things published in London, Oxford, or Cambridge (which, let’s be realistic, will be the lion’s share), the burden’s on me to wade through the results to find what I want. And yet, all it takes is one interested (possibly non-library) person to identify the publishers in the West Midlands and all of the libraries in Birmingham, warwickshire, worcestershire, staffordshire, etc. has a curated set locally published resources.

Something libraries are completely unable to do currently.

These are some revealing comments. I would like to analyze them since I think they show some of the basic differences of a systems person vs. a librarian/cataloger. I have discussed this problem at some length in a podcast of mine, and also in a presentation I gave for students at La Sapienza here in Rome.

You have asked a question: what has been published in West Midlands, and then say that libraries are completely unable to answer that currently. But is that correct? Can libraries provide that answer? Yes, but they do it in a way different from how a systems person might expect.

To expect the library catalog to do it is actually using the wrong tool. The catalog was never designed to answer such questions. It never was designed to do it and (probably) never will. So, expecting the library catalog to answer such a question is (to me as a librarian/cataloger) much the same as expecting a hammer to help you examine the rings of Saturn.

That comparison may be shocking and seem incorrect, but to expect a library catalog to do what it is not designed to do is just as shocking to the cataloger. It is unfair to expect a tool to do what it is not designed to do.

It should be clear that if you want to examine the rings of Saturn, you must use another tool–and if the hammer does not help you, you should realize you are using the wrong tool, and not conclude that the hammer is therefore worthless.

A catalog does not contain data in the normal IT sense of the word. That is probably another strange idea, but it is nevertheless a fact. It contains information (data) that will help you find the information you want. In other words, it contains directional information to what you want, but it does not contain the information itself. Let’s see how this works in reality.

Going through the process of answering your question as a reference librarian can illustrate it. If you want materials published in that area of the world, can the library catalog help?

Yes, but you need to know how to use it. There is a subject heading “Publishers and publishing” that can be subdivided geographically. There are also lots of narrower terms and a nice scope note as well.

Following these headings (i.e. by browsing), we eventually come to “Publishers and publishing–England” with further geographic subdivisions, and we should not forget “Publishers and publishing–Great Britain”. It is very possible that the “data” you want is in some of these sources.

For movies, there is the subject heading “Motion picture producers and directors” that can also be subdivided geographically. So yes, libraries very definitely can do what you want. And they do it–every single day. Your question is not at all unique nor especially difficult. But it takes the entire library to do it and to focus only on the catalog is incorrect, and unfair.

Then, let’s add the idea of a new tool you suggested, i.e. the example you give of the “… one interested (possibly non-library) person to identify the publishers in the West Midlands and all of the libraries in Birmingham, warwickshire, worcestershire, staffordshire, etc. has a curated set locally published resources”. How does the library handle this?

It would be handled in the following way. His/her work would be selected by a library selector to ensure quality, and then the cataloger would make a record, and at least one subject would be under “Publishers and publishing” with the appropriate geographic subdivision. The cataloger would not add the actual information from that work into the catalog itself–because the catalog was never designed to work that way.

To be honest, I am not that great of a reference librarian–there are many others who are much better than I am. They may be able to help in better ways than I can, but I at least know there is this heading, although there are probably others.

Of course, it’s not easy for a user to find a heading such as “Publishers and publishing”. Everybody has know that from the beginning, and the solution was the extremely important job of the reference librarian.

Therefore, a catalog and the collection are designed to work very closely together, and to separate the two make both practically unusable. Perhaps people don’t like this, but that’s the way it all works.

Still, your question illustrates the basic problem very well: you expect the library catalog to do what the library collection is designed to do. I agree that lots of people expect this too, and then we arrive at a major problem for libraries and their catalogs in the 21st century: New user expectations.

I think this shows how the “data” in the library catalog is fundamentally different from the “data” in other kinds of databases. And it also illustrates how the normal tools used for “data mining” and “data extraction” that work fairly well in other venues are more or less doomed to failure when applied to library catalogs. They contain a different kind of data.

So, to compare the situation to earlier times, it is like someone who wants to know what has been published in a certain part of the world, walks into a library, thumbs through the catalog, and walks out angry because they have decided that the information they want isn’t there, all without asking anybody anything.

Unfortunately, this happens all the time today with visitors on the web so of course people have bad experiences. Times have changed and the scenario I described–once the norm–happens less and less, so I agree that something must be done. Do we conclude that the “directional information” found in the catalog is a useless relic of the past? Or do we try to re-imagine what can be done with the tools we have? Libraries have always had lots of tools.

I would like to think people are trying for the latter, but it seems as if the current trends are for the former. In either case, the first step is to understand where the real problems are and then it may be possible to find solutions–or maybe not.

There are many other related issues of course. What do we want from a library catalog?

For a deeper discussion, there is my podcast “Cataloging Matters No. 17: Catalog Records as Data”, and my presentation to La Sapienza (shorter) is at


by James Weinheimer at April 06, 2015 07:20 PM

April 05, 2015

First Thus

BIBFRAME Linked data

On 06/03/2015 15.42, Martynas Jusevičius wrote:
> There is MARCXML, which can be used as a transitional format and > transformed using XSLT into RDF/XML. Once in RDF, data can be > transformed using SPARQL CONSTRUCT. That has been discussed before. >
> I want to underline one point: it is the transition to RDF that > matters, and not to BIBFRAME or any other specific vocabulary. You > don’t have to wait for some consolidated one-size-fits-all solution, > you can use RDF and enjoy its advantages already now, and mix > vocabularies as you see fit. It should be an agile development, not a > waterfall model.
> …
> So, what are the real problems? Libraries, just get the Linked Data > out there!:)

Thanks for putting it so much better than I have managed to do. Libraries could have been doing this for a long time now and learned a lot from how everybody used/didn’t use/transformed our data.

James Weinheimer First Thus First Thus Facebook Page Cooperative Cataloging Rules Cataloging Matters Podcasts


by James Weinheimer at April 05, 2015 03:12 PM

Mod Librarian

First Thus

BIBFRAME Linked data

Ross Singer wrote:

Counterpoint: if libraries can do “anything they want” with their data and have had 40+ years to do so, why haven’t they done anything new or interesting with it for the past 20?

How, with my MARC records alone, do I let people know that they might be interested in “Clueless” if they’re looking at “Sense and Sensibility”? How do I find every Raymond Carver short story in the collection? The albums that Levon Helm contributed to? How can I find every introduction by Carl Sagan? What do we have that cites them?

How, with my MARC records alone, can I definitively limit only to ebooks? What has been published in the West Midlands?

You *could* make a 3-D day-glo print of a MARC record, I suppose – but that seems like exactly the sort of tone deaf navel gazing that has rendered our systems and interfaces more and more irrelevant to our users.

Why haven’t libraries done anything new or interesting with our data for the past 20 years? Is it because it has been *impossible* due to our formats, even though we now have XML? You ask an excellent and important question that I was hoping somebody would bring up. It deserves a separate discussion. But first I want to emphasize: I am not saying that we need to work with MARC records alone–never said that at all. What I am saying is that for the library community, that is, the people who already know and understand–and even control–MARC format, changing the format they already control to Bibframe will not give them any new capabilities over what they have been able to do with MARCXML. *Librarians* understand the MARC codes and that means they can work with MARCXML to fold in their records with what else exists on the Internet; they can do that now, and they’ve been able to do it for awhile. Changing to Bibframe/RDF will not change anything for librarians, but it will change matters for non-librarians who may want to use our data for their purposes. Nevertheless, a *lot* of work will remain to be done. It isn’t like after we change to Bibframe, we can fly onto the deck of the aircraft carrier festooned with banners that proclaim “Mission Accomplished”. It will only be the beginning of a vast amount of work and expense. It seems to me to make sense to talk about that now.

So, if we can already do anything and haven’t, the obvious question is: why will anything change with Bibframe/RDF? again, I stress: this concerns *the library community*. Non-librarians will have new options but there will not be any new capabilities for the library community. Perhaps Bibframe will be a catalyst for change among librarians, providing a needed kick-in-the-pants to get them to do something they haven’t until now. OK, I’d go along with that. But let’s be fair and say that it is just as possible that it won’t. Going back to the reason why we haven’t done anything interesting in the last 20 years: maybe it’s money, maybe it’s imagination, maybe it’s proprietary catalogs, maybe it’s power…. I don’t know, but there may be a whole host of other reasons.

Perhaps with Bibframe the non-librarian community will come riding to the rescue and they will figure out what to do. We can hope.

I wrote that message on Autocat to combat the popular idea that the reason libraries haven’t done anything new or interesting is because of the limitations of the format. That was true until MARCXML arrived and then it became possible to do all sorts of new things. MARCXML may be nasty and difficult to work with, but no matter: if somebody wants to, it *can* be worked with *within the library community*. And people have worked with it, such as we see in catalogs that utilize Lucene indexing (which is based on MARCXML) to create the facets we see in different library catalogs. (That is one thing that has been done in the last 20 years, and it is due to XML)

I gave the example of printing day-glo colors merely to emphasize that we can currently do anything we want right now, but of course, I was not suggesting we should waste our time on that. I want to try to open people’s minds to what *can* be possible. *Anything* is a tremendous concept that is difficult to grasp. Once we accept and begin to comprehend the idea that “anything can be done” the question of what would be better, or worse, uses of our labor and resources becomes far more complex and takes on different subtleties. Those who believe that the problems we have faced are because of the *format* so therefore, the solution is to get a “better format” and things will then be solved, will be sadly disillusioned.

Finally, in answer to some other posts, I repeat once again that I am FOR the library community’s implementation of linked data but we need to do it with our eyes open. I’ll copy that part of my original message:
“I want again to emphasize that libraries should go into linked data, but when we do so, there will probably be more question marks than exclamation points. Just as when a couple is expecting a baby and they experience pregnancy: at least when I experienced it, I imagined that the birth of my son would be an end of the pregnancy. But suddenly, I had a crying baby on my hands! Linked data will be similar: it will be a beginning and not an end.”

James Weinheimer First Thus First Thus Facebook Page Cooperative Cataloging Rules Cataloging Matters Podcasts


by James Weinheimer at April 05, 2015 01:36 PM

April 04, 2015

Resource Description & Access (RDA)

April 03, 2015

First Thus

ACAT linked data question

On 03/03/2015 19.24, Anderson, William wrote:
> It’s interesting that the article “Ending the Invisible Library” posted earlier the thread, from my quick read seems to focus on the use of linked data the other way round, libraries sharing their catalog data with others over the open web. In other words, it is we that are the invisible library. >
> Our MARC data, beyond being a communication standard, does indeed consist of “semantic mark up”, but “mark up” semantically unreadable by the vast majority of computers beyond or own systems.
> > The upshot being we are judged a “trusted source” that just happens to speak in the data equivalent of obscure dialect (MARC) only spoken in a few towns up in the mountains (i.e. the library profession). It is not our wisdom that is questioned, but our communication skills. A facetious analogy, but hopefully apt.

There are some points to keep in mind when considering linked data/semantic web. The new formats (, Bibframe) are *not* there for libraries to be able to do new and wonderful things with their own data. Why? Because libraries already understand and control all of that data. Right now, so long as we have XML formats (and we have that now with MARCXML) we can do *anything* we want with the data. MARCXML is not perfect, but it is still XML and that means: librarians can search that data however we want, manipulate it however we want, transform it however we want, sort it however we want and display it however we want. If we want to search by the fiction code in the fixed fields and sort by number of pages or by 100/700$q we can. We can print out reams of entire records, or any bits and pieces of them we could want, collate them in any number of ways (or not), and print them out on 3D printers in day-glow colors, display them with laser beams on the moon or work with them in the virtual reality “wearable technology”. We can do all of that and more *right now* if we wanted. We’ve been able to do it for a long time. We don’t need or Bibframe to enhance our own capabilities because we can do anything with our own data now.

So, who is and Bibframe for? Non-librarians, i.e. for people who neither understand nor control our data. Libraries will allow others to work with our data in ways that they can understand a bit more than MARC. Non-librarians cannot be expected to understand 240$k or 700$q, but with or Bibframe, it is supposed to be easier for them–although it still won’t be easy. Nevertheless, they will be able to take our data and do with it as they will as they cannot do now with our MARC/ISO2709 records. It’s interesting to note that the LC book catalog in this format has been in the Internet Archive for awhile now ( but I haven’t heard that any developers have used it.

With Bibframe and people will be able to merge it with other parts of the linked data universe (oops! Not Freebase or dbpedia. They’ll have to go to Wikidata! Wonder how long that will last!) or with all kinds of web APIs (see Here too is a list of some of the web apis Web programmers can then put these things together to create something absolutely new, e.g. bring together library data with ebay so that people can see if something on ebay is available in the library or vice versa. But remember that those web programmers will also be able to manipulate our data as much as we can, so the final product they create may look and work completely differently than we would imagine, or that we would like. As a result, libraries and catalogers will lose the control of their data that they have always enjoyed. For better or worse, that is a necessary consequence of sharing your data.

Then comes what are–I think–the two major questions of linked data for libraries. First is: OK. We add the links, but what do we link *to*? Will linking into appeal to the public? I personally don’t think so since there is so little there, other than the traditional syndetic structures found in our traditional catalogs (i.e. the UF, BT, NT, RT for subjects, the earlier/later names of corporate bodies and series, the other names of people). This is not what people think of when they think of the advantages of linked data. While those things may be nice for us, I don’t know if that will be so appealing to the public. If it is to become appealing to the public, somebody somewhere will have to do a lot of work to make them appealing.

Concerning VIAF, it’s nice to know the authorized forms in Hebrew, French, Italian, and so on, but again, is that so appealing to the *public*? It may be, but that remains to be proven.

Second, there is no guarantee at all that anyone will actually do anything with our data. While I certainly hope so, there are no guarantees that anybody will do anything with our data. It could just sit and go unused.

I want again to emphasize that libraries should go into linked data, but when we do so, there will probably be more question marks than exclamation points. Just as when a couple is expecting a baby and they experience pregnancy: at least when I experienced it, I imagined that the birth of my son would be an end of the pregnancy. But suddenly, I had a crying baby on my hands! Linked data will be similar: it will be a beginning and not an end.

James Weinheimer First Thus First Thus Facebook Page Cooperative Cataloging Rules Cataloging Matters Podcasts


by James Weinheimer at April 03, 2015 10:37 PM