Planet Cataloging

March 23, 2017

Terry's Worklog

MarcEdit and Alma Integration: Working with holdings data

Ok Alma folks,

 I’ve been thinking about a way to integrate holdings editing into the Alma integration work with MarcEdit.  Alma handles holdings via MFHDs, but honestly, the process for getting to holdings data seems a little quirky to me.  Let me explain.  When working with bibliographic data, the workflow to extract records for edit and then update, looks like the following:


  1. Records are queried via Z39.50 or SRU
  2. Data can be extracted directly to MarcEdit for editing



  1. Data is saved, and then turned into MARCXML
  2. If the record has an ID, I have to query a specific API to retrieve specific data that will be part of the bib object
  3. Data is assembled in MARCXML, and then updated or created.


Essentially, an update or create takes 2 API calls.

For holdings, it’s a much different animal.


  1. Search via Z39.50/SRU
  2. Query the Bib API to retrieve the holdings link
  3. Query the holdings link api to retrieve a list of holding ids
  4. Query each holdings record API individually to retrieve a holdings object
  5. Convert the holdings object to MARCXML and then into a form editable in the MarcEditor
    1. As part of this process, I have to embed the bib_id and holdin_id into the record (I’m using a 999 field) so that I can do the update


For Update/Create

  1. Convert the data to MARCXML
  2. Extract the ids and reassemble the records
  3. Post via the update or create API


Extracting the data for edit is a real pain.  I’m not sure why so many calls are necessary to pull the data.

 Anyway – Let me give you an idea of the process I’m setting up.

First – you query the data:

Couple things to note – to pull holdings, you have to click on the download all holdings link, or right click on the item you want to download.  Or, select the items you want to download, and then select CTRL+H.

When you select the option, the program will prompt you to ask if you want it to create a new holdings record if one doesn’t exist. 


The program will then either download all the associated holdings records or create a new one.

Couple things I want you to notice about these records.  There is a 999 field added, and you’ll notice that I’ve created this in MarcEdit.  Here’s the problem…I need to retain the BIB number to attach the holdings record to (it’s not in the holdings object), and I need the holdings record number (again, not in the holdings object).  This is a required field in MarcEdit’s process.  I can tell if a holdings item is new or updated by the presence or lack of the $d. 


Anyway – this is the process that I’ve come up with…it seems to work.  I’ve got a lot of debugging code to remove because I was having some trouble with the Alma API responses and needed to see what was happening underneath.  Anyway, if you are an Alma user, I’d be curious if this process looks like it will work.  Anyway, as I say – I have some cleanup left to do before anyone can use this, but I think that I’m getting close.



by reeset at March 23, 2017 11:52 AM

March 22, 2017

Terry's Worklog

Truncating a field by a # of words in MarcEdit

This question came up on the listserv, and I thought that it might be generically useful that other folks might find it interesting.  Here’s the question:

I’d like to limit the length of the 520 summary fields to a maximum of 100 words and adding the punctuation “…” at the end. Anyone have a good process/regex for doing this?
=520  \\$aNew York Times Bestseller Award-winning and New York Times bestselling author Laura Lippman’s Tess Monaghan—first introduced in the classic Baltimore Blues—must protect an up-and-coming Hollywood actress, but when murder strikes on a TV set, the unflappable PI discovers everyone’s got a secret. {esc}(S2{esc}(B[A] welcome addition to Tess Monaghan’s adventures and an insightful look at the desperation that drives those grasping for a shot at fame and those who will do anything to keep it.{esc}(S3{esc}(B—San Francisco Chronicle When private investigator Tess Monaghan literally runs into the crew of the fledgling TV series Mann of Steel while sculling, she expects sharp words and evil looks, not an assignment. But the company has been plagued by a series of disturbing incidents since its arrival on location in Baltimore: bad press, union threats, and small, costly on-set “accidents” that have wreaked havoc with its shooting schedule. As a result, Mann’s creator, Flip Tumulty, the son of a Hollywood legend, is worried for the safety of his young female lead, Selene Waites, and asks Tess to serve as her bodyguard. Tumulty’s concern may be well founded. Recently, a Baltimore man was discovered dead in his home, surrounded by photos of the beautiful—if difficult—aspiring star. In the past, Tess has had enough trouble guarding her own body. Keeping a spoiled movie princess under wraps may be more than she can handle since Selene is not as naive as everyone seems to think, and instead is quite devious. Once Tess gets a taste of this world of make-believe—with their vanities, their self-serving agendas, and their remarkably skewed visions of reality—she’s just about ready to throw in the towel. But she’s pulled back in when a grisly on-set murder occurs, threatening to topple the wall of secrets surrounding Mann of Steel as lives, dreams, and careers are scattered among the ruins.
So, there isn’t really a true expression that can break on number of words, in part, because how we define word boundaries will vary between different languages.  Likewise, the MARC formatting can cause a challenge.  So, the best approach is to look for good enough – and in this case, good enough is likely breaking on spaces.  My suggestion is to look for 100 spaces, and then truncate.
In MarcEdit, this is easiest to do using the Replace function.  The expression would look like the following:
Find: (=520.{4})(\$a)(?<words>([^ ]*\s){100})(.*)
Replace: $1$2${words}…
Check the use regular expressions option. (image below).
So why does this work.  Let’s break it down.
(=520.{4}) – this matches the field number, the two spaces related to the mnemonic format, and then the two indicator values.
(\$a) – this matches on the subfield a
(?<words>([^ ]*\s){100}) – this is where the magic happens.  You’ll notice two things about this.   First, I use a nested expression, and second, I name one.  Why do I do that?  Well, the reason is because the group numbering gets wonky once you start nesting expressions.  In those cases, it’s easier to name them.  So, in this case, I’ve named the group that I want to retrieve, and then have created a subgroup that matches on characters that aren’t a space, and then a space.  I then use the qualifier {100}, which means, must match at least 100 times.
(.*) — match the rest of the field.
Now when we do the replace, putting the field back together is really easy.  We know we want to reprint the field number, the subfield code, and then the group that captured the 100 units.  Since we named the 100 units, we call that directly by name.  Hence,
$1 — prints out =520  \\
$2 — $a
${words} — prints 100 words
… — the literals
And that’s it.  Pretty easy if you know what you are looking for.

by reeset at March 22, 2017 08:34 PM


Build joy into your library’s website

What libraries can learn from eCommerce

I’m passionate about Web analytics. This passion ignited before I came to OCLC as I’ve spent most of my career working on eCommerce teams for brands like American Eagle Outfitters and DSW. eCommerce teams use web analytics to optimize experiences for shoppers to ensure that they can find what they are looking for and ultimately click that purchase button.

Honestly, we often pushed past passion to complete obsession. We used to get our key metrics emailed to us every hour on the hour before one VP requested that the emails stop coming out after midnight so the team could get some sleep. Since I’ve been here at OCLC, I’ve found that a lot of what we do in eCommerce can be leveraged for improving library websites as well.

Joyful stacks

When I first joined OCLC, I reflected upon how my new calling intersected with my favorite user experience quote by Don Norman, who has done a lot of interesting things in his career, including work in the Psychology Department at University of California, San Diego. I love this quote from him:

“It is not enough that we build products that function, that are understandable and usable, we also need to build products that bring joy and excitement, pleasure and fun, and, yes, beauty to people’s lives.”

book_cover_postThat sure fits into our mission, doesn’t it? It reminds me of the first time I ever experienced a library. I was around five and still remember that magical promise: “Choose any book you want.” As I entered our small public library, I saw a long book sticking out of the bottom shelf. I remember pulling it out and seeing the Jumanji cover.

I loved that book, and it’s one of the most joyful memories of my childhood. Do you remember your first memory of a library?

So, libraries’ physical spaces already bring joy and beauty to people’s lives. I think our goal should be to make our online presences as amazing and as joyful as the in-person experiences. Web analytics can help.

You can’t improve what you don’t measure

Web analytics will help you spot trends and behaviors about your users so that you can make adjustments that improve their experiences. We recently did a survey of our members about library website redesign projects. The majority of respondents indicated that redesign projects were top of mind, either in-flight or just completed.

Surprisingly, 41% of those working on website redesign improvements told us that they did NOT plan to use web analytics to track those improvements. If that’s the case, how will you know if you’ve logically organized the content for your users?

Are people wandering out of your library?

Imagine a person coming into your library, getting as far as the front lobby, then turning around and leaving. Another person does that…then another. That would make you reconsider your physical layout and procedures, wouldn’t it?

How can simple web analytics help make your library’s website as joyful as an in-person experience?
Click To Tweet

Without web analytics, on the web, these lost souls are invisible. Analytics gives us the opportunity to intercept the poor, confused people wandering around your website. To build them an experience that they find intuitive and engaging. And get them to the beautiful, joyful materials and services they need.

So where do you start?

Analyzing traffic patterns and page views is a great place to start. In fact, many of the surveyed librarians said that they already look at these metrics. However, the magic really begins to happen when you look at how successful your users are at completing key workflows using conversion funnel analysis.

To analyze key workflows:

  1. Identify a key workflow (e.g., scheduling a consultation, searching the catalog, etc.)
  2. Identify each step/page in that workflow
  3. Tag pages and establish a baseline
  4. Identify points of failure and tweak the experience
  5. Measure for improvements against the baseline

And then, of course, repeat.

We do this all the time at OCLC. For example, when we added a “did you mean?” suggestion feature to searches within WorldCat Discovery. We moved the “Did you mean?” phrase from here:


to here:


We DOUBLED the number of clicks on that option. Because having the call-to-action up front is usually a good idea from a user experience standpoint. Sometimes it really can be that simple. Small tweaks can yield huge results for our users.

Treat your library site like another branch. Measure what works and what doesn’t. Then improve it a little bit, every time you make a change. Soon, your online presence will be just as much a place of joy and beauty as your most beloved physical space.

The post Build joy into your library’s website appeared first on OCLC Next.

by Cathy King at March 22, 2017 02:00 PM

March 21, 2017

Terry's Worklog

MarcEdit Update Notes

MarcEdit Update: All Versions

Over the past several weeks, I’ve been working on a wide range of updates related to MarcEdit. Some of these updates have dealt with how MarcEdit handles interactions with other systems, some of these updates have dealt with integrating the new bibframe processing into the toolkit, and some of these updates have been related to adding more functionality around the programs terminal programs and SRU support. In all, this is a significant update that required the addition of ~20k lines of code to the Windows version, and almost 3x that to the MacOs version (as I was adding SRU support). In all, I think the updates provide substantial benefit. The updates completed were as follows:


* Enhancement: SRU Support — added SRU support to the Z39.50 Client
* Enhancement: Z39.50/SRU import: Direct import from the MarcEditor
* Enhancement: Alma/Koha integration: SRU Support
* Enhancement: Alma Integration: All code needed to add Holdings editing has been completed; TODO: UI work.
* Enhancement: Validator: MacOS was using older code — updated to match Windows/Linux code (i.e., moved away from original custom code to the shared validator.dll library)
* Enhancement: MARCNext: Bibframe2 Profile added
* Enhancement: BibFrame2 conversion added to the terminal
* Enhancement: Unhandled Exception Handling: MacOS handles exceptions differently — I created a new unhandled exception handler to make it so that if there is an application error that causes a crash, you receive good information about what caused it.

Couple of specific notes about changes in the Mac Update.

Validation – the Mac program was using an older set of code that handled validation. The code wasn’t incorrect, but it was out of date. At some point, I’d consolidated the validation code into its own namespace and hadn’t updated these changes on the Mac side. This was unfortunate. Anyway, I spent time updating the process so the all versions now share the same code and will receive updates at the same pace.

SRU Support – I’m not how I missed adding SRU support to the Mac version, but I had. So, while I was updating ILS integrations to support SRU when available, I added SRU support to the MacOS.

BibFrame2 Support – One of the things I was never able to get working in MarcEdit’s Mac version was the Bibframe XQuery code. There were some issues with how URI paths resolved in the .NET version of Saxon. Fortunately, the new bibframe2 tools don’t have this issue, so I’ve been able to add them to the application. You will find the new option under the MARCNext area or via the command-line.


* Enhancement: Alma/Koha integration: SRU Support
* Enhancement: MARCNext: Bibframe2 Profile added
* Enhancement: Terminal: Bibframe2 conversion added to the terminal.
* Enhancement: Alma Integration: All code needed to add Holdings editing has been completed; TODO: UI work.
Windows changes were specifically related to integrations and bibframe2 support. On the integrations side, I enabled SRU support when available and wrote a good deal of code to support holdings record manipulation in Alma. I’ll be exposing this functionality through the UI shortly. On the bibframe front, I added the ability to convert data using either the bibframe2 or bibframe1 profiles. Bibframe2 is obviously the default.

With both updates, I made significant changes to the Terminal and wrote up some new documentation. You can find the documentation, and information on how to leverage the terminal versions of MarcEdit at this location: The MarcEdit Field Guide: Working with MarcEdit’s command-line tools

Downloads can be picked up through the automated updating tool or from the downloads page at:

by reeset at March 21, 2017 03:28 PM

March 16, 2017

TSLL TechScans

Core Competencies for Cataloging and Metadata Librarians

The CaMMS Competencies and Education for a Career in Cataloging Interest Group presented Core Competencies for Cataloging and Metadata Professional Librarians at ALA Midwinter in Atlanta. The document supplements the American Library Association's Core Competencies in Librarianship. The document outlines Knowledge, Skill & Ability, and Behavioral Competencies and is meant to define a "baseline of core competencies for LIS professionals in the cataloging and metadata field."

Knowledge competencies are those providing understanding of conceptual models upon which cataloging standards are based. Skill & ability competencies include not just the application of particular skills and frameworks, but the also the ability to "synthesize these principles and skills to create cohesive, compliant bibliographic data that function within local and international metadata ecosystems. Behavioral competencies are those "personal attributes that contribute to success in the profession and ways of thinking that can be developed through coursework and employment experience."

Of particular note is emphasis on cultural awareness in the introductory section.  "Metadata creators must possess awareness of their own historical, cultural, racial, gendered, and religious worldviews ... Understanding inherent bias in metadata standards is considered a core competency for all metadata work."

Full text of the competencies document is available via ALA's institutional repository. Slides from the presentation at ALA Midwinter are also available.

by (Jackie Magagnosc) at March 16, 2017 08:26 PM

March 15, 2017

TSLL TechScans

Linked Data Catalog at Oslo Public Library

The Oslo Public Library, Deichmanske bibliotek, has developed a library services platform based on linked data. It can be seen in action at the library's website, and the source code is available on GitHub.

The platform uses a work-based model for its public-facing catalog; for an example of this "FRBR-ized" interface, see the display for the film Harry Potter and the Order of the Phoenix. The display provides prominently positioned work-level information, and then shows information for two different DVD versions of the movie, as well as a Blu-Ray version. It also very nicely highlights the film's position in the Harry Potter series, by providing "continues" and "continued in" links to the appropriate films. It also includes a "based on" link to the book of the same name. Following this link brings you to an even more impressive display of various print and audiobook holdings for this title.

More information about the behind-the-scenes cataloging work can be found in this post from 2014 on the library's blog.

by (Emily Dust Nimsakont) at March 15, 2017 04:13 PM


Bringing order to the chaos of digital data


530 million songs. 90 years of high-definition video. 250,000 Libraries of Congress. That’s how much data we produce every day—2.5 exabytes according to Northeastern University. I guess that’s not surprising, given the amount of activity that goes on in social media, websites, email messages and texting.

Much of that data, though, is personal and ephemeral. Videos, photos, tweets and stories that can be passed along and deleted without any thought or care about accuracy or archiving.

But in the scholarly community, a similar and perhaps more significant explosion of digital data is occurring. Here the stakes may be much higher. Without trusted stewardship, data from research will not be effectively collected and preserved for reuse. And when this happens, research innovation and advancement slows significantly.

This is new territory in many ways. Data have been collected and preserved for thousands of years, but never at the volume we see today, nor with some of the deliberate (and in some cases, legally mandated) intentions for reuse.

Given their expertise, library and archives professionals are well-suited to provide support and many have taken a leading role in developing new ways to serve their campus communities’ needs to manage, curate and preserve research data. It’s an exciting opportunity and one for which our community is well-equipped.

Managing and curating data with reuse in mind

For almost 10 years, I’ve been studying data reuse in academic communities. It has evolved into studying academics’ data management and sharing practices and the library’s role in supporting these kinds of activities. My latest research focuses on data sharing and reuse in the social science, archaeological and zoological communities, and what these traditions and practices mean for repository data curation. My goal is to identify the common drivers and unique elements of sharing and reuse in each discipline, and what applications those might have for library and archives professionals in their role as data curators.

I recently published my findings with Elizabeth Yakel in chapter four of a new book published by ACRL, Curating Research Data, Volume 1: Practical Strategies for your Digital Repository.

What mediating roles can librarians and archivists take on between data production and reuse?
Click To Tweet

What we found throws some light on how librarians and archivists can play a more active role in this new environment:

  • Trust in both the data and the repository plays a major role in whether data are reused. Perceptions and opinions about documentation quality, data producer reputation and repository reputation are formed over time as researchers gain experience with the discipline, data and repository. As data curators, librarians and archivists have the power to shape researchers’ perceptions and opinions about these trust markers, particularly repository reputation and documentation quality.
  • Important mediating roles occur between data production and reuse. Engaging with researchers who are producing as well as reusing data provides a full view of upstream and downstream data needs and the opportunity to better align support and services, such as data deposit and curation activities, user interface design, data management instruction modules or other scaffolding to improve researchers’ experiences. In addition, intervening at the point of data production would allow for a negotiation of curation goals that serve to better satisfy the needs of multiple stakeholders—data producers, curators and reusers.
  • Data repositories do not have to house all of the information about the data to be effective. However, they do have to provide provenance information or chronology of ownership, chain of custody information and location if the information or related data is housed elsewhere. We see this within the zoological community where decisions about data stewardship and data services are made across multiple repositories resulting in partnerships among several institutions. Seeking partnerships that extend institutional capabilities adds value to the research community.

In each case, what we see is that trained, engaged people making connections with others plays a key role in the success of these projects. And to me, that sounds a lot like the work librarians and archivists do.

The role of the library and the archives in aggregating and servicing these assets is increasing. Our challenge is to provide new services that respond to the abundant and sometimes chaotic flow of digital data and the evolving patterns of data sharing and data reuse.

The post Bringing order to the chaos of digital data appeared first on OCLC Next.

by Ixchel M. Faniel, Ph.D. at March 15, 2017 02:38 PM

March 14, 2017

TSLL TechScans

New BIBFRAME Components Available from the Library of Congress

The Library of Congress has made available new BIBFRAME 2.0 components. Developed during the Library of Congress' own MARC to BIBFRAME conversion project, these components are being released for public use to assist other libraries with their own BIBFRAME projects.

The new components include:

  • BIBFRAME 2.0 Vocabulary Update
    The BIBFRAME Vocabulary has been updated to meet the needs of the Library of Congress project. It also includes suggestions from other members of the BIBFRAME community. Other new elements were added for testing and possible permanent inclusion.  
  • MARC to BIBFRAME 2.0 Specifications 
    Written from the MARC side so that all MARC tags are considered for inclusion in BIBFRAME, this specification consists of series of spreadsheets covering each MARC tag field group. MS Word explanatory documents are also included on the site.
  • MARC to BIBFRAME Conversion Programs
    Developed by Index Data for the Library of Congress, these conversion programs will be updated as the BIBFRAME projrect progresses. 

For more details, see the full Library of Congress press release.

by (Travis Spence) at March 14, 2017 08:46 PM

March 11, 2017

Resource Description & Access (RDA)

RDA Cataloging News and RDA Bibliography

RDA CATALOGING NEWS RDA Cataloging News and RDA Bibliography is a compilation of News, Events, Workshops, Seminars, Conferences, Training, Articles, Books, Presentations, Videos, Workshops, Training, Web Articles, Blog Posts, Reviews Etc. on Resource Description and Access (RDA), Anglo-American Cataloging Rules (AACR2), MARC21, Functional Requirements for Bibliographic Records (FRBR),

by Salman Haider ( at March 11, 2017 05:07 AM

March 10, 2017


Together, we move forward

EMEA Regional Council Meeting

It was great to see everyone in Berlin last month at the EMEA Regional Council meeting. More than 250 guests from 28 countries attended this eighth annual membership meeting, which was held at the European School of Management and Technology (ESMT Berlin). The theme was Libraries at the Crossroads: Resolving Identities, and we explored the trends that are shaping the future of libraries through a rich program of 65 presentations led by 67 thought leaders.

Thank you to all who planned and attended this powerful event. It was a great chance to share knowledge around this important theme while getting to know each other better. A special congratulations goes to the Lightning Talk winner Katrin Kropf from the Public Library of Chemnitz, Germany.

We look forward to seeing you next year on 20–21 February in Edinburgh, Scotland.

EMEA Regional Council 2017 meetingEMEA Regional Council 2017 meetingEMEA Regional Council 2017 meeting

Photos courtesy of Toni Kretschmer.

The post Together, we move forward appeared first on OCLC Next.

by Eric van Lubeek at March 10, 2017 02:50 PM