Planet Cataloging

April 28, 2017

OCLC Next

Inspiring breakthroughs in librarianship worldwide

2017-fellows

 

Managing the IFLA/OCLC Fellowship program, which began in 2001, is one of the most professionally rewarding experiences of my career. The Program provides advanced continuing education and exposure to a broad range of issues in information technologies, library operations and global cooperative librarianship. With the five Fellows from the 2017 class, the Program has welcomed 85 librarians and information science professionals from 38 countries.

Each year a new class of Fellows brings a new wave of enthusiasm and energy to the program, which we sponsor with IFLA. This class was no exception.

They are:

  • Patience Ngizi-Hara, The Copperbelt University, Zambia
  • Eric Nelson Haumba, YMCA Comprehensive Institute, Uganda
  • Sharisse Rae Lim, National Library of the Philippines
  • Jerry Mathema, Masiyephambili College, Zimbabwe
  • Nguyen Van Kep, Hanoi University, Vietnam

These Fellows will inspire breakthroughs in librarianship around the world and within their communities, and they are equally inspiring to all of us who had the opportunity of interacting with them. They will go on to become library leaders and mentors to others in their home countries. And they will advance the sharing of knowledge around the world.


Video: the 2017 IFLA/OCLC Fellows describe challenges & opportunities for libraries in their countries.
Click To Tweet


As we were closing out this year’s program, we asked the Fellows to share their experiences and to describe the challenges and opportunities facing libraries in their countries. I invite you to take a look at this moving two-minute video. Congratulations and thank you, IFLA/OCLC class of 2017.

 

The post Inspiring breakthroughs in librarianship worldwide appeared first on OCLC Next.

by Nancy Lensenmayer at April 28, 2017 01:23 PM

April 27, 2017

Terry's Worklog

MarcEdit Linked Data Rules Files Enhancements

I’m working on a couple enhancements to the MarcEdit Linked Data rules file to accommodate proposals being generated by the PCC regarding the usage of the $0 and the $1 in MARC21 records.  Currently, MarcEdit has the ability to generate $0s for any controlled data within a MARC record, so long as a webservice for the vocabulary has been profiled in the MarcEdit rules file.  This has allowed folks to begin embedding URIs into their MARC records…but one of the areas of confusion is where services like VIAF fit into this equation.

The $0, as described in MARC, should represent the control number or URI to the controlled vocabulary.  This will likely never (or in rare cases ever be), VIAF.  VIAF is an aggregator, a switching point, an aggregation of information and services about a controlled term.  It’s also a useful service (as are other aggregators), and this had led folks to start adding VIAF data into the $0.  This is problematic, because it dilutes the value of the $0 and makes it impossible to know if the data being linked is to the source vocabulary or an aggregating service.

To that end, the PCC will likely be recommending the $1 for URIs that are linked data adjacent.  This would allow users to embed references to aggregations like VIAF, while still maintaining the $0 and the URI to the actual vocabulary — and I very much prefer this approach.

So, I’m updating the MarcEdit rules file to allow users to denote multiple vocabularies to query against a single MARC field, and to denote the subfield for embedding the retrieved URI data on a vocabulary by vocabulary level.  Prior, this was set as a global value within the field definition set.  This means that if a user wanted to query LCNAF for main entry (in the 100) and then VIAF to embed a link in a $1, they would just need to use:

<field type=”bibliographic”>
<tag>100</tag>
<subfields>abcdqnpt</subfields>
<uri>0</uri>
<special_instructions>personal_name</special_instructions>
<vocab subfield=”0″>naf</vocab>
<vocab subfield=”1″>viaf</vocab>
</field>

The Subfield Attribute now defines per vocabulary element, where the URI should be stored, and the uri element, continues to act as the global subfield setting, if not value is defined in the vocab element.  Practically, this means that if you had the following data in your MARC record:

=100  1\$6880-01$aHu, Zongnan,$d1896-1962,$eauthor.

The tool will now automatically (with the rules file updates), generate both a $0 and $1.

=100  1\$6880-01$aHu, Zongnan,$d1896-1962,$eauthor.$0http://id.loc.gov/authorities/names/n84029846$1http://viaf.org/viaf/70322743

But users could easily add other data sources if they have interests beyond VIAF.

These changes will be available in all versions of MarcEdit when the next update is posted in a week or so.  This update will also include an updated rules file that will embed viaf elements in the $1 for all main entry content (when available).

 

–tr

 

 

by reeset at April 27, 2017 08:16 PM

April 26, 2017

TSLL TechScans

Who's Responsible for Digital Preservation?

Many libraries are currently being restructured and rethought to support new campus goals, in response to concerns, or as staffing levels change. That doesn't mean that our core functions change - we're still here to describe materials, provide access to items in a variety of formats, and assist or manage the preservation of materials. What is less clear as we move forward is who is really responsible for what, especially when it comes to digital preservation. Should that the library handle it or should it fall under the realm of the IT department, a different department within your institution, or even an external department or company? These questions are hard enough to answer, but as we take on new roles, both comfortable and unfamiliar, finding the solution has become even more challenging.

To help answer the questions Harvard Library's Digital Repository Service (DRS) has recently produced and shared documentation of how the management of digital assets is being handled. There is a blog post available describing the process of developing documentation and the full document is available on the DRS wiki.

by noreply@blogger.com (Lauren Seney) at April 26, 2017 07:51 PM

April 25, 2017

025.431: The Dewey blog

Chemical weapons and chemical warfare

Where are works on chemical weapons and chemical warfare classed? That depends on what aspect is covered. We’ll touch on a few of the many aspects.

In international law, control of chemical weapons is classed in 341.735 Control of chemical and biological weapons, e.g., The chemical weapons convention: A commentary.

In military science, works about chemical warfare are classed in 358.34 Chemical warfare, e.g., Chemical warfare: A study in restraints. A general history of chemical warfare covering multiple wars (e.g., A history of chemical warfare) is classed in 358.3409 (built with 358.34 plus standard subdivision T1—09 History).

In military engineering, works about chemicals used as destructive agents are classed in 623.4592 Chemical agents. The upward hierarchy includes:

623.4 Ordnance

623.45 Ammunition and other destructive agents

623.459 Nonexplosive agents

At 623.459 is the class-here note: "Class here detection of nonexplosive agents"; that note has hierarchical force. Hence Detection technologies for chemical warfare agents and toxic vapors is classed in 623.4592.

What about history of chemical warfare in a specific war? The Manual note at 930-990 vs. 355.009, 355-359 begins with relevant advice:

930-990 vs. 355.009, 355-359 Military topics and war

Use 930-990 for works on military history that deal with the outcome of significant events in wars, e.g., the use of tanks on the Eastern Front and how their use affected various battles 940.54217. Use the history standard subdivisions in 355-359 for works emphasizing military history or topics without consideration of the general course of a war, e.g., changes in tank tactics during the course of World War II 358.18409044. If in doubt between 930-990 and 355-359, prefer 930-990.

Works like Hell in Flanders Fields: Canadians at the Second Battle of Ypres and Trial by gas: The British Army at the Second Battle of Ypres are much concerned with the outcome of the Second Battle of Ypres in World War I; hence those works are classed in 940.424 1915, western and Austro-Italian fronts.  The upward hierarchy includes:

940.4 Military history of World War

940.42 Land campaigns and battles of 1914-1916

In contrast, Behind the gas mask: The U.S. Chemical Warfare Service in war and peace covers the years 1917-1929; it begins with World War I but goes beyond to treat continuing policy issues. It  is classed in 358.34097309041 (built with 358.34 plus T1—09 plus T2—73 United States as instructed at T1—093-T1—099 Specific continents, countries, localities; extraterrestrial worlds plus notation 090 from the add table entry T1—093-T1—099:0901-0905 Historical periods plus 41 from T1—09041 1900-1919 as instructed in the add table entry; the historical period notation 41 for 1900-1919 is correct by the first-of-two rule.)

Remember the key advice in the Manual note: "If in doubt between 930-990 and 355-359, prefer 930-990."

by Juli at April 25, 2017 04:27 PM

TSLL TechScans

Getting to Know TS Librarians: Joni Cassidy



1. Introduce yourself.
Joni Lynn Cassidy, President, Cassidy Cataloguing Services, Inc.

2. Does your job title actually describe what you do? Why/why not?
My job title doesn't really describe my work.  I design and oversee all the projects we work on for the various libraries and institutions we provide cataloguing and technical services support for.  Every day, I meet with the members of my staff to discuss the details of their assignments, along with being responsible for new client development and the general direction our company is going.

3. What are you reading right now?
I love mysteries and am presently reading "The Detective's Secret" by Lesley Thomson.

4. If you could work in any library (either a type of library or specific one), what would it be? Why?
If I could work anywhere, I would be cataloging the books in the two libraries in Hearst Castle in San Simeon, CA.  This is an elegant place, and at lunchtime I would find a place to sit where I could watch the ocean.

by noreply@blogger.com (Lauren Seney) at April 25, 2017 02:57 PM

April 20, 2017

OCLC Next

The legendary “apple cake” record story

50th-Anniversary-revised

You can find links to thousands (maybe millions) of recipes using the WorldCat Cookbook Finder. And it’s the work of many, many catalogers over 50 years that makes that possible, of course. The metadata entered into WorldCat is what powers the Cookbook Finder and many other library services.

But can you find an actual recipe within those 394 million bibliographic records?

For one (or more?) brief times during OCLC’s history, yes you could.

And today? Well … read on to find out!

It all began in 1974 …

Details are vague. Memories are hazy. And while some may claim to be the first cataloger to create a WorldCat MARC record with the details of an apple cake recipe—and others may deny it!—there is no proof one way or the other. No one knows the motivation behind creating the record other than the sense of humor of the cataloging community.

What we do know is this. At some time in 1974, someone did just that. The record had this in the notes field:

505-field-apple-cakeAs you might imagine, some catalogers took exception to using metadata fields for actual information. Some argued that the record should remain; others that it would only invite additional shenanigans. Eventually, the record was removed … but not before nearly 200 locations chose to list their libraries as holding the item.

But wait! There’s more!

apple cake bakingOver the next few years, the record was added and deleted at least once more. By some accounts, it came and went with alarming frequency as the two sides of argument struggled for apple cake supremacy.

One step along the way was when the recipe was added to The OCLC Employees’ Cookbook in 1990 by OCLC’s Ron Gardner, who retired earlier this year.

And to celebrate the 25th anniversary of WorldCat in 1996, librarians from the Pittsburgh Theological Seminary Clifford E. Barbour Library actually prepared an in realia copy of the OCLC apple cake.

But what about the record itself? What was the final conclusion of the cataloging community?

In the early 1990s, it seemed as if the recipe would have to remain within the confines of the library material itself, not the WorldCat bibliographic data.

Or would it …


Where do you stand on the Great WorldCat Apple Cake Record discussion?
Click To Tweet


The apple cake record is alive!

A record with the aptly named title of “apple-cake” was added for an archival material item by Southern Methodist University in Dallas, Texas, USA, in September 2015. And the recipe itself is described in the 505 field.

worldcat cake record

Another apple cake record, this one numbered 826,054,444, was added by the US Army Corps of Engineers Library in Alexandria, Virginia, USA. This record also has the recipe in the notes field.

So for now, the “keep the recipe in WorldCat” team has the upper hand.

Whatever the final disposition of the record, however, one thing is for sure: it’s a tasty cake.

The post The legendary “apple cake” record story appeared first on OCLC Next.

by Kem Lang at April 20, 2017 06:11 PM

April 19, 2017

Terry's Worklog

MarcEdit, Windows XP, and what the way too early log analysis says

Related post: MarcEdit and the Windows XP Sunsetting conversation

So, this has been kind of interesting.  Since Sunday, I’ve been running a script against my access logs to get a better idea of what the MarcEdit XP user community may look like.  I timed this with the update, because I’ve found that in general, a large number of users will update the program within the first couple of weeks of the update.  Equality interesting, even though MarcEdit has an automatic updating tool, there are users that stay with very old versions of the 6.x series.

Anyway, couple of interesting things of note since this active log analysis began:

  1. Since Sunday, MarcEdit’s automatic update script has been pinged north of 250,000+ times.  That means that since Sunday, users that make use of the autoupdate notifications have opened the program to do work at least a quarter of a million times.
  2. Since Sunday, MarcEdit has been updated ~7000 times.  I’ve estimated that the active user community is around 11-12k users, but that year over year, the update service will report a unique user count in the high 20,000s.  Assuming that these are the most active users, I think this gives an idea of who will be most immediately impacted by any potential change in XP support.
  3. Of those approximate 7000 users, 9 run XP.  The largest number of these users come from China and India, but surprisingly, there are still US users (at academic institutions based on the IPs reported) that still run XP…my god.
  4. Surprisingly, the mostly widely used Windows Operating system reported was Windows 8.1.  I honestly expected to see Windows 7.  But by usage, the breakdown was Windows 8.x, Windows 10, then Windows 7.  I really thought the order was going to be Windows 7, 10, then 8.x.

 

So what can I learn from this way to early analysis of the update/usage information.  First, it confirms a suspicion that I’ve had for a while that XP is very much in the minority among MarcEdit’s most active users…and that’s actually about it.  What I anticipate seeing is that the initial trends (can 7 users be a trend) that shows XP usage largely concentrated in China and India (since these are both large user communities).  I’m honestly also curious if this will extend to the Middle East.  Year over year, the Middle East represents the 3rd highest concentration of MarcEdit users.  The question for me as this picture becomes more clear is how wide spread XP use actually is — and that I just don’t know yet.  Oh, and third…I keep thinking there is a paper to be written here…if I could just find the time.  🙂

–tr

 

 

 

 

 

by reeset at April 19, 2017 04:18 PM

April 17, 2017

Coyle's InFormation

Precipitating Forward

Our Legacy, Our Mistake


If you follow the effort taking place around the proposed new bibliographic data standard, BIBFRAME, you may have noticed that much of what is being done with BIBFRAME today begins our current data in MARC format and converts it to BIBFRAME. While this is a function that will be needed should libraries move to a new data format, basing our development on how our legacy data converts is not the best way to move forward. In fact, it doesn't really tell us what "forward" might look like if we give it a chance.

We cannot define our future by looking only at our past. There are some particular aspects of our legacy data that make this especially true.          

I have said before (video, article) that we made a mistake when we went from printing cards using data encoded in MARC, to using MARC in online catalogs. The mistake was that we continued to use the same data that had been well-adapted to card catalogs without making the changes that would have made it well-adapted to computer catalogs. We never developed data that would be efficient in a database design or compatible with database technology. We never really moved from textual description to machine-actionable data points. Note especially that computer catalogs fail to make use of assigned headings as they are intended, yet catalogers continue to assign them at significant cost.

One of the big problems in our legacy data that makes it hard to take advantage of computing technology is that the data tends to be quirky. Technology developers complain that the data is full of errors (as do catalogers), but in fact it is very hard to define, algorithmically, what is an error in our data.  The fact is that the creation of the data is not governed by machine rules; instead, decisions are made by humans with a large degree of freedom. Some fields are even defined as being either this or that, something that is never the case in a data design. A few fields are considered required, although we've all seen records that don't have those required fields. Many fields are repeatable and the order of fields and subfields is left to the cataloger, and can vary.

The cataloger view is of a record of marked-up text. Computer systems can do little with text other than submit it for keyword indexing and display it on the screen. Technical designers look to the fixed fields for precise data points that they can operate on, but these are poorly supported and are often not included in the records since they don't look like "cataloging" as it is defined in libraries. These coded data elements are not defined by the cataloging code, either, and can be seen a mere "add-ons" that come with the MARC record format. The worst of it is that they are almost uniformly redundant with the textual data yet must be filled in separately, an extra step in the cataloging process that some cannot afford.

The upshot of this is that it is very hard to operate over library catalog data algorithmically. It is also very difficult to do any efficient machine validation to enforce consistency in the data. If we carry that same data and those same practices over to a different metadata schema, it will still be very hard to operate over algorithmically, and it will still be hard to do quality control as a function of data creation.

The counter argument to this is that cataloging is not a rote exercise - that catalogers must make complex decisions that could not be done by machines. If cataloging were subject to the kinds of data entry rules that are used in banking and medical and other modern systems, then the creativity of the cataloger's work would be lost, and the skill level of cataloging would drop to mere data entry.

This is the same argument you could used for any artisanal activity. If we industrialize the act of making shoes, the skills of the master shoe-maker are lost. However, if we do not industrialize shoe production, only a very small number of people will be able to afford to wear shoes.

This decision is a hard one, and I sympathize with the catalogers who are very proud of their understanding of the complexity of the bibliographic world. We need people who understand that complexity. Yet increasingly we are not able to afford to support the kind of cataloging practices of which we are proud. Ideally, we would find a way to channel those skills into a more efficient workflow.

There is a story that I tell often: In the very early days of the MARC record, around the mid-1970's, many librarians thought that we could never have a "computer catalog" because most of our cataloging existed only on cards, and we could NEVER go back and convert the card catalogs, retype every card into MARC. At that same time, large libraries in the University of California system were running over 100,000-150,000 cards behind in their filing. For those of you who never filed cards... it was horribly labor intensive. Falling 150,000 cards behind meant that a book was on the shelf THREE MONTHS before the cards were in the catalog. Some of this was the "fault" of OCLC which was making it almost too easy to create those cards. Another factor was a great increase in publishing that was itself facilitated by word processing and computer-driven typography. Within less than a decade it became more economical to go through the process of conversion from printed cards to online catalogs than to continue to maintain enormous card catalogs. And the rest is history. MARC, via OCLC, created a filing crisis, and in a sense it was the cost of filing that killed the card catalog, not the thrill of the modern online catalog.

The terrible mistake that we made back then was that we did not think about what was different between the card catalog and the online catalog, and we did not adjust our data creation accordingly. We carried the legacy data into the new format which was a disservice to both catalogers and catalog users. We missed an opportunity to provide new discovery options and more efficient data creation.

We mustn't make this same mistake again.

The Precipitant

Above I said that libraries made the move into computer-based catalogs because it was uneconomical to maintain the card catalog. I don't know what the precipitant will be for our current catalog model, but there are some rather obvious places to look to for that straw that will break the MARC/ILS back. These problems will probably manifest themselves as costs that require the library to find a more efficient and less costly solution. Here are some of the problems that I see today that might be factors that require change:

  • Output rates of intellectual and cultural products is increasing. Libraries have already responded to this through shared cataloging and purchase of cataloging from product vendors. However, the records produced in this way are then loaded into thousands of individual catalogs in the MARC-using community.
  • Those records are often edited for correctness and enhanced. Thus they are costing individual libraries a large amount of money, potentially as much or more than libraries save by receiving the catalog copy.
  • Each library must pay for a vendor system that can ingest MARC records, facilitate cataloging, and provide full catalog user (patron) support for searching and display.
  • "Sharing" in today's environment means exporting data and sending it as a file. Since MARC records can only be shared as whole records, updates and changes generally are done as a "full record replace" which requires a fair amount of cycles. 
  • The "raw" MARC record as such is not database friendly, so records must be greatly massaged in order to store them in databases and provide indexing and displays. Another way to say this is that there are no database technologies that know about the MARC record format. There are database technologies that natively accept and manage other data formats, such as key-value pairs

There are some current technologies that might provide solutions:

  • Open source. There is already use of open source technology in some library projects. Moving more toward open source would be facilitated by moving away from a library-centric data standard and using at least a data structure that is commonly deployed in the information technology world. Some of this advantage has already been obtained with using MARCXML.
  • The cloud. The repeated storing of the same data in thousands of catalogs means not being able to take advantage of true sharing. In a cloud solution, records would be stored once (or in a small number of mirrors), and a record enhancement would enhance the data for each participant without being downloaded to a separate system. This is similar to what is being proposed by OCLC's WorldShare and Ex Libris' Alma, although presumably those are "starter" applications. Use of the cloud for storage might also mean less churning of data in local databases; it could mean that systems could be smaller and more agile.
  • NoSQL databases and triple stores. The current batch of databases are open source, fast, and can natively process data in a variety of formats (although not MARC). Data does not have to be "pre-massaged" in order to be stored in a database or retrieved and the database technology and the data technology are in sync. This makes deployment of systems easier and faster. There are NoSQL database technologies for RDF. Another data format that has dedicated database technology is XML, although that ship may have sailed by now.
  • The web. The web itself is a powerful technology that retrieves distributed data at astonishing rates. There are potential cost/time savings on any function that can be pushed out the web to make use of its infrastructure. 

The change from MARC to ?? will come and it will be forced upon us through technology and economics. We can jump to a new technology blindly, in a panic, or we can plan ahead. Duh.



by Karen Coyle (noreply@blogger.com) at April 17, 2017 08:13 PM

If It Ain't Broke

For the first time in over forty years there is serious talk of a new metadata format for library bibliographic data. This is an important moment.

There is not, however, a consensus within the profession on the need to replace the long-standing MARC record format with something different. A common reply to the suggestion that library data creation needs a new data schema is the phrase: "If it ain't broke, don't fix it." This is more likely to be uttered by members of the cataloging community - those who create the bibliographic data that makes up library catalogs - than by those whose jobs entail systems design and maintenance. It is worth taking a good look at the relationship that catalogers have with the MARC format, since their view is informed by decades of daily encounters with a screen of MARC encoding.

Why This Matters

When the MARC format was developed, its purpose was clear: it needed to provide the data that would be printed on catalog cards produced by the Library of Congress. Those cards had been printed for over six decades, so there was no lack of examples to use to define the desired outcome. In ways unimagined at the time, MARC would change, nay, expand the role of shared cataloging, and would provide the first online template for cataloging.

Today work is being done on the post-MARC data schema. However, how the proposed new schema might change the daily work of catalogers is unclear. There is some anxiety in the cataloging community about this, and it is understandable. What I unfortunately see is a growing distrust of this development on the part of the data creators in our profession. It has not been made clear what their role is in the development of the next "MARC," not even whether their needs are a driving force in that development. Surely a new model cannot be successful without the consideration (or even better, the participation) of the people who will spend their days using the new data model to create the library's data.

(An even larger question is the future of the catalog itself, but I hardly know where to begin on that one.)


If it Ain't Broke...

The push-back against proposed post-MARC data formats is often seen as a blanket rejection of change. Undoubtedly this is at times the case. However, given that there have now been multiple generations of catalogers who worked and continue to work with the MARC record, we must assume that the members of the cataloging community have in-depth knowledge of how that format serves the cataloging function. We should tap that knowledge as a way to understand the functionality in MARC that has had a positive impact on cataloging for four decades, and should study how that functionality could be carried forward into the future bibliographic metadata schema.

I asked on Twitter for input on what catalogers like about MARC, and received some replies. I also viewed a small number of presentations by catalogers, primarily those about proposed replacements for MARC. From these I gathered the following list of "what catalogers like about MARC." I present these without comment or debate. I do not agree with all of the statements here, but that is no matter; the purpose here is to reflect cataloger perspectives.

(Note: This list is undoubtedly incomplete and I welcome comments or emails with your suggestions for additions or changes.)


What Catalogers Like/Love About MARC



There is resistance to moving away from using the MARC record for cataloging among some in the Anglo-American cataloging community. That community has been creating cataloging data in the MARC formats for forty years. For these librarians, MARC has many positive qualities, and these are qualities that are not perceived to exist in the proposals for linked data. (Throughout the sections below, read "library cataloging" and variants as referring to the Anglo-American cataloging tradition that uses the MARC format and the Anglo-American Cataloging Rules and its newer forms.)

MARC is Familiar

Library cataloging makes use of a very complex set of rules that determine how a resource is described. Once the decisions are made regarding the content of the description, those results are coded in MARC. Because the creation of the catalog record has been done in the MARC format since the late 1970's, working catalogers today have known only MARC as the bibliographic record format and the cataloging interface. Catalogers speak in "MARC" - using the tags to name data elements - e.g. "245" instead of "title proper".

MARC is WYSIWYG

Those who work with MARC consider it to be "human readable." Most of the description is text, therefore what the cataloger creates is exactly what will appear on the screen in the library catalog. If a cataloger types "ill." that is what will display; if the cataloger instead types "illustrations" then that is what will display. In terms of viewing a MARC record on a screen, some cataloger displays show the tags and codes to one side, and the text of those elements is clearly readable as text.

MARC Gives Catalogers Control

The coding is visible, and therefore what the cataloger creates on the screen is virtually identical to the machine-readable record that is being created. Everything that will be shown in the catalog is in the record (with the exception of cover art, at least in some catalogs). The MARC rules say that the order of fields and subfields in the record are the order in which that information should be displayed in the catalog. Some systems violate this by putting the fields in numeric order, but the order of subfields is generally maintained. Catalogers wish to control the order of display and are frustrated when they cannot. In general, changing anything about the record with automated procedures can un-do the decisions made by catalogers as part of their work, and is a cause of frustration for catalogers.

MARC is International

MARC is used internationally, and because the record uses numerics and alphanumeric codes, a record created in another country is readable to other MARC users. Note that this was also the purpose of the International Standard Bibliographic Description (ISBD), which instead of tags uses punctuation marks to delimit elements of the bibliographic description. If a cataloger sees this, but cannot read the text:

  245 02   |a לטוס עם עין אחת / |c דני בז.

it is still clear that this is a title field with a main title (no subtitle), followed by a statement of the author's name as provided on the title page of the book.

MARC is the Lingua Franca of Cataloging

This is probably the key point that comprises all of the above, but it is important to state it as such. This means that the entire workflow, the training materials, the documentation - all use MARC. Catalogers today think in MARC and communicate in MARC. This also means that MARC defines the library cataloging community in the way that a dialect defines the local residents of a region. There is pride in its "library-ness". It is also seen as expressing the Anglo-American cataloging tradition.

MARC is Concise

MARC is concise as a physical format (something that is less important today than it was in the 1960s when MARC was developed), and it is also concise on the screen. "245" represents "title proper"; "240" represents "uniform title"; "130" represents "uniform title main entry". Often an entire record can be viewed on a single screen, and the tags and subfield codes take up very little display space.

MARC is Very Detailed

MARC21 has about 200 tags currently defined, and each of these can have up to 36 subfields. There are about 2000 subfields defined in MARC21, although the distribution is uneven and depends on the semantics of the field; some fields have only a handful of subfields, and in others there are few codes remaining that could be assigned.

MARC is Flat

The MARC record is fairly flat, with only two levels of coding: field and subfield. This is a simple model that is easy to understand and easy to visualize.

MARC is Extensible

Throughout its history, the MARC record has been extended by adding new fields and subfields. There are about 200 defined fields which means that there is room to add approximately 600 more.

MARC has Mnemonics

Some coding is either consistent or mnemonic, which makes it easier for catalogers to remember the meaning of the codes. There are code blocks that refer to cataloging categories, such as the title block (2XX), the notes block (5XX) and the subject block (6XX). Some subfields have been reserved for particular functions, such as the use of the numeric subfields in 0-8. In other cases, the mnemonic is used in certain contexts, such as the use of subfield "v" for the volume information of series. In other fields, the "v" may be used for something else, such as the "form" subfield in subject fields, but the context makes it clear.

There are also field mnemonics. For example, all tagged fields that have "00" in the second and third places are personal name fields. All fields and subfields that use the number 9 are locally defined (with a few well-known exceptions).

MARC is Finite and Authoritative

MARC defines a record that is bounded. What you see in the record is all of the information that is being provided about the item being described. The concept of "infinite graphs" is hard to grasp, and hard to display on a screen. This also means that MARC is an authoritative statement of the library bibliographic description, whereas graphs may lead users to sources that are not approved by or compatible with the library view.

by Karen Coyle (noreply@blogger.com) at April 17, 2017 08:05 PM

Terry's Worklog

Can my ILS be added to MarcEdit’s ILS Integration?

This question has shown up in my email box a number of times over the past couple of days.  My guess, it’s related to the youtube videos recently posted demonstrating how to setup and use MarcEdit directly with Alma.

  1. Windows Version: https://youtu.be/8aSUnNC48Hw
  2. Mac Version: https://youtu.be/6SNYjR_WHKU

 

Folks have been curious how this work was done, and if it would be possible to do this kind of integration on their local ILS system.  As I was answering these questions, it dawned on me, others may be interested in this information as well — especially if they are planning to speak to their ILS vendor.  So, here are some common questions currently being asked, and my answers.

How are you integrating MarcEdit with the ILS?

About 3 years ago, the folks at Koha approached me.  A number of their users make use of MarcEdit, and had wondered if it would be possible to have MarcEdit work directly with their ILS system.  I love the folks over in that community — they are consistently putting out great work, and had just recently developed a REST-based API that provided read/write operations into the database.   Working with a few folks (who happen to be at ByWaters, another great group of people), I was provided with documentation, a testing system, and a few folks willing to give it a go — so I started working to see how difficult it would be.  And the whole time I was doing this, I kept thinking – it would be really nice if I could do this kind of thing with our Innovative Interfaces (III) catalog.  While III didn’t offer an API at the time (and for the record, as of 4/17/2017, they still don’t offer a viable API for their product outside of some toy API for dealing primarily with patron and circulation information), I started to think beyond Koha and realized that I had an opportunity to not just create a Koha specific plugin but use this integration as a model to develop an integration framework in MarcEdit.  And that’s what I did.  MarcEdit’s integration framework can potentially handle the following operations (assuming the system’s API provides them):

  1. Bibliographic and Holdings Records Search and Retrieval — search can be via API call, SRU or Z39.50
  2. Bibliographic and Holdings Records creation and update
  3. Item record management

 

I’ve added tooling directly into MarcEdit that supports the above functionality, allowing me to plug and play an ILS based on the API that they provide.  The benefit is that this code is available in all versions of MarcEdit, so once the integration is created, it works in the Windows version, the Linux version, and the Mac version without any additional work.  If a community was interested in building a more robust integration client, then I/they could look at developing a plugin — but this would be outside of the integration framework, and takes a significant amount of work to make cross-platform compatible (given the significant differences in UI development between Windows, the MacOS, and Linux).

This sounds great, what do you need to integrate my ILS with MarcEdit?

This has been one of the most common questions I’ve received this weekend.  Folks have watched or read about the Alma integration, and wondered if I can do it with their ILS.  My general answer, and I mean this, is that I’m willing to integrate any ILS system with MarcEdit, so long as they can provide the available API end points that make it possible to:

  1. Search for bibliographic data (holdings data is a plus)
  2. Allow for the creation and update of bibliographic data
  3. Utilize an application friendly authentication process, that hopefully allows the tool to determine user permissions

 

This is a pretty low bar.  Basically, an API just needs to be present; and if there is one, then integrating the ILS with MarcEdit is pretty straightforward.

OK, so my ILS system has an API, what else do I need to do?

This is where it gets a bit trickier.  ILS systems tend to not work well with folks that are not their customers, or who are not other corporations.  I’m generally neither, and for the purposes of this type of development, I’ll always be neither.  This means that getting this work to happen generally requires a local organization within a particular ILS community to champion the development, and by that, I mean either provide the introductions to the necessary people at the ILS, or provide access to a local sandbox so that development can occur.  This is how the Alma integration was first initiated.  There were some interested folks at the University of Maryland that spent a lot of time working with me and with ExLibris to make it possible for me to do this integration work.  Of course, after getting started and this work gained some interest, ExLibris reached out directly, which ultimately made this a much easier process.  In fact, I’m rarely impressed by our ILS community, but I’ve been impressed by the individuals at ExLibris for this specifically.  While it took a little while to get the process started, they do have open documentation, and once we got started, have been very approachable in answering questions.  I’ve never used their systems, and I’ve had other dealings with the company that have been less positive, but in this, ExLibris’s open approach to documentation is something I wish other ILS vendors would emulate.

I’ve checked, we have an API and our library would be happy to work with you…but we’ll need you to sign an NDA because the ILS API isn’t open

Ah, I neglected above to mention one of my deal-breakers and why I have not at present, worked with the APIs that I know are available in systems like Sirsi.  I won’t sign an NDA.  In fact, in most cases, I’ll likely publish the integration code for those that are interested.  But more importantly, and this I can’t stress enough, I will not build an integration into MarcEdit to an ILS system where the API is something that must be purchased as an add-on service, or requires an organization to purchase a license to “unlock” the API access.  API access is a core part of any system, and the ability to interact, update, and develop new workflows should be available to every user.  I have no problem that ILS vendors work with closed sourced systems (MarcEdit is closed source, even though I release large portions of the components into the public domain, to simplify supporting the tool), but if you are going to develop a closed source tool, you have a responsibility to open up your APIs and provide meaningful gateways into the application to enable innovation.  And let’s face it, ILS systems have sucked at this, and much to the library community’s detriment.  This really needs to change, and while the ability to integrate with a tiny, insignificant tool like MarcEdit isn’t going to make an ILS system more open, I personally get to make that same choice, and I have made the choice that I will only put development time into integration efforts on ILS systems that understand that their community needs choices and actively embraces the ability for their communities to innovate.  What this means, in practical terms, is if your ILS system requires you or I to sign an NDA to work with the API, I’m out.  If your ILS system requires you or their customers to pay for access to the API through additional license, training, or as an add-on to the system (and this one particularly annoys me), I’m out.  As an individual, you are welcome to develop the integrations yourself as a MarcEdit plugin, and I’m happy to answer questions and help individuals through that process, but I will not do the integration work in MarcEdit itself.

I’ve checked, my ILS system API meets the above requirements, how do we proceed?

Get in touch with me at reeset@gmail.com.  The actual integration work is pretty insignificant (I’m just plugging things into the integration framework), usually, the most time consuming part is getting access to a test system and documenting the process.

Hopefully, that answers some questions.

–tr

 

 

 

 

 

by reeset at April 17, 2017 02:32 PM

April 16, 2017

Terry's Worklog

MarcEdit Updates Posted

Change log below:

 

Mac Updates:

2.3.12
**************************************************
** 2.3.12
**************************************************
* Update: Alma Integration Updates: New Create Holdings Record Template
* Update: Integration Framework refresh: corrects issues where folks were getting undefined function errors.
* Update: the #xx field syntax will be available in the Edit field and Edit indicator functions.  This means users will be able to edit all 6xx fields using the edit field function by using 6xx in the field textbox.
* Update: SRU Library updates to provide better error checking (specific for Windows XP)
* Update: Adding support for the Export Settings command.  This will let users export and import settings when changing computers.

Windows Updates:
6.3.2
* Update: Alma Integration Updates: New Create Holdings Record Template
* Update: Integration Framework refresh: corrects issues where folks were getting undefined function errors.
* Update: the #xx field syntax will be available in the Edit field and Edit indicator functions.  This means users will be able to edit all 6xx fields using the edit field function by using 6xx in the field textbox.
* UI Updates: All comboboxes that include 0-999 field numbers in the Edit Field, Edit Subfield, Swap Field, Copy Field, etc. have been replaced with Textboxes.  Having the dropdown boxes just didn't seem like good UX design.
* Enhancement: RunAs32 bit mode on the 64 bit systems (for using Connexion) has been updated.  Also, I'll likely be adding a visual cue (like adding * 32 to the Main window title bar) so that users know that the program is running in 32 bit mode while on a 64 bit system.
* Enhancement: MarcEdit 7 Update Advisor
* Update: SRU Library updates to provide better error checking (specific for Windows XP)

–tr

by reeset at April 16, 2017 09:12 PM

MarcEdit 7 Upgrade Advisor

This post is related to the: MarcEdit and the Windows XP Sunsetting conversation

I’ll be updating this periodically, but I wanted to make this available now.  One of the biggest changes related to MarcEdit 7, is that I’m interested in building against an updated version of the .NET framework.  Tentatively, I’m looking to build against the 4.6 framework, but would be open to building against the 4.5.2 framework.  To allow users to check their local systems, and provide me with feedback — I’m including an upgrade advisor.   This will be updated periodically as my plans related to MarcEdit 7 come into shape.

You can find the upgrade advisor under the Help menu item on the Main MarcEdit Window.

Upgrade Advisor Window:

 

 

 

 

As I’ve noted, my plan is to build against the 4.6 .NET Framework.  This version of the .NET framework is supported on Windows Vista-Windows 10.

–tr

 

by reeset at April 16, 2017 07:12 PM

MarcEdit and the Windows XP Sunsetting conversation

I’m again thinking about the need to seriously think about Windows XP and MarcEdit’s continued support for the nearly 20 year old OS.  What is pushing this again is the integration work I’ve been doing with Alma.   None of this work will be available to XP users, in part, because ExLibris (and rightly) uses TLS 1.2 for their website security certificate.  This isn’t supported on XP (and never will be), so the connection cannot be made to the Alma service.  I have a feeling more and more of these kinds of problems are going to come up, and the way to “fix” this on my end is to migrate MarcEdit’s windows version (the Mac version already is) to the .NET 4.6 framework.  Right now, I’m using 4.0 as a base because it is supported on Windows XP SP 3, but I have to do a number of hacks to keep things working as new technology keeps coming available.  So, I think I’m ready to let Windows XP go, and say that it is time for the MarcEdit community to put this system behind us.  I’ll likely leave the last version of MarcEdit with XP support available for download, and maybe this change will be marked by a version change number (MarcEdit 7), but it really needs to happen.

At the same time, I’m sensitive to the fact that XP has lived so long because it has been the primary system used in a number of developing countries for a very long time.  This is part of the reason I wanted to let folks know that I’m planning to make this change, and give folks an opportunity to provide some feedback.  I’ll also be doing a few things between now and June.  I have a lot of log data, and I’ll be looking at these logs to identify the actual Windows XP use.  A quick glance at my website stats tell me that Windows XP is a minority operating system — but I want to dig deeper.  MarcEdit is run hundreds of thousands of times over the month (per the update logs), and I’ve tweaked the logging to preserve the users operating system (currently, I keep nothing but a general geographic location).  My hope is that libraries will have long ago let go of Windows XP, but I want to do some diligence to make sure this is the case.  If I find that there is still significant XP usage, I’d likely extend my date before dropping XP support from my development code branch, but usage will need to be pretty significant.  But this is also why I’m welcoming feedback.  I do want to be sensitive to the user community and try to make sure that no one gets left behind.

As of right now, my plan is to leave the last XP compatible version of 6.x available for download, but this would be the end of life version which would receive no further updates.  The benefits for the MarcEdit community is this will allow me to remove large sections of code that exist only because I’m dragging XP along, as well as an improved and leaned down installer.  Likewise, I’ll be able to take advantage of some new coding structures that will allow me to more easily do some semantic web integrations, as well as make it easier to support some of the 3rd party integrations that we see.  Additionally, it will bring the Windows codebase into sync with the Linux and MacOS codebases.  By default, both of those systems make use of the 4.6 framework, which would make testing much easier on my end as now I’m testing code across 10 operating systems and 4 different .NET framework versions.  Finally, since this will be a breaking change, i.e. MarcEdit 7 would be developed to work with Windows 7+ (though, I believe it would also work on Vista, but I would no-longer be formally testing for that OS given it’s historically low uptake) on the Windows-side, I would ensure that users could install the MarcEdit 6.x and MarcEdit 7.x programs-side by side.  This is the same approach that I took when moving from MarcEdit 5 to 6, and when I dropped support for Pre-XP Service Pack 3 machines (of which, there were still a handful).

In a couple of months, I’ll start detailing formal plans.  I should also have enough data (from log analysis and active logging of OS for the next 2 months) to have a good idea of what the impact might be and if the time table needs to be shifted.

I’m tentatively planning to make this change in Fall 2017, though I’ll very likely have a working build of the MarcEdit 7.x branch in early June.

–tr

by reeset at April 16, 2017 02:26 PM