Doubles: Good and Bad

Doubles, doppelgängers, dupes ... in a book or a movie, these are often code for trouble — a malevolent force in the world that looks and acts like you, that people mistake for you, that steals your life.

Each ONIX record fully describes one saleable product — typically a single book format associated with a unique ISBN. Within each book record, the ONIX standard supports unique data points, normally a one-to-one correspondence between a uniquely defined code and its equally unique value. A few sections like Subjects have the same code (a subject scheme) supporting multiple entries (unique subject codes). There are also a few individual tags that can repeat to support multiple unique values from its code list (Product Form Detail is an example), or to maintain uniqueness and avoid duplication you can provide multiple unique codes to show the associated value covers more than one use (the Product Relation Code in Related Product is an example). In each case something in the structure of ONIX repeats to maintain something unique and meaningful as metadata.

A code only exists because a code list — multiple codes — exist. Any place where a code and a value are provided there's the potential to provide a different code with another value, typically by repeating the container script (using composites in ONIX speak). It’s what drives the full potential of the metadata.

It requires data receivers to:

select what they need — choose the uniquely defined code and value combos that are meaningful to them; and
ignore or not select what they don’t need.

It requires data senders to:

send uniquely defined code and value combinations that match the definition provided in the code list; and
fully exploit their ability to send multiple code and value combinations to provide support for multiple needs and use cases.

At BookNet, we like to say ONIX 3.0 wants to use data more accurately than 2.1 and one of the big advantages in ONIX 3.0 is that there are more ways to create unique data points using greater clarity than 2.1. Here’s an example (that applies to ONIX 2.1 as well).

Case study: The Not-So-Simple Case of “Every book has a publisher”

It’s a primary data point, an expected value: the publisher is the business responsible for publishing the book and it should be named in the metadata. Every book might have a publisher but ONIX Code list 45 Publishing Role carries 18 codes — or 18 ways to use the Publisher composite to give unique values and only two of them can be used to name the book’s Publisher. The data receiver's need-to-know who the publisher of the book is, is pitted against the data sender’s complex co-publishing relationships; need to support their marketing by providing the important publisher of the original language version; or desire to provide the honest calling out of a sponsor’s close relationship to a project.

The joy of metadata comes from its depth and breadth and the bases it can cover. So long as it's meaningful, accurate, and usable it should be there.

The expected value should also be there.

Senders have a responsibility to ensure that the expected value, the publisher of the book, is easily found and for the Publisher composite it's obvious: The primary publisher, the contact publisher, should be identified using the expected “01” Publisher code, Every ONIX record should have it.
Receivers should select that “01” with an expectation of uniqueness, but they have an additional responsibility. They should also plan for the use of “02” Co-publisher entries and occasional bad data. If receivers choose to ignore the wealth of marketing and business potential offered by the other 16 codes, that may be a good decision for them.

Receivers have an extra responsibility because of the atypical case where books have multiple publishers and the data they receive can reflect atypical situations. To be clear, I would strongly counsel data senders to create entries for multiple publishers by providing a single publisher “01” and listing the second (or remaining) as “02” co-publisher(s). That would cleanly define the primary contact publisher using the expected code association and automatically create an easily understood hierarchy. Anytime you enable data loading to be done “normally” you’re creating a supply chain asset.

However, pragmatic EDItEUR creates ONIX and they allow for

single “01” Publisher entry (typical); or
- single “01” Publisher entry plus one or more “02” Co-publisher entries; or
- multiple “02” Co-publishers without a “01” Publisher entry (all publishers are equally responsible and therefore all co-publishers).

If it’s complicated, read their Best Practice documentation. For instance, it carefully distinguishes between co-publishing and co-editions The latter would never be published under the same ISBN or appear in the same ONIX record.

EDItEUR does this because they understand publishing contracts are unpredictable. ONIX users should always look beyond the code’s description to the notes column available in full ONIX Code List for information.

This example is to highlight that even in the simplest case of providing the Publisher’s name, both senders and receivers of metadata have a responsibility to provide and accept metadata that reflects reality.

But for the sake of supply chain sanity what’s unique must make sense within the metadata!

Each Publisher composite should represent a single entity described by a specific code — that’s the typical case throughout ONIX. Don’t create new businesses by hyphenating two companies together as one entry.
Receivers MUST select the “01” composite and understand that “02” entries are possible too. They cannot ignore the Role Code and require each record have a single Publisher Name entry. That wouldn’t allow the sender to provide the full 18 possible unique values offered by ONIX Code list 45 Publishing Role. Asking for uniqueness to be supported that way is called “a flavour of ONIX,” a special file made just for them. It leaves the bitter taste of unnecessary expense in its wake. But it's completely reasonable for a data receiver to have an expectation of finding a single “01” entry being there most of the time and be willing to deal with exceptions.
Senders MUST participate by providing the ability to select uniqueness with clear intent: You’ve got to read the definition of the code, both description and notes as supplied by EDItEUR, and match it. If you’re duplicating a code and value combination — providing multiple “01” Publisher entries (an error based on the “02” code’s Notes on what’s a co-publisher); or a single entry as a “02” Co-publishing entry (a clear error that contradicts the code’s Notes); or multiple “02” codes without a “01” (definitely allowed, but unclear which publisher should be the contact for queries and payment); or only supplying a “03” Sponsor (?!?! absolute error that makes no sense given the importance of the publisher metadata); you better know it and consider the problems this will create for data receivers.

Most of the time, do the expected thing and ensure a consistently presented Publisher Name is provided as a single “01” value supported as necessary by a co-publisher. Then, have truly excellent reasons when you supply multiple publisher entries as separate “02” entries without a “01 entry — you’re not making it easy to get paid after all! Then, should you need to, use any of the other 16 codes offered by ONIX Code list 45 Publishing Role to further describe this magnificent and most particular book. Do it knowing that unsent metadata does nothing. If it’s accurate and meaningful include it, but don’t dilute, obscure, or contradict the “expected” data.

Double Plus Good

Animated gif of the 1984 Apple Computer ad

Having provided a detailed case study, here are more ways that ONIX 3.0 allows for more flexibility and better communication with your trading partners.

Product Contributors

What’s unique: Each contributor has their own section. Each should be treated as an individual (or as a single group entity as a corporate contributor).

Why being unique is important:

ONIX supports an incredible variety of ways to support that specific author record and if identity based marketing is a goal a unique contributor entry is a need.
- Contributor Role Code — repeats so an individual’s specific record can support their having multiple roles describing their authorship of this book.
- Contributor Place — unique codes that can be paired with locations. You layer information that can be searched and retrieved as data points. Geography and identify are linked.
- Identifiers — offer a database its way to maintain uniqueness but consider what we might be able to do to support authors if ISNI were consistently available.
- Dates — born, died, and flourished are supported because adding a “when” helps identify which John Smith is the author of this book. A publication date is when an ISBN was released not when the author was writing. If the author isn’t contemporary, their “when” is linked with their identity.
- Alternate names — the main record would match the presentation on the book but this section supports a pseudonym or real name (depending on what’s on the book), an authority name (for libraries!); transliteration and so on. This is a diversity consideration of increasing importance.
- Professional Affiliation — academic credentials are as unique as the author and primary for some types of publishing.
- Prizes — if you’re lucky enough to have an author who has lifetime awards you really should brag using the Contributor Prize section.
Contributor Biographical Note — the single most important support for marketing your author. It’s best supplied as part of the author’s record (though you can supply a “book” author biography). Free text can capture the subtlety needed to sell a reader on this individual and particular author.

Data considerations and problems

Authors are important — their names don’t follow any single culture’s expectations. Use the tools provided by ONIX to “name the parts” correctly to allow both indexing and display to work.
- Warning: Supporting only Person Name and Person Name Inverted doesn’t support an individual's need to be appropriately presented in data.
Ordering is important — each contributor record should include a sequence number to ensure that the correct order of authors is maintained.
- Warning: ONIX 3.0’s XML Schema will invalidate files if there is duplication in Contributor Sequence Number. For the book record to support the contributor order, each sequence number has to be unique.
Contributor Role is important — the contributor's relationship to this book is described by role and different types of publishing prioritize different roles. A novelist has a different role than the creator of a graphic novel; academic publishing uses different criteria for role than trade books.
- Warning: The contributor listed as first in the ordering sequence is the book’s primary contributor regardless of the role listed. Retailers should NOT privilege certain role codes to order a book’s authorship but should rely on the Sequence Number supported by almost all senders of ONIX.
Consider supporting Contributor Statement — All of the above provides data receivers with granular data that can be indexed and searched for unique values. This is how you can also provide a single contributor statement using HTML formatting for display.

Product Title

What’s unique: Every book has an associated title, but it’s not necessarily a unique ordering of words (though hopefully the title plus author together are close-to-unique). The ONIX metadata standard supports a Title Composite that can repeat and support different options that can support unique needs to help the supply chain use title metadata.

ONIX Code List 15 Title Type supports nine different ways to support a purpose-driven book title entry as well as five other codes for specific needs like translations, former titles, and the naming of serials. Further, each Title Composite contains a Title Element composite that repeats to support the break out of component parts from titles. This is an option that can, as example, add clarity to an omnibus by providing each sub-book title as a fielded component. Another use is to allow a brand name, a part number, or a series name to be presented as ordered components within the main book title.

Title Statement — ONIX 3.0 supports, separate and outside of the Product Title composites, a XHTML enabled data element to support a “display title”. You can provide a simple, unique HTML formatted data point to support on-line display outside of the structured and indexed components used in the Title composite.

Data considerations and problems

Book Titles have multiple purposes and may need multiple entries to support different purposes.
- Book titles are important — The “01” Distinct Book title should reliably match the title of the published book. Libraries and other users need access to the actual matched to the book title. This would be the title that is displayed online, at least if the Title Statement (more below) is not available.
- Book titles are important for identification — Consider supporting an alternate title version such as Title Type “10” Distributor’s title – defined as “the title carried in a book distributor’s title file: frequently incomplete, and may include elements not properly part of the title.” This simple addition to your metadata would support the needs of sales rep and order forms (if limited to 30 characters or less) to have a title that will support identification of the product over matching published product.
  - Warning: The fielded data provided within a Title composite is there for indexing purposes and therefore cannot include HTML. ONIX supports special XHTML enabled data elements and the Title Statement is where a formatted title entry can be provided. Use HTML tags within the Title composite (and any element in ONIX not designated for XHTML use) violates ONIX 3.0 XML schema and can cause delays loading files.

Related Product

What’s unique: Related product entries are one of the most important ways to create direct code-driven links between a product record and other ISBNs. Often the same ISBN will have more than one relationship to the product record and what’s unique is the ability to handle that ISBN as a single Related Product entry supported by multiple Product Relation Codes (ProductRelationCode / x455).

Best practices — EDItEUR creates a “strict” XML schema to highlight best ONIX metadata practices. A best practice, by definition implies that other practices exist and a “best practice” may not be optimal for every business or supply chain segment. EDItEUR’s “strict” schema cites ONIX files that repeat ISBNs across Related Product composite to highlight that repeating the Product Relation Code is an “ONIX best practice.” Want to know more? BookNet Canada can run strict schema validation — ask for more information.

Record Reference

What’s unique: An ONIX record typically represents a single ISBN. A single ISBN represents a specific saleable book product — a single format, something that can be ordered. The ONIX Record Reference element is another part of this chain: It is the unique identifier for the ONIX record that supports the unique ISBN that supports the unique saleable product. While the Record Reference identifier isn’t used extensively within regular metadata exchanges (the focus would be on the ISBN in the record) there are a number of special cases in metadata exchange where identifying the record instead of the ISBN can be important. Good ONIX depends on this being maintained as a unique identifier. The best source of information is the EDIItEUR’s ONIX Best Practice documentation where this and other important record level identification is covered in detail.

Warning: ONIX files will fail XML schema validation if the Record Reference is duplicated. In particular, database systems that support marketing materials that share the same ISBN (a.k.a. Web PDFs) sometimes distribute these metadata records (in error or because they are needed?), and do not maintain the uniqueness of the corresponding Record Reference. The resulting file failure can cause delays.

Two more ONIX elements that can repeat

We’ve already noted Contributor Role and Product Relation Code as a repeating data points, but there are a number of special case ONIX elements that can repeat to cover special cases. Most are unlikely to see much use and include something like the ability to repeat a FromLanguage /x412 or ToLanguage / x413 to cover translations made from one than one language; or EditionType / x419 is another one with much more potential that seldom is used as a repeating data point in North America.

Here are two I recommend for use by data senders.

Product Form Detail

After supplying a single mandatory for all records Product Form (ProductForm / b012), the ONIX standards allows for one or more Product Form Detail (ProductFormDetail / b333) codes to further describe the Product.
- A BC (Paperback) might be described with multiple Product Form Details such as
  - b102 (trade paperback)
  - b221 (picture book)
  - b307 (reinforced binding)
  - b412 (flexibound)
  - b504 (with flaps)
Note that while there may be codes that could describe many different book formats — b221 picture book is one — in general there is a one-to-one correspondence between the first letter of the Product Form and the PF-Detail. For example: “B_” Product Forms generally match to “b###” PF-Details.
Warning: Duplication of Product Form Detail codes will cause the file to fail schema validation and cause delays.

Primary Content Type & Product Content Type

Intended for use with digital products where it’s not obvious what the file contains, after supplying a single Primary Content Type (PrimaryContentType / x416), the ONIX standard allows for one or more Product Content Type (ProductContentType / b385 ) codes to further describe the Product.
- Both values are taken from List 81 so a typical entry might be:
  - Primary Content Type code “10” Text (to show the ebook is a normal book of mostly text)
  - To show what else is found in the product, Product Content Type codes can be supplied as appropriate for things like
    - “19” Figures, Diagrams, Charts Graphs
    - “14” Extensive links to external content
Warning: This data point is associated with XML schema validation errors in ONIX 3.0. It’s likely is due to bad conversion from ONIX 2.1 data (2.1 only supports a repeating Product Content Type without the Primary Content Type value). For whatever reason, ONIX 3.0 data often supplies the same code as Primary and the Product Content Type. That is an unnecessary duplication cited by the ONIX 3.0 XML schema. Files with validation errors fail to process and that may cause data delays.

This topic can extend to almost every part of the ONIX standard, but hopefully you’re getting the point that something is unique and findable within every part of the ONIX record. If you’re ever looking at your ONIX and thinking “Isn’t this confusing!”, this isn’t unique it’s an excellent reason to ask questions. BookNet Canada is waiting!