[UPDATE: The most recent version of the ONIX codelists may be found here.]
First, an apology: BNC BiblioShare’s Quality Reports are designed with print books in mind, so producers of e-book data need to ignore some of the chaff in the report. I’m looking to improve this and BookNet is seeking input from e-book publishers and retailers.
That said, what are the general guidelines for producing “good” ONIX support for e-books? Quality data exchange requires two things: clarity about how Product Identifiers are being used and the use of the ONIX 3.0 standard. Given that we are lacking both of these things, the answer to how to make a “good” e-book is a bit ambiguous. Use the following as a starting point, ask questions of your trading partners, and—please!—comment on this post.
My recommendations for best practices in ONIX 2.1 for e-book data are:
Review and follow as much of the guideline for digital products in ONIX 3.0 as possible.
ONIX 2.1 can’t support much of it but you should be aware of what’s proposed in order to begin tracking it.
Make sure you’re using the current Codelists for ONIX 2.1. While support for e-books has shifted to ONIX 3.0, the ONIX 2.1 codelists aren’t moribund just yet—so use them fully whenever you can. A new ONIX Code List (Issue 12) is due out this week or next so look for announcements shortly.
Keep your e-book feed separate from your print book feed. Mixing them just invites problems, and the supply chain for each is pretty distinct anyway.
Use the product form DG—for electronic book. It’s basically how you say it’s an ebook.
Specify the rights and territorial availability. ONIX 2.1 supports a lot of potential detail and I’d strongly recommend as much clarity in your data as you can possibly muster with regard to this. E-tailers are using it. Ask questions of them.
Don’t include meaningless dimension measures for e-books. It just invites confusion—with one possible exception: Consumers don’t really have a good measure for relative size of an e-book. You should include file size as it’s an important value for the supply chain but it doesn’t mean much to consumers, and even if it did, images would skew it. Some vendors still use a page count, and it remains recommended. Use a reasonable estimate if you don’t have a real book to reference.
Use free-text description fields. When in doubt I like to recommend making use of any free-text description field available to describe in clear terms what you’re selling, in particular PR.10.4 Edition statement, or possibly PR.3.8 Product form description.
Use PR.10.1 Edition type code if appropriate. There’s a proposal for a new “enhanced” e-book code for editions with audio or video material, but it’s not available just yet.
Use PR.23, the Related Product composite. It allows you define how your other Product EANs relate to your descried product. There are special codes for POD, e-book and print editions. This is a simple-to-use and effective way for the supply chain to know what you’re doing by being able to see all your products based on the same content in one spot. Its use is highly recommended. See the Product Identifier section below for some considerations.
All of the above should be easily accommodated in any ONIX feed from a print publisher—these are things you probably support now, but there are less typical ONIX 2.1 sections that can help you communicate more. Should you implement them or shift to ONIX 3.0? Well maybe both… the industry needs to talk about this.
PR.4.1 Epublication type code. List 10 lets you specify EPUB (and an Adobe EPUB), PDF, Kindle etc.
PR.3.11 Product content type code. This one is important in ONIX 3.0 and supported in ONIX 2.1. It allows you to specify that the content is ‘text’ and similar usefully clear values.
PR.3.3-5 Product Form Feature. This one is supported in both ONIX 2.1 and 3.0 and includes ways to specify Operating System and system requirements, and it’s slated for additional code values as well. The less standard your e-book the more you may need these.
Product Identifiers and ebooks: The only recommendation BookNet Canada can make at this point is that publishers should control their own product identifiers and make sure they are usable for ordering.
Unfortunately, that’s not exactly how the supply chain for e-books is working either. You’ve carefully assigned an EAN to your Kobo edition, your Kindle edition, and 5 other distinct DRM versions of the same EPUB source file. They are all effectively different books as defined above. Consider that you’ve listed them in your Related Product composite. Who’s using that information? Who is loading ALL of those EANs into their system so that they can make use of your metadata on your overall publishing program? BNC BiblioShare, Bowker and, I expect, Google would… but I’m not really sure who else. It’s not like Amazon want’s Kobo’s metadata is it? Are library wholesalers using it?
Can you even send an e-tailer an order listing just an EAN to get a product back?
There are committees talking about it and better guidelines may be available soon. In the meantime, BookNet Canada recommends that you try as much as possible to assign ISBNs in the same way as you do for print books. A different edition in a meaningfully different format should have its own ordering number. And e-tailers must respect the need, intent and simplicity of ISBNs by supporting them as well.
A final note: the actual e-book contains its own metadata—not nearly as rich as ONIX or as thought out, but there nevertheless. Try to match your ONIX feed to the embedded metadata. It can’t possibly hurt the supply chain to have agreement in sources.