Limitations of the conversion

We could in theory add more passes through getElement to capture the publisher name, volume number, issue number, etc. But there are diminishing returns. There's an increasing risk of inaccurate results. For example, one number looks very much like another within a string.

And adding more elements to the citation doesn't necessarily improve the results from an OpenURL link. We've noticed that including more parameters in an OpenURL can result in the link failing to match an item in a library catalogue. Unless each parameter matches a corresponding field in the catalogue metadata, the OpenURL will not return a result. The sweet spot for references to books seems to be surname of first author, title, and publication date.

We have so far chosen only to identify the first contributor to a referenced work, and only their surname. The format of the @author attribute of the <bibItem> element is surname, forenames. This doesn't always match the format in the text of the reference, which could be forenames surname. We have to weigh the value of identifying other contributors and their forenames against the effort of enhancing the conversion templates.

Limiting the conversion to just a few fields gives us a good chance of capturing enough information with enough accuracy to support OpenURL linking.

This is not enough to support the Initiative for Open Citations (I4OC)[7] fully, since the markup is incomplete. I4OC is an initiative to promote availability of citation data. This is facilitated by depositing bibliographic references as part of the metadata associated with a DOI. CrossRef[4] supports the deposit of unstructured references, which allows us to provide limited support for I4OC, but ideally we would provide fully structured references.

One benefit of the OUP data model that we are losing in the conversion to BITS is the ability to handle partial references. An author referring repeatedly to the same work may use the word Ibid. to save space on subsequent references, or substitute a couple of em-dashes for the author name. Using a <bibItem> for these allows us to capture the full information even on a partial reference, but the BITS model doesn't let us do that. This situation is increasingly rare, as authors are discouraged from using these styles.