One such session was called “Everyone’s a Player: Creation of Standards in a Fast-Paced Shared World,” which discussed the work of NISO and the development of new standards and “best practices.” Marshall Breeding discussed the ongoing development of the Open Discovery Initiative (ODI), a project that seeks to identify the requirements of web-scale discovery tools, such as Summon. Breeding pointed out that it makes no sense for libraries to spend millions of dollars on subscriptions, if nobody can find anything. So, in this context, it makes sense for libraries to spend tens of thousands on discovery tools. But, since these tools are still so new, there are no standards for how these tools should function and operate with each other. ODI plans to develop a set of best practices for web-scale discovery tools, and is beginning this process by developing a standard vocabulary as well as a standard way to format and transfer data. The project is still in its earliest phases and will have its first work available for review this fall. Also at this session, Regina Reynolds from the Library of Congress discussed her work with the PIE-J initiative, which has developed a draft set of best practices that is ready for comment. PIE-J stands for the Presentation & Identification of E-Journals, and is a set of best practices that gives guidance to publishers on how to present title changes, issue numbering, dates, ISSN information, publishing statements, etc. on their e-journal websites. Currently, it’s pretty much the Wild West out there, with publishers following unique and puzzling practices. PIE-J hopes to help clean up the mess.
Another session that was quite useful was on “CONSER Serials RDA Workflow,” where Les Hawkins, Valerie Bross and Hien Nguyen from Library of Congress discussed the development of RDA training materials at the Library of Congress, including CONSER serials cataloging materials and general RDA training materials from the PCC (Program for Cooperative Cataloging). I haven’t had a chance yet to root around on the Library of Congress website, but these materials are available for free, and include a multi-part course called “Essentials for Effective RDA Learning” that includes 27 hours (yikes!) of instruction on RDA, including a 9 hour training block on FRBR, a 3 hour block on the RDA toolkit, and 15 hours on authority and description in RDA. This is for general cataloging, not specific to serials. Also, because LC is working to develop a replacement for the MARC formats, there is a visualization tool called RIMMF available at marcofquality.com that allows for creating visual representations of records and record-relationships in a post-MARC record environment. It sounds promising, but I haven’t had a chance to play with it yet. Also, the CONSER training program, which focuses on serials cataloging, is developing a “bridge” training plan to transition serials catalogers from AACR2 to RDA, which will be available this fall.
Another interesting session I attended was “Automated Metadata Creation: Possibilities and Pitfalls” by Wilhelmina Randtke of Florida State University Law Research Center. She pointed out that computers like black and white decisions and are bad with discretion, while creating metadata is all about identifying and noting important information. Randtke said computers love keywords but are not good with “aboutness” or subjects. So, in her project, she tried to develop a method to use computers to generate metadata for graduate theses. Some of the computer talk got very technical and confusing for me, but her discussion of subject analysis was fascinating. Using certain computer programs for automated indexing, Randtke did a data scrape of the digitally-encoded theses and identified recurring keywords. This keyword data was run through ontologies/thesauruses to identify more accurate subject headings, which were applied to the records. A person needs to select the appropriate ontology/thesaurus for the item(s) and review the results, but the basic subject analysis can be performed by the computer. Randtke found that the results were cheap and fast, but incomplete. She said, “It’s better than a shuffled pile of 30,000 pages. But, it’s not as good as an organized pile of 30,000 pages.” So, her work showed some promise, but still needs some work.
Of course there were a number of other interesting presentations, but I have to leave something for Chris and Derrik to write about. One idea that particularly struck me came from Rick Anderson during his thought provoking all-conference vision session on the final day, “To bring simplicity to our patrons means taking on an enormous level of complexity for us.” That basic idea has been something of an obsession of mine for the last few months while wrestling with authority control and RDA and considering the semantic web. To make our materials easily discoverable by the non-expert (and even the expert) user, we have to make sure our data is rigorously structured and that requires a lot of work. It’s almost as if there’s a certain quantity of work that has to be done to find stuff, and we either push it off onto the patron or take it on ourselves. I’m in favor of taking it on ourselves.
The slides for all of the conference presentations are available here: http://www.slideshare.net/NASIG/tag/nasig2012 for anyone who is interested. You do not need to be a member of NASIG to check them out.