By Bill Kasdorf, Vice President, Apex, and General Editor, The Columbia Guide to Digital Publishing
You may have missed one of the most intriguing new technologies to be shown at the recent SSP 30th Annual Meeting in Boston. Demonstrated in the session on accessibility opened by Rick Bowes, and part of the presentation by Bob Kelly and John Gardner on how the American Physical Society (APS) is making its content accessible to the print-disabled, it was the software at the core of ViewPlus Technologies’ IVEO system for making images—especially diagrams and graphs—accessible.
Before you say “That sounds nice, but it doesn’t apply to me,” let me correct that all-too-common assumption up front. What this technology does could be of enormous benefit to all publishers whose content includes technical graphics—and the users of that content: researchers, scholars, and librarians. That includes most SSP members.
Accessible technologies suffer from an “Isn’t that wonderful, glad I don’t have to bother about it” syndrome. They’re admired but easily dismissed. Everybody agrees that it’s important to make published content accessible to people who have problems with print—the blind, people with low vision, people with learning disabilities (dyslexia in varying degrees is surprisingly common), and so forth. But we tend to think it’s somebody else’s problem, mostly an issue limited to K-12 educational publishers, for whom there are laws that mandate making instructional materials accessible and available to all who need them.
We certainly don’t tend to think of physicists first. And we don’t imagine that we are anywhere close to making content as technical as physics research accessible. But that’s exactly what APS and ViewPlus Technologies are doing. The strides they’re making are turning out to have benefits for physicists of all sorts—and, by analogy, to any publishers and consumers of images.
When most content is published digitally, the images are often neglected while the rest of the content is made more and more sophisticated. In the early days, digitization meant just scanning print pages and converting them to bitmapped images; although that is still common, it’s viewed as primitive and unacceptable to most users. The text in those pages has progressively been made more sophisticated—and more valuable. First, it was OCR’d (Optical Character Recognition) so it was searchable; then it was cleaned up and tagged with structural information (typically now in XML, ideally born digital in an XML workflow), which made it more navigable; metadata were added, and that made it more discoverable; and today, increasingly, semantic information is added to make it more meaningful, more useful, more powerful.
But all this time, the poor images usually still sat there as bitmaps—typically TIFF or JPEG files—with little metadata, no structure, no semantics, no searchability. That’s what John Gardner, CTO and founder of ViewPlus, set out to fix. (Dr. Gardner is a renowned solid-state physicist who went blind at the age of 48.)
What the ViewPlus technology does is to create image files that can be rendered in tactile form (via a tactile touchpad or printed out with an embossing printer) and also in audio (via text-to-speech technology). At the core of this technology is an image format known as SVG (Scalable Vector Graphics). Unlike bitmapped image formats like TIFF and JPEG, SVG is an XML-based format that provides the images as vector graphics (the core of PostScript and PDF) with text and metadata in XML.
The ViewPlus technology does what Dr. Gardner referred to as the “best possible conversion” from any given image format. When the images are already in vector form (for example, as EPS files within a PostScript or PDF file), it converts them to very accessible SVGs. When dealing with the more common TIFF or JPEG files, the software detects and OCRs the text (including labels within the graphics) and does the best it can with the image file. But having done so, it enables the software—or the user—to infer information about that graphic information. In fact, the next version of the ViewPlus software will enable authoring, so that users can add descriptive information to the SVG file.
The benefits to the print-disabled user are obvious. Dr. Gardner demonstrated how a graph that was otherwise inaccessible to a print-disabled user was made meaningful: He could feel the slopes of the various lines on the graph, and as he did so, the software read labels describing the lines, including the values of datapoints, as he touched them.
The most electrifying moment of Gardner’s presentation came when he pointed out how excited his sighted physicist colleagues were when they saw this demonstrated with a graphic. What the ViewPlus software had done with that image was nothing less than adding the semantics that takes it from being “dumb” to dynamic data. Imagine a whole collection of such images in which a researcher could use a computer to search for certain patterns, values, and features and do comparisons or calculations on them. This is Tim Berners-Lee’s vision of the Semantic Web: information that a computer can understand. Not just store, find, and deliver, but understand.
That was what got the physicists geeked. But there are benefits to publishers as well. Once an image is in SVG form, it is much easier to correct the problems that plague author-submitted graphics: wrong fonts, type too small, lines too fine, unusable colors, and so forth. SVGs are designed to render beautifully at all sizes (that’s what “scalable” means) and in all media. It’s a fundamental graphic format for the new EPUB standard for e-books, for example, and it is what cell phones use to render graphics. Plus, because they contain XML text and metadata, SVG graphics can be much better integrated with the text they accompany—and mined for other value for their publishers. In fact, Gardner found that many of his sighted colleagues preferred to have their computers “read” them information from the images in a paper so they could look at something else at the same time—perhaps an instrument in an experiment they were conducting.
ViewPlus is a great example of a technology that was originated to benefit the print-disabled but which promises to have great benefit for all of us, sighted or not.
Bill Kasdorf is vice president of Apex Content Solutions, a leading supplier of business services, data conversion, editorial, production, and support services to publishers and other organizations worldwide. A past SSP president, Bill has led seminars and spoken widely for publishing industry organizations such as SSP, the Association of American Publishers, the Association of American University Presses, Seybold Seminars, the Council of Science Editors, the Association of Learned and Professional Society Publishers, the Library of Congress, and the International Association of Scientific, Technical, and Medical Publishers.
Article compliments of the Society for Scholarly Publishing