There is some writing from my previous degrees with which I am sufficiently happy that I might share it in a From the Personal Archives series any month I don’t run the Revisiting series. This one comes from the same class as my essay on the universal library’s myriad problems, Dr. Richard Arias-Hernandez’s Fall 2014 course on Digital Libraries at the University of British Columbia iSchool. This time, I was to take a look at a digital library’s workflow and metadata standards and I decided to look at the Marxists Internet Archive as an exercise in connecting library practice with ideological and institutional constraints.
A Library without Librarians?: The Marxist Internet Archive’s Policies and Standards
The Marxist Internet Archive collects texts, and fragments of texts, from writers who have had some impact on Marxist, communist, socialist, and allied movements. Most of these texts are simple HTML documents viewed directly in a web browser; a number of them are available for download in other formats, most commonly pdf but also in prc, mobi, epub, and odt. Most, but not all, documents have a set of metadata, which the Archive presents in a standard way but which do not always contain the same fields: fields might include when the text was written, when it was first published, the source, who transcribed the text, who proofread the text, who applied HTML markup, and so on. Although the scope appears to be fairly well-defined as original texts or text fragments by Marxist, communist, socialist, or anarchist thinkers, a few scientific and feminist documents are also in the collection with little or insubstantial explanation.
The texts cannot be searched through a catalogue; indeed, all of the texts are on a static webpage, and links to each page are gathered under pages dedicated to a particular author (or, occasionally, topic).1 These author pages are accessible through a “Select Author” dropbox on the homepage or through one of several indexes. There are a few different indexes—“Selected Marxists,” “Library,” “History,” “Subjects”—and it is not always clear how the index creators decided what to include in each index or what distinctions they intended each index to capture. Further, the “Library” index also has smaller indexes nested within it, and it uses different sizes and colours for different links, either inconsistently or according to a standard that is not readily apparent. The Archive does have an embedded Google Site Search function, but it is only as effective as a plug-in designed for another context. In general, while the Marxist Internet Archive has a wide array of texts, its organization is either inconsistent or it adheres to a standard which is clear only to its creators.
The Marxists Internet Archive is run by volunteers. Two kinds of volunteer are involved: permanent volunteers sit on the Steering Committee and make largely autonomous decisions based on their charter and bylaws, and occasional volunteers do a lot of the digitization, transcription, translation, or proof-reading work (“Who We Are”). The bulk of the Steering Committee are not academics (ibid) and, based on the Archive, not many of them appear to be information professionals either. There is no “centralized planning,” and volunteers do whatever tasks they wish to do (ibid), although the “Volunteer Workshop” page encourages volunteers to consult the Steering Committee before beginning to check that someone else is not engaged in the same project. The website offers guidelines for creating HTML documents and downloadable eBook formats; the instructions for the latter are particularly extensive. For instance, the instructions ask volunteers to only create eBooks of texts exceeding 200,000 words and notes that the resulting files ought to be between 1.5 and 10 MB (“eBook Production Guidelines”). Further, fairly complete instructions are given regarding associated metadata and file formats (ibid), though these instructions would not meet professional standards.
Although the words “standard” or “policy” are not used explicitly, the volunteer workshop pages nonetheless describe both a standard for object and metadata creation and an ad hoc workflow. The Steering Committee’s charter and bylaws could also be considered a policy of sorts, as it determines the scope of the collection and the roles the Archives’ volunteers will play. However, the volunteers do not appear to always follow these standards. Indeed, the reliance on volunteer labour and the Steering Committee’s reluctance to use centralized planning appear to be the Archive’s largest barriers to applying the consistent standards expected of a professional digital library. One can only speculate whether the Steering Committee’s reluctance is practical, ideological, or some of each, but it has resulted in an Archive which espouses and encourages specific standards without meeting them consistently. Other challenges might include the lack of a larger institution’s support, in terms of financial resources, software and catalogue architecture, and expertise. The question, then, is whether the information professional community’s expectations of digital libraries can be achieved by groups of dedicated individuals working outside of that community and outside of institutional support structures, as will almost certainly happen.
Marxist Internet Archive Admin Committee. “eBook Production Guidelines.” Marxists Internet Archive. n.d. Web. Accessed 8 October 2014. <https://www.marxists.org/admin/volunteers/index.htm>
Marxist Internet Archive Admin Committee. “Who We Are.” Marxists Internet Archive. 2 January 2010. Web. Accessed 8 October 2014. <https://www.marxists.org/admin/volunteers/index.htm>
Marxist Internet Archive Admin Committee. “Volunteer Workshop.” Marxists Internet Archive. n.d. Web. Accessed 8 October 2014. <https://www.marxists.org/admin/volunteers/index.htm>
1. According to certain definitions, the Marxist Internet Archive therefore is not a digital library because digital libraries use catalogues, not lists of links to fixed pages. Perhaps this is true, though if it reliably offered PDF versions of its texts it might count. But outside of narrow definitional concerns, the Marxist Internet Archive offers an interesting opportunity to look at particular challenges involved in creating a digital library when relying on volunteer labour either because of a lack of institutional backing or because of ideological motivations.