Open Access: posting and reuse. [what is predatory?]

There has been some recent discussion about what it means to be a predatory journal, and who suffers as a result of these unscrupulous behaviors. And, while publishers bask under the safety of the illusion of a free market, Institutional Repositories are criticized for posting the manuscript version of articles with proper citation but without the specific publisher’s unique requirements, even after a 12 month embargo.

While trying to further the positive impact of spreading knowledge created at our higher education institutions, repository managers spend a great deal of time checking copyright, applying metadata and adding publisher’s approved statements to articles so readers are fully aware of the original place of publication. We do this because it makes articles more useful, but also because publishers demand it on their often lengthy, shifting, and unique set of requirements for publishing in repositories.

This is partially why, this spring, when colleagues found their article “Building Library Community Through Social Media” indexed in Google Scholar but pointing to a copy in ProQuest behind a paywall, I felt a particular frustration.

The author’s felt similarly frustrated and expressed it on Twitter. ProQuest replied on Twitter stating that “ProQuest provides many types of content, including OA.” Which would be acceptable if they left the content OPEN. They also state that “Our goal is to simplify the workflow for our users, to avoid looking in multiple places for quality content.” Which is interesting because we found this link on the popular site Google Scholar which does a pretty good job of simplifying the search workflow. And, because the article is already easily found through its original open access publication in LITA.

Further, ProQuest asked that our author “Please DM your contact details if you’d like to talk with us.” Maybe they thought that making a public (Twitter) conversation private would be a nice analogy of their actions around the article.

ProQuest failed to link to the published version. Their citation:

Building Library Community Through Social Media Young, Scott W H; Rossmann, Doralyn. Information Technology and Libraries (Online)34.1 (Mar 2015): 20-37. )

You will note that there is no DOI. And ,they are creating HTML versions of the ITAL articles, which, if you manage to get behind the paywall, omit all non-textual material (images, charts, graphs, etc.). This is very troubling.

I helped my colleague write a response to ProQuest that outlined our concerns, below.

My library colleagues, who are active advocates of open publication, are left frustrated with ProQuest. And while we remain hopeful that current (and future) open access library journals continue to provide options for open publication, that does not fix the issue at hand: ProQuest’s unethical indexing. ProQuest has not added a DOI to the citation in question or changed their practice in any visible way based on this interaction. If one definition of predatory is “seeking to exploit others”, then I would bet that ProQuest fits that pretty well.

NOTE: Some issues with the web indexing from the journal Information Technology and Libraries(ITAL) lead to the ProQuest version being the first search result for a few weeks. [Of the 7 versions currently available through Google Scholar, 6 link to the ITAL page or directly to the .pdf of their article from the ITAL page.] Although ProQuest’s unethical linking practice is now muted by the availability of the open access versions of the paper, the exchange should be a call to arms for OA options to flood the market where there clearly is a need and for authors to speak up when their content is hidden behind a paywall for commercial gain [without their permission].

The email exchange:


Dear [ProQuest],

I write to you in response to the following exchange I had recently with the ProQuest Twitter account.

The following article is available through a fully Open Access publication, which means that it is freely available to anyone with internet access.

Subsequent to publication, this article has been indexed by ProQuest and is currently made available through ProQuest, though it is behind a paywall with standard options for access to subscription content. This is troubling, as the content is available through an Open Access journal that is not routinely indexed by ProQuest.

When ProQuest replied on Twitter that they provide access to content, including OA, it is as if you confuse OA with a topic, like Economics or Biofilm. Open Access is about the freely available use are reuse of knowledge, which ProQuest’s paywall actively denies.

This is, in part, the justification for the CC-BY 3.0 license attached to this publication which allows for reuse contingent of attribution of authorship. Your use falls within this license mostly, although I would argue that the lack of a link back to the original posting is improper attribution: you have posted an incomplete citation with this article.

Reposting a freely-available article behind a paywall is poor practice. Your response on Twitter stated that “Our goal is to simplify the workflow for our users, to avoid looking in multiple places for quality content”, yet picking one article from a journal issue and posting it for subscription access does not seem to work in favor of that “simplification,” but rather adds to potential confusion for a researcher who finds yet another source for this article.

In light of this situation, I have a few questions: How did ProQuest come to index this particular article? Why does the citation on ProQuest’s preview page not include a link to the original posting? How does ProQuest justify reposting a freely-available article
behind a paywall?

Please help me better understand this situation.

Thank you for your time addressing this matter.  I look forward to your response.

The response from ProQuest:

Thank you for your patience and the time you’ve allowed me to ensure that I was providing you accurate details in my reply.

The journal Information Technology and Libraries has been indexed in Library and Information Science Abstracts (LISA) from ProQuest and its predecessors since 1968, when it appeared in print as the Journal of Library Automation. To make this information more apparent, we have asked our sister company Ulrich’s to update their coverage details, and will also request the journal’s publisher, the American Library Association, include this information on its publication site.

ProQuest includes many open access titles in our databases. Where the full text is made available inside the ProQuest platform, this is always through a formal license agreement with the publisher. In the case of Information Technology and Libraries, we have been licensing the full text from the American Library Association since 1987 and continued to license the journal after it moved to the new open access, online-only format in 2012. It’s an important title and widely used by researchers, who are accustomed to finding it within the ProQuest platform.

ProQuest is a key resource within the scholarly workflow and we have found that inclusion of high quality OA content in ProQuest boosts its dissemination and discovery by the academic community. Our goal is not to hide OA content behind paywalls, but to integrate it so that it’s discovered in context with other relevant scholarly content. We use end-users as our guides for decisions such as these, consulting usability studies that we conduct and also those from organizations focused on the research workflow.

We fully understand and appreciate your perspective on the matter of linking to the original version. It’s a thorny issue as there are no industry standards for citing articles appearing in open access journals, and the entire community is adapting to an OA landscape that is changing rapidly.   Here at ProQuest, we are evaluating and testing models that work for publishers, authors, as well as libraries and their patrons. Please know the feasibility of linking to author versions is of prime concern.

I hope this answers your questions and concerns. I’m happy to discuss this further.


After reviewing the CrossRef and EZID fee structures with our Executive Team, we’ve decided to subscribe to EZID. A major factor in this choice is our recent commitment to minting DOIs for every record in ScholarWorks. With several thousand records now needing DOIs, we would have ended up paying a substantial amount with CrossRef’s per-DOI fee structure. EZID’s flat fee seemed like a better fit—simpler and more cost-effective for our needs. After exchanging a couple of emails and signing a service agreement, we are up and running. Our first DOIs will be minted this week!

DOIs and ARKs: What Are They, and Why Use Them?

This year, we at Publication and Data Services have started looking into adding unique identifiers to the content in our digital collections, including ScholarWorks. Almost as soon as scholarly content began to be published online, there arose the problem of “reference rot”when links to online content no longer work. A recent study in PLOS ONE looked at millions of articles and found that one in five reference links were broken. (And just yesterday, an update was published on the Impact of Social Sciences blog.) A New York Times article from last year highlighted reference rot in Supreme Court cases. It’s not only inconvenient to click on a link that leads to a 404 error page, it threatens our scholarly legacy. To combat this problem, several persistent identifier formats have been developed, including the Handle System, Universal Resource-Identifier/Locator/Name (URI/URL/URN), Digital Object Identifier (DOI), Archival Resource Key (ARK), and Universally Unique Identifier (UUID). Academic journals and digital libraries now commonly use these persistent identifiers in order to make sure that their digital content is available into the future. (Just a note: we use the word “persistent” in order to hedge against the more forceful “permanent.” These identifiers are designed to help combat reference rot, but they are only as permanent as the institutions that mint and maintain them. If you’d like to learn more, here’s an interesting blog post discussing some of the nuances of DOIs.)

Right now, we use a hybrid of Handles and URIs in ScholarWorks. When we upload a record to the repository, it automatically gets a unique URI that contains a Handle (for example, the URI has Handle 1/3413). The idea is that this URI is a persistent link for citation purposes. But a couple of factors have gotten us thinking about alternatives to our system. First, while looking into how to make our DSpace repository better looking, we realized that it might help to switch from the XMLUI interface to the JSPUI interface of DSpace. We don’t have to get into the differences between these interfaces. But you can see in the example URI from ScholarWorks that “XMLUI” is actually part of our unique identifier. If we were to switch interfaces, all of our so-called “persistent” URIs would break.

Our second consideration is that Digital Object Identifiers (DOIs) are quickly becoming the standard for scholarly articles and data sets, and a few recent publications have shown DOIs to be robust persistent identifiers, especially for data. We’ve also seen some examples of ARKs being used for digital archival content, and right now our digital photos and documents don’t have persistent identifiers at all. We decided to look into assigning DOIs to our articles and data, and assigning ARKs to our digital collections.

The main difference between DOIs and ARKs is that DOIs are generated and managed by a few specific organizations, whereas ARKs can be generated and managed by any institution. The process of becoming a DOI-minting agency is expensive, and therefore DOIs are only offered by a couple of services. DOIs are used more often by publishers and online data providers, and the DOI agencies make most of the technical decisions surrounding DOI minting and metadata. On the other hand, it is free to procure a Name Assigning Authority Number (NAAN) in order to generate ARKs, and open source software can be used to mint ARKs and create associated metadata. ARKs tend to be used by cultural institutions, and each ARK-generating institution is free to define its own policies and services.

Right now there are two main minters of DOIs: California Digital Library (CDL) EZID service, and CrossRef. For a PhD granting research institution like MSU, EZID’s annual subscription fee is $2500, with a million DOIs and unlimited ARKs included. CrossRef’s pricing is determined by publishing revenue; since we make less than $1 million per year from our publishing ventures, CrossRef would only cost us $275 per year, with an additional fee of $1.00 for each publication and $.06 for each data set. Since at this point, we don’t plan to mint many DOIs, it looks as though CrossRef might be the way to go. The one hitch in the plan is if we still want to assign ARKs to our digital archival collections. CrossRef doesn’t provide an ARK generating service, so we’d have to get a Name Assigning Authority Number from CDL and set up a system for creating our own.

Is this all too much trouble? Should we just pay the $2500 for EZID and call it good? The answer, of course, has to be determined by our administrators. I’m meeting with our Executive Team tomorrow, and I’ll post an update next week about decision.

Welcome to the library’s Publication and Data Services blog!

(photo by Kelly Gorham)

We’re passionate about all the innovative research happening at MSU, from the pathogens Sheila Nielsen-Preiss’s team sent to the International Space Station for microgravity studies, to Sarah Vogt’s research on how heat changes the structure of Mozzarella and Cheddar cheeses.

One of our most exciting projects is ScholarWorks, MSU’s open access institutional repository. ScholarWorks aims to capture our university’s intellectual work, and it is a central point of discovery for accessing, collecting, sharing, preserving, and distributing knowledge to the MSU community and the world.

Here at the blog, we’ll be sharing about MSU research, scholarly communication, open access, and data. We’ll also be posting citations and links to articles published by MSU researchers. We’re constantly amazed by the breadth of scholarly inquiry happening on our campus.

We hope you’ll join the conversation!

Sara and Leila