At the core of what we do at scrazzl is data-mining and document analysis. With greater than 5 million full-text articles and more than 20 million abstracts we have no shortage of raw data. Undoubtedly, analysis of content and content usage has the potential to provide insights of great benefit to the research community. Globally, efforts to develop alternative metrics (altmetrics) to supplement impact factors and citation counting are gathering pace and are being spearheaded by companies like PloS, CiteULike and Mendeley. In parallel, publisher led initiatives like the Usage Factor project also promise to provide useful insights into article usage and quality.
It has been clear for some time that the traditional methods of establishing article quality and impact, namely peer review, JIF (journal impact factor) and citation count are straining under the weight of new content being produced every year. The emergence of altmetrics as a supplementary measure of quality taking account of storage, links, bookmarks and conversations certainly has the potential to alleviate some of the problems caused by the continuing data tsumani. Collectively, journal and article level metrics are designed to help readers assess research quality at the level of the article.
Apart from the question “Is this article worth reading?” there are quite a number of other questions asked by researchers that could in the future benefit from improvements in analysis of content and content usage. For example, Datacite is working to promote citation and attribution of primary datasets thus making it easier for researchers to source good primary data. At scrazzl our focus is on the provision of qualitative metrics to support decision making in experimental science and medicine. Our working hypothesis is that by collating data related to material usage in a large sample of articles, it is possible to derive qualitative measures and insights into the use and application of those materials. Through analysis of articles in combination with metrics related to article quality and researcher influence, we are working to help our users answer questions like:
- How good is this product, should I use it/prescribe it? Recommendation engines like TripAdvisor have revolutionised how we make decisions about the hotels we book or the places we visit. However, the fragmented nature of the scientific supply market coupled with the fact that scientists do not purchase like ordinary consumers (usually they buy through procurement systems) has meant that there is little by way of qualitative information relating to experimental materials. Knowing that other scientists have successfully used the tools and technologies that you are interested in can have a significant bearing on your decision to buy a specific product. In the competitive world of research the cost of failed experiments is high. An initial focus of our article analysis work has been on extracting usage data for experimental materials. We render these derivative metrics through our Product Metrics Widget.

- What are the optimal conditions for the experiment that I plan to run? Slightly more technically challenging than gathering usage statistics on material entities in articles is gathering optimal working conditions for those same materials. Take for example a specific antibody the use of which has been reported more than 500 times in published research. As a researcher, you are interested in the range of dilutions that have been successful for this antibody. Providing this normally fragmented and disparate information in a structured and actionable form is a current R&D focus.
- Who are the researchers that have the experience or resources that I need? Sites like LinkedIn have been successful because they have built a community of skilled individuals and made it really easy to create a network within that community. In science there have been many efforts to create a “facebook for science” most of which have failed. Those that are succeeding, Mendeley and CiteUlike, for example, are doing so because their core value proposition has never been to create a facebook for science. These platforms work to solve other issues such as reference management and social bookmarking and allow users to build a social network as a by-product. For scientists seeking to find specific material resources or technical experts arguably the most complete source of information is in existing literature. At scrazzl, by mapping the associations between scientists and the experimental tools that they document in their research, we are making it easier for scientists to find the people and answers that they require.
Currently we are at the start of a transformation in how the quality and impact of research is assessed. As new standards of research qualification (usage factor, altmetrics) mature it is likely that researchers will demand more tools that can unlock the collective insights housed within content and content usage to enable better decision making. There will be a greater demand for standardised metrics of quality in many areas of scientific enquiry. For our part we will be working hard to unlock the benefits of largescale article analysis in all areas of experimental science.


