The ubiquitous pile of Excel files serving as a research result “database” clearly demonstrates the failure of current assay data management systems in offering sufficient flexibility to match the requirements of scientific research. Commercial research as well as academic research have a great need for a new generation of assay data management software that not only manages data effectively, but does so without constraining research workflows and data analysis; otherwise, scientists will continue to be forced to rely on Excel for data management. This blog post discusses the main unmet needs and suggests a different approach to the implementation of Assay Data Management Systems.


Microsoft Excel, GraphPad Prism and R are very powerful and flexible tools for experiment result processing. However, they are absolutely insufficient as data management systems. Assay data management systems exist, and they are quite good at the data management part, but the result processing is rudimentary and inflexible, often falling far short of the real world requirements of scientific research. Thus, in many cases, scientist have a strong (and legitimate) preference for Excel, Prism or R. With Excel et. al. being the only feasible approach to result processing, scientists are faced with the dilemma of giving up on effective data management or going through a tedious and error-prone more-or-less manual data transfer process between the result processing and the data management systems.

Elements of an Assay Data Management System

Before addressing the improvements that are necessary, let’s have a look at the elements that go into assay data management:

  • Workflow- and order-management: Communication with the project or (internal) customer who requested the analysis. Coordination with upstream and downstream activities.
  • Sample logistics, including barcode support, storage management, and transfer of sample information (e.g. concentration, buffer composition etc).
  • Experiment setup, often something that can be greatly enhanced for the end user with custom functionality in the IT system.
  • Data capture, including proper linking to the experiment setup; often requiring customisation to handle different file formats and data logistics.
  • Data processing, turning instrument output into useful results, including curve fitting and other statistical analysis, maybe in several stages done by different software.
  • Result storage in a database, with support for the very diverse nature of result results, as described in last weeks post .
  • Reporting results back to the customer/project.
  • Aggregated analysis of results from many experiments, in combination with data from other systems, maybe long after the experiment was performed when the scientist in charge is no longer with the organisation.

Improvements

The main areas where a next generation assay data management system must improve over the current generation are:

  • Easily let scientists choose different statistics/analysis software for processing experimental results, e.g. Excel, Prism, R, or custom built analysis software.
  • Much stronger support for IT savvy scientists as well as in-house research informatics staff to customize the system in an agile fashion, in order to keep up with quickly changing research workflows, and to reap the benefits of customisation, including greater system convenience (and hence, adoption by end users), saving scientists time and increased data quality.
  • Reduce the barrier to entry — long implementation time, risk of software lock-in and a steep learning curve makes system adoption a big decision; it should be easy to start using a system and equally easy to leave it if it doesn’t work out. Microsoft Excel is the gold standard of low-barrier-to-entry, and any software system should be measured against Excel for ease of adoption.
  • The system components must be replaceable by alternative components that the customer currently uses and be able to be used individually as building blocks in a larger data management infrastructure — enabling a much smoother integration than what’s possible today.

A Simple Example of Integrating Excel Based Result Processing

Support for different result analysis tools, including Excel and Prism, is a key element in making a flexible system. This screenshot from Scirex illustrates one way to integrate Excel processing into an assay workflow:

The “Download Result Template” button auto generates an Excel file containing relevant information from the experiment setup, including laboratory step ID and sample IDs. This enables tracking of results against specific samples. Creation of this Excel file is also one of the major customization elements, since it can be tailored to each type of assay or workflow and thus help laboratory execution and assay data capture and processing.

Within the template, the scientist is free to use all the features of Excel for data analysis, only making sure that the final results are places in the right location in the template in order to be extracted into the system.

After data processing, the result file is dropped on the “Drop Result File Here” area. The system will read the results from the file and link it to input samples and the relevant workflow step.

With this approach, the scientist have full freedom and flexibility in data processing and analysis within Excel, and the results are available for sharing with colleagues and for aggregated analysis by entry into the database.

Prism Wrapper

Another popular data analysis tool is GraphPad Prism, and support for this system is very important as well. Prism has scripting (“macro”) support that lets an external system to execute Prism analyses based on a template that the user has created. The scientist can create a Prism template with custom analyses, and then a “Prism adapter” in the assay data management system can enable the system to automatically process a experimental results using that template and collect the results into a database.

Conclusion

Choice of analysis tool, customisation, low barrier-to-entry, and integration features are the key drivers towards more effective assay data management. Future post will go into more details about the Excel Adapter and the Prism Adapter. This blog will also chronicle the development of assay data management functionality in Scirex according to the general approach outlined in this post.