Informatics has changed the way lab data is collected, stored, shared and analyzed. However, it is widely agreed that the full potential of laboratory data, for example, metadata, aggregated data and even the experimental results themselves, has yet to be realized; in particular, the application of these types of data to solve scientific challenges, such as productivity, reproducibility and large-scale integration. By re-imagining the traditional paradigms for lab data, companies are utilizing novel approaches to increase the value of laboratory data, the experimental process as a whole and scientific discovery in general.


BioBright utilizes hardware and software to easily record lab data and integrate it for analyses. The company’s focus is biological applications, especially the pharmaceutical industry’s needs for discovery, through clinical testing and manufacturing. Partners include the Sanger Institute, US Department of Defense’s DARPA (Defense Advanced Research Projects Agency) and major pharmaceutical companies.

Tools for Insight

Darwin consists of three main tools. “One is a Dropbox-like tool that automatically collects data from equipment, computers and other infrastructure that is deployed in typical pharmaceutical and research workflows,” explained Charles Fracchia, founder and CEO of BioBright. “The second is a voice assistant, very similar to Siri or Alexa, but the difference is we have the knowledge of all the custom vocabulary that goes on in the laboratory.” The third, Darwin Terminal, is a touchscreen dashboard for data visualization and analysis.

As Mr. Fracchia told IBO, “One of the big tensions is that the current paradigm for doing work in laboratory science requires a lot of manual labor and a lot of physical dexterity. The issue is doing the experiment and documenting the experiment are diametrically opposed in terms of activities.” The result is lost insight and experimental knowledge.

“What happens today is that scientists are doing an experiment and they try to remember all the minutiae and variation that may have happened, and only write them down in their lab notebook later on,” he said.

But Darwin Speech, the voice assistant, enables the data to be collected in real time with context. “For example, one can make a note by just saying, ‘Darwin, sample 3 looks cloudy,’ or ‘Darwin, I think the yield on this step is going to be low,’” explained Mr. Fracchia. “We know who said it, we know at what time, in what context, which instruments were used, which samples were being used, and all that information is basically available to the user after the fact.”

Maximizing Available Information

Previously, much of this contextual data may have been systematically collected rarely, if at all.

“There’s a lot of unrecorded information. There’s a lot of institutional information. There’s a lot of interaction information that is really crucial to how a company operates,” noted Mr. Fracchia.

Such information comes from both inside and outside a particular lab, such as from scientific presentations or even another lab in the same company that has completed the same experiment. “Our system, because it collects all the context from the data, from the presentation, from the equipment information, from the voice notes that people give, and also integrates with electronic lab notebooks, can get all the historical information.” But, as he emphasized, this solution does require a change in workflow. “Our tools are designed to be completely transparent to the workflow, the way it is done today. We not asking anybody to really change their workflow.”

The collection, aggregation and integration of such data can provide new insights, such as the source of experimental error or discovery of best practices.

“That’s the reason we collect information that may look seemingly tangentially related, but [in one example] we were able to find distributions of dispensing operations, and find in this case that there was an unusual distribution due to a human factor,” noted Mr. Fracchia. “What we’ve found is most useful are metrics that help scientists create a baseline of parameters that they know or that they suspect are playing an important role. Our system helps them hone their skills, their intuition and their knowledge about a particular workflow—something that today a machine cannot do.” He emphasized that Darwin augments what scientists do, rather than replacing scientists with automation.

The range of data available provides for key analyses and discoveries to me be made. “So everywhere from log files of dispensers, to calibration information of a plate reader, to historical data for the same compounds, we bring in, so you can ask those questions and then display it on a ‘mission control’ view.” As he put it, “You cannot improve what you cannot measure.”

Adding Instrument Data

As a data integrator, the company is very different from an ELN provider, emphasized Mr. Fracchia. ”We are the glue that collects all of the information and then makes it available through APIs to any other vendor that wants to integrate with it.” This includes instrument data.  “In fact, we see our role as being really key in interacting with those existing established players to really augment everybody’s capabilities… We are there to provide this interoperability layer for everybody who wants it.”

Further Insights

Mr. Fracchia also spoke to the future potential of BioBright’s technology. “[O]ur goal is squarely aimed at providing information and insights that will augment the human scientist in that workflow. You design things very differently when you can do that,” he explained. “Now, all of a sudden, you can start asking questions like, ‘Darwin, show me all images with a particular cell line in them,’ or ‘Darwin, show me the distribution of the drugs to the organisms that we’ve tried, and put that together with the results.’” This influences how a scientist works. “Our goal if we are successful is that scientists, when they are doing their work, never leave the scientific plane of thinking.”


