Video: Maximizing the Value of Data

Categories: Big Data, Tools and Resources

Tags: ,

Two looks at harnessing data, one by creating a new publication metric built around course syllabi, and the other about making software a “first-class citizen” in the research community, are the focus of the panel held at the annual conference of the Association of Learned and Professional Society Publishing last month.

This video – “Insights from teaching and Research: Maximizing the Value of Data” — as moderator Adrian Stanley, vice president of publisher business development with Digital Science, explains, is built around data but with “very practical things you can take away.”

The first of two speakers is Joe Karaganis, vice president of The American Assembly, which is a public policy institute at Columbia University. In the session, he describes The Open Syllabus Project (“Opening the curricular black box”) he directs at the university; the project is an academic data mining project that’s analyzing more than a million college course syllabi and presenting its analysis of the syllabi as an open-source tool that will effectively answer “what’s taught at universities.”

That is turn “gives a more intuitive grasp of how often things are taught in connection with each other,” and gives a new publication metric about what’s important for teaching and not necessarily for citation. One clear winning text: the classic volume The Elements of Style, which appears the most in the syllabi analyzes.

The second speaker is Ian Mulvany, head of product innovation at SAGE Publishing, who discusses maximizing the value of data through software publishing. The journey from collected data to usable data requires things like cleaning and analyzing, and all of those stops, and their many iterations, Mulvaney notes, are software based.

“What I want to convince you of is that we should be considering software as a first-class citizen,” he begins. Because if you have research data, and you want to maximize the value of that research data, he stresses, you must think about how to bring that data together with the software pipelines that created it, analyzed it and ultimately produced the results that the data makes claims about.

Leave a Reply