Visualization & Photography


Provenance information describes the origination process of a specific item. In visual data analysis these items are either datasets or visualizations. As the reproducibility of research results gains in importance, the availability of these information plays a more prominent role. Moreover, increasing complexity of visual analysis projects results in an even larger number of resulting datasets and visualizations.

I utilized a revision control system to capture provenance information, giving the user the chance to describe special achievements or results. After analyzing the data repository, a graph structure that represents the provenance of the included artifacts is created. Referring to the Open Provenance Model (OPM), this graph is acyclic and contains three basic dependency-connected types of nodes:

  • Artifact

    Immutable piece of state, which may have a physical embodiment in a physical object, or a digital representation in a computer system.

  • Process

    Action or series of actions performed on or caused by artifacts, and resulting in new artifacts.

  • Agent

    Contextual entity acting as a catalyst of a process, enabling, facilitating, controlling, affecting its execution.

OPM graph showing the directed structure of the three node types. Artifacts are drawn as circles, Processes are drawn as rectangles and hexagons represent Agents.

A first approach to visualize the provenance information is the use of standard graph drawing techniques. However, even for small projects, a direct visualization of the provenance graph becomes too complex very soon. To communicate the essential information reduction and priorization techniques are used that extract the most important artifacts and their relationships from the graph, resulting in a smaller graph or sub-graph structure.

Example for a provenance graph with prioritized nodes (different grey values).

Sub-graph showing the local context of a selected node (red).

Kommentare sind geschlossen.