Hardware-accelerated Interactive Data Visualization For .

3y ago
48 Views
2 Downloads
1.38 MB
9 Pages
Last View : 7d ago
Last Download : 3m ago
Upload by : Amalia Wilborn
Transcription

ORIGINAL RESEARCH ARTICLEpublished: 19 December 2013doi: ccelerated interactive data visualization forneuroscience in PythonCyrille Rossant* and Kenneth D. HarrisCortical Processing Laboratory, University College London, London, UKEdited by:Fernando Perez, University ofCalifornia at Berkeley, USAReviewed by:Werner Van Geit, ÉcolePolytechnique Fédérale deLausanne, SwitzerlandMichael G. Droettboom, SpaceTelescope Science Institute, USA*Correspondence:Cyrille Rossant, Cortical ProcessingLaboratory, University CollegeLondon, Rockefeller Building, 21University Street, London WC1E6DE, UKe-mail: cyrille.rossant@gmail.comLarge datasets are becoming more and more common in science, particularly inneuroscience where experimental techniques are rapidly evolving. Obtaining interpretableresults from raw data can sometimes be done automatically; however, there are numeroussituations where there is a need, at all processing stages, to visualize the data in aninteractive way. This enables the scientist to gain intuition, discover unexpected patterns,and find guidance about subsequent analysis steps. Existing visualization tools mostlyfocus on static publication-quality figures and do not support interactive visualization oflarge datasets. While working on Python software for visualization of neurophysiologicaldata, we developed techniques to leverage the computational power of modern graphicscards for high-performance interactive data visualization. We were able to achieve veryhigh performance despite the interpreted and dynamic nature of Python, by usingstate-of-the-art, fast libraries such as NumPy, PyOpenGL, and PyTables. We presentapplications of these methods to visualization of neurophysiological data. We believe ourtools will be useful in a broad range of domains, in neuroscience and beyond, where thereis an increasing need for scalable and fast interactive visualization.Keywords: data visualization, graphics card, OpenGL, Python, electrophysiology1. INTRODUCTIONIn many scientific fields, the amount of data generated bymodern experiments is growing at an increasing pace. Notabledata-driven neuroscientific areas and technologies include brainimaging (Basser et al., 1994; Huettel et al., 2004), scanning electron microscopy (Denk and Horstmann, 2004; Horstmann et al.,2012), next-generation DNA sequencing (Shendure and Ji, 2008),high-channel-count electrophysiology (Buzsáki, 2004), amongstothers. This trend is confirmed by ongoing large-scale projectssuch as the Human Connectome Project (Van Essen et al., 2012),the Allen Human Brain Atlas (Shen et al., 2012), the Human BrainProject (Markram, 2012), the Brain Initiative (Insel et al., 2013),whose specific aims entail generating massive amounts of data.Getting the data, while technically highly challenging, is only thefirst step in the scientific process. For useful information to beinferred, effective data analysis and visualization is necessary.It is often extremely useful to visualize raw data right afterthey have been obtained, as this allows scientists to make intuitive inferences about the data, or find unexpected patterns, etc.Yet, most existing visualization tools (such as matplotlib,1 Chaco,2PyQwt, 3 Bokeh, 4 to name only a few Python libraries) are eitherfocused on statistical quantities, or they do not scale well to verylarge datasets (i.e., containing more than one million points).With the increasing amount of scientific data comes a more andmore pressing need for scalable and fast visualization tools.1 http://matplotlib.org/The Python scientific ecosystem is highly popular in science (Oliphant, 2007), notably in neuroscience (Koetter et al.,2008), as it is a solid and open scientific computing and visualization framework. In particular, matplotlib is a rich, flexibleand highly powerful software for scientific visualization (Hunter,2007). However, it does not scale well to very large datasets. Thesame limitation applies to most existing visualization libraries.One of the main reasons behind these limitations stems fromthe fact that these tools are traditionally written for centralprocessing units (CPUs). All modern computers include a dedicated electronic circuit for graphics called a graphics processingunit (GPU) (Owens et al., 2008). GPUs are routinely used invideo games and 3D modeling, but rarely in traditional scientific visualization applications (except in domains involving 3Dmodels). Yet, not only are GPUs far more powerful than CPUs interms of computational performance, but they are also specificallydesigned for real-time visualization applications.In this paper, we describe how to use OpenGL (Woo et al.,1999), an open standard for hardware-accelerated interactivegraphics, for scientific visualization in Python, and note the roleof the programmable pipeline and shaders for this purpose. Wealso give some techniques which allow very high performancedespite the interpreted nature of Python. Finally, we present anexperimental open-source Python toolkit for interactive visualization, which we name Galry, and we give examples of itsapplications in visualizing neurophysiological data.2 http://code.enthought.com/projects/chaco/2. MATERIALS AND METHODS3 http://pyqwt.sourceforge.net/In this section, we describe techniques for creating hardwareaccelerated interactive data visualization applications in Python4 https://github.com/ContinuumIO/BokehFrontiers in Neuroinformaticswww.frontiersin.orgDecember 2013 Volume 7 Article 36 1

Rossant and HarrisHardware-accelerated visualization in Pythonand OpenGL. We give a brief high-level overview of the OpenGLpipeline before describing how programmable shaders, originallydesigned for custom 3D rendering effects, can be highly advantageous for data visualization (Bailey, 2009). Finally, we apply thesetechniques to the visualization of neurophysiological data.2.1. THE OPENGL PIPELINEA GPU contains a large number (hundreds to thousands) of execution units specialized in parallel arithmetic operations (Hongand Kim, 2009). This architecture is well adapted to realtimegraphics processing. Very often, the same mathematical operationis applied on all vertices or pixels; for example, when the camera moves in a three-dimensional scene, the same transformationmatrix is applied on all points. This massively parallel architectureexplains the very high computational power of GPUs.OpenGL is the industry standard for real-time hardwareaccelerated graphics rendering, commonly used in video gamesand 3D modeling software (Woo et al., 1999). This open specification is supported on every major operating system 5 and mostdevices from the three major GPU vendors (NVIDIA, AMD,Intel) (Jon Peddie Research, 2013). This is a strong advantage ofOpenGL over other graphical APIs such as DirectX (a proprietarytechnology maintained by Microsoft), or general-purpose GPUprogramming frameworks such as CUDA (a proprietary technology maintained by NVIDIA Corporation). Scientists tend tofavor open standard to proprietary solutions for reasons of vendorlock-in and concerns about the longevity of the technology.OpenGL defines a complex pipeline that describes how 2D/3Ddata is processed in parallel on the GPU before the final image isrendered on screen. We give a simplified overview of this pipelinehere (see Figure 1). In the first step, raw data (typically, points inthe original data coordinate system) are transformed by the vertexprocessor into 3D vertices. Then, the primitive assembly createspoints, lines and triangles from these data. During rasterization,these primitives are converted into pixels (also called fragments).Finally, those fragments are transformed by the fragment processor to form the final image.An OpenGL Python wrapper called PyOpenGL allows thecreation of OpenGL-based applications in Python (Fletcher andLiebscher, 2005). A critical issue is performance, as there is a slightoverhead with any OpenGL API call, especially from Python. Thisproblem can be solved by minimizing the number of OpenGLAPI calls using different techniques. First, multiple primitives ofthe same type can be displayed efficiently via batched rendering.Also, PyOpenGL allows the transfer of potentially large NumPyarrays (Van Der Walt et al., 2011) from host memory to GPUmemory with minimal overhead. Another technique concernsshaders as discussed below.2.2. OPENGL PROGRAMMABLE SHADERSPrior to OpenGL 2.0 (Segal and Akeley, 2004), released in 2004,vertex and fragment processing were implemented in the fixedfunction pipeline. Data and image processing algorithms weredescribed in terms of predefined stages implemented on nonprogrammable dedicated hardware on the GPU. This architecture5 s/Frontiers in NeuroinformaticsFIGURE 1 Simplified OpenGL pipeline. Graphical commands and datago through multiple stages from the application code in Python to thescreen. The code calls OpenGL commands and sends data on the GPUthrough PyOpenGL, a Python-OpenGL binding library. Vertex shadersprocess data vertices in parallel, and return points in homogeneouscoordinates. During rasterization, a bitmap image is created from the vectorprimitives. The fragment shader processes pixels in parallel, and assigns acolor and depth to every drawn pixel. The image is finally rendered onscreen.resulted in limited customization and high complexity; as a result,a programmable pipeline was proposed in the core specificationof OpenGL 2.0. This made it possible to implement entirely customized stages of the pipeline in a language close to C called thewww.frontiersin.orgDecember 2013 Volume 7 Article 36 2

Rossant and HarrisHardware-accelerated visualization in PythonOpenGL Shading Language (GLSL) (Kessenich et al., 2004). Thesestages encompass most notably vertex processing, implementedin the vertex shader, and fragment processing, implemented in thefragment shader. Other types of shaders exist, like the geometryshader, but they are currently less widely supported on standardhardware. The fixed-function pipeline has been deprecated sinceOpenGL 3.0.The main purpose of programmable shaders is to offer highflexibility in transformation, lighting, or post-processing effectsin 3D real-time scenes. However, being fully programmable,shaders can also be used to implement arbitrary data transformations on the GPU in 2D or 3D scenes. In particular, shaderscan be immensely useful for high-performance interactive 2D/3Ddata visualization.The principles of shaders are illustrated in Figure 2, sketching a toy example where three connected line segments forminga triangle are rendered from three vertices (Figure 2A). A dataitem with an arbitrary data type is provided for every vertex. Inthis example, there are two values for the 2D position, and threevalues for the point’s color. The data buffer containing the itemsfor all points is generally stored on the GPU in a vertex bufferobject (VBO). PyOpenGL can transfer a NumPy array with theappropriate data type to a VBO with minimal overhead.OpenGL lets us choose the mapping between a data itemand variables in the shader program. These variables are calledattributes. Here, the a position attribute contains the firsttwo values in the data item, and a color contains the last threevalues. The inputs of a vertex shader program consist mainly ofattributes, global variables called uniforms, and textures. A particularity of a shader program is that there is one execution threadper data item, so that the actual input of a vertex shader concerns a single vertex. This is an example of the Single Instruction,Multiple Data (SIMD) paradigm in parallel computing, whereone program is executed simultaneously over multiple cores andmultiple bits of data (Almasi and Gottlieb, 1988). This pipelineleverages the massively parallel architecture of GPUs. Besides,GLSL supports conditional branching so that different transformations can be applied to different parts of the data. In Figure 2A,the vertex shader applies the same linear transformation (rotationand scaling) on all vertices.The vertex shader returns an OpenGL variable calledgl Position that contains the final position of the currentvertex in homogeneous space coordinates. The vertex shadercan return additional variables called varying variables (here,v color), which are passed to the next programmable stage inthe pipeline: the fragment shader.After the vertex shader, the transformed vertices are passedto the primitive assembly and the rasterizer, where points, linesand triangles are formed out of them. One can choose the modedescribing how primitives are assembled. In particular, indexingrendering (not used in this toy example) allows a given vertex tobe reused multiple times in different primitives to optimize memory usage. Here, the GL LINE LOOP mode is chosen, wherelines connecting two consecutive points are rendered, the lastvertex being connected to the first.Finally, once rasterization is done, the scene is described interms of pixels instead of vector data (Figure 2B). The fragmentFrontiers in Neuroinformaticsshader executes on all rendered pixels (pixels of the primitivesrather than pixels of the screen). It accepts as inputs varyingvariables that have been interpolated between the closest verticesaround the current pixel. The fragment shader returns the pixel’scolor.Together, the vertex shader and the fragment shader offergreat flexibility and very high performance in the way data aretransformed and rendered on screen. Being implemented in asyntax very close to C, they allow for an unlimited variety ofprocessing algorithms. Their main limitation is the fact thatthey execute independently. Therefore, implementing interactions between vertices or pixels is difficult without resorting tomore powerful frameworks for general-purpose computing onGPUs such as OpenCL (Stone et al., 2010) or CUDA (Nvidia,2008). These libraries support OpenGL interoperability

applications of these methods to visualization of neurophysiological data. We believe our tools will be useful in a broad range of domains, in neuroscience and beyond, where there is an increasing need for scalable and fast interactive visualization. Keywords: data visualization, graphics card, OpenGL, Python, electrophysiology. 1. INTRODUCTION .

Related Documents:

visualization, interactive visualization adds natural and powerful ways to explore the data. With interactive visualization an analyst can dive into the data and quickly react to visual clues by, for example, re-focusing and creating interactive queries of the data. Further, linking vi

2.1 Data Visualization Data visualization in the digital age has skyrocketed, but making sense of data has a long history and has frequently been discussed by scientists and statisticians. 2.1.1 History of Data Visualization In Michael Friendly's paper from 2009 [14], he gives a thorough description of the history of data visualization.

Interactive Visual Learning of Deep Learning Models in Browser. Collaboration with. HUMAN -CENTERED AI Interactive Data Visualization. 110K visitors from 170 countries. 1.9K Likes. 800 Retweets. We build interactive data visualization tools for people to more easily understand, build, and use AI systems. through. bit.ly/gan-lab. Try out!

Interactive data visualization is a technique of analyzing data, where a user interacts with the system that results in visual patterns for a given set of data. In this paper, seven basic modules and their corresponding operations have been proposed that an interactive big data visualization tool for .

Visualization and espe-cially interactive visualization has a long history of making large amounts of data better accessible. The R-extension package arulesViz provides most popular visualization techniques for association rules. In this paper, we discuss recently added interactive visualizations to explore association rules

extensive and complex, the visualization based data discovery can e ciently and e ectively deliver insights from big data. However, weaving big data into interactive visualizations that provides understanding and sense-making is a big challenge. Liu et al. [45] discussed various techniques that enable interactive visualization of big data,

use of interactive visualization, however focuses primarily on small-screen mobile devices. With regard to interactive glyph-based visualization, Yang et al. [YHW 07] propose a Value and Relation display that is designed for interactive exploration of large data sets. Shaw et al. [SHER99] investi-

in the Japanese Language Proficiency Test (JLPT)—Level 2, which is the equivalent number of the new words introduced in the four chapters of the regular intermediate Japanese language textbook. It contains new intermediate to low-advanced level grammar . 78 patterns normally found in a grammar textbook. Hirata’s plays, as discussed above, are full of strategies commonly used in oral .