enter search term and/or author name
ADiJaC -- Automatic Differentiation of Java Classfiles
Emil I. Sluşanschi, Vlad Dumitrel
Article No.: 9
This work presents the current design and implementation of ADiJaC, an automatic differentiation tool for Java classfiles. ADiJaC uses source transformation to generate derivative codes in both the forward and the reverse modes of automatic...
Stability and Performance of Various Singular Value QR Implementations on Multicore CPU with a GPU
Ichitaro Yamazaki, Stanimire Tomov, Jack Dongarra
Article No.: 10
Singular Value QR (SVQR) can orthonormalize a set of dense vectors with the minimum communication (one global reduction between the parallel processing units, and BLAS-3 to perform most of its local computation). As a result, compared to other...
Pipelined Iterative Solvers with Kernel Fusion for Graphics Processing Units
Karl Rupp, Josef Weinbub, Ansgar Jüngel, Tibor Grasser
Article No.: 11
We revisit the implementation of iterative solvers on discrete graphics processing units and demonstrate the benefit of implementations using extensive kernel fusion for pipelined formulations over conventional implementations of classical...
Analytical Modeling Is Enough for High-Performance BLIS
Tze Meng Low, Francisco D. Igual, Tyler M. Smith, Enrique S. Quintana-Orti
Article No.: 12
We show how the BLAS-like Library Instantiation Software (BLIS) framework, which provides a more detailed layering of the GotoBLAS (now maintained as OpenBLAS) implementation, allows one to analytically determine tuning parameters for high-end...
Implementing Multifrontal Sparse Solvers for Multicore Architectures with Sequential Task Flow Runtime Systems
Emmanuel Agullo, Alfredo Buttari, Abdou Guermouche, Florent Lopez
Article No.: 13
To face the advent of multicore processors and the ever increasing complexity of hardware architectures, programming models based on DAG parallelism regained popularity in the high performance, scientific computing community. Modern runtime...
Topology-Oriented Incremental Algorithm for the Robust Construction of the Voronoi Diagrams of Disks
Mokwon Lee, Kokichi Sugihara, Deok-Soo Kim
Article No.: 14
Voronoi diagrams are useful for spatial reasoning, and the robust and efficient construction of the ordinary Voronoi diagram of points is well known. However, its counterpart for circular disks in R2 and spherical balls in R3...
A Note on Performance Profiles for Benchmarking Software
Nicholas Gould, Jennifer Scott
Article No.: 15
In recent years, performance profiles have become a popular and widely used tool for benchmarking and evaluating the performance of several solvers when run on a large test set. Here we use data from a real application as well as a simple...
Algorithm 966: A Practical Iterative Algorithm for the Art Gallery Problem Using Integer Linear Programming
Davi C. Tozoni, Pedro J. De Rezende, Cid C. De Souza
Article No.: 16
In the last few decades, the search for exact algorithms for known NP-hard geometric problems has intensified. Many of these solutions use Integer Linear Programming (ILP) modeling and rely on state-of-the- art solvers to be able to find optimal...
Algorithm 967: A Distributed-Memory Fast Multipole Method for Volume Potentials
Dhairya Malhotra, George Biros
Article No.: 17
The solution of a constant-coefficient elliptic Partial Differential Equation (PDE) can be computed using an integral transform: A convolution with the fundamental solution of the PDE, also known as a volume potential. We present a Fast Multipole...