We propose building upon the modular extensible architecture and existing capabilities of Open|SpeedShop to provide seamless, integrated, heterogeneous processor performance analysis. The NVIDIA GPU and Intel Many Integrated Core (MIC) processors are increasingly important at high performance computing (HPC) laboratories within NASA for use on NASA's high-end computing (HEC) projects because of their ability to accelerate scientific application performance. In order to understand what impact these accelerators are having on performance, tools must succinctly present heterogeneous processor performance information. One of the key goals of this work is to develop innovative methods for presenting the performance information extracted from applications running on both traditional CPU and GPU/MIC processors. And, specifically, to provide command line interface (CLI) and graphical user interface (GUI) displays of the heterogeneous processor performance information that facilitates the user's understanding of how their application utilizes the accelerator. For this project, Phase I GPU-related research will include measuring the usefulness of a GPU kernel, including device utilization, data transfer rates, device efficiency metric, internal device tiling factors, device memory hierarchy usage, as well as additional factors outlined in the referenced material. When measuring the performance of an application that contains concurrent processing on both the CPU and GPU processors, attributing time spent to the proper processor can be difficult - a difficulty exacerbated by the potential presence and usage of multiple accelerators concurrently. Accurately measuring the interactions of multiple accelerator devices is necessary in order to accurately report the application's performance to the user, but is difficult with the current set of accelerator interfaces. Developing techniques for mitigating these limitations will be another area of our GPU research.