Applied Scientific Research has recently developed a Lagrangian vortex-boundary element method for the grid-free simulation of unsteady incompressible laminar-turbulent flow in/around complex geometries. The evaluation of the velocities due to vortex and boundary elements, which constitute roughly 90% of the simulation cost, are accelerated via a scalably parallel adaptive fast multipole method (FMM) for distributed computing. The objective of this proposal is to incorporate hardware acceleration into the latter using a cluster of graphical processing units (GPU). During Phase I, the direct interaction component of FMM (vortex only), which constitutes roughly half the computational cost, will be offloaded onto one GPU. Feasibility will be demonstrated by achieving order-of-magnitude GPU speedup over a CPU. Phase II activities will involve the porting of the entire FMM code (for both vortex and boundary elements) onto a cluster of GPUs. The developed grid-free flow simulation software for GPU clusters is expected to provide significant price/performance improvement over traditional CPU clusters, thereby allowing simulations for larger and more physically realistic problems.