Accurate traffic speed prediction is critical to modern internet of things-based intelligent transportation systems. It serves as the foundation of advanced traffic management systems and travel services. Nonetheless, the large number of roads and sensors impose great computational burden to existing forecast approaches, most of which can only handle one or few roads at a time. In this paper, a novel data-driven deep learning-based approach is proposed for citywide traffic speed prediction. The proposed approach is grounded on recent developments of geometric deep learning techniques to fully utilize the topological information of road networks in the learning process. Specifically, the approach captures the geometric traffic data dependency with graph convolution and attention mechanisms, and the temporal data correlation is extracted and expanded using the encoder-decoder architecture within a generative adversarial learning framework. Comprehensive case studies are conducted with real-world urban road networks and respective data to evaluate its performance, where consistent improvements can be observed over baseline approaches. Lastly, an architectural study is carried out to discover the best-performing structure of the proposed approach, whose sensitivity to data noise and sample frequency is also assessed.