Bus bunching, a phenomenon due to the failure of headway or timetable adherence, often causes low level of public transit service with poor bus on-time performance and excessive passenger waiting time. To mitigate bus bunching, an accurate and real-time prediction method plays an important role. In this paper, we propose a supply-demand seq2seq model called SD-seq2seq to predict bus bunching using smart card data. Features from both supply and demand sides of bus service are taken into account, like bus stop type, dwelling time, passenger demand and type, and so on. Extensive experiments on multiple bus routes in real world demonstrate that our method outperforms other baseline methods. The proposed method is expected to provide useful online information of bus operation to both bus operators and passengers.