I’m currently a third year Ph.D student advised by Prof. Ming of EECS-Ling Lab from School of Integrated Circuits, Southeast University.

I received my B.S. and M.S. degrees from Central China Normal University (CCNU) and Nanjing University of Posts and Telecommunications (NJUPT), separately.

My research interest includes domain specific accelerator, efficient machine learning and parallel computing.

🔥 News

📝 Publications

ICCAD 2025
sym

Diff-DiT: Temporal Differential Accelerator for Low-bit Diffusion Transformers on FPGA

Shidi Tang, Pengwei Zheng, Ruiqi Chen, Yuxuan Lv, Bruno da Silva, Ming Ling
The 2025 International Conference on Computer-Aided Design (ICCAD’25).

Project

  • This work presents Diff-DiT, an efficient FPGA accelerator for DiT models with temporal differential computing. First, we propose an approximated differential attention to mitigate the previous challenge of differentiating the Attention layer. Second, we propose a cross-cast data access pattern that achieves the highest computational intensity when performing matrix multiplications. Third, we optimize the dataflow by exploring the parallelism (with our HCS method) and pipelining. Diff-DiT achieves 1.39× speedup and 5.60× energy efficiency improvement when compared to the NVIDIA V100 GPU.
FITEE 2025
sym

Vina-FPGA2: a high-level parallelized hardware-accelerated molecular docking tool based on the inter-module pipeline

Ming Ling, Shidi Tang, Ruiqi Chen, Xin Li, Yanxiang Zhu
Frontiers of Information Technology & Electronic Engineering.

Project

  • This work presents Vina-FPGA2, an FPGA-based molecular docking acceleration tool. Building upon our previous efforts, Vina-FPGA2 implements an inter-module pipelined design, further accelerating the Vina computation process. Additionally, we developed a reinforcement learning based rapid solver that allows users to quickly obtain deployment parameters tailored to their target FPGA. Vina-FPGA2-Enhanced achieves an average 12.6× performance improvement over the CPU and a 3.3× improvement over Vina-FPGA. Compared to Vina-GPU, Vina-FPGA2 achieves a 7.2× enhancement in energy efficiency.
TECS 2025
sym

Diff-Acc: An Efficient FPGA Accelerator for Unconditional Diffusion Models

Shidi Tang, Ruiqi Chen, Rui Liu, Yuxuan Lv, Pengwei Zheng, He Li, Ming Ling
ACM Transactions on Embedded Computing Systems (Early Access).

Project

  • This work presents Diff-Acc, an efficient FPGA accelerator for unconditional UNet-based diffusion models with noval step-wise quantization method and group-wise parallelism. Compared with both server-based (Tesla V100 and Intel Xeon) and edge-based (Raspberry Pi 4 and Jetson Nano) platforms, Diff-Acc implemented on the Zynq UltraScale+ XCZU9EG FPGA demonstrates an up-to 12.5 × energy efficiency. Particularly versus edge-based platforms, Diff-Acc achieves up to 10.26 × and 1.97 × performance improvements over CPU and GPU, respectively.
TCBB 2024
sym

EEVS: Redeploying Discarded Smartphones for Economic and Ecological Drug Molecules Virtual Screening

Ming Ling, Chuanzhao Zhang, Shidi Tang, Ruiqi Chen, Yanxiang Zhu
IEEE Transactions on Sustainable Computing.

Project

  • This work presents EEVS, aimed at redeploying discarded smartphones for economic and ecological drug molecules virtual screening.
TCBB 2024
sym

Vina-GPU 2.1: towards further optimizing docking speed and precision of AutoDock Vina and its derivatives

Shidi Tang, Ji Ding, Xiangyu Zhu, Zheng Wang, Haitao Zhao, Jiansheng Wu
IEEE/ACM Transactions on Computational Biology and Bioinformatics.

Project

  • This work presents Vina-GPU 2.1 , aimed at enhancing the docking speed and precision of AutoDock Vina and its derivatives through the integration of novel algorithms to facil-itate improved docking and virtual screening outcomes.
PACRIM 2024
sym

Modeling equivariant neural networks for hardware acceleration, a case study on the molecular docking tool DiffDock

Shidi Tang, Xingxing Zhou, Ming Ling.
PACRIM’2024.

Project

  • This work proposes first SystemC model for equivariant neural networks, specifically targeting Diffdock, with the aim of accelerating its performance using customized hardware such as the FPGA..
NMI 2024
sym

Predicting equilibrium distributions for molecular systems with deep learning

Shuxin Zheng, Jiyan He, Chang Liu, Yu Shi, Ziheng Lu, Weitao Feng, Fusong Ju, Jiaxi Wang, Jianwei Zhu, Yaosen Min, He Zhang, Shidi Tang, Hongxia Hao, Peiran Jin, Chi Chen, Frank Noé, Haiguang Liu, Tie-Yan Liu.
Nature Machine Intelligence (2024): 1-10.

Project

  • This work proposes Distributional Graphormer (DiG) in an attempt to predict the equilibrium distribution of molecular systems.
TBCAS 2024
sym

Vina-FPGA-Cluster: Multi-FPGA Based Molecular Docking Tool with High-Accuracy and Multi-Level Parallelism

Ming Ling, Zhihao Feng, Ruiqi Chen, Yi Shao, Shidi Tang, Yanxiang Zhu
IEEE Transactions on Biomedical Circuits and Systems (TBCAS) (2024).

Project

  • This work presents Vina-FPGA-cluster, a multi-FPGA-based molecular docking tool enabling high-accuracy and multi-level parallel Vina acceleration.
JCIM 2023
sym

Vina-GPU 2.0: further accelerating AutoDock Vina and its derivatives with graphics processing units

Ji Ding, Shidi Tang, Zheming Mei, Lingyue Wang, Qinqin Huang, Haifeng Hu, Ming Ling, Jiansheng Wu
Journal of Chemical Information and Modeling (JCIM) 63 (7), 1982-1998

Project

  • This work presents Vina-GPU 2.0 to further accelerate AutoDock Vina and its derivatives with graphics processing units.

Xingxing Zhou, Ming Ling, Qingde Lin, Shidi Tang, Jiansheng Wu, Haifeng Hu
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB), 2023

Project

  • This work presents a probabilistic model for the effectiveness of multiple initial states parallel SA (MISPSA) that is adapted in Vina-GPU.
Moleculars 2022
sym

Accelerate autodock vina with GPUs

Shidi Tang, Ruiqi Chen, Mengru Lin, Qingde Lin, Yanxiang Zhu, Ji Ding, Haifeng Hu, Ming Ling, Jiansheng Wu
Molecules, 2022, 27(9): 3041

Project

  • This work presents a parallel algorithm called Vina-GPU and the OpenCL implementation on GPUs.

A fast approximate check polytope projection algorithm for ADMM decoding of LDPC codes

Qiaoqiao Xia, Yan Lin, Shidi Tang, Qinglin Zhang
IEEE Communications Letters, 2019, 23(9): 1520-1523.

Project

  • This work presents a fast projection algorithm for ADMM decoding of LDPC codes

💼 Service

  • Reviewer of IEEE Transactions on Sustainable Computing
  • Reviewer of The Journal of Supercomputing

🎖 Honors and Awards

📖 Educations

  • B.S. in Communication Engineering, 2016-2019

    Central China Normal University (CCNU), Wuhan, China

  • M.S. in Biomedical Engineering, 2020-2023

    Nanjing University of Posts and Telecommunications (NJUPT), Nanjing, China

  • Ph.D student, 2023-present

    Southeast University (SEU), Nanjing, China

💻 Internships

  • Research Intern, 2022-2023

    Microsoft Research Asia (MSRA), AI for Science group, Beijing, China