Publications

You can also find my articles on my Google Scholar profile.

Journal Articles

Black-Box Test-Time Ensemble

Published in IEEE Computational Intelligence Magazine, 2025

Privacy considerations have become increasingly critical in the deployment of modern machine learning models. To protect sensitive data and reduce storage or transmission costs, many service providers offer trained models via APIs, effectively creating privacy-preserving black-box models. However, evaluating the performance of such models remains to be a significant challenge, especially for downstream tasks lacking labeled data. This paper proposes an unsupervised combination method for black-box test-time ensemble. By utilizing only the models’ predictions on unlabeled test data, the proposed approach estimates the reliability of individual base classifiers and constructs a weighted ensemble that favors more accurate ones. Our approach is compatible with both traditional machine learning classifiers and modern large language models, and accommodates a wide range of scenarios, including binary and multi-class classification, hard and soft outputs, and both offline and online settings. Extensive experiments on 13 real-world text, image, and time series datasets verified the effectiveness and flexibility of the approach, consistently outperforming majority voting and other combination approaches. Notably, the proposed approach is hyperparameter-free and computationally efficient, rendering it well-suited for applications that require online real-time inference.

Recommended citation: S. Li, Z. Wang, C. Liu, and D. Wu, "Black-Box Test-Time Ensemble," IEEE Comput. Intell. Mag., 2025.
Download Paper

DBConformer: Dual-Branch Convolutional Transformer for EEG Decoding

Published in IEEE Journal of Biomedical and Health Informatics, 2025

Electroencephalography (EEG)-based brain-computer interfaces (BCIs) transform spontaneous/evoked neural activity into control commands for external communication. While convolutional neural networks (CNNs) remain the mainstream backbone for EEG decoding, their inherently short receptive field makes it difficult to capture long-range temporal dependencies and global inter-channel relationships. Recent CNN-Transformer (Conformer) hybrids partially address this issue, but most adopt a serial design, resulting in suboptimal integration of local and global features, and often overlook explicit channel-wise modeling. To address these limitations, we propose DBConformer, a dual-branch convolutional Transformer network tailored for EEG decoding. It integrates a temporal Conformer to model long-range temporal dependencies and a spatial Conformer to extract inter-channel interactions, capturing both temporal dynamics and spatial patterns in EEG signals. A lightweight channel attention module further refines spatial representations by assigning data-driven importance to EEG channels. Extensive experiments under four evaluation settings on three paradigms, including motor imagery, seizure detection, and steady-state visual evoked potential, demonstrated that DBConformer consistently outperformed 13 competitive baseline models, with over an eight-fold reduction in parameters than current high-capacity EEG Conformer architecture. Furthermore, the visualization results confirmed that the features extracted by DBConformer are physiologically interpretable and aligned with prior knowledge. The superior performance and interpretability of DBConformer make it reliable for accurate, robust, and explainable EEG decoding.

Recommended citation: Z. Wang, H. Wang, T. Jia, X. He, S. Li, and D. Wu, "DBConformer: Dual-Branch Convolutional Transformer for EEG Decoding," IEEE J. Biomed. Health Inform., 2025.
Download Paper

MVCNet: Multi-View Contrastive Network for Motor Imagery Classification

Published in Knowledge-Based Systems, 2025

Electroencephalography (EEG)-based brain-computer interfaces (BCIs) enable neural interaction by decoding brain activity for external communication. Motor imagery (MI) decoding has received significant attention due to its intuitive mechanism. However, most existing models rely on single-stream architectures and overlook the multi-view nature of EEG signals, leading to limited performance and generalization. We propose a multi-view contrastive network (MVCNet), a dual-branch architecture that parallelly integrates CNN and Transformer blocks to capture both local spatial-temporal features and global temporal dependencies. To enhance the informativeness of training data, MVCNet incorporates a unified augmentation pipeline across time, frequency, and spatial domains. Two contrastive modules are further introduced: a cross-view contrastive module that enforces consistency of original and augmented views, and a cross-model contrastive module that aligns features extracted from both branches. Final representations are fused and jointly optimized by contrastive and classification losses. Experiments on five public MI datasets across three scenarios demonstrate that MVCNet consistently outperforms nine state-of-the-art MI decoding networks, highlighting its effectiveness and generalization ability. MVCNet provides a robust solution for MI decoding by integrating multi-view information and dual-branch modeling, contributing to the development of more reliable BCI systems.

Recommended citation: Z. Wang, S. Li, X. Chen, and D. Wu, "MVCNet: Multi-View Contrastive Network for Motor Imagery Classification," Knowledge-Based Systems, 328:114205, 2025.
Download Paper

Canine EEG Helps Human: Cross-Species and Cross-Modality Epileptic Seizure Detection via Multi-Space Alignment

Published in National Science Review, 2025

Epilepsy significantly impacts global health, affecting about 65 million people worldwide, along with various animal species. The diagnostic processes of epilepsy are often hindered by the transient and unpredictable nature of seizures. Here we propose a multi-space alignment approach based on cross-species and cross-modality electroencephalogram (EEG) data to enhance the detection capabilities and understanding of epileptic seizures. By employing deep learning techniques, including domain adaptation and knowledge distillation, our framework aligns cross-species and cross-modality EEG signals to enhance the detection capability beyond traditional within-species and within-modality models. Experiments on multiple surfaces and intracranial EEG datasets of humans and canines demonstrated substantial improvements in detection accuracy, achieving over 90% AUC scores for cross-species and cross-modality seizure detection with extremely limited labeled data from the target species/modality. To our knowledge, this is the first study that demonstrates the effectiveness of integrating heterogeneous data from different species and modalities to improve EEG-based seizure detection performance. This is a pilot study that provides insights into the challenges and potential of multi-species and multi-modality data integration, offering an effective solution for future work to collect huge EEG data to train large brain models.

Recommended citation: Z. Wang, S. Li, and D. Wu, "Canine EEG Helps Human: Cross-Species and Cross-Modality Epileptic Seizure Detection via Multi-Space Alignment," National Science Review, 12(6):nwaf086, 2025.
Download Paper

Time-Frequency Transform based EEG Data Augmentation for Brain-Computer Interfaces

Published in Knowledge-Based Systems, 2025

Accurate decoding of electroencephalography (EEG) signals is crucial for brain–computer interfaces (BCIs); however, individual differences, non-stationarity of EEG signals, and limited training data make the decoding very challenging. Existing EEG data augmentation approaches usually operate in the temporal, frequency, or spatial domain only, which may not adequately capture the non-stationarity of EEGs. Moreover, these methods typically generate within-subject augmented trials, restricting their effectiveness in accommodating inter-subject variability. This paper proposes two time–frequency transform based EEG data augmentation approaches: Discrete Wavelet Transform Augmentation (DWTaug) and Hilbert–Huang Transform Augmentation (HHTaug). Both follow three steps: time–frequency domain decomposition, cross-subject sub-signal reassembling, and time domain reconstruction. Augmenting data expands the pool of labeled training samples, alleviating the data scarcity problem; time–frequency decomposition captures the non-stationary properties of EEG signals more effectively; finally, cross-subject reassembling of sub-signals handles individual differences. Experiments on 17 datasets from three different BCI paradigms demonstrated the superiority of DWTaug and HHTaug over nine existing EEG data augmentation approaches, improving 4% over baseline on average. By leveraging essential time–frequency information, DWTaug and HHTaug introduce new utility to traditional signal processing techniques, enhancing EEG data augmentation, thus effectively addressing key EEG decoding challenges. To our knowledge, this is the first work to simultaneously address individual variability, non-stationarity, and data scarcity in EEG decoding, significantly enhancing the real-world applicability of BCIs

Recommended citation: Z. Wang, S. Li, X. Chen, and D. Wu, "Time-Frequency Transform based EEG Data Augmentation for Brain-Computer Interfaces," Knowledge-Based Systems, 311:113074, 2025.
Download Paper

User-wise Perturbations for User Identity Protection in EEG-Based BCIs

Published in Journal of Neural Engineering, 2025

An electroencephalogram (EEG)-based brain–computer interface (BCI) is a direct communication pathway between the human brain and a computer. Most research so far studied more accurate BCIs, but much less attention has been paid to the ethics of BCIs. Aside from task-specific information, EEG signals also contain rich private information, e.g. user identity, emotion, disorders, etc which should be protected. We show for the first time that adding user-wise perturbations can make identity information in EEG unlearnable. We propose four types of user-wise privacy-preserving perturbations, i.e. random noise, synthetic noise, error minimization noise, and error maximization noise. After adding the proposed perturbations to EEG training data, the user identity information in the data becomes unlearnable, while the BCI task information remains unaffected. Experiments on six EEG datasets using three neural network classifiers and various traditional machine learning models demonstrated the robustness and practicability of the proposed perturbations. Our research shows the feasibility of hiding user identity information in EEG data without impacting the primary BCI task information.

Recommended citation: X. Chen, S. Li, Y. Tu, Z. Wang, and D. Wu, "User-wise Perturbations for User Identity Protection in EEG-Based BCIs," J. Neural Eng., 22(1):016040, 2024.
Download Paper

Multimodal Brain-Computer Interfaces: AI-powered Decoding Methodologies

Published in ArXiv, 2025

Brain-computer interfaces (BCIs) enable direct communication between the brain and external devices. This review highlights the core decoding algorithms that enable multimodal BCIs, including a dissection of the elements, a unified view of diversified approaches, and a comprehensive analysis of the present state of the field. We emphasize algorithmic advancements in cross-modality mapping, sequential modeling, besides classic multi-modality fusion, illustrating how these novel AI approaches enhance decoding of brain data. The current literature of BCI applications on visual, speech, and affective decoding are comprehensively explored. Looking forward, we draw attention on the impact of emerging architectures like multimodal Transformers, and discuss challenges such as brain data heterogeneity and common errors. This review also serves as a bridge in this interdisciplinary field for experts with neuroscience background and experts that study AI, aiming to provide a comprehensive understanding for AI-powered multimodal BCIs.

Recommended citation: S. Li, H. Wang, X. Chen, and D. Wu, "Multimodal Brain-Computer Interfaces: AI-powered Decoding Methodologies," arXiv.
Download Paper

Gated Parametric Neuron for Spike-based Audio Recognition

Published in Neurocomputing, 2024

Spiking neural networks (SNNs) aim to simulate real neural networks in the human brain with biologically plausible neurons. The leaky integrate-and-fire (LIF) neuron is one of the most widely studied SNN architectures. However, it has the vanishing gradient problem when trained with backpropagation. Additionally, its neuronal parameters are often manually specified and fixed, in contrast to the heterogeneity of real neurons in the human brain. This paper proposes a gated parametric neuron (GPN) to process spatio-temporal information effectively with the gating mechanism. Compared with the LIF neuron, the GPN has two distinguishing advantages: (1) it copes well with the vanishing gradients by improving the flow of gradient propagation; and, (2) it learns spatio-temporal heterogeneous neuronal parameters automatically. Additionally, we use the same gate structure to eliminate initial neuronal parameter selection and design a hybrid recurrent neural network-SNN structure. Experiments on two spike-based audio datasets demonstrated that the GPN network outperformed several state-of-the-art SNNs, could mitigate vanishing gradients, and had spatio-temporal heterogeneous parameters. Our work shows the ability of SNNs to handle long-term dependencies and achieve high performance simultaneously.

Recommended citation: H. Wang, H. Zhang, S. Li, and D. Wu, "Gated Parametric Neuron for Spike-based Audio Recognition," Neurocomputing, 609:128477, 2024.
Download Paper

Federated Motor Imagery Classification for Privacy-Preserving Brain-Computer Interfaces

Published in IEEE Transactions on Neural Systems and Rehabilitation Engineering, 2024

Training an accurate classifier for EEG-based brain-computer interface (BCI) requires EEG data from a large number of users, whereas protecting their data privacy is a critical consideration. Federated learning (FL) is a promising solution to this challenge. This paper proposes Federated classification with local Batch-specific batch normalization and Sharpness-aware minimization (FedBS) for privacy protection in EEG-based motor imagery (MI) classification. FedBS utilizes local batch-specific batch normalization to reduce data discrepancies among different clients, and sharpness-aware minimization optimizer in local training to improve model generalization. Experiments on three public MI datasets using three popular deep learning models demonstrated that FedBS outperformed six state-of-the-art FL approaches. Remarkably, it also outperformed centralized training, which does not consider privacy protection at all. In summary, FedBS protects user EEG data privacy, enabling multiple BCI users to participate in large-scale machine learning model training, which in turn improves the BCI decoding accuracy.

Recommended citation: T. Jia, L. Meng, S. Li, and D. Wu, "Federated Motor Imagery Classification for Privacy-Preserving Brain-Computer Interfaces," IEEE Trans. Neural Syst. Rehabil. Eng., 32:3442–3451, 2024.
Download Paper

Channel Reflection: Knowledge-Driven Data Augmentation for EEG-based Brain–Computer Interfaces

Published in Neural Networks, 2024

A brain–computer interface (BCI) enables direct communication between the human brain and external devices. Electroencephalography (EEG) based BCIs are currently the most popular for able-bodied users. To increase user-friendliness, usually a small amount of user-specific EEG data are used for calibration, which may not be enough to develop a pure data-driven decoding model. To cope with this typical calibration data shortage challenge in EEG-based BCIs, this paper proposes a parameter-free channel reflection (CR) data augmentation approach that incorporates prior knowledge on the channel distributions of different BCI paradigms in data augmentation. Experiments on eight public EEG datasets across four different BCI paradigms (motor imagery, steady-state visual evoked potential, P300, and seizure classifications) using different decoding algorithms demonstrated that: (1) CR is effective, i.e., it can noticeably improve the classification accuracy; (2) CR is robust, i.e., it consistently outperforms existing data augmentation approaches in the literature; and, (3) CR is flexible, i.e., it can be combined with other data augmentation approaches to further improve the performance. We suggest that data augmentation approaches like CR should be an essential step in EEG-based BCIs. Our code is available online.

Recommended citation: Z. Wang, S. Li, J. Luo, J. Liu, and D. Wu, "Channel Reflection: Knowledge-Driven Data Augmentation for EEG-based Brain–Computer Interfaces," Neural Networks, 176:106351, 2024.
Download Paper

T-TIME: Test-Time Information Maximization Ensemble for Plug-and-Play BCIs

Published in IEEE Transactions on Biomedical Engineering, 2024

An electroencephalogram (EEG)-based brain-computer interface (BCI) enables direct communication between the human brain and a computer. Due to individual differences and non-stationarity of EEG signals, such BCIs usually require a subject-specific calibration session before each use, which is time-consuming and user-unfriendly. Transfer learning (TL) has been proposed to shorten or eliminate this calibration, but existing TL approaches mainly consider offline settings, where all unlabeled EEG trials from the new user are available. This paper proposes Test-Time Information Maximization Ensemble (T-TIME) to accommodate the most challenging online TL scenario, where unlabeled EEG data from the new user arrive in a stream, and immediate classification is performed. T-TIME initializes multiple classifiers from the aligned source data. When an unlabeled test EEG trial arrives, T-TIME first predicts its labels using ensemble learning, and then updates each classifier by conditional entropy minimization and adaptive marginal distribution regularization. Our code is publicized. Extensive experiments on three public motor imagery based BCI datasets demonstrated that T-TIME outperformed about 20 classical and state-of-the-art TL approaches. To our knowledge, this is the first work on test time adaptation for calibration-free EEG-based BCIs, making plug-and-play BCIs possible.

Recommended citation: S. Li, Z. Wang, H. Luo, L. Ding, and D. Wu, "T-TIME: Test-Time Information Maximization Ensemble for Plug-and-Play BCIs," IEEE Trans. Biomed. Eng., 71(2):423–432, 2024.
Download Paper

Motor Imagery Classification for Asynchronous EEG-Based Brain-Computer Interfaces

Published in IEEE Transactions on Neural Systems and Rehabilitation Engineering, 2024

Motor imagery (MI) based brain-computer interfaces (BCIs) enable the direct control of external devices through the imagined movements of various body parts. Unlike previous systems that used fixed-length EEG trials for MI decoding, asynchronous BCIs aim to detect the user’s MI without explicit triggers. They are challenging to implement, because the algorithm needs to first distinguish between resting-states and MI trials, and then classify the MI trials into the correct task, all without any triggers. This paper proposes a sliding window prescreening and classification (SWPC) approach for MI-based asynchronous BCIs, which consists of two modules: a prescreening module to screen MI trials out of the resting-state, and a classification module for MI classification. Both modules are trained with supervised learning followed by self-supervised learning, which refines the feature extractors. Within-subject and cross-subject asynchronous MI classifications on four different EEG datasets validated the effectiveness of SWPC, i.e., it always achieved the highest average classification accuracy, and outperformed the best state-of-the-art baseline on each dataset by about 2%.

Recommended citation: H. Wu, S. Li, and D. Wu, "Motor Imagery Classification for Asynchronous EEG-Based Brain-Computer Interfaces," IEEE Trans. Neural Syst. Rehabil. Eng., 32:527–536, 2024.
Download Paper

Unsupervised Domain Adaptation for Cross-Patient Seizure Classification

Published in Journal of Neural Engineering, 2023

Epileptic seizure is a chronic neurological disease affecting millions of patients. Electroencephalogram (EEG) is the gold standard in epileptic seizure classification. However, its low signal-to-noise ratio, strong non-stationarity, and large individual difference nature make it difficult to directly extend the seizure classification model from one patient to another. This paper considers multi-source unsupervised domain adaptation for cross-patient EEG-based seizure classification, i.e. there are multiple source patients with labeled EEG data, which are used to label the EEG trials of a new patient. We propose an source domain selection (SDS)-global domain adaptation (GDA)-target agent subdomain adaptation (TASA) approach, which includes SDS to filter out dissimilar source domains, GDA to align the overall distributions of the selected source domains and the target domain, and TASA to identify the most similar source domain to the target domain so that its labels can be utilized. Main results. Experiments on two public seizure datasets demonstrated that SDS-GDA-TASA outperformed 13 existing approaches in unsupervised cross-patient seizure classification. Our approach could save clinicians plenty of time in labeling EEG data for epilepsy patients, greatly increasing the efficiency of seizure diagnostics.

Recommended citation: Z. Wang, W. Zhang, S. Li, X. Chen, and D. Wu, "Unsupervised Domain Adaptation for Cross-Patient Seizure Classification," J. Neural Eng., 20(6):066002, 2023.
Download Paper

Active Poisoning: Efficient Backdoor Attacks on Transfer Learning-based Brain-Computer Interfaces

Published in Science China Information Sciences, 2023

Transfer learning (TL) has been widely used in electroencephalogram (EEG)-based brain-computer interfaces (BCIs) for reducing calibration efforts. However, backdoor attacks could be introduced through TL. In such attacks, an attacker embeds a backdoor with a specific pattern into the machine learning model. As a result, the model will misclassify a test sample with the backdoor trigger into a prespecified class while still maintaining good performance on benign samples. Accordingly, this study explores backdoor attacks in the TL of EEG-based BCIs, where source-domain data are poisoned by a backdoor trigger and then used in TL. We propose several active poisoning approaches to select source-domain samples, which are most effective in embedding the backdoor pattern, to improve the attack success rate and efficiency. Experiments on four EEG datasets and three deep learning models demonstrate the effectiveness of the approaches. To our knowledge, this is the first study about backdoor attacks on TL models in EEG-based BCIs. It exposes a serious security risk in BCIs, which should be immediately addressed.

Recommended citation: X. Jiang, L. Meng, S. Li, and D. Wu, "Active Poisoning: Efficient Backdoor Attacks on Transfer Learning-based Brain-Computer Interfaces," Sci. China Inf. Sci., 66(8):1–22, 2023.
Download Paper

Meta-Learning for Fast and Privacy-Preserving Source Knowledge Transfer of EEG-Based BCIs

Published in IEEE Computational Intelligence Magazine, 2022

Electroencephalogram (EEG) based brain-computer interfaces (BCIs) are used in many applications, due to their low-risk, low-cost, and convenience. Because of EEG’s high variations across subjects and sessions, a long calibration session is usually needed to adjust the system before each use, which is time-consuming and user-unfriendly. Though various machine learning approaches have been proposed to cope with this problem, none of them considered individual differences, data scarcity and data privacy simultaneously. In this paper, a Multi-Domain Model-Agnostic Meta-Learning (MDMAML) approach is proposed to address challenging cross-subject, few-shot and source-free (privacy protection) classification tasks in EEG-based BCIs. Experiments on four datasets from two different BCI paradigms demonstrated that MDMAML outperformed several classical and state-of-the-art approaches in both online and offline applications.

Recommended citation: S. Li, H. Wu, L. Ding, and D. Wu, "Meta-Learning for Fast and Privacy-Preserving Source Knowledge Transfer of EEG-Based BCIs," IEEE Comput. Intell. Mag., 17(4):16–26, 2022.
Download Paper

Conference Papers

TMMM: Transformer in Multimodal Sentiment Analysis under Missing Modalities

Published in IJCNN, 2024

The cross-task gap presents a significant challenge for multimodal models because of the differences in input-output workflows. For instance, multimodal pre-trained transformers may encounter uni-modal data during testing. To mitigate the gap, this paper introduces a Transformer in Multimodal Sentiment Analysis under Missing Modalities (TMMM) aims to perform well using missing-modal data during testing. TMMM uses a missing multimodal training approach to prevent accuracy degradation in testing. At the same time, a new network architecture allows the model to reconstruct missing modalities during testing. Classification token fusion and Mixture-of-Experts structures further enhance the model’s performance. A pre-training method utilizing contrastive learning, which can construct negative samples with positive samples, is proposed to overcome insufficient labeled data. Our experiments demonstrated the effectiveness of TMMM on two datasets with no modalities missing, i.e., it consistently achieved the highest classification accuracy and Macro-F1, which outperformed the best state-of-the-art baseline on each dataset by about 2% and 2.5%. Additionally, TMMM usually performs better than other baselines on datasets with missing modalities during testing.

Recommended citation: H. Wu, S. Li, and D. Wu, "TMMM: Transformer in Multimodal Sentiment Analysis under Missing Modalities," in IJCNN, Jun. 2024.
Download Paper

Low-Resource Machine Translation Training Curriculum Fit for Low-Resource Languages

Published in PRICAI, 2023

We conduct an empirical study of neural machine translation (NMT) for truly low-resource languages, and present a training curriculum fit for cases when both parallel training data and compute resource are lacking, reflecting the reality of most of the world’s languages and the researchers working on these languages. Previously, unsupervised NMT, which employs back-translation (BT) and auto-encoding (AE) tasks has been shown barren for low-resource languages. We demonstrate that leveraging comparable data and code-switching as weak supervision, combined with pre-training with BT and AE objectives, result in remarkable improvements for low-resource languages even when using only modest compute resources. The training curriculum proposed in this work achieves BLEU scores that improve over supervised NMT trained on the same backbone architecture, showcasing the potential of weakly-supervised NMT for low-resource languages.

Recommended citation: G. Kuwanto, A. F. Akyürek, I. C. Tourni, S. Li, A. Jones, and D. Wijaya, "Low-Resource Machine Translation Training Curriculum Fit for Low-Resource Languages," in PRICAI, Nov. 2023.
Download Paper

Facial Expression Recognition In-the-Wild with Deep Pre-trained Models

Published in ECCV Workshops, 2023

Facial expression recognition (FER) is challenging, when transiting from the laboratory to in-the-wild situations. In this paper, we present a general framework for the Learning from Synthetic Data Challenge in the 4th Affective Behavior Analysis In-The-Wild (ABAW4) competition, to learn as much knowledge as possible from synthetic faces with expressions. To cope with four problems in training robust deep FER models, including uncertain labels, class imbalance, mismatch between pretraining and downstream tasks, and incapability of a single model structure, our framework consists of four respective modules, which can be utilized for FER in-the-wild. Experimental results on the official validation set from the competition demonstrated that our proposed approach outperformed the baseline by a large margin.

Recommended citation: S. Li, Y. Xu, H. Wu, D. Wu, Y. Yin, J. Cao, and J. Ding, "Facial Expression Recognition In-the-Wild with Deep Pre-trained Models," in ECCV Workshops, Oct. 2022.
Download Paper

Siyang Li (李思扬)

Publications

Journal Articles

Conference Papers