Research / 03

Preprints, papers,
and ongoing work.

Three preprints spanning interpretable clinical NLP, mechanistic safety of tool-using LLMs, and few-shot fault diagnosis under data scarcity. Click any thumbnail or title to open the full PDF.

01 / Themes

/01

Healthcare AI

Interpretable clinical NLP, concept-grounded diagnosis.

/02

LLM Safety

Channel-specific vulnerability, mechanistic interpretability.

/03

Applied Generative ML

Few-shot diagnosis, augmentation under data scarcity.

02 / Preprints (3)

ShifaMind: A Multiplicative Concept Bottleneck for Interpretable ICD-10 Coding preview

Preprint2025

Healthcare AI · Interpretability

ShifaMind: A Multiplicative Concept Bottleneck for Interpretable ICD-10 Coding

Mohammed Sameer Syed, Xuan LuUniversity of Arizona

Automated ICD-10 coding from clinical discharge summaries requires models that are both accurate on long-tailed multi-label classification tasks and interpretable to clinicians. We present ShifaMind, a concept-grounded architecture built around a Multiplicative Concept Bottleneck (MCB), which changes the form, rather than the width, of the bottleneck. Instead of projecting through a narrow concept layer, ShifaMind uses a learned multiplicative gate over a concept-grounded representation while retaining a scalar concept interface for inspection. On MIMIC-IV top-50 ICD-10 coding, ShifaMind achieves performance competitive with the strongest baseline LAAT across F1, AUC, and ranking metrics, while outperforming five additional ICD-coding baselines and providing concept-mediated explanations.

0.712

Macro-F1

MIMIC-IV top-50

4.3×

over Vanilla CBM

0.704

CSTPR

Concept BottleneckClinical NLPICD-10MIMIC-IVInterpretability

Read PDF DownloadPreprint #01

Same Payload, Different Channel: Measuring Trust Asymmetry in Tool-Using Language Models preview

Preprint2025

LLM Safety · Mechanistic Interpretability

Same Payload, Different Channel: Measuring Trust Asymmetry in Tool-Using Language Models

Mohammed Sameer Syed, Rozhin YasaeiUniversity of Arizona

As language models take on agentic roles that span calling external APIs, reading tool outputs, and acting on instructions embedded in third-party content, their attack surface expands well beyond what users type. We introduce the Safety Asymmetry Score (SAS), which measures how much a model's susceptibility to adversarial content shifts depending on whether that content arrives in the user message, tool metadata, or tool output, using matched payload pairs that keep the malicious text identical and vary only the context of delivery. Evaluated across 6 production LLMs and three attack families, agent-native models are substantially more vulnerable when adversarial content arrives via tool descriptions than via user messages, while general-purpose models show the reverse. A mechanistic study on Llama 3.3 70B reveals that the safety-relevant representation is causally present at mid-to-late network depths but non-linearly encoded, explaining why linear probes fail to detect it.

+30.4 pp

Group SAS gap

6 / 98

Models · cases

ρ = 0.54

vs MCPTox

LLM SafetyTool UseMCPActivation PatchingLlama 3.3

Read PDF DownloadPreprint #02

SpectralGAN-Augmented Transformer Neural Network for Power Transformer Winding Fault Diagnosis via Frequency Response Analysis preview

Submission Pending2025

Power Systems · Generative Models

SpectralGAN-Augmented Transformer Neural Network for Power Transformer Winding Fault Diagnosis via Frequency Response Analysis

Mohammed Sameer Syed, Mohammed Sohail SyedIEEE Transactions on Power DeliveryUniversity of Arizona

Accurate classification of power transformer winding deformation faults from frequency response analysis (FRA) measurements is constrained by the fundamental scarcity of labelled fault data. This paper presents a three-stage diagnostic pipeline: 48-dimensional indicator vectors are extracted from IEC-standard sub-bands; SpectralGAN, a conditional WGAN-GP with spectral normalisation on every critic layer, synthesises class-conditional vectors from as few as 20 training samples per fold; FRATransformer, a lightweight multi-head self-attention classifier, classifies a mixed corpus of real, jittered, and GAN-generated samples. Under strict 21-fold LOOCV on a 21-sample dataset spanning healthy, axial displacement, and radial deformation classes, the pipeline achieves 85.7% accuracy and macro F1 = 0.838, a +23.8 pp gain over the best SVM baseline.

85.7%

Accuracy

21-fold LOOCV

+23.8 pp

over SVM baseline

0.838

Macro F1

WGAN-GPSpectral NormalisationSelf-AttentionFRAFew-Shot

Read PDF DownloadPreprint #03

Preprints, papers,and ongoing work.