News

Presentation Schedule

Written on 27.10.2025 09:06 by Ziqing Yang

Dear all,

After receiving your responses, we have arranged a schedule for you to give the presentations.

Starting from November 4th, every Tuesday from 2 pm to 3 pm, we will have two presenters introduce their preferred papers.

04.11.2025
Rishika Kumari, PLeak: Prompt Leaking Attacks against Large Language Model Applications
Prachi Sajwan, I Don't Know If We're Doing Good. I Don't Know If We're Doing Bad: Investigating How Practitioners Scope, Motivate, and Conduct Privacy Work When Developing AI Products

11.11.2025
Manu Vyshnavam Viswakarmav, Unveiling Privacy Risks in LLM Agent Memory
Ansu Varghese, Do Anything Now: Characterizing and Evaluating In-The-Wild Jailbreak Prompts on Large Language Models

18.11.2025
Tianze Chang, Universal and Transferable Adversarial Attacks on Aligned Language Models
Farzaneh Soltanzadeh, JailbreakRadar: Comprehensive Assessment of Jailbreak Attacks Against LLMs

25.11.2025
Syed Usfar Wasim, Benchmarking and Defending against Indirect Prompt Injection Attacks on Large Language Models
Shreya Atul Kolhapure, Formalizing and Benchmarking Prompt Injection Attacks and Defenses

02.12.2025
Tarik Kemal Gundogdu, From Meme to Threat: On the Hateful Meme Understanding and Induced Hateful Content Generation in Open-Source Vision Language Models
Mengfei Liang, Hate in Plain Sight: On the Risks of Moderating AI-Generated Hateful Illusions

09.12.2025
Elena Bondarevskaya, On the Proactive Generation of Unsafe Images From Text-To-Image Models Using Benign Prompts
Xinyu Zhang, HateBench: Benchmarking Hate Speech Detectors on LLM-Generated Content and Hate Campaigns

16.12.2025
Daniyal Azfar, Safety Alignment Should Be Made More Than Just a Few Tokens Deep
Shaun Paul, Societal Alignment Frameworks Can Improve LLM Alignment

Best,
Ziqing

AI Safety

News

Presentation Schedule