News

Currently, no news are available

Opportunities and Risks of Large Language Models and Foundation Model

The advent of Large Language Models (e.g. ChatGPT) and other foundation models (e.g. stable diffusion) has and will continue to change the way to AI/ML applications are developed and deployed.

On the one hand, these models show unprecedented performance and can often be adapted to new tasks with little effort. In particular, large language models like ChatGPT have the potential to change the way we implement and deploy functionality.

On the other hand, these models raise several questions related to safety, security and general aspects of trustworthiness, that urgently need to be addressed to comply with our high expectations for future AI systems.

Therefore, this seminar will investigate aspects of trustworthiness, security, safety, privacy, robustness, and intellectual property.

This is a lecture in the context of the ELSA - European Lighthouse on Secure and Safe AI: https://elsa-ai.eu

 

Date and time need to be confirmed. Tentatively, the seminar is scheduled for Tuesday's 8:30am to 10am.

June 4th 170104016 2.2 CapabilitiesAreDifficulttoEstimateandUnderstand
June 4th 7048112 2.3 Effects of Scale on Capabilities Are Not Well-Characterized
June 11th 7026710 2.4 Qualitative Understanding of Reasoning Capabilities Is Lacking
June 11th 7047907 3.2 Finetuning Methods Struggle to Assure Alignment and Safety
June 18th 7057874 3.4 Tools for Interpreting or Explaining LLM Behavior Are Absent or Lack Faith- fulness
June 18th 7057356 3.5 Jailbreaks and Prompt Injections Threaten Security of LLMs
July 9th 7056005 4.1 ValuestoBeEncodedwithinLLMsAreNotClear
July 9th 7022885 4.2 Dual-Use Capabilities Enable Malicious Use and Misuse of LLMs
July 16th 7062138 4.3 LLM-SystemsCanBeUntrustworthy

 

Video recording of kick-off session

kick-off slides

presentation tutorial slides

 

Literature and Resources:

Foundational Challenges in Assuring Alignment and Safety of Large Language Models

OWASP Top 10 for Large Language Model Applications

MITRA ATLAS Matrix

NIST AI 100-2 E2023: Adversarial Machine Learning: A Taxonomy and Terminology of Attacks and Mitigations

Not What You've Signed Up For: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection

Privacy Policy | Legal Notice
If you encounter technical problems, please contact the administrators.