pa4vd - Program Analysis for Vulnerability Detection

News

Grades in LSF

Written on 01.03.21 (last change on 01.03.21) by Cristian-Alexandru Staicu

Dear all,

Please double-check that you were assigned the grade for the seminar in LSF and let me know if you have any complaints, questions, or feedback. Thank you very much for choosing this seminar. I hope that by doing so you got a more clear understanding of existing state-of-the-art vulnerability detection techniques, and in particular about the program analysis-based ones. I also hope that the individualized feedback I sent each of you, both after the presentation and after submitting the report/draft, will help you improve your scientific writing and your presentation skills. I wish you all the best with the rest of your studies.

Best,

Cris

Zoom meeting invitation

Written on 05.11.20 by Cristian-Alexandru Staicu

Please check your emails (including the spam folder) to find the invitation for the Zoom meeting.

Kick off the seminar

Written on 02.11.20 by Cristian-Alexandru Staicu

Hey all,

Thanks for choosing this seminar. Please vote in the following doodle, so we can agree on a time slot for the seminar (on Thursdays):

https://doodle.com/poll/r8mwuf49t84yf6rc?utm_source=poll&utm_medium=link
Additionally, please send me a list of five topics you would like to be… Read more

Hey all,

Thanks for choosing this seminar. Please vote in the following doodle, so we can agree on a time slot for the seminar (on Thursdays):

https://doodle.com/poll/r8mwuf49t84yf6rc?utm_source=poll&utm_medium=link
Additionally, please send me a list of five topics you would like to be assigned to. The list should be ordered from your first preference to your fifth.

Best,

Cris

P.S. I also sent you an email with additional information earlier today.

Description

Program analysis is a mature research area at the intersection of programming languages, formal methods, and software engineering. One of its main applications is automatic vulnerability detection. However, the complexity of modern systems is overwhelming and the vulnerabilities to be detected are increasingly sophisticated. To account for these particularities, many recent approaches advocate for lightweight program analysis techniques or hybrid methods, i.e., static and dynamic analysis. This seminar explores the trade-offs involved in designing a program analysis that scales to analyzing the security of real systems. In this seminar, we will discuss recent research papers in the area in a reading group format. Each week, one student will present papers covering a given topic, followed by a discussion. All participants are expected to actively participate in the discussion by asking questions.

Logistics

Instructor: Cristian-Alexandru Staicu

Time: Thursday, 15:00 (3pm)

Location: Zoom (Disclaimer) - link to the recurring meeting was sent by email.

Semester Plan

5^th of November - kick-off meeting,
12^th of November - Paul Krappen, Vulnerabilities in low-level programs,
19^th of November - Raoul Scholtes, Vulnerabilities in web applications,
26^th of November - Pit Jost, Detect misuses of crypto APIs,
3^rd of December - Tristan Hornetz, Removing vulnerabilities through debloating,
10^th of December - Jeremy Rack, Automatic patching of vulnerabilities,
17^th of December - Banji Olorundare, Fuzzing compilers and engines,
7^th of January - Tim Walita, Vulnerabilities in software components and dependencies,
14^th of January - Jonathan Busch, Vulnerabilities in mobile apps,
21^st of January - Dominic Troppmann, Vulnerability prediction,
28^th of January - Muhammad Bilal Latif, Machine learning-aided vulnerability detection,
4^th of February - Dominik Sautter, Availability vulnerabilities.

Grading system

The final grade is an aggregate of the following parts, both presentation and final report are mandatory:

50% the final report,
50% the presentation,
bonus: up to 15% for being active in class,
bonus: up to 15% for the hands-on exercise.

Supporting Materials

Please find below a set of useful materials for the seminar:

The kick-off presentation's slides contain useful information about the structure and goals of this seminar, but also some required background for the assigned papers.
Sample presentation 1 - you should aim for this much content when presenting each of the assigned papers (approx. 10 minutes). See the kick-off presentation for the recommended presentation's structure.
Sample presentation 2 - a slightly longer presentation (approx. 15 minutes).
Consider using the following template for the report and its associated sources.

Topics

Vulnerabilities in web applications
- NAVEX: precise and scalable exploit generation for dynamic web applications, USENIX Security 2018,
- Nodest: feedback-driven static analysis of Node.js applications, FSE 2019,
- [optional] hands-on exercise: study the NoSQL vulnerability in CVE-2017-100049, and explain how it works; ideally, provide a PoC exploit.
Vulnerabilities in software components and dependencies
- Thou shalt not depend on me: analysing the use of outdated JavaScript libraries on the web, NDSS 2017,
- Beyond metadata: code-centric and usage-based analysis of known vulnerabilities in open-source software, ICSME 2018,
- [optional] hands-on exercise: study the prototype pollution vulnerability in CVE-2020-8203, and explain how it works. Build two client applications: one that safely uses lodash and one that is affected by the vulnerability. How do the two approaches differ in alerting the two client applications about this vulnerability?
Vulnerabilities in mobile apps
- FIRMSCOPE: automatic uncovering of privilege-escalation vulnerabilities in pre-installed apps in Android firmware, USENIX Security 2020,
- Iframes/popups are dangerous in mobile WebView: studying and mitigating differential context vulnerabilities, USENIX Security 2019,
- [optional] hands-on exercise: create a simple Android application that loads third-party code in a WebView. Try to access sensitive web APIs such as the Geolocation API or the Sensor API. If access is allowed, is the user alerted that such APIs are accessed?
Detect misuses of crypto APIs
- Cryptoguard: high precision detection of cryptographic vulnerabilities in massive-sized Java projects, CCS 2019,
- CrySL: an extensible approach to validating the correct usage of cryptographic APIs, ECOOP 2018,
- [optional] hands-on exercise: analyze the Apache Ranger's fix (here and here)in response to Cryptoguard findings. Extract a minimal, crypto API only working example that constructs an MD5 hash (old) and an SHA-1 (new). Hash the same password with the two approaches and discuss how the length of the passwords differs.
Vulnerabilities in low-level programs
- K-Miner: uncovering memory corruption in Linux, NDSS 2018,
- Symbolic execution with SymCC: don't interpret, compile!, USENIX Security 2020,
- Optional hands-on exercise: install and run SymCC on the example provided in the repo. Evaluate its scalability by running it on your favorite C programs (a few hundred or thousand lines of code). Report any interesting findings.
Fuzzing low-level programs
- Qsym : a practical concolic execution engine tailored for hybrid fuzzing, USENIX Security 2018,
- Redqueen: fuzzing with input-to-state correspondence, NDSS 2019,
- [optional] hands-on exercise: study one of the vulnerabilities found by Redqueen, e.g., CVE-2018-14567, and explain how it works; ideally, provide a PoC exploit.
Fuzzing compilers and engines
- Fuzzing JavaScript engines with aspect-preserving mutation, S&P 2020,
- Montage: A neural network language model-guided JavaScript engine fuzzer, USENIX Security 2020,
- [optional] hands-on exercise: study one of the vulnerabilities found by Montage, e.g., CVE-2019-0860, and explain how it works; ideally, provide a PoC exploit.
Machine learning-aided vulnerability detection
- Neutaint: efficient dynamic taint analysis with neural networks, S&P 2020,
- Scalable taint specification inference with big code, PLDI 2019,
- [optional] hands-on exercise: study one of the vulnerabilities found by the presented extension of DeepCode, e.g., this path traversal, and explain how it works; ideally, provide a PoC exploit or a fix.
Availability vulnerabilities
- Static detection of DoS vulnerabilities in programs that use regular expressions, TACAS 2017,
- SlowFuzz: automated domain-independent detection of algorithmic complexity vulnerabilities, CCS 2017,
- [optional] hands-on exercise: study the ReDoS vulnerability in CVE-2017-16119, and explain how the associated PoC exploit works. Show how it can affect a vulnerable express server installed on your machine.
Automatic patching of vulnerabilities
- Automating Patching of vulnerable open-source software versions in application binaries, NDSS 2019,
- VuRLE: Automatic vulnerability detection and repair by learning from examples from example, ESORICS 2017,
- [optional] hands-on exercise: Write a small tool that updates vulnerable dependencies in package.json files. The tool should support 5-10 vulnerable packages from this list.
Removing vulnerabilities through debloating
- RazoR: a framework for post-deployment software debloating, USENIX Security 2019,
- Less is more: quantifying the security benefits of debloating web applications, USENIX Security 2019,
- [optional] hands-on exercise: Write a compiler pass, e.g, as babel plugin or esprima traversal, to find calls to lodash.defaultsDeep. Run it on a few examples that use various lodash methods.
Vulnerability prediction
- Leopard: identifying vulnerable code for vulnerability assessment through program metrics, ICSE 2019,
- The importance of accounting for real-world labeling when predicting software vulnerabilities, FSE 2019,
- [optional] hands-on exercise: Analyze three CVEs found by Leopard and their fixes. Using your best judgment, how many of the metrics used by Leopard change after the fix?