Decompiler-Resistant Code (1-2 Persons)
Program decompilers have become an inherently important tool for malware analysts. Given disassembled instructions, they transform assembly instructions into higher-level programming languages (such as C). In general, malware aims to evade such analysis by obfuscating its program code. While so-called packing has been well studied and is prevalent in malware nowadays, once a malware is unpacked, it can usually be analyzed using off-the-shelf decompilers. This project should study how even unpacked x64 programs could in principle evade decompilers, e.g., by hiding function boundaries, parameters, and local variables. The nature of this project is rather offensive, but the study should also include an assessment of potential countermeasures.
Project requirements: Knowledge of and strong interest in x64 (dis)assembly and program analysis
Detecting Website Content and Policy Modifications (1 Person)
There are many tools out there that modify websites a user is visiting. Security solutions like anti-virus software, browser plugins or malware are injecting code or modifying the content of a website in the browser, or even modify the Content Security Policy of a website. In this project, you will build a detection for these modifications, deploy it, and identify the programs doing this modification. In the end, you will find out how big this problem actually is.
Side Channels Based on DNS Cache Snooping (1 Person)
DNS Cache Snooping allows a client to inspect whether a particular domain has been cached at a resolver, potentially by another client. Snooping, in principle, can thus be used for a purely DNS-based side channel that allows two virtual machines (VMs) to communicate if they share the same DNS resolver. This project will prototype such a side channel and assess the goodput that can be achieved with such side channels. While this itself is trivial, one particular focus should be stealthiness: How can two parties communicate with as few queries as possible, and at the same time maximize their communication bandwidth?
Project requirements: Knowledge and deep interest in DNS (e.g., from CySec II); programming skills (ideally Python, but C/C++ also works)
Password-less web authentication
Recently, a number of popular web services, such as Google, Amazon, or Microsoft, have begun implementing password-less authentication to their web services. Instead of entering a password (or just a password), the user authenticates using her smartphone (e.g., face recognition, fingerprint recognition) or extra devices (e.g., YubiKey). This password less authentication is made possible through new standards like WebAuthN or FIDO2.0.
To better understand those new authentication approaches and study them better (e.g., in terms of usability/user acceptance or extensions), we require a test setup of those techniques. Goal of the project is to set up a mock web service (or potentially integrate with a real service here at CISPA) that allows us to conduct such investigations of password-less authentication.
Improving/Extending the Java/Android Library Detector LibScout
- Moving from default java hashing function to Guava
- Addition of per-library profiles
- parameter tuning via external configs
- Requirements: Java
iOS App Crawler
Implement a crawler that builds a database of iOS apps from Apple's App Store
- Indexer: Identify apps in app store
- Downloader (via iTunes)
- Decryptor: Apps are encrypted by default. App's memory can be dumped at runtime.
Android Middleware Native Library Identification
- Identify which native libraries are used in Android's Middleware
- Study library evolution (in terms of versions) over Android versions
- Check whether they're outdated/customized/contain known security vulnerabilities
Dynamic Analysis of Private Data Leakage of Advertising/Tracking Libraries on Android
- Build (semi-/automatically) testapps that integrate ad/tracking libraries
- Run them with available monitoring tools and analyze which data leak to which servers
Grammar inference from program traces
I am working on a grammar inference technique that makes use of variables and the values assigned to them during the course of execution. For this, I need to get a trace of program execution consisting of
- Function name as it gets called, the parameters passed in, and their values
- At each point any variable gets assigned a value inside the function, I want the trace to contain this information, including the line number.
This can be obtained in two parts. The first part uses DTrace, which can use the PID trace provider to hook into any given program, and trace the execution of the functions. Essentially in any DTrace supported platforms (OSX, FreeBSD, or BPF for Linux), one can attach a DTrace script to an executing program that can hook into arbitrary execution points and print interesting values out.
DTrace can also print the value of any variable at arbitrary offsets in a function. So, in the second part, we need to read the DWARF information embedded in the binaries to find where variables are getting initialized or reassigned, and use this to provide a trace of the variables and values as they get assigned. If the variable is a reference, print the content. If it is an array, or a struct, or a nested struct, print out a flattened list of assignments. The format of the trace should be JSON lines.
Investigation of side-channel communication between Android applications
Android applications actively communicate with each other and exchange sensitive data. One can share a picture taken from camera app with the Facebook app, or send a contact with Gmail. For that end applications use standard Android message passing mechanism. Though, malicious applications can also use this channel to exchange data, thus, making it harder to identify data leaks. Specifically, two apps can be considered secure against standalone static analysis — one app just collects sensitive data without leaking them outside, while another sends some data which are not sensitive. Though, running together they may create completed data flow.
Previous research done regarding inter app communication mostly focused on the conventional message passing and file sharing processes. However, it is possible to transmit signals bypassing a phone, without involving OS sharing mechanisms. It would be interesting to see whether some exotic means of communication can be used for data sharing between app installed on the one phone, and craft possible vector of attacks.
Means of communication which can be investigated:
- Ultrasonic waves, based on speaker—microphone endpoints
- Vibration, based on vibrator — accelerometer
- Light, based on display — light sensor
The aim of the project is to develop Android applications that use side-channels to communicate and leak sensitive data.
Grammar-based Detection of Anomalous Input (documents)
Often anomalous documents are fed into applications to perform malicious behaviour. It is common to find input documents (such as Microsoft Office and PDF files) containing malicious payloads embedded in seemingly benign content. The aim of this project is to leverage the naturalness of input documents to detect anomalous documents on-the-fly. The main idea is to learn the probabilistic input grammar of benign documents to recognise malicious documents. We aim to collect benign input files, learn the probabilistic grammar of such files and apply this grammar to detect malicious payloads in anomalous files. We expect that our approach improves the performance of typical malware scanners.
Crashing apps with valid runtime exceptions
Java/Android RuntimeExceptions don’t need to be handled by developers. Run an analysis of Java/Android source code to identify which runtime exceptions can be thrown by each API and use this information to “crash" some apps. =)
Fine-grained privacy policies on Android apps
Dynamic analysis can be used to determine which resources an app can access (privacy policies). This coarse-granularity still leaves several possibilities open for malicious behaviour. If a news app can access internet, it can download the news (good) or share your private information with Facebook (bad). Extend the policy gathering approach (paper: Mining sandboxes) with finer grained access control to Internet resources (servers).
Malware Classification with Deep Learning on Binary Executables
Deep Learning methods have led to strong advances in recognition and analysis of raw data in a range of application domains. Recently, such methods have been proposed that can perform classification tasks directly from binary executables. This project will analyze the state of the art techniques for classifying malware directly from executables. Different approach will be compared, selected ones implemented and improvements suggested and tested.
- Deep Convolutional Malware Classifiers Can Learn From Raw Executables and Labels Only ICLR 2018
- Deep Learning at the Shallow End: Malware Classification for Non-Domain Experts Digital Forensics Workshop 2018
- Malware Detection by Eating a Whole EXE ArXiv Oct 2017
- Recognizing Functions in Binaries with Neural Networks Usenix Security 2015
Compiler-based Dynamic Taint Tracking
The objective is to use ARTist, the Android Runtime instrumentation and security toolkit, to build a dynamic taint tracking system for Android applications. While the seminal TaintDroid system, which is not compatible with Android versions 5 and above, utilized the Dalvik VM to implement taint tracking on the register level, ARTist builds upon the Android Runtime compiler dex2oat that transforms dex bytecode to platform-specific native code and therefore rather operates on the compiler's intermediate representation. This particularly implies that a simple re-implementation of TaintDroid is not possible because the compiler only sees one method of the compiled app at a time and therefore lacks a holistic view of its target, and while compiling the app we do not see the framework and systemserver code that is executed as well. While this project can heavily borrow from the taint tracking prototype created for the initial ARTist paper, the objective is to rethink the design decisions to create a clean implementation that eventually overcomes the drawbacks of the initial proof-of-concept implementation, such as its lack of support for tracking taints in the framework code. A bit of experience with Android development is required, basic knowledge of C++ is a plus but not required.
Compiler-based app penetration testing support
There are plenty of penetration tests solutions that focus on different parts of Android app pentesting. However, there are some well-known and documented problems that prevent researchers and pentesters from fully assessing the security of an application, such as certificate pinning preventing traffic proxying, and root & tampering detection preventing instrumentation. The goal of this project is to explore the space of analysis tools for Android, compile a list of known problems and disadvantages, and finally search for and implement solutions based on the ARTist app instrumentation framework. While some objectives are fixed already, such as enabling traffic interception in the presence of certificate pinning, others are dynamically discovered during the project. A bit of experience with Android development is required, basic knowledge of C++ and maybe penetration testing or vulnerability assessment is a plus but not required.
Inken Hagestedt / Yang Zhang
Android App for Fitbit and Smartphone Data Collection
The student's task is to develop an Android app that links with Fitbit and collects the measured data (heartbeat, accelerometer data, ... ). Moreover, the app should also collect sensor data from the smartphone itself, like light, audio and location. Existing libraries can be used if they are appropriate. The collected data is uploaded to a central server for further analysis.
Why? Our research team wants to demonstrate how privacy sensitive such sensor data is. As a first step, we need to collect the data.
Core requirements: Data is encrypted properly while being stored on the device and during upload. The app has to be well-documented and extendable for later use in our research project.
Other requirements: The app should be as little intrusive as possible, e.g., upload the data preferably over WiFi and keep memory and battery consumption as low as possible.
Analysis of privacy in smart meters
The increasing deployment of the smart grid imposes a threat on the privacy of individual customers: From measurement data transmitted by individual smart meters, an energy supplier could infer detailed information about the respective household's residents - potentially ranging from coarse presence patterns up to fine-grained information such as watched TV programmes. As a theoretic countermeasure, privacy-enhancing technologies (PETs) have been suggested in the literature. They aim at hiding individual household's behaviour patterns, e.g., by addition of noise and/or aggregation of measurements before or during their transmission to the energy supplier.
As of today, however, only little is known about the practicality of these approaches. Do the assumptions underlying the theoretical works still hold for real data sets? How can smart meters be grouped for aggregation? Which group sizes are suitable? Can aggregation among small groups still achieve an acceptable level of privacy?
The student's task is to develop a framework for analysis of (and to analyze) the suitability of aggregation approaches with small group sizes based on an existing set of raw, real smart meter data. To the extent possible, the framework should build upon the open source Non-Intrusive Load Monitoring Toolkit (NILMTK) which has been used in similar works, but is yet incompatible to the data set at hand.
Patrick Speicher / Robert Kuennemann
Mitigation analysis: evaluation in cloud network
We developed a toolchain using the open source network vulnerability scanner OpenVas to simulate a network attacker and to reason about possible mitigations and their associated costs. The task of this project is to evaluate this toolchain on a virtual network, i.e., a computer network consisting of virtual machines, using, e.g., Amazon Cloud, DETERLab (http://deter-project.org/about_deterlab), or a combination of similar services and technologies.
Enforcing correct use of Crypto APIs in Android
Crypto APIs often support legacy algorithms. Correct use is not trivial. A necessary condition to securely implement protocols in Android apps is to ensure the correct use. In this project, we want to pursue a white listing approach, using a static analyses of calls to known Crypto APIs to ensure (a) the API calls are locally correct, e.g., called with correct parameters (b) the API calls are globally correct, e.g., PRNGs are initialised before being used, and (c) the API is not circumvented, e.g., by accessing library internals. Our focus will be on (a) and (c).