Stealth SSL/TLS Hooking in Windows and Linux
This project aims to augment our dynamic malware analysis system SANDNET with the ability to inspect TLS traffic. The core idea is to hook OS communication libraries such that the plain content of TLS communication can be leaked. Warning: This project is fairly low-level and will require some degree of reverse-engineering and assembly coding to inspect where and how to place hooks.
TPM firmware testing
The Trusted Platform Module is a small co-processor available on almost all modern desktop and server class motherboards and increasingly also on smartphones. It enables different trusted computing features, such as secure storage, measuring the platform state, or attesting that state to a remote verifier. As such the TPM acts as a hardware root-of-trust and has to be axiomatically free of errors and bugs. The latest version of TPM is v2.0, which just recently came to market. In this project, the student should create fuzzing-based test framework for the TPM, which through the byte-stream-based TPM interface tries to identifies erroneous or anomalous behaviour of the TPM firmware that might hint to security problems.
TPM specification testing
Similar to the first project on TPM, this project should test the correct behaviour of the TPM. In contrast, however, this project should test the correct implementation of the TCG TPM specification, in particular in hindsight of integrity measurement, key, and authentication operations. This requires understanding the TPM specification and writing test cases to identify diversions from the specified behaviour.
GAN password cracker
Generative Adversarial Networks are a class of unsupervised machine learning in which one network tries to create new candidates that the other network evaluates. As a result the generative network learns to create better and better candidates that will fool the evaluator into thinking they come from the true distribution. This method has, for instance, been used to create a neural network that can create images that would even fool a human into thinking that they have been created by another human (e.g., "creative"). This idea has recently also proposed as a way of generating better guesses at human-created password and increase the password space covered during password cracking. Since the available PassGAN tools are not yet publicly available but need for our work, we would a (team of) student(s) to implement such as PassGAN tool for us.
The latest recommendation for secure passwords advises passphrases instead of (short) complex pass"words". This project is the implementation of a passphrase recommender system that uses different, user-accepted linguistic structures to create new passphrases. We would like to see the following steps realised:
- Crawl the internet for a huge dataset of plaintext (blogs, wiki, forums, etc.)
- Map the discovered plaintext with lightweight NLP techniques to their basic structures, where particular consideration should be given to special characters, such as "@", "!", etc.
- Use the discovered linguistic structures to train a neural network or markov model that then forms the heart of the new passphrase generator
Implementation and Evaluation of Ring Signature Schemes
Recently a number of ring signature schemes in the standard model (secure without assuming that a random oracle exists) were proposed. The project's main task is to evaluate the practicality of those schemes, e.g. compare signature and public key size, compare the execution time on the same machine etc.
The evaluation should also consider some schemes that rely on the random oracle model, to compare schemes in different models.
Direct Anonymous Attestation and Platform Configuration Registers on a Smart Card
The project's main task is to develop a smart card applet (for Java Card or MULTOS technology) that supports the main features of a TPM, i.e. platform configuration register (PCRs) and direct anonymous attestation (DAA). For higher security, the implemented DAA scheme should use elliptic curves.
Smart wristband data collection and analysis
Devices for self-tracking like sensor wristbands or smart watches are becoming more and more popular. They are part of the general emerging quantified-self movement. Such devices can track the steps, heart rate, activities, sleeping times etc. Most of these devices only work with the cloud services provided by the manufacturer. The typical data flow is as follows: The wristband collects the data and submits them via Bluetooth LE to a smartphone app which itself submits the data to the cloud. Here the data are processed and can now be displayed on the user’s app or browser. All data channels are usually encrypted.
The student’s task is to find and work on possible ways to acquire raw data from wearable devices and then to analyse them: Which behaviour patterns can be identified both using individual sensors (e.g. acceleration sensor) and their combination? How can aggregation help to hide these patterns?
Making the Internet Great Again
Astroturfing has made the Internet unusable as a public forum and considerably damaged public discourse as a whole. We devised a protocol called TrollThrottle that limits the number of post a user can write per day to raise the costs of astroturfing, while still retaining anonymity, unlinkability, and accountability for censorship. The task is to implement the protocol and see if it can handle reddit (20M comments/day).
Visualizing protocol proofs
The SAPIC tool provides a translation from a high-level protocol description language to low-level calculus for which an effective prover, called tamarin, exists. The task is to integrate SAPIC into tamarin, so tamarin-prover can make use of the high-level features SAPIC provides to improve visualisation, e.g., by distinguishing protocol steps belonging to different parties with colours.
Feasibility: mitigation analysis in browser
YubiHSM 2.0 model extraction
The Yubikey is a very successful One-time-password-token, used by google, US ministry of defence and many other big companies or government institutions. The YubiHSM is the server-side counter-part, which is supposed to harden the protocol against server compromised, however, formal analysis exposed flaws in the previous version. We have obtained a beta version of the YubiHSM 2.0 and want to extend this analysis. The task is to generate a model of which operations this device allows, by writing a Python script polling which operations are permitted. This model will later be used to validate the security of this device.
PostMessage security analysis in browsers
Android Middleware Native Lib Fuzzing
- Identify native libs in Android Middleware that are called from Java code (JNI)
- Identify the lib version and if there is custom code added (e.g. via diff to original lib / git changelog)
- If outdated lib version, check for known vulnerabilities
- Fuzz the library e.g. with AFL
- if no custom code was added, use original lib
- otherwise build harness to enable offline fuzzing
Identify Permission Checks in Android's Native Code
- Identify permission check and the respective permission string in native code
- Trace them back to JNI and Java code to Framework EPs
- Build tool to check this for API levels 16-26
- Augment existing permission maps
Build repository of GitHub Projects and identify Vulnerabilities/Security Patches
- Identify Java/Android (/Python/C/C++) projects on GitHub and crawl them
- Extract history of patches
- Mark patches that either have commit message/changelog indicating bugfix/security fix
- Try to mark code changes that introduced the bug
- Try to automatically/manually isolate concrete code locations (instruction/method/class) including bug/bugfixes
Large Scale Shared-Code Analysis on Android Apps
- Given a repo of > 2M Android apps
- Analyze their code structure (e.g. in terms of Java package tree) to distinguish App developer code and Third-party library code (based on frequency sampling)
- How many third-party code is out there? How many different libraries?
- Do some statistics / generate eye-candy graphs to illustrate your findings
Bartek Surma / Yang Zhang
Graph analysis for social networks
Graph structures are widely used to model social networks. In many cases those graphs hold some special properties that carry meaningful information (i.e. information that breaks users privacy).
One of the techniques well suited to learn those properties is a machine learning algorithm word2vec. In order to use it, many random walks on the graph needs to be performed. Because of the size of graphs (tens of thousands nodes) speed of the random walks is of utmost importance.
Student is to implement a fast way for performing random walks, program should make use of multiple cores (access to colossus for testing required) and be documented.
Find sublte differences in processor architectures by comparing how sequences of instructions are executed. The idea is to automatically craft consecutive, legal instructions that are executed on a processor and the side effects (memory access, registers changed etc.) are recorded. Those exact same sequences shall then be run on a different machine (e.g., AMD vs. Intel) to compare whether the instructions have the exact same effect.
End-to-end Encrypted Keyboard, e.g. for whistle-blowers
Build an arduino based USB-HID device that acts like a normal keyboard but instead encrypts all USB-Keys and the Linux kernel decrypts the keys. It shall also be resistant to the side-channels 'some key was pressed just now' by sending arbitrary fake data in non-deterministric intervals that are indistinguishable from normal typing.
Grammar inference from program traces
I am working on a grammar inference technique that makes use of variables and the values assigned to them during the course of execution. For this, I need to get a trace of program execution consisting of
- Function name as it gets called, the parameters passed in, and their values
- At each point any variable gets assigned a value inside the function, I want the trace to contain this information, including the line number.
This can be obtained in two parts. The first part uses DTrace, which can use the PID trace provider to hook into any given program, and trace the execution of the functions. Essentially in any DTrace supported platforms (OSX, FreeBSD, or BPF for Linux), one can attach a DTrace script to an executing program that can hook into arbitrary execution points and print interesting values out.
DTrace can also print the value of any variable at arbitrary offsets in a function. So, in the second part, we need to read the DWARF information embedded in the binaries to find where variables are getting initialized or reassigned, and use this to provide a trace of the variables and values as they get assigned. If the variable is a reference, print the content. If it is an array, or a struct, or a nested struct, print out a flattened list of assignments. The format of the trace should be JSON lines.
Maria Gomez / Nataniel Borges
Automatic Migration to Runtime Permissions of Android Apps
Originally in Android, users had to accept all the permissions that an app requested before installing the app. Then, the apps had freely access to all the permissions at any time. Starting from Android 6.0, users can grant or revoke individual permissions when using the apps. For example, users can disable the microphone permission in the Facebook app and keep using the app without such feature. When an app needs a permission (e.g. microphone), then it prompts with a dialog and asks the user either to accept or deny the permission. In both situations, the app should be able to continue working.
The new Android Runtime Permission model improves security and gives more control to users. Therefore, it is an urgent concern for developers to migrate their apps to the new Runtime Permission model. Currently, developers need to manually refactor the code of their apps to incorporate the runtime permission flow.
In this project you will build a tool to automatically migrate Android apps, which use the old permission model, to the new runtime permission model.
The tool applies source code modification. The process involves 2 main steps:
- Code analysis to identify security-sensitive locations, i.e. features accessing privacy-sensitive API’s which require a permission.
- Code refactoring to incorporate the runtime permission flow in the needed locations.