News
Next Seminar on 30.3.2022
Written on 25.03.2022 11:28 by Stella Wohnig
Dear All,
The next seminar takes place on 30.3. at 14:00.
Session A: (RA3,4)
Stanimir Iglev - Tim Walita
https://cispa-de.zoom.us/j/96786205841?pwd=M3FOQ3dSczRabDNLb3F1czVXVUpvdz09
Meeting-ID: 967 8620 5841
Kenncode: BT!u5=
Session A:
14:00-14:30
Speaker: Stanimir Iglev
Type of talk: Bachelor Final
Advisor: Prof. Dr. Andreas Zeller
Title: Transpiling Schema Languages to Grammars
Research Area: RA4
Abstract:
Testing programs, which expect highly-structured inputs, has proven to be a challenging problem. Grammar-based fuzzing is a promising approach to address this issue. The idea behind it is to enhance the fuzzer with a grammar, which describes the input language of the program under test, and use it to produce syntactically valid inputs. However, due to the diversity of data formats, it is hard to construct a universal grammar representation which is able to express the variety of restrictions
that are applicable to the inputs. As a result, fuzzers are very often tailored to work with a specific format or application and use a representation of grammar that is designed to function efficiently with the targeted data format. This creates a layer of incompatibility across different fuzzers. We present an approach to reduce this problem by creating a set of transpilers for the conversion of input specifications. More precisely, we convert schema specifications to context-free grammars.
Schema languages are often used to define the structure of documents accepted by programs. These languages allow the developers to assert that the input data satisfies certain conditions. Essentially, a schema defines the input language of the application. Most well-established data formats provide such schema vocabularies. By translating these input specifications into grammars, we can use them not only
to assert the validity of incoming data, but also to generate test inputs. Hence, we believe that being able to convert different input languages to a single grammar format will greatly enhance already existing grammar-based fuzzers and provide a strong candidate for a unified grammar representation.
This thesis describes the creation of a set of transpilers which use these schema documents to infer input grammars. The presented tools are able to translate the constraints, described by the schema, to production rules in the resulting grammar by constructing an intermediate representation, which depicts the structure of valid documents. We target the two most widely used data interchange formats and
their schema languages, namely JSON Schema and W3C XML Schema Definition Language. The grammars produced by our transpilers are able to generate XML and JSON documents, which fulfill all requirements depicted by the input schema.
We evaluate the correctness of produced grammars and the validity of generated inputs against schemas used by real-world applications. The results show that our transpilers produce correct grammars for a high-variety of schemas. Additionally, when paired with a grammar-based fuzzer, more than 90% of the inputs generated by the grammar are valid according to the corresponding schema. Furthermore, we compared the grammars against generic XML and JSON grammars by fuzzing a JSON Schema validator and an RSS parser. Fuzzing with generated grammars achieves more than double the amount of coverage that is obtained when using a generic grammar.
14:30-15:00
Speaker: Tim Walita
Type of talk: Bachelor Final
Advisor: Nils Ole Tippenhauer
Title: Backdoor Attacks on Autoencoder-based Attack Detectors for ICS
Research Area: RA4
Abstract: Industrial control systems (ICS) such as PLCs and SCADAs are a
valuable target for hackers which can attack these targets and cause a
tremendous amount of damage to critical infrastructures. Consequently,
security measures must be implemented into these systems. An autoencoder
for water distribution systems can be an effective reconstruction based
anomaly detection system. It is capable of learning the standard behavior
of a complex system and can detect attacks on it very reliably. Therefore,
we want to further explore the security of the model itself and find out
if further protection measures are required.
In this thesis, we test the autoencoder against a state-of-the-art
adversarial machine learning attack called backdoor attack. This type of
attack is commonly explored in the domain of image classification. A
successful backdoor attack embeds a hidden behavior into the target model
which allows an attacker to bypass any type of ML-based detection system.
In the course of this thesis, we developed multiple versions of the attack
and evaluate each one of them on our own test dataset. This makes it
possible to evaluate the success rate of the backdoor attack because it
helps us to analyze if the backdoor triggers are recognized by the model.
On the benign test datasets the backdoored autoencoders achieve the same
accuracy as the original autoencoder.
15:00-15:30
No talk this week