Zizhan Zheng, Computer Science, Tulane University

Towards Optimal Timing in Cyber Defense: A Game Theoretic and Learning Approach

Personnel

Goals

The goal of this project is to develop a unified game theoretic and learning approach to tackle the fundamental challenges in achieving adaptive timing of security updates in the face of advanced attacks, which will be achieved by conducting reseach on three inter-related tasks:

Activities and Findings

We have developed online learning algorithms for optimal timing of security updates under partial feedback (Task 1). Our first algorithm was based on the improved UCB algorithm, which has a very good theoretic performance bound but converges rather slowly in practice. We then designed a new algorithm based on the technique of Thompson sampling, where the next defense period is chosen based on the posterior distribution of attack times observed in previous rounds. The main novelty of our algorithm is to incorporate side observations into Thompson sampling to accelerate learning. We have derived new performance bounds for Thompson sampling in a general graphical bandit setting. We observed that when adapted to our optimal timing problem, Thompson sampling achieves superb empirical performance.

We have developed a new proactive defense approach to combat advanced attacks (Task 2). The new approach is based on the technique of moving target defense (MTD), where the defender dynamically updates the system configuration to increase the attacker’s uncertainty. MTD can be viewed as a generalization of periodic security updates. That is, in addition to deciding when to update system configuration, the defender should also determine what new configuration to use. Although MTD has been successfully applied to various domains, there has been very little research on quantifying the effectiveness of MTD. Further, the timing decisons have been largely ignored in previous studies. We have developed a Markovian modeling of MTD and are currently working on incorporating timing decisions into MTD.

In addition to numerical studies, we have implemented a prototype of our security game framework, which will be posted to Amazon Mechanical Turk to invite human players (“workers”) to play the games. This experiment will help us collect real data needed to validate our algorithms (Task 3). Although behavioral studies on games of timing have been conducted before, previous work largely ignored the stealthy behavior of adversaries and their resource constraints, both of which are captured by our models. We are currently working on improving the user interface and adding more strategies to the games and studying how to incentivize human workers to participate.

Publications

Support

The project is funded by a Louisiana Board of Regents Research Competitiveness Subprogram (RCS) grant.