SSC: Systems Security Center - Columbia University

Fall 2010

Dec. 8, High Performance Firewalls in MANETs, Hang Zhao
Dec. 1, Paranoid Android: Versatile Protection For Smartphones, George Portokalidis
Dec. 1, GPU-Assisted Malware, Michalis Polychronakis
Nov. 22, The Failure of Online Social Network Privacy Settings, Michelle Madejski and Maritza Johnson
Nov. 17, GPU-Assisted Malware, Michalis Polychronakis
Nov. 17, Paranoid Android: Versatile Protection For Smartphones, George Portokalidis
Nov. 10, Data Governance, Andrew Sherman
Oct. 27 - A Quantitative Analysis of the Insecurity of Embedded Network Devices: Results of a Wide-Area Scan, Ang Cui
Oct. 20 - Information Flow Control for Distributed Applications, Winnie Cheng
Oct. 15 - How Secure are Secure Internet Routing Protocols?, Sharon Goldberg
Oct. 13 - iLeak: A Lightweight System for Detecting Inadvertent Information Leaks, George Portokalidis
Oct. 6 - Policy Refinement of Network Services for MANETs, Hang Zhao
Sept. 29 - Smudge Attacks on Smartphone Touch Screens, Adam Aviv
Sept. 27 - Technology and Privacy: A Short Tour through the Canadian Landscape, Tara Whalen
Sept. 22 - Experimental Results of Cross-Site Exchange of Web Content Anomaly Detector Alerts, Nathaniel Boggs
Sept. 15 - Crimeware Swindling without Virtual Machine, Vasilis Pappas
Sept. 8 - Traffic Analysis Against Low-Latency Anonymity Networks Using Available Bandwidth Estimation, Sambuddho Chakravarty

High Performance Firewalls in MANETs

Hang Zhao, Columbia University PhD Student

ROFL (ROuting as the Firewall Layer) implements packet filtering using underlying routing mechanism, where port numbers are part of the address for routing purposes, and hence the lack of a routing announcement effectively blocks access to a port. Doing route selection based in part on source addresses is a form of policy routing, which has started to receive increased amounts of attention. In this paper, we extend our previous work on ROLF to achieve source prefix filtering. This permits easy definition of "inside" and "outside", even in MANET environment where there is no topological boundary. We present algorithms for route propagation and packet forwarding using ROFL; we measure its performance in a simulated environment with two different ad hoc routing protocols. Simulation results demonstrate that ROFL can significantly reduce unwanted packets without extra control traffic incurred, and thus improves overall system performance and preserves battery power of mobile nodes. ROFL is the first scheme to provide a concrete defense against some battery exhaustion attacks in MANETs. Moreover, it requires only minor changes to existing ad hoc network routing protocols, making it practical and feasible to be deployed in real world.

Fast and Practical Instruction-Set Randomization for Commodity Systems

George Portokalidis, Columbia University Postdoctoral Researcher

Instruction-set randomization (ISR) is a technique based on randomizing the "language" understood by a system to protect it from code-injection attacks. Such attacks were used by many computer worms in the past, but still pose a threat as it was confirmed by the recent Conficker worm outbreak, and the latest exploits targeting some of Adobe's most popular products. This paper presents a fast and practical implementation of ISR that can be applied on currently deployed software. Our solution builds on a binary instrumentation tool to provide an ISR-enabled execution environment entirely in software. Applications are randomized using a simple XOR function and a 16-bit key that is randomly generated every time an application is launched. Shared libraries can be also randomized using separate keys, and their randomized versions can be used by all applications running under ISR. Moreover, we introduce a key management system to keep track of the keys used in the system. To the best of our knowledge we are the first to apply ISR on truly shared libraries.

Finally, we evaluate our implementation using real applications including the Apache web server, and the MySQL database server. For the first, we show that our implementation has negligible overhead (less than 1%) for static HTML loads, while the overhead when running MySQL can be as low as 75%. We see that our system can be used with little cost with I/O intensive network applications, while it can also be a good candidate for deployment with CPU intensive applications, in scenarios where security outweighs performance.

Comprehensive Shellcode Detection using Runtime Heuristics

Michalis Polychronakis, Columbia University Postdoctoral Researcher

A promising method for the detection of previously unknown code injection attacks is the identification of the shellcode that is part of the attack vector using payload execution. Existing systems based on this approach rely on the self-decrypting behavior of polymorphic code and can identify only that particular class of shellcode. Plain, and more importantly, metamorphic shellcode do not carry a decryption routine nor exhibit any self-modifications and thus both evade existing detection systems. In this paper, we present a comprehensive shellcode detection technique that uses a set of runtime heuristics to identify the presence of shellcode in arbitrary data streams. We have identified fundamental machine-level operations that are inescapably performed by different shellcode types, based on which we have designed heuristics that enable the detection of plain and metamorphic shellcode regardless of the use of self-decryption. We have implemented our technique in Gene, a code injection attack detection system based on passive network monitoring. Our experimental evaluation and real-world deployment show that Gene can effectively detect a large and diverse set of shellcode samples that are currently missed by existing detectors, while so far it has not generated any false positives.

The Failure of Online Social Network Privacy Settings

Michelle Madejski and Maritza Johnson

Increasingly, people are sharing sensitive personal information via online social networks (OSN). While such networks do permit users to control what they share with whom, access control policies are notoriously difficult to configure correctly; this raises the question of whether users' privacy settings match their intentions. We present the results of an empirical evaluation that measures privacy attitudes and sharing intentions and compares these against the actual privacy settings on Facebook. Our results indicate a serious mismatch: every one of the 65 participants in our study had at least one sharing violation. In other words, OSN users are sharing more information than they wish to. Furthermore, a majority of users cannot or will not fix such errors. We conclude that the current approach to privacy settings is fundamentally flawed and cannot be fixed; a fundamentally different approach is needed. We present recommendations to ameliorate the current problems, as well as providing suggestions for future research.

GPU-Assisted Malware

Michalis Polychronakis, Columbia University Postdoctoral Researcher

Malware writers constantly seek new methods to obfuscate their code so as to evade detection from virus scanners. In this paper, we demonstrate how malware can take advantage of the ubiquitous and powerful graphics processing unit to increase its robustness against detection. We present the design and implementation of GPU-based unpacking and run-time polymorphism, two code armoring techniques that pose significant challenges to existing malicious code detection and analysis systems. Both techniques have been implemented and tested using existing graphics hardware. We also discuss how upcoming GPU features can be utilized to build even more robust, evasive, and functional malware.

Paranoid Android: Versatile Protection For Smartphones

George Portokalidis, Columbia University Postdoctoral Researcher

Smartphone usage has been continuously increasing in recent years. Moreover, smartphones are often used for privacy-sensitive tasks, becoming highly valuable targets for attackers. They are also quite different from PCs, so that PC-oriented solutions are not always applicable, or do not offer comprehensive security. We propose an alternative solution, where security checks are applied on remote security servers that host exact replicas of the phones in virtual environments. The servers are not subject to the same constraints, allowing us to apply multiple detection techniques simultaneously. We implemented a prototype of this security model for Android phones, and show that it is both practical and scalable: we generate no more than 2KiB/s and 64B/s of trace data for high-loads and idle operation respectively, and are able to support more than a hundred replicas running on a single server.

Data Governance

Andrew Sherman, PhD, Director, IT Security Architecture, Credit Suisse

Technologists tend to view data security as a set of technical challenges. However, in the enterprise data is a business asset which means that data security is actually a business problem that needs technical solutions. The term "Data Governance" is used to describe the set of business process that allow data to be used where its needed and protected by means appropriate to its value. This talk will discuss those processes, and also the inevitable tension between "allow access" and "protect".

A Quantitative Analysis of the Insecurity of Embedded Network Devices: Results of a Wide-Area Scan

Ang Cui, Columbia University PhD Student

We present a quantitative lower bound on the number of vulnerable embedded device on a global scale. Over the past year, we have systematically scanned large portions of the internet to monitor the presence of trivially vulnerable embedded devices. At the time of writing, we have identified over 540,000 publicly accessible embedded devices configured with factory default root passwords. This constitutes over 13% of all discovered embedded devices.These devices range from enterprise equipment such as firewalls and routers to consumer appliances such as VoIP adapters, cable and IPTV boxes to office equipment such as network print- ers and video conferencing units. Vulnerable devices were detected in 144 countries, across 17,427 unique private enterprise, ISP, government, educational, satellite provider as well as residential network environments. Preliminary results from our longitudinal study tracking over 102,000 vulnerable devices revealed that over 96% of such accessible devices remain vulnerable after a 4-month period. We believe the data presented in this paper provides a conservative lower bound on the actual population of vulnerable devices in the wild. By combining the observed vulnerability distributions and its potential root causes, we propose a set of mitigation strategies and hypothesize about its quantitative impact on reducing the global vulnerable embedded device population. Employing our strategy, we have partnered with Team Cymru to engage key organizations capable of significantly reducing the number of trivially vulnerable embedded devices currently on the internet. As an ongoing longitudinal study, we plan to gather data continuously over the next year in order to quantify the effectiveness of community's cumulative effort to mitigate this pervasive threat.

Information Flow Control for Distributed Applications

Winnie Cheng, Research Scientist, IBM T.J. Watson Center

Private and confidential information is increasingly stored online and increasingly being exposed due to human errors as well as malicious attacks. Information leaks threaten confidentiality, lead to lawsuits, damage enterprise reputations, and cost billions of dollars. While service-oriented architectures and cloud computing platforms enable new value-added applications and improve data and service integration, they also create information flow control problems due to the interaction complexity among service providers. A main problem has been the lack of an appropriate programming model to capture expected information flow behaviors in these large distributed software infrastructures. We have developed one such model called Aeolus and have applied it to several application scenarios.

In this talk, I will describe our proposed programming methodology and our enforcement platform for building safer distributed applications that avoid the inadvertent release of sensitive information. The Aeolus security model is based on decentralized information flow control and label tracking. Our platform is implemented as a .NET runtime extension supporting local server processes as well as web services.

This is joint work with Professor Barbara Liskov from MIT and Professor Liuba Shrira from Brandeis.

October 15 - How Secure are Secure Internet Routing Protocols?

Sharon Goldberg, Boston University

A decade of research has been devoted to addressing vulnerabilities in global Internet routing system. The result is a plethora of security proposals, each providing a different set of security guarantees. To inform decisions about which proposal should be deployed in the Internet, we present the first side-by-side quantitative comparison of the major security variants. We evaluate security variants on the basis of their ability to prevent one of the most fundamental forms of attack, where attacker manipulates routing messages in order to attract traffic to a node it controls (so that it can tamper, drop, or eavesdrop on traffic). We combine a graph-algorithmic analysis with simulations on real network data to show that prior analysis has underestimated the severity of attacks, even when the strongest known secure routing protocol is fully deployed in the network. We find that simple access control mechanisms can be as effective as strong cryptographic approaches, and it is really the combination of these two competing approaches that leads to a significant improvement in security. Time permitting, we will also discuss some of the economic and engineering issues that must be addressed before any of these proposals can realistically be deployed in the Internet.

Based on joint work with Michael Schapira, Pete Hummon, and Jennifer Rexford.

October 13 - iLeak: A Lightweight System for Detecting Inadvertent Information Leaks

George Portokalidis, Columbia University Postdoctoral Researcher

Data loss incidents, where data of sensitive nature are exposed to the public, have become too frequent and have caused damages of millions of dollars to companies and other organizations. Repeatedly, information leaks occur over the Internet, and half of the time they are accidental, caused by user negligence, misconfiguration of software, or inadequate understanding of an application's functionality. We present iLeak a lightweight, modular system for detecting inadvertent information leaks. Unlike previous solutions, iLeak builds on components already present in modern computers. In particular, we employ system tracing facilities and data indexing services, and combine them in a novel way to detect data leaks. Our design consists of three components: uaudits are responsible for capturing the information that exits the system, while Inspectors use the indexing service to identify if the transmitted data belong to files that contain potentially sensitive information. The Trail Gateway handles the communication and synchronization of uaudits and Inspectors. We implemented iLeak on Mac OS X using DTrace and the Spotlight indexing service. Finally, we show that iLeak is indeed lightweight, since it only incurs 4% overhead on protected applications.

October 6 - Policy Refinement of Network Services for MANETs

Hang Zhao, Columbia University PhD Student

It is increasingly important to develop policy refinement that automates high level requirements into low level implementation in policy-based system management. The goal of policy refinement is to generate low level rules so that syntax and semantics can be understood by individual enforcement points. Policy refinement fills the gap between policy authoring and enforcement. While these two techniques have been studied intensively, only limited work has addressed policy refinement.

In this work, we describe a framework for a refinement scheme located in a centralized policy server that consists of three components: a knowledge database, a refinement rule set, and a policy repository. The refinement process includes two successive steps: a policy transformation phase and a policy composition phase. Our refinement scheme takes policies written in our logic-based abstract policy language as input and generates low level rules directly implementable by individual enforcement points. We provide concrete policy examples in a coalition scenario that forms a mobile ad hoc network (MANET). We suggest using ROFL scheme together with access control list as one possible enforcement mechanism.

September 29 - Smudge Attacks on Smartphone Touch Screens

Adam Aviv, University of Pennsylvania PhD Student

Touch screens are an increasingly common feature on personal computing devices, especially smartphones, where size and user interface advantages accrue from consolidating multiple hardware components (keyboard, number pad, etc.) into a single software definable user interface. Oily residues, or smudges, on the touch screen surface, are one side effect of touches from which frequently used patterns such as a graphical password might be inferred.

In this talk I will present our recent work on the feasibility of such smudge attacks on touch screens for smartphones, and focus our analysis on the Android password pattern. We first investigate the conditions (e.g., lighting and camera orientation) under which smudges are easily extracted. In the vast majority of settings, partial or complete patterns are easily retrieved. We also emulate usage situations that interfere with pattern identification, and show that pattern smudges continue to be recognizable. Finally, we provide a preliminary analysis of applying the information learned in a smudge attack to guessing an Android password pattern.

This talk was given at WOOT'10 (Workshop on Offensive Technology)

September 27 - Technology and Privacy: A Short Tour through the Canadian Landscape

Tara Whalen, Office of the Privacy Commissioner of Canada

Technological advances have far-reaching privacy implications: sometimes they create ways to protect confidentiality, but they also create new opportunities for the collection and dissemination of personal information. Those involved in the privacy field keep a close watch on emerging technologies and analyze their potential effects on society, law, and policy. From the smart grid and intelligent transportation systems, to ubiquitous surveillance and geolocation services, new directions in technology have given rise to new privacy challenges. In this talk, I will describe some of the technological issues that the Office of the Privacy Commissioner of Canada has been tracking recently, and speculate on some emerging areas that could give rise to privacy concerns in the near future.

September 22 - Experimental Results of Cross-Site Exchange of Web Content Anomaly Detector Alerts

Nathaniel Boggs, PhD student, Columbia University

We present our initial experimental findings from the collaborative deployment of network Anomaly Detection (AD) sensors. Our system examines the ingress http traffic and correlates AD alerts from two administratively disjoint domains: Columbia University and George Mason University. We show that, by exchanging packet content alerts between the two sites, we can achieve zero-day attack detection capabilities with a relatively small number of false positives. Furthermore, we empirically demonstrate that the vast majority of common abnormal data represent attack vectors rather than false positives. We posit that cross-site collaboration enables the automated detection of common abnormal data which are likely to ferret out zero-day attacks with high accuracy and minimal human intervention.

September 15 - Crimeware Swindling without Virtual Machine

Vasilis Pappas, PhD student, Columbia University

In previous work, we introduced a bait-injection system designed to delude and detect crimeware by forcing it to reveal itself during the exploitation of captured information. Although effective as a technique, our original system was practically limited, as it was implemented in a personal VM environment. In this paper, we investigate how to extend our system by applying it to personal workstation environments. Adapting our system to such a different environment reveals a number of challenging issues, such as scalability, portability, and choice of physical communication means. We provide implementation details and we evaluate the effectiveness of our new architecture.

September 8 - Traffic Analysis Against Low-Latency Anonymity Networks Using Available Bandwidth Estimation

Sambuddho Chakravarty, PhD student, Columbia University

We introduce a novel remotely-mounted attack that can expose the network identity of an anonymous client, hidden service, and anonymizing proxies. To achieve this, we employ single-end controlled available bandwidth estimation tools and a colluding network entity that can modulate the traffic destined for the victim. To expose the circuit including the source, we inject a number of short or one large burst of traffic. Although timing attacks have been successful against anonymity networks, they require either a Global Adversary or the compromise of substantial number of anonymity nodes. Our technique does not require compromise of, or collaboration with, any such entity.

To validate our attack, we performed a series of experiments using different network conditions and locations for the adversaries on both controlled and real-world Tor circuits. Our results demonstrate that our attack is successful in controlled environments. In real-world scenarios, even an under-provisioned adversary with only a few network vantage points can, under certain conditions, successfully identify the IP address of both Tor users and Hidden Servers. However, Tor's inherent circuit scheduling results in limited quality of service for its users. This at times leads to increased false negatives and it can degrade the performance of our circuit detection. We believe that as high speed anonymity networks become readily available, a well-provisioned adversary, with a partial or inferred network "map", will be able to partially or fully expose anonymous users.