Managing Heterogeneity in Highly Distributed Stream, Cloud, and Sensor Systems
With the advent of low-cost wireless sensing devices, it is predicted that the world will quickly move to one in which many environments are instrumented for reasons of security, scientific monitoring, environmental control, entertainment, etc. There are many fundamental questions about how to develop applications in this emerging sensor network world. Perhaps the most important are how to support rich, complex applications that may have confidentiality requirements, heterogeneous types of sensors, different connectivity levels, and timing constraints. The Aspen (Abstract Sensor Programming Environment) project focuses on the challenges in developing a programming environment and runtime system for this style of environment.
We are investigating a number of complementary topics and ideas:
- Complex analysis in a cluster/cloud setting: Many sensor and stream data items need complex analysis. Building upon ideas from MapReduce and from our ORCHESTRA distributed query engine, we are developing new techniques for supporting cluster computation with incremental updates over recursive operations (e.g., PageRank, optimization).
- Distributed coordination and control: Many complex computations need to be continuously rebalanced, redistributed, and replanned based on monitored activity — this is a form of adaptive processing. We are developing new declarative techniques to address these problems.
- New programming model: We are building upon a declarative style of programming to develop a new language, group-based programming, for complex sensor applications. The goal is to combine compositional, database-style declarative computation with constraints on timing, security, distribution, and actuation in a seamless way. This work is funded by NSF CNS-0721541.
- Security and privacy: We have studied how sensor network application security is affected by node-level compromise. We are developing further language constructs for specifying encryption levels and other properties for data along certain channels.
- Runtime monitoring and checking: We seek to develop techniques for monitoring performance and triggering events in response to constraint violations. This work is funded by NSF CNS-0721541.
- Home health care and hospital applications: We hope to develop a number of applications useful in home hospice and hospital care, which monitor patients and also connect patients with the care they need. This work is funded by NSF CNS-0721541.
- Declarative information integration and query optimization: The core programming model is based on database query languages. We are developing techniques for supporting schema mappings over streams, distributed in-network join computation, and recursive queries for regions. Importantly, we are developing techniques for performing distributed, decentralized optimization of such computations. This work is funded by NSF IIS-0713267.
- Stream algorithms: In a distributed setting, many nodes have limited resources and must use approximate algorithms to make decisions and capture synopses of system activity. This work is funded by NSF IIS-0713267.
- Interfacing to Java code: Many real control systems require Java, C, or other procedural code for sophisticated sensor data processing or decision-making. This work is funded by Lockheed Martin.
- Declarative monitoring and re-optimization: We seek to build a declarative infrastructure for monitoring distributed query execution status, plus adaptive re-optimization, using declarative techniques. This work is funded by Lockheed Martin.