Skip to main content

Distributed Systems, Stream Processing Applications and Energy/Data Management

I lead research activities on Edge Computing and data-driven applications in the Rutgers Discovery Informatics Institute (RDI2).

The current focus of this work is on enabling emerging applications to be effectively deployed on a fluid ecosystem where distributed resources and services are programmatically aggregated on-demand to support emerging data-driven application workflows. As these emerging classes of applications mature and their data and processing requirements grow, they can not be sustained by solely using edge resources or by sending all the data to centralized locations. They instead require a fluid integration of resources at the edge, the core, and along the datapath to support dynamic and data-driven application workflows, that is, they need to leverage a computing continuum.

Our research can be seen as covering key research questions including (1) how to take into account what, where, and when data get collected and analyzed; (2) how to program services to respond to changes in application behavior or data variability; (3) how to react to changes and trigger rules associated to the content of the data; and (4) how to consider users constraints and quality of service to deliver data products.

Enabling edge-to-cloud integration: Dramatic changes in the technology landscape marked by increasing scales and pervasiveness of compute and data have resulted in the proliferation of edge applications aimed at effectively processing data in a timely manner. As the levels and fidelity of instrumentation increases and the types and volumes of available data grow, new classes of applications are being explored that seamlessly combine real-time data with complex models and data analytics to monitor and manage systems of interest. We are currently exploring a range of approaches for improving the deployment of data-driven workflows.

Related publications:

  • Towards a computing continuum: Enabling edge-to-cloud integration for data-driven workflows, Daniel Balouek-Thomert, Eduard Gibert Renart, Ali Reza Zamani, Anthony Simonet, Manish Parashar, 2019, IJHPCA2019
  • An Edge-Based Framework for Enabling Data-Driven Pipelines for IoT Systems, Eduard Gibert Renart, Daniel Balouek-Thomert and Manish Parashar, 2019, Paise2019
  • Eduard Renart, Daniel Balouek-Thomert, Manish Parashar
    Pulsar: Enabling Dynamic Data-Driven IoT Applications
    In FAS*W 2017: The IEEE 2nd International Workshops on Foundations and Applications of Self* Systems, Tucson, United States, September 2017 [bibtex] [hal]

Runtime management of Stream processing applications: Modern Cyberinfrastructures (CIs) operate to bring content produced from remote data sources such as sensors and scientific instruments and deliver it to end-users and workflow applications. Maintaining data quality/resolution and on-time data delivery while considering an increasing number of computing, storage and network resources require reactive systems able to adapt to changing demands. By expressing the dynamic stage of resources in the context of edge and in-transit computing, this research considers resource utilization, approximation techniques, and users’ constraints to adapt workflows on heterogeneous geo-distributed re-sources

Related publications:

  • Ali Reza Zamani, Daniel Balouek-Thomert, JJ Villalobos, Ivan Rodero, Manish Parashar
    Submarine: A Subscription-based Data Streaming Framework for Integrating Large Facilities and Advanced Cyberinfrastructure
    Concurrency and Computation: Practice and Experience (CCPE), John Wiley and Sons, Ltd, USA, 2019 [bibtex] [hal] [DOI]
  • Ali Reza Zamani, Daniel Balouek-Thomert, JJ Villalobos, Ivan Rodero, Manish Parashar
    Runtime Management of Data Quality for Scientific Observatories Using Edge and In-Transit Resources
    In SBACPAD 2018: 30th International Symposium on Computer Architecture and High-Performance Computing, Lyon, France, September 2018 [bibtex] [hal]
  • Ali Reza Zamani, Moustafa AbdelBaky, Daniel Balouek-Thomert, Ivan Rodero, Manish Parashar
    Supporting Data-Driven Workflows Enabled by Large Scale Observatories
    In e-Science 2017: IEEE 13th International Conference on e-Science, Auckland, New Zealand, October 2017 [bibtex] [hal]

Edge-based middleware for Internet of Things Applications: Due to the proliferation of the Internet of Things (IoT) paradigm, the number of devices connected to the Internet is growing. These devices are generating unprecedented amounts of data at the edges of the infrastructure. Although the generated data provides great potential, identifying and processing relevant data points hidden in streams of unimportant data, and doing this in near real-time, remains a significant challenge. Existing stream processing platforms require the data to be transported to the cloud for processing, resulting in latencies that can prevent timely decision making or may reduce the amount of data processed. To tackle this problem, we designed an IoT Edge Framework, called R-Pulsar, that extends cloud capabilities to local devices and provides a programming model for deciding what, when, and where data get collected and processed.

Overall layered architecture of edge-based data-intensive IoT system to achieve computing in the continuum.

Related publications:

  • Eduard Gibert Renart, Alexandre da Silva Veith, Daniel Balouek-Thomert, Marcos Dias de Assunção, Laurent Lefèvre and Manish Parashar
    Distributed Operator Placement for IoT Data Analytics Across Edge and Cloud Resources
    In CCGRID 2019: The 19th Annual IEEE/ACM International Symposium in Cluster, Cloud, and Grid Computing, Cyprus, May 2019 [bibtex] [hal]
  • Eduard Renart, Daniel Balouek-Thomert, Xuan Hu, Jie Gong, Manish Parashar
    Online Decision-Making Using Edge Resources for Content-Driven Stream Processing
    In e-Science 2017: 13th International Conference on e-Science, Auckland, New Zealand, October 2017 [bibtex] [hal]
  • Eduard Renart, Daniel Balouek-Thomert, Manish Parashar
    Pulsar: Enabling Dynamic Data-Driven IoT Applications
    In FAS*W 2017: The IEEE 2nd International Workshops on Foundations and Applications of Self* Systems, Tucson, United States, September 2017 [bibtex] [hal]

 

 

Applications and Applied models: Applying learning models to large-scale IoT data is a compute-intensive task and needs significant computational resources. Existing approaches transfer this big data from IoT devices to a central cloud where inference is performed using a machine learning model. However, the network connecting the data capture source and the cloud platform can become a bottleneck. We address this problem by distributing the deep learning pipeline across edge and cloudlet/fog resources and applying active adaptations on communications, network, and resources.

Illustration of an Earthquake Early Warning System infrastructure. Seismic sensors transfer data continuously to a centralized data center where data are processed. When P-waves are identified, an earthquake warning is emitted to warn broadcasting users.

Related publications:

  • Issam Raïs, Daniel Balouek-Thomert, Anne-Cécile Orgerie, Laurent Lefèvre, Manish Parashar.
    Leveraging energy-efficient non-lossy compression for data-intensive applications
    In HPCS 2019 – 17th International Conference on High-Performance Computing & Simulation, Jul 2019, Dublin, Ireland. pp.1-7. [bibtex] [hal]
  • Edward Chuah, Arshad Jhumka, Samantha Alt, Daniel Balouek-Thomert, James C Browne, Manish Parashar
    Towards comprehensive dependability-driven resource use and message log-analysis for HPC systems diagnosis
    Journal of Parallel and Distributed Computing (JPDC), USA, 2019 [bibtex] [hal] [DOI]
  • Ioan Petri, Ali Reza Zamani, Daniel Balouek-Thomert, Omer Rana, Yacine Rezgui, Manish Parashar
    Ensemble-Based Network Edge Processing
    In UCC 2018: The 2018 IEEE/ACM 11th International Conference on Utility and Cloud Computing, Zurich, Switzerland, December 2018 [bibtex] [hal]
  • Muhammad Ali, Ashiq Anjum, M.U Yaseen, Ali Reza Zamani, Daniel Balouek-Thomert, Omer Rana, Manish Parashar
    Edge Enhanced Deep Learning System for Large-scale Video Stream Analytics
    In ICFEC 2018: The 2nd IEEE International Conference on Fog and Edge Computing, Washington, United States, May 2018 [bibtex] [hal]

Previous research:

My Ph.D. research focused on implementing tradeoff mechanisms between performance and energy consumption for large scale cloud applications [Dissertation PDF]. This research was recognized with the 2015 IEEE CloudTech Best Paper Award and a 2015 CEFIPRA Grant for collaborative research between Inria, France and Mahindra Ecole Centrale, India.

Industry research focused on the energy efficiency of virtual machine placement for datacenters. [All Publications]