November 10, 2019, New York

Data: Acquisition to Analysis

A SenSys/BuildSys 2019 Workshop


The workshop will be held in Hamilton Hall at Columbia University.

Due to #CoffeeBreakChaos, times have updated this morning. Please double-check the schedule!


8:30-9:00, Hamilton Hall entrance area


9:10-9:20, Hamilton 517

Speaker: Pat Pannuto

Slides: (pptx) (pdf)

Session 1: Who’s There? – Finding People and Their State of Being

9:20-10:20, Hamilton 517

Chair: Pat Pannuto

Occupancy Sensing and Activity Recognition with Camera and Wireless Sensors

Yang Zhao, Peter Tu (GE Research); Ming-Ching Chang (University at Albany, SUNY)
Dataset: DOI

Dataset: Occupancy Detection, Tracking, and Estimation Using a Vertically Mounted Depth Sensor

Fabricio Flores (Carnegie Mellon University); Sirajum Munir (Bosch Research & Technology Center); Matias Quintana (National University of Singapore); Anand Krishnan Prakash (Lawrence Berkeley National Laboratory); Mario Berges (Carnegie Mellon University)
Dataset: DOI

Dataset: User side acquisition of People-Centric Sensing in the Internet-of-Things

Chenguang Liu, Jie Hua, Tomasz Kalbarczyk, Sangsu Lee, Christine Julien (The University of Texas at Austin)
Dataset: DOI

Dataset: Inferring Thermal Comfort using Body Shape Information Utilizing Depth Sensors

Sirajum Munir (Bosch Research and Technology Center); Jonathan Francis (Carnegie Mellon University, Bosch Research Pittsburgh); Matias Quintana (National University of Singapore); Nadine von Frankenberg (Technical University of Munich); Mario Berges (Carnegie Mellon University)
Dataset: DOI

Session 1 Breakout & Discussion


Coffee Break


Session 2a (BuildSys Track): The Built Environment – Instrumented Buildings and Their Data

11:10-12:25, Hamilton 517

Chair: Fisayo Caleb Sangogboye

Dataset: An Open Dataset and Collection Tool for BMS Point Labels

Gabe Fierro, Sriharsha Guduguntla, David E. Culler (UC Berkeley)
Dataset: DOI
Slides: (pptx) (pdf)

Dataset: Occupancy Presence and Trajectory Dataset from an Instrumented Public Building

Anooshmita Das, Jens Hjort Schwee, Emil Stubbe Kolvig-Raun, Mikkel Baun Kjærgaard (University of Southern Denmark)
Dataset: DOI

Dataset: Tracing Indoor Solar Harvesting

Lukas Sigrist, Andres Gomez, Lothar Thiele
Dataset: DOI
Slides: (link)

Session 2a Breakout & Discussion


BuildSys DATA Breakout & Discussion


Session 2b (SenSys Track): The Wireless Channel – More than Meets the Eye

11:10-12:25, Hamilton 516

Chair: Carlos Ruiz Dominguez

Channel State Information (CSI) analysis for Predictive Maintenance using Convolutional Neural Network (CNN)

Prachi Bagave, Jeroen Linssen, Wouter Teeuw (Saxion, University of Applied Sciences);Jeroen Klein Brinke, Nirvana Meratnia (Pervasive Systems group at the University of Twente)
Dataset: DOI

Dataset: Wireless Link Quality Estimation on FlockLab – and Beyond

Romain Jacob, Reto Da Forno, Roman Trueb, Andreas Biri, Lothar Thiele (ETH Zurich)
Dataset: DOI
Slides: (pdf)

Dataset: Channel state information for different activities, participants and days

Jeroen Klein Brinke, Nirvana Meratnia (Pervasive Systems group at the University of Twente)
Dataset: DOI

Session 2b Breakout & Discussion


SenSys DATA Breakout & Discussion




Session 3: The Great Outdoors – Sensing at Farm and City-Scale

13:30-14:20, Hamilton 517

Chair: Arian Prabowo

Designing a Vehicle Mounted High Resolution Multi-Spectral 3D Scanner - Concept Design

Gregory Meyers, Chengxi Zhu, Martin Mayfield, Danielle Densley Tingley, Jon Willmott, Daniel Coca (University of Sheffield)

Dataset: Horse Movement Data and Analysis of its Potential for Activity Recognition

Jacob Kamminga, Nirvana Meratnia, Paul Havinga (University of Twente)
Dataset: DOI

Dataset: LoRa Underground Farm Sensor Network

Rachel Cardell-Oliver (The University of Western Australia); Christof Huebner (University of Applied Sciences Mannheim); Matthias Leopold, Jason Beringer (The University of Western Australia)
Dataset: DOI

Session 3 Breakout & Discussion


Session 4: Details Matter – Looking Deeply at Sensed Data

14:20-15:15, Hamilton 517

Chair: Gabe Fierro

A Signal Quality Assessment Metrics for Vibration-based Human Sensing Data Acquisition

Yue Zhang, Lin Zhang (Tsinghua University); Hae Young Noh, Pei Zhang (Carnegie Mellon University); Shijia Pan (University of California Merced)

Dataset: Indoor Localization with Narrow-band, Ultra-Wideband, and Motion Capture Systems

Usman Raza, Aftab Khan, Roget Kou, Tim Farnham, Thajanee Premalal, Aleksandar Stanoev, William Thompson (Toshiba Research Europe Limited)
Dataset: DOI

Synchronization between Sensors and Cameras in Movement Data Labeling Frameworks

Jacob W. Kamminga (University of Twente); Michael Jones, Kevin Seppi (Brigham Young U.); Nirvana Meratnia, Paul J.M. Havinga (University of Twente)

Session 4 Breakout & Discussion


Coffee Break


Group Discussion

15:45-16:40, Hamilton 517

Moderators: The DATA'19 PC (Shijia, Flora, Mikkel, and Pat)

Closing Remarks

16:40-16:45, Hamilton 517


As the enthusiasm for and success of the Internet of Things (IoT), Cyber-Physical Systems (CPS), and Smart Buildings grows, so too does the volume and variety of data collected by these systems. How do we ensure that this data is of high quality, and how do we maximize the utility of collected data such that many projects can benefit from the time, cost, and effort of deployments?

The Data: Acquisition To Analysis (DATA) workshop aims to look broadly at interesting data from interesting sensing systems. The workshop considers problems, solutions, and results from all across the real-world data pipeline. We solicit submissions on unexpected challenges and solutions in the collection of datasets, on new and novel datasets of interest to the community, and on experiences and results—explicitly including negative results—in using prior datasets to develop new insights.

The workshop aims to bring together a community of application researchers and algorithm researchers in the sensing systems and building domains to promote breakthroughs from integration of the generators and users of datasets. The workshop will foster cross-domain understanding by enabling both the understanding of application needs and data collection limitations.


The workshop seeks contributions across two major thrusts, but is open to a broad view of interesting questions around the collection, dissemination, and use of data as well as interesting datasets:

The collection and use of data

  • - Challenges and solutions in data collection, especially around security and privacy
  • - Expectations and norms for data collection from sensor networks, especially those that involve human factors
  • - Novel insights from existing datasets
  • - Metadata management for complex datasets
  • - Synthetic data, including its generation, application, and utility
  • - Success stories—key properties of useful datasets and how to generalize these
  • - Shortcomings of prior datasets—and how to address these in the future
  • - Position papers on policies and norms from experimental design through data management and use are explicitly welcomed

New and interesting datasets, including but not limited to:

  • - Shopping related sensing data
  • - Animal related data or sensed data
  • - Anonymized health, or synthetic health related data
  • - Indoor localization, especially unprocessed/unfiltered physical layer measurements
  • - Smart building, occupancy, motion data, energy, human comfort, vibration, BIM
  • - Vehicular, GPS, cellular, or wifi traces
  • - Reproductions of prior work that validate, refute, or enhance results

To enable the longevity of submitted datasets, we plan on providing a central location where a repository for the data, and information about the data can be archived for at least 5 years.

Submission Guidelines

Submissions may range from 1-5 pages in PDF format, excluding references, using the standard ACM conference template. Submissions are strongly encouraged to use only as much space as needed to clearly convey the significance of the work—we fully expect many submissions, especially datasets, to use only 1-2 pages, but wish to allow those interested in fully elucidating positions on data collection and use or insights from reproducibility efforts ample space to do so. Submissions should use only as much space as necessary to clearly convey their ideas and contributions. Please do not anonymize your submission. Upon acceptance, instructions for the final camera-ready abstract will be provided.

Dataset submissions should prefix paper titles with “Dataset: “ and must include a description of the dataset as well a reasonable accompanying data sample. Once accepted, a full described dataset must be shared to a public repository by the camera ready deadline. Issues on licenses will be resolved by generally following the procedure similar to CRAWDAD ( and special treatments, if needed, will be discussed separately with the TPC chairs.

Each accepted submission is required to have at least one author attend the workshop and present to the workshop attendees.

Important Dates

Submission deadline: August 2nd, 2019, AOE August 12th, 2019, AOE., submit HERE

Notifications: August 12th, 2019, AOE August 24th, 2019, AOE.

Camera-ready: September 13th, 2019, AOE September 20th, AOE.

Useful links

Submission Site (HotCRP)

DATA’19 in the ACM Proceedings


Workshop Chairs

Pat Pannuto University of California, Berkeley

Shijia Pan University of California Merced

Flora Salim RMIT University

Mikkel Baun Kjærgaard University of Southern Denmark

Dataset Management Chair

Chien-Chun Ni Yahoo! Research

Advising Committee

Jie Gao Stony Brook University

Pei Zhang Carnegie Mellon University

Prabal Dutta University of California, Berkeley

Jie Liu Harbin Institute of Technology

Technical Program Committee

Jorge Ortiz Rutgers University

Wen Hu University of New South Wales (UNSW)

Olga Saukh TU Graz

Jun Han National University of Singapore

Brano Kusy CSIRO

Chris Xiaoxuan Lu University of Oxford

Wan Du University of California, Merced

Arun Vishwanath IBM Research Australia

Clayton Miller National University of Singapore

Rachel Cardell-Oliver The University of Western Australia

Zoltan Nagy UT Austin

Mohammad Saiedur Rahaman RMIT University

Fisayo Caleb Sangogboye University of Southern Denmark

Yongli Ren RMIT University

Jason Koh University of California, San Diego


The 2nd DATA workshop is part of (co-located with) SenSys/BuildSys 2019.

For venue details, visa information, etcetera please visit the SenSys venue page.