Knowledge Extraction from Sensor Data


KAT follows a simpe workflow processing approach. We use a general workflow that has been extracted by observing several diffferent approaches for information abstraction in the domain of sensor data. The existing approaches either follow the workflow shown in the Figure below or implement some parts of it. We identified the following main steps: Pre-processing to bring the data into shape for further processing, Dimensionality Reduction to either compress the data or reduce its feature vectors, Feature Extraction to find low-level Abstractions in local sensor data, Abstraction from raw data to higher-level Abstractions and finally semantic representations to make the abstracted data available for the end-user and/or machines that interpret the data.



The raw sensory data is pre-processed stage to prepare the data for knowledge acquisition. Pre-processing can be done on sensor nodes to reduce transmission cost or filter unwanted data and can include mathematical/statistical methods to smooth the data by applying moving average windows, or methods from signal processing such as band-, low-, high pass filter to focus on a certain frequency spectrum.

Transmission cost can be reduced by only sending certain information of a current sampling window to the base station/gateway such as minimum and/or maximum values or the mean value of the current window. The pre-processing is not only limited to a single sensor node, some approaches use in-network processing to aggregate the data before further processing it by finding the minimum, mean or maximum value in a set of sensor nodes and before transmitting the data to a base station.

Despite aggregation, in-network techniques can also be used to improve the accuracy of the data by calculating correlation with data from neighbouring nodes.

Dimensionality Reduction

To handle the large amount of data that has to be processed and stored. Dimensionality reduction can decrease the size and length of samples by applying different methods on the data while preserving the important features and patterns.



Initial Website goes online and first prototypical implementation can be downloaded. Source code is following later.