Next, the DTW algorithm calculates a cost matrix that allows us to determine the alignment costs between two time series. To this end, the equation in (2) is utilized, which computes these costs for each element . The cost matrix helps with finding the so-called warping path , which is the path with the lowest alignment costs. It is obtained by moving through the cost matrix in reverse order (i.e., ). Its individual positions are then applied to the distance matrix to yield the overall alignment costs (i.e., ). Subsequently, the alignment costs in are averaged as shown in equation (3), whereas refers to the total number of elements in the warping path. The lower the value, the more similar two time series are considered in their shape. Finally, after all distances are computed according to equation (3), an updated distance matrix is passed to the AHC algorithm.
AHC is a variant of hierarchical clustering that creates a binary, rooted hierarchy of clusters from the bottom up, meaning every data point, or walking trajectory as in our case, is a cluster at first [1]. It is able to work with time series of arbitrary shapes and remedies the challenge of poor initial clusters [2]. In a nutshell, the algorithm builds on the computed distance matrix from the DTW algorithm and, then, recursively applies a linkage criterium to update the distance matrix by merging elements with the shortest distance until no more new instances are left to be merged [2]. AlMahamid and Grolinger [2] recently achieved the best clustering performance with the UPGMA (unweighted pair group method with arithmetic mean) linkage criterium in AHC, hence we decided to use it in our research as well. The UPGMA algorithm was implemented following to the equation given in (4). Here, each distance () between a cluster, for instance, and (), and a new cluster , is the result of proportional averaging the distances of and . At each iteration, computed cluster distances are stored temporarily to be ultimately included in a dendrogram relating each and every cluster to one another.
Evaluating extracted clusters without assigned data labels is challenging and there is still no universally accepted technique, neither visually nor numerically, to do so [1]. However, as Aghabozorgi et al. [1] note, labeling by a human judge can capture an algorithm's strengths and shortcomings as ground truth in practice. Because AHC has a great visualization power [1], we chose to assess the performance of our implementation visually for the context of this research. To this end, we randomly selected a subset of 352 records from the entire data set (roughly 10% of records). We did so primarily to keep manual efforts (e.g., comparing walking trajectories by hand) and the overall computational costs to a reasonable minimum. As central visual tools for our assessment, we used the AHC algorithm's dendrogram (see Figure 2) and visualizations of walking trajectory clusters (see Figure 4).
The dendrogram in Figure 2 suggests that the data set contains one large, homogeneous group of walking trajectories. Specifically, a lot of trajectories were clustered at around the smallest computed distance of 0.40. Only a few stand in stark contrast to the aforesaid group. In one instance, the difference reaches as much as roughly 929 times the value (371.75). Apart from that, there is no other comparably homogeneous group observable in the data set.
In light of this observation, we then experimented with different cutoff levels in the dendrogram. To demonstrate better, how we went about during analysis and how the algorithm successfully distilled mobility behavior patterns, we, in the end, decided to define three cutoff levels: 12, 55, and 75 clusters, respectively (see Figure 2). While this decision is not conclusive in a mathematical sense, we did so, because, on the one hand, with a total of 12 clusters, intra-cluster distances were reduced notably by about one-sixth, or more. At this cutoff level, we were also able to present a mixture of clusters as an example (see Figure 4). On the other hand, we wanted to vividly illustrate the algorithm's ability to isolate the dominant mobility behavior mentioned before. With the cutoff levels of 55 (see Figure 3b) and 75 (see Figure 3c), were able to do this in an exemplary manner.
Our analysis led to the following conclusions. First, and foremost, the AHC algorithm was able to identify and group together two-dimensional walking trajectories with (dis-)similarity in shape. In our example, it became evident that, as intra-cluster distances gradually minimized, the algorithm increasingly better isolated the mobility behavior selected (see Figure 3). Therefore, what has been observed in the dendrogram initially, now became substantiated as the algorithm correctly merged the many walking trajectories recorded near people's main walking path to the left of both display installations. In this area, both Kinect sensors tracked the most people [14]. Second, at any given cutoff level, the algorithm is able to suggest potential outliers and clusters requiring further examination as shown in the example in Figure 4. Outliers, meaning one to many walking trajectories with rather large inter-cluster distances, can be quickly identified and may point to a unique, novel, or uncommon behavior. For the data set at hand, outliers are, for example, clusters C10 and C11 for which we found no other incidents similar in shape in the data set. On the contrary, cluster C1 indicates that a higher cutoff level would be necessary to unveil patterns underlying this cluster.
The present study is the first in its field to adopt both AHC and DTW in an effort to perform a data-driven identification of patterns in skeletal data. By demonstrating both viability and usefulness of this approach, we make a contribution towards the methodological framework for quantitative analyses of mobility behavior in an ambient display's wider context. Specifically, we add to existing research [8, 14] by proposing an approach able to (a) assist in evaluating skeletal data without any pre-existing knowledge and to (b) automatically suggest (dis-)similarities in people's mobility behavior based on walking trajectory characteristics. In our view, fully or partially automated clustering is an indispensable tool for drawing conclusions from larger data sets, which may involve many thousands of individual records. Such conclusions can guide, for instance, mixed-methods field deployment studies at the outset to design interviews and observations. Similarly, during later stages of a research endeavor, clustering information can be used to cross-validate findings from other methods. Throughout our research, we experienced that both the dendrogram and the visualizations at different cutoff levels assisted greatly in becoming familiar with the data set at hand more quickly. While our approach is not a replacement for the visual analysis by human experts, we show that it is of value in tandem with, or as a precursor to, manual analysis. In perspective, research like ours may aid embarking on understudied issues such as how usage changes over time in longitudinal deployments [10]. Furthermore, while we worked with one specific format of skeletal data provided by the Kinect camera, the AHC algorithm can be handily applied to formats of location tracking data drawn from other depth cameras (e.g., Stereolabs ZED 2). In such instances, adjustments only have to be made in terms of selecting the correct axes as input parameters. Skeletal data, such as in the case of the Kinect sensor, may also contain additional information beyond joint coordinates (e.g., user engagement). Such information can be equivalently used in the AHC algorithm to cluster data accordingly.
Finally, attention is drawn to research limitations. First, AHC and DTW have a computational complexity of and , respectively [1, 16]. Both algorithms cannot deal with very large data sets effectively [1, 7] hence we are examining means to optimize their computational runtimes. Previous research [16], for instance, recommends using a warping window parameter in DTW that lowers its overall complexity to . Second, walking trajectories by themselves evoke ambiguity in leaving the reason for the movement open to interpretation [10]. Therefore, qualitative methods such as interviews may be additionally required to cross-validate observations. Third, the quality of our evaluation depended greatly on our subjective judgment. To address this issue mathematically, we are planning to incorporate measures such as within-cluster sum of squares and the silhouette coefficient. In doing so, we prospectively will be able to determine the optimal number of clusters [2, 19]. Forth, our evaluation was conducted with only one particular data set. More examinations with other data sets are required to profoundly underline the algorithm's clustering performance. Fifth, we utilized solely two-dimensional data of cases where one person was tracked. Thus, means are to be developed to process skeletal data containing information about multiple people. Lastly, our random selection of approximately 10% of the data during evaluation may have, to a lesser or greater extent, affected the results as well (e.g., instances of patterns may have been underrepresented in the data set). Considering the above-mentioned optimizations regarding the algorithms' complexity, we are planning to work with larger data sets in the future.
This research envisions guidance on how to practically increase the level of automation during the analysis process of multi-dimensional skeletal data. It proposes using AHC, leveraging DTW as a distance measure, for this purpose. Algorithmically, walking trajectories distilled from a real-world skeletal data set were merged into coherent clusters, individually displaying (dis-)similarities in the underlying mobility behavior. By demonstrating our approach's actual usefulness, it is, arguably, a viable addition to the repertoire of existing qualitative and quantitative methods for longitudinal ambient display research in the wild. We see its main strengths in markedly reducing the manual effort necessary during analysis and in becoming familiar with a data set at hand more quickly. In the future, we will focus primarily on optimizing the computational complexity of the algorithms, evaluating pre-processing steps such as z-normalization, and extending the existing approach to suggest the optimal number of clusters.
This research is funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation)—project number 451069094.