# Clustering & Time Series

I have a multivariate dataset that changes over time. I have extracted (and normalised) some features and used k-means to generate clusters over the entire span of the dataset.

Now I want to see whether the clusters change significantly over time. So, working backwards, and thus reducing the dataset by x-months, can I see a significant reduction on certain clusters?

This, I think, could fall within the realm of time series clustering. I was hoping to avoid complicating the approach, since the clusters are currently meaningful and the approach is relatively simple.

My intuition is to reduce the dataset by x-months and then cluster (using k-means) the data for comparison. However, I may be breaking the rules here, and oversimplifying a complicated problem.

Time-series clustering requires sample size remaining the same but the features changes over time, otherwise it makes little sense. In the question though, inferring from the description sample size increases over time. In that case, to see significant reduction on certain clusters, one should use a fixed sample-size. Then choose fixed sample from the initial time period, and see how their cluster sizes and memberships are changing over time.
$$X_{t_{0}} \supset X_{t_{1}} \supset X_{t_{2}}X_{t_{0}} \supset X_{t_{1}} \supset X_{t_{2}}$$
and corresponding clusterings $$C_{0}, C_{1}, C_{2}C_{0}, C_{1}, C_{2}$$, where $$CC$$ is essentially instances and cluster membership tables. To judge how clustering changes, take samples at $$t_{0}t_{0}$$, such that $$X_{0} \supset X_{t_{0}}X_{0} \supset X_{t_{0}}$$. Tracking how $$X_{0}X_{0}$$‘s membership and cluster sizes on different clusterings $$C_{0}, C_{1}, C_{2}C_{0}, C_{1}, C_{2}$$ changes. This would give a good idea if there are “reductions” (significant changes) over different clustering, given that $$X_{0}X_{0}$$ is representative over-time.