Bhattacharya, Indrajit and Godbole, Shantanu and Joshi, Sachindra and Verma, Ashish (2012) Cross-Guided Clustering: Transfer of Relevant Supervision across Tasks. In: ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 6 (2).
acm_tkdd_6-2_2012.pdf - Published Version
Restricted to Registered users only
Download (573Kb) | Request a copy
Lack of supervision in clustering algorithms often leads to clusters that are not useful or interesting to human reviewers. We investigate if supervision can be automatically transferred for clustering a target task, by providing a relevant supervised partitioning of a dataset from a different source task. The target clustering is made more meaningful for the human user by trading-off intrinsic clustering goodness on the target task for alignment with relevant supervised partitions in the source task, wherever possible. We propose a cross-guided clustering algorithm that builds on traditional k-means by aligning the target clusters with source partitions. The alignment process makes use of a cross-task similarity measure that discovers hidden relationships across tasks. When the source and target tasks correspond to different domains with potentially different vocabularies, we propose a projection approach using pivot vocabularies for the cross-domain similarity measure. Using multiple real-world and synthetic datasets, we show that our approach improves clustering accuracy significantly over traditional k-means and state-of-the-art semi-supervised clustering baselines, over a wide range of data characteristics and parameter settings.
|Item Type:||Journal Article|
|Additional Information:||Copyright for this article belongs to the ACM|
|Keywords:||Multitask; transfer; cluster alignment|
|Department/Centre:||Division of Electrical Sciences > Computer Science & Automation (Formerly, School of Automation)|
|Date Deposited:||17 Sep 2012 10:34|
|Last Modified:||17 Sep 2012 10:34|
Actions (login required)