Logo image
Unified Structure and Content Search for Personal Information Management Systems
Technical documentation   Open access

Unified Structure and Content Search for Personal Information Management Systems

Wei Wang, Amelie Marian and Thu Nguyen
Rutgers University
2009
DOI:
https://doi.org/10.7282/T3V40ZNB

Abstract

The amount of data that users are storing and accessing in personal information systems is growing massively. At the same time, the organization of this data is becoming more heterogeneous, with data spread across different organizational domains such as emails, music databases, and photo albums, some of which are structured by applications rather than users. Powerful search tools are needed to help users locate data in these rapidly expanding yet fragmented data sets. In this paper, we present a novel fuzzy search approach that considers approximate matches to structure and content query conditions. Our approach includes a scoring framework for computing unified relevance scores for potential answers. Critically, our framework uses unified data and query processing models so that structure conditions can be approximately matched by content inside files and vice versa. Our model also unifies external structure (directories) with internal structure (e.g., XML structure), allowing users to specify integrated queries that are matched to a single unified data domain. We propose indexes and algorithms for efficient query processing. Finally, we empirically evaluate our approach using a real data set. We show that our unified fuzzy search approach can leverage structure information to significantly improve search accuracy, yet is robust to mistakes in query conditions.
pdf
tr5b47cd03645b6173.45 kBDownloadView
Version of Record (VoR) Open Access
url
Report an accessibility issueView
Please complete a content remediation request to report an accessibility issue with a library electronic resource, website, or service.

Metrics

48 File downloads
53 Record Views

Details

Logo image