Wednesday, 4:00-5:30 PM
Chair: Paul Bennet
Beyond Position Bias: Examining Result Attractiveness as a Source of Presentation Bias in Clickthrough Data
Yisong Yue, Hein Roehrig, Rajan Patel
Leveraging clickthrough data has become a popular approach for evaluating and optimizing information retrieval systems. Although data is plentiful, one must take care when interpreting clicks, since user behavior can be affected by various sources of presentation bias. While the issue of position bias in clickthrough data has been the topic of much study, other presentation bias effects have received comparatively little attention. For instance, since users must decide whether to click on a result based on its summary (e.g., the title, URL and abstract), one might expect clicks to favor “more attractive” results. In this paper, we examine result summary attractiveness as a potential source of presentation bias. This study distinguishes itself from prior work by aiming to detect systematic biases in click behavior due to attractive summaries inflating perceived relevance. Our experiments conducted on a commercial web search engine show substantial evidence of presentation bias in clicks towards results with more detective titles.
Time is of the Essence: Improving Recency Ranking Using Twitter Data
Anlei Dong, Ruiqiang Zhang, Pranam Kolari, Bai Jing, Yi Chang, Fernando Diaz,
Zhaohui Zheng, Hongyuan Zha
Twitter is a social network and a micro-blogging service, which becomes very popular nowadays. People use Twitter to exchange messages, which contain fresh and useful information. This paper proposes a ranking system for web search which utilizes Twitter data to improve ranking results, especially to improve the freshness of ranking results. We treat the urls that were ever referred by Twitter users (called as Twitter urls) differently compared with regular urls. A challenging problem for Twitter urls is that they lack click information and anchor-text information due to their freshness, which restrict them from being promoted appropriately in ranking results. We analyze the unique characteristics within the twitter microcosm such as Twitter users’ following relationship and the texts of tweets, and we use them as new evidences for ranking Twitter urls appropriately in web search. We then use a compositional modeling algorithm to fully use the available data and different categories of rank features. This approach solves the dilemma in recency ranking that fresh documents cannot be promoted appropriately due to the lack of favorable rank features that need to be aggregated over time. To evaluate ranking results, we not only incorporate recency demotion into discounted cumulative grade (DCG) for stale documents, but also use discounted cumulative freshness (DCF) to evaluate the most fresh documents in ranking results. The efficacy of this approach is illustrated by the experiments on real data.
Visualizing Differences in Web Search Algorithms using the Expected Weighted Hoeffding Distance
Mingxuan Sun. Guy Lebanon, Kevyn Collins-Thompson
We introduce a new dissimilarity function for ranked lists, the expected weighted Hoeffding distance, that has several advantages over current dissimilarity measures for ranked search results. First, it is easily customized for users who pay varying degrees of attention to websites at different ranks. Second, unlike existing measures such as generalized Kendall’s tau, it is based on a true metric, preserving meaningful embeddings when visualization techniques like multi-dimensional scaling are applied. Third, our measure can effectively handle partial or missing rank information while retaining a probabilistic interpretation. Finally, the measure can be made computationally tractable and we give a highly efficient algorithm for computing it. We then apply our new metric with multi-dimensional scaling to visualize and explore relationships between the result sets from different search engines, showing how the weighted Hoeffding distance can distinguish important differences in search engine behavior that are not apparent with other rank-distance metrics. Such visualizations are highly effective at summarizing and analyzing insights on which search engines to use, what search strategies users can employ, and how search results evolve over time. We demonstrate our techniques using a collection of popular search engines, a representative set of queries, and frequently used query manipulation methods.