Track: Search
Paper Title:
Search Engines and their Public Interfaces: Which APIs are the Most Synchronized?
Authors:
Abstract:
Researchers of commercial search engines often collect data
using the application programming interface (API) or by
"scraping" results from the web user interface (WUI), but
anecdotal evidence suggests the interfaces produce different
results. We provide the first in depth quantitative analysis
of the results produced by the Google, MSN and Yahoo API
and WUI interfaces. After submitting a variety of queries
to the interfaces for 5 months, we found significant discrepancies
in several categories. Our findings suggest that the
API indexes are not older, but they are probably smaller
for Google and Yahoo. Researchers may use our findings to
better understand the differences between the interfaces and
choose the best API for their particular type of queries.