![Top of Menu](images/menuTop.jpg)
![Home](images/menuHome.jpg)
![CFP](images/menuCfp.jpg)
![Program](images/menuProgramS.jpg)
![Committees](images/menuCommittee.jpg)
![Key Dates](images/menuKeyDates.jpg)
![Location](images/menuLocation.jpg)
![Hotel](images/menuHotel.jpg)
![Registration](images/menuRegistration.jpg)
![Students](images/menuStudents.jpg)
![Sponsors](images/menuSponsors.jpg)
![Media](images/menuMedia.jpg)
![Submission](images/menuSubmission.jpg)
![Tutorials](images/menuTutorial.jpg)
![Workshops](images/menuWorkshops.jpg)
![Travel Info](images/menuTravel.jpg)
![Proceedings](images/menuProceedings.jpg)
Track: Data Mining
Paper Title:
Page-level Template Detection via Isotonic Smoothing
Authors:
Abstract:
We develop a novel framework for the ``page-level'' template detection
problem. Our framework is built on two main ideas. The first is the
automatic generation of training data for a classifier that, given a
page, assigns a templateness score to every DOM node of the page. The
second is the global smoothing of these per-node classifier scores by
solving a regularized isotonic regression problem; the latter follows
from a simple yet powerful abstraction of templateness on a page. Our
extensive experiments on human-labeled test data show that our approach
detects templates effectively.