|
Main
Page
Graduate
Program
Undergraduate
Program
Course Information
Course
Catalog
Course
Web Pages
Our
Research
Research Areas
Research
Projects
Publications
Faculty
Graduate
Students
Research
and Technical Staff
Administrative
Staff
Alumni
Calendars
Computing Facilities
Yale
Computer Science FAQ
Yale Workstation Support
Computing
Lab
AfterCollege
Job Resource
Contact
Us
History
Life in the Department
Life About Town
Directions
Faculty
Positions
City
of New Haven
Yale Applied Mathematics
Yale Faculty of Engineering
Yale
University Home Page
Google Search
Yale Info Phonebook
Internal |
|
Department Talk
April 14, 2009
11:00 a.m., AKW 200
Speaker: Aryeh
Kontorovich, Weizmann Institute of Science
Title: Universal Kernel-Based Learning with Applications
to Regular Languages
Abstract: We propose a novel framework
for supervised learning of discrete concepts.
Since the 1970's, the standard computational primitive has been to find
the most consistent hypothesis in a given complexity class. In contrast,
in this paper we propose a new basic operation: for each pair of input
instances, count how many concepts of bounded complexity contain both
of them.
Our approach maps instances to a Hilbert space, whose metric is induced
by a universal kernel coinciding with our computational primitive, and
identifies concepts with half-spaces. We prove that all concepts are linearly
separable under this mapping. Hence, given a labeled sample and an oracle
for evaluating the universal kernel, we can efficiently compute a linear
classifier (via SVM, for example) and use margin bounds to control its
generalization error. Even though exact evaluation of the universal kernel
may be infeasible, in various natural situations it is efficiently approximable.
Though our approach is general, our main application is to regular languages.
Our approach presents a substantial departure from current learning paradigms
and in particular yields a novel method for learning this fundamental
concept class. Unlike existing techniques, we make no structural assumptions
on the corresponding unknown automata, the string distribution or the
completeness of the training set. Instead, given a labeled sample our
algorithm outputs a classifier with guaranteed distribution-free generalization
bounds; to our knowledge, the proposed framework is the only one capable
of achieving the latter. Along the way, we touch upon several fundamental
questions in complexity, automata, and machine learning.
Joint work with Boaz Nadler.

|
 |