Yale University.  
Computer Science.  
     
Computer Science
Main Page
Academics
Graduate Program
Undergraduate Program
Course Information
Course Catalog
Course Web Pages
Research
Our Research
Research Areas
Research Projects
Publications
People
Faculty
Graduate Students
Research and Technical Staff
Administrative Staff
Alumni
Resources
Calendars
Computing Facilities
Yale Computer Science FAQ
Yale Workstation Support
Computing Lab
AfterCollege Job Resource
Department Information
Contact Us
History
Life in the Department
Life About Town
Directions
Job Openings
Faculty Positions
Useful Links
City of New Haven
Yale Applied Mathematics
Yale Faculty of Engineering
Yale University Home Page
Google Search
Yale Info Phonebook
Internal
Internal
 

Department Talk
April 14, 2009
11:00 a.m., AKW 200

Speaker:
Aryeh Kontorovich, Weizmann Institute of Science
Title:
Universal Kernel-Based Learning with Applications to Regular Languages

Abstract: We propose a novel framework for supervised learning of discrete concepts.
Since the 1970's, the standard computational primitive has been to find the most consistent hypothesis in a given complexity class. In contrast, in this paper we propose a new basic operation: for each pair of input instances, count how many concepts of bounded complexity contain both of them.

Our approach maps instances to a Hilbert space, whose metric is induced by a universal kernel coinciding with our computational primitive, and identifies concepts with half-spaces. We prove that all concepts are linearly separable under this mapping. Hence, given a labeled sample and an oracle for evaluating the universal kernel, we can efficiently compute a linear classifier (via SVM, for example) and use margin bounds to control its generalization error. Even though exact evaluation of the universal kernel may be infeasible, in various natural situations it is efficiently approximable.

Though our approach is general, our main application is to regular languages. Our approach presents a substantial departure from current learning paradigms and in particular yields a novel method for learning this fundamental concept class. Unlike existing techniques, we make no structural assumptions on the corresponding unknown automata, the string distribution or the completeness of the training set. Instead, given a labeled sample our algorithm outputs a classifier with guaranteed distribution-free generalization bounds; to our knowledge, the proposed framework is the only one capable of achieving the latter. Along the way, we touch upon several fundamental questions in complexity, automata, and machine learning.

Joint work with Boaz Nadler.