Uploaded by GoogleTechTalks on Jun 7, 2008
Google Tech Talks
June, 2 2008
ABSTRACT
Feature-based structured models provide a flexible and elegant framework for various information extraction (IE) tasks. These include label sequences for traditional IE, segmentation models for entity-level extractions, and skip chain models for collective labeling. I will present efficient inference algorithms for finding the highest scoring (MAP) prediction for two interesting types of structured models in IE.
I will then present our recent results in max-margin training of such models. There are two popular formulations for maximum margin training of structured spaces: margin scaling and slack scaling. While margin scaling is extremely popular since it requires the same kind of MAP inference as prediction, slack scaling is believed to be more accurate and better-behaved. I will describe an efficient variational approximation to the slack scaling method that solves its inference bottleneck while retaining its accuracy advantage over margin scaling. Further I argue that existing scaling approaches do not separate the true labeling comprehensively while generating violating constraints. I will propose a new max-margin trainer PosLearn that generates violators to ensure separation at each position of a decomposable loss function.
Speaker: Sunita Sarawagi
Sunita Sarawagi researches in the fields of databases, data mining, machine learning and statistics. Her current research interests are information integration, graphical and structured models, and probabilistic databases. She is associate professor at IIT Bombay. Prior to that she was a research staff member at IBM Almaden Research Center. She got her PhD in databases from the University of California at Berkeley and a bachelors degree from IIT Kharagpur. She has several publications in databases and data mining including a best paper award at the 1998 ACM SIGMOD conference and several patents. She is on the editorial board of the ACM TODS, ACM TKDD, and FnT for machine learning journal. She serves on the board of directors of ACM SIGKDD and VLDB. She is program chair for the ACM SIGKDD 2008 conference and has served as program committee member for SIGMOD, VLDB, SIGKDD, ICDE, and ICML conferences.
Category:
Tags:
License:
Standard YouTube License
-
7 likes, 0 dislikes
36:02The Thorium Molten-Salt Reactor: Why Didn't Thi...by GoogleTechTalks22,996 views
1:00:07SPDY Essentialsby GoogleTechTalks2,580 views
57:54Mobile Web Performanceby GoogleTechTalks6,471 views
29:04HCIR 2011: Human Computer Information Retrieval...by GoogleTechTalks368 views
16:23Basics of Markov Chains Example 1by mobart024,333 views
5:26iTrails: Pay-as-you-go Information Integration ...by marcosvazsalles1,770 views
0:26Bullet For My Valentine live in Vienna - Circle...by vikingspower1,422 views
3:24Natural Language Interfaces Revisitedby AnaphoricSystems1,644 views
4:32Kognitio WX2 analytical database - introductionby kognitiowx2387 views
2:33Kognitio WX2 analytical database - row vs columnarby kognitiowx2383 views
8:08Women's 200m Butterfly A Final - 2011 Indianapo...by USASwimmingOrg486 views
19:13Databases will visualize queries too (VLDB 2011)by QueryViz93 views
3:56Danny Pink e Carolinaby crazy4uvideos273 views
8:31Tenzing VLDB Presentationby biswapesh63 views
5:49About the CSIR Center for High Performance Comp...by csirsa149 views
8:15Large-scale information extraction through text...by togotv118 views
5:30[VLDB] Dicussion - presentation 2by thuanthanhluong4 views
13:47[VLDB] Presentation 2 Jesen shannon divergenceby thuanthanhluong12 views
1:48Tolkien (VLDB Demo - 09) - Shortest General Storyby wicknicks65 views
1:29:54Think faster focus better and remember moreRewi...by GoogleTechTalks151,622 views
- Loading more suggestions...
Link to this comment:
All Comments (1)