MONGOOSE: Ingest, Monitor, Rinse, Repeat

Loading...

Sign in or sign up now!
Alert icon
Upgrade to the latest Flash Player for improved playback performance. Upgrade now or more info.
3,686
Loading...
Alert icon
Sign in or sign up now!
Alert icon

Uploaded by on Oct 29, 2009

Google Tech Talk
October 23, 2009

ABSTRACT

Presented by Daniel Gruhl.

Currently, data analytics technology is in high demand as people try to extract as much value as possible from their most valuable resource - the information around them, whether in their organizations or freely and publicly available. Unfortunately, though many data analytics efforts are focused a particularly interesting (and often difficult) question, whose answer hopefully lies in the data, these projects tend to spend most of their cycles acquiring and ingesting data. Thus, the focus of these efforts tend to tilt away from data analysis and towards data ingestion. MONGOOSE is 1) A suite of technologies that one can plug domain knowledge cartridges into and that outputs data suitable for OLAP or BI consumption. One plugs in small amounts of domain knowledge that involves pulling in unstructured, semi-structured and structured data, and MONGOOSE converts it all into structured form. 2) A Platform for Worst-Case Scenario Workflow Management. MONGOOSE is built on the assumption that failure happens and it must be handled quickly and seamlessly, such that it does not stop or hinder information ingest. 3) A Platform for Community-Based Information Extraction around specific phenomenon that can be fed into statistical analysis tools.

Daniel Gruhl (dgruhl@almaden.ibm.com) is a research staff member in the Computer Science Department of IBM Almaden Research Center, San Jose, CA. Dan is currently in the Health Informatics research group. Dan specializes in very large scale text analytics for a variety of applications from healthcare to pop music. Dan co-architected IBM's Unstructured Information Management Architecture (UIMA), which is now the de facto standard for text analytics projects. He earned his Ph.D. in electrical engineering from the Massachusetts Institute of Technology in 2000 with thesis work on distributed text analytics systems. Dan was named in MIT's Technology Review Top 100 (TR 100) in 2004.

Varun Bhagwan (vbhagwan@us.ibm.com) is an advisory software engineer in the Computer Science Department of IBM Almaden Research Center, San Jose, CA. His interests lie in the field of text analytics, data mining, machine learning/AI, internet technologies, and services science. Since joining IBM research in 2001, Varun has worked at multiple levels of a large scale text mining project, ranging from cluster management, to indexing a multi-billion page corpus, to crawling the internet. He is currently a member of the the Health Informatics research group. Varun holds a Master's degree in Computer Science from University of Florida, Gainesville and is currently pursuing a Ph.D. at the University of California, Santa Cruz.

Tyrone Grandison (tyroneg@us.ibm.com) manages the Intelligent Information Systems team in the Computer Science department at the IBM Almaden Research Center, San Jose, CA. Tyrone's research interests are in data disclosure management relevant and applicable to industry verticals. Over the years, Tyrone has worked in data privacy, RFID data management, privacy-preserving mobile data management and text analytics. Tyrone is a senior member of both the ACM and IEEE and was named Pioneer of the Year by NSBE in 2009. Tyrone received a Ph.D. from Imperial College, London and M.Sc. and B.Sc. degrees from the University of the West Indies, Mona, Jamaica.

Category:

Science & Technology

Tags:

License:

Standard YouTube License

  • likes, 0 dislikes

Link to this comment:

Share to:

Top Comments

  • 14:40 - 15:06

    You can't complain about people changing their internal code. They made no guarantees to you and it's not their fault you made yourself dependent.

  • This was a cool talk. I like how he used real world examples from his work.

see all

All Comments (5)

Sign In or Sign Up now to post a comment!
  • Obviously you can. I do it all the time

  • GOOGLE! 42!

Loading...

Alert icon
0 / 00Unsaved Playlist Return to active list
    1. Your queue is empty. Add videos to your queue using this button:
      or sign in to load a different list.
    Loading...Loading...Saving...
    • Clear all videos from this list
    • Learn more