17 July 2017: Jiawei Han (Joint LFCS/CISA Seminar)

Title:

Mining Structures from Massive Text Data: A Data-Driven Approach

 

Speaker: 

Jiawei Han, Abel Bliss Professor, Department of Computer Science, University of Illinois at Urbana-Champaign

 

ABSTRACT

The real-world big data are largely unstructured, interconnected, and in the form of natural language text.  One of the grand challenges is to turn such massive data into structured networks and actionable knowledge.  We propose a text mining approach that requires only distant or minimal supervision but relies on massive data.   We show quality phrases can be mined from such massive text data, types can be extracted from massive text data with distant supervision, and relationships among entities can be discovered by meta-path guided network embedding.  Finally, we propose a D2N2K (i.e., data-to-network-to-knowledge) paradigm, that is, first turn data into relatively structured information networks, and then mine such text-rich and structure-rich networks to generate useful knowledge.  We show such a paradigm represents a promising direction at turning massive text data into structured networks and useful knowledge.

 

Short bio:

Jiawei Han is Abel Bliss Professor in the Department of Computer Science, University of Illinois at Urbana-Champaign.  He has been researching into data mining, information network analysis, database systems, and data warehousing, with over 600 journal and conference publications. He has chaired or served on many program committees of international conferences, including PC co-chair for KDD, SDM, and ICDM conferences, and Americas Coordinator for VLDB conferences.  He also served as the founding Editor-In-Chief of ACM Transactions on Knowledge Discovery from Data and the Director of Information Network Academic Research Center supported by U.S. Army Research Lab, and is the co-Director of KnowEnG, an NIH funded Center of Excellence in Big Data Computing.  He is a Fellow of ACM and Fellow of IEEE, and received 2004 ACM SIGKDD Innovations Award, 2005 IEEE Computer Society Technical Achievement Award, 2009 M. Wallace McDowell Award from IEEE Computer Society.  His co-authored b!

 ook "Data Mining: Concepts and Techniques" has been adopted as a textbook popularly worldwide.