Birds over the bay at sunset, Mountain View

Ni Lao 劳逆

email

I work on machine learning, information retrieval, and natural language processing — now focused on learning to control machines and learning to create machines.

Google large pretrained models CMU LTI · PhD 40+ publications 6 patents

About

Previously I have studied a wide range of topics such as robotic soccer, computer system diagnosis, product search, and question answering.

I graduated from the Language Technologies Institute, School of Computer Science at Carnegie Mellon University. My thesis was advised by professor William W. Cohen. I worked at Google and Apple on language understanding and question answering, and was chief scientist at SayMosaic. Now I work at Google on large pretrained models.

I TAed Machine Learning with Large Datasets (2012) and Machine Learning (2010). Here is my collection of interesting stuff.

Patents 6

Cheng HE, Ni Lao, Xiuqi Tan, Sumang Liu, Method and apparatus for searching historical data, US20190370398A1, 2019

Ni Lao, Chen Liang, Quoc V Le, John Blitzer, Neural question answering system, US20190130251A1, 2019

Ni Lao, Lukasz Mieczyslaw Kaiser, Nitin Gupta, Afroz Mohiuddin, Preyas Popat, Answer to question neural networks, US20180114108A1, WO2018097907A1, DE202017106363U1, GB2557014A, 2018

Ni Lao, Jiazhong NIE, Fan Yang, Natural language processing with an n-gram machine, US Patent App. 16/069,781, WO2019083519A1, 2017

A Subramanya, F Pereira, N Lao, J Blitzer, R Gupta, Querying a data graph using natural language queries, US Patent 10,810,193

Kevyn B Collins-Thompson, Ni Lao, Context-Aware Query Alteration, US Patent App. 13/043,500, 2012

Selected Publications

AGILE: GIScience Series 2, 1-21, 2021
Gengchen Mai, Krzysztof Janowicz, Ling Cai, Rui Zhu, Bo Yan, Blake Regalia, Bo Yan, Meilin Shi, Ni Lao, SE‑KGE: A location‑aware Knowledge Graph Embedding model for Geographic Question Answering and Spatial Semantic Lifting
Transactions in GIS, 2020
Gengchen Mai, Krzysztof Janowicz, Bo Yan, Rui Zhu, Ling Cai, Ni Lao, Multi-Scale Representation Learning for Spatial Feature Distributions using Grid Cells
ICLR 2020
Notes: Symbolic representations are efficient and accurate, and that is how mammals represent positions and locations.
Mai et al, Semantically-Enriched Search Engine for Geoportals: A Case Study with ArcGIS Online
Proceedings of AGILE 2020, Chania, Crete, Greece
Gengchen Mai, Krzysztof Janowicz, Bo Yan, Rui Zhu, Ling Cai, Ni Lao, Contextual Graph Attention for Answering Logical Queries over Incomplete Knowledge Graphs
K-CAP 2019
Felix Wu, Boyi Li, Lequn Wang, Ni Lao, John Blitzer, Kilian Q. Weinberger, Integrated Triaging for Fast Reading Comprehension
Preprint, 2019
Jacob Biloki, Chen Liang, Ni Lao, Neural Program Planner for Structured Predictions
ICLR 2019, Workshop on Deep RL Meets Structured Prediction
Felix Wu, Boyi Li, Lequn Wang, Ni Lao, John Blitzer, Kilian Q. Weinberger, FastFusionNet: New State-of-the-Art for DAWNBench SQuAD
Technical Report, 2019
Gengchen Mai, Krzysztof Janowicz, Cheng He, Sumang Liu, Ni Lao, POIReviewQA: A Semantically Enriched POI Retrieval and Question Answering Dataset
12th Workshop on Geographic Information Retrieval (GIR 2018)
Chen Liang, Mohammad Norouzi, Jonathan Berant, Quoc Le, Ni Lao, Memory Augmented Policy Optimization for Program Synthesis with Generalization
NIPS 2018
Notes: Animals dream of memory traces of high emotional/motivational values. The optimal experience replay strategy in RL is to balance good and bad experiences focusing on the more surprising ones.
T. Mitchell, W. Cohen, E. Hruschka, P. Talukdar, B. Yang, J. Betteridge, A. Carlson, B. Dalvi, M. Gardner, B. Kisiel, J. Krishnamurthy, N. Lao, K. Mazaitis, T. Mohamed, N. Nakashole, E. Platanios, A. Ritter, M. Samadi, B. Settles, R. Wang, D. Wijaya, A. Gupta, X. Chen, A. Saparov, M. Greaves, J. Welling, Never-Ending Learning
Communications of the ACM, 2018
Juanzi Li, Ming Zhou, Guilin Qi, Ni Lao, Tong Ruan, Jianfeng Du, Knowledge Graph and Semantic Computing. Language, Knowledge, and Intelligence
Communications in Computer and Information Science, Springer, 2017
Fan Yang, Jiazhong Nie, William W. Cohen, Ni Lao, Learning to Organize Knowledge with N-Gram Machines
ICLR 2018 Workshop
Notes: Towards indexing meaning in text with a symbolic open-domain schema.
Chen Liang, Jonathan Berant, Quoc Le, Kenneth D. Forbus, Ni Lao, Neural Symbolic Machines: Learning Semantic Parsers on Freebase with Weak Supervision
ACL 2017
Notes: We want a symbolic machine, which is good at large scale KG computation, to be controlled by a neural net, which is good at learning from data.
Felix Wu, Ni Lao, John Blitzer, Guandao Yang, Kilian Weinberger, Fast Reading Comprehension with Convnets
Technical Report, 2017
Ni Lao, Einat Minkov, William Cohen, Learning relational features with backward random walks
ACL 2015
Machine Learning Journal (MLJ 2015), Springer
T. Mitchell, W. Cohen, E. Hruscha, P. Talukdar, J. Betteridge, A. Carlson, B. Dalvi, M. Gardner, B. Kisiel, J. Krishnamurthy, N. Lao, K. Mazaitis, T. Mohammad, N. Nakashole, E. Platanios, A. Ritter, M. Samadi, B. Settles, R. Wang, D. Wijaya, A. Gupta, X. Chen, A. Saparov, M. Greaves, J. Welling, Never-Ending Learning
AAAI 2015
CoRR abs/1406.7445, 2014
Notes: Evaluate exponentially many CRF features efficiently by only considering big gradients, which is very sparse.
Xin Luna Dong, Evgeniy Gabrilovich, Geremy Heitz, Wilko Horn, Ni Lao, Kevin Murphy, Thomas Strohmann, Shaohua Sun, Wei Zhang, Knowledge Vault: A Web-Scale Approach to Probabilistic Knowledge Fusion
KDD 2014
Notes: Automatic KG construction at Google scale.
Ni Lao, Amarnag Subramanya, Fernando Pereira, William W. Cohen, Reading The Web with Learned Syntactic-Semantic Inference Rules
EMNLP 2012
Notes: Read the web.
EMNLP 2011
Notes: To learn effectively we need to first set a budget for every direction/branch we try.
Machine Learning, 2010, Vol. 81(1), 53-67 (ECML 2010)
Ni Lao, Jun Zhu, Liu Liu, Yandong Liu, William W. Cohen, Efficient Relational Learning with Hidden Variable Detection
NIPS 2010
Notes: Where do new concepts come from? They come from the estimated gain of data likelihood.
KDD 2010
Notes: Randomness is your friend for both efficiency and quality.
KDD 2010
Notes: Where do new features come from? They come from the estimated gain of data likelihood.
NTCIR-7 Workshop, Japan, 2008
NTCIR-7 Workshop, Japan, 2008
Journal of Plant Ecology, 1(2): 143-145, 2008
Notes: Genetic Algorithm sucks. SVM (and max entropy) models rock.
Yiming Yang, Abhimanyu Lad, Ni Lao, Abhay Harpale, Bryan Kisiel, Monica Rogati, Utility-based information distillation over temporally sequenced documents
SIGIR, pp. 31-38, 2007
Notes: Simulated user experience from logs.
Chun Yuan, Ni Lao, Ji-Rong Wen, Jiwei Li, Zheng Zhang, Yi-Min Wang, Wei-Ying Ma, Automated Known Problem Diagnosis with Event Traces
EuroSys 2006
Notes: PC Genomics.
Ji-Rong Wen, Ni Lao, Wei-Ying Ma, Probabilistic Model for Contextual Retrieval
SIGIR 2004
Notes: You can find/create great data in an industry environment.
Dependable Systems and Networks (DSN) 2004
Jinyi Yao, Lao Ni, Fan Yang, Yunpeng Cai, Zengqi Sun, Technical Solutions of TsinghuAeolus for Robotic Soccer
RoboCup 2003, pp. 205-213
Notes: A two-time RoboCup world champion used dynamic programming and geometry (instead of statistical machine learning).

Manuscripts & Presentations

An interactive map of concepts from ancient philosophy across traditions — surfacing their equivalences, resonances, and contradictions in a single explorable graph.
Ni Lao, Neuroscience and AI, 2026
In this talk we compare the algorithms in neuroscience against their corresponding algorithms in AI.
I first introduce previous work in query understanding — weakly supervised semantic parsing and related issues such as symbolic representations for efficient inference, unbiased low-variance gradient estimation with experience replays, and sequence reranking model design and training. Then I discuss preliminary work in document understanding, aiming for generalizability, scalability and accountability beyond current large sequence models.
By invitation of Synced. Thanks to Patrick Nguyen and Esther Lee.
Based on an interview from Robin.ly
Ni Lao, Weakly Supervised Natural Language Understanding, JiangMen tutorial, 2019

Ni Lao, Weakly Supervised Natural Language Understanding, AIFrontiers tutorial, 2018

Ni Lao, Do Androids Dream of Great Success?, 2018

Ni Lao, Neural Symbolic Language Understanding, 2017

Ni Lao, Text Generation Survey, 2017

Ni Lao, Xipeng Qiu, Knowledge Acquisition, 2017

Ni Lao, NIPS 2016 Overview, 2016

Ni Lao, Neural Symbolic Machines, 2016

A lecture at CCF ADL65, with my take on the relationship between connectionism and symbolism.

Ni Lao, Elephant and AI, LTI Colloquium Report, Spring 2012

Ni Lao, Programming by Demonstrations and Verbal Commands, LTI Colloquium Report, Spring 2012

Ni Lao, Beyond Shallow Semantics, LTI Colloquium Report, Fall 2011

Ni Lao, CCG, Fractal, and Emergence, LTI Colloquium Report, Spring 2011

Ni Lao, Reinforcement Learning In An Unknown Domain (slides), 2011

Ni Lao, Probabilistic Ontology Model, LTI Colloquium Report, Fall 2010

Ni Lao, Split-Emit Process for Natural Language Generation, Advanced NLP seminar, 2009

Ni Lao, Jun Zhu, Contrastive Feature Induction for Efficient Structure Learning of Conditional Random Fields, 2009

Ni Lao, T. Mitamura, E. Nyberg, Tree Representations for Chinese Semantic Role Labeling, 2009

Ni Lao, Read The Web (slides), Advanced IR seminar, 2007

Ni Lao, Schema Extraction Model, Advanced IR seminar, 2007

Ni Lao, Knowledge Acquisition From Text — A Survey, Statistical NLP class, 2007

Thesis

PhD thesis, 2012. Efficient Random Walk Inference with Knowledge Bases (slides). Carnegie Mellon University

Master thesis, 2006. Data Mining Problems in Automatic Computer Diagnosis. Tsinghua University

Bachelor thesis, 2003. Mining Spatial-Temporal Data Using Constructive Induction. Tsinghua University

Code & Data

Code

2012 — Path Ranking Algorithm, a system for relational retrieval on heterogeneous graphs (github)

2006 — geoSVM, a predictive system for modeling species potential distributions based on SVM. See Wenyun's page

Data Sets

2012 — NELL v165, NELL knowledge graph in both triple format and PRA format

2010 — yeast2, updated yeast data with extra information about Mesh headings, chemicals and affiliations (321K entities, 6.1M links)

2010 — fly, a biological literature graph with 770K entities and 3.5M links

2010 — yeast, a biological literature graph with 164K entities and 2.8M links

Academic Services

2023: ACL*, EMNLP*, ICML, IJGIS, Neurips*, TALLIP, Computers & Security (*Area chair for large language models and reasoning)

2022: ACL, CoNLL, EMNLP, ICLR, KDD, Neurips, TGIS

2021: ACL, AAAI, CoNLL, EMNLP, ICLR, NAACL, SIGIR, TALLIP, GeoAI, NLP4ConvAI

2020: ACL, AAAI, COLING, CoNLL, EACL, EMNLP, ICLR, ICML, IJCAI, Neurips, SIGIR, TALLIP, TKDE

I co-organized the Deep RL Meets Structured Prediction workshop, 2019 — ICLR page, homepage, intro slides

2019: ACL, AAAI, CCL, CoNLL, EMNLP, ICLR, IJCAI, NAACL, SIGIR, TKDE

2018: ACL, CCKS, COLING, EMNLP, NAACL, NLPCC, NIPS, SIGIR

2017: ACL, CCKS, EMNLP, IJCAI, IJCNLP, SIGIR, TKDE, WSDM, Google Research Grants

2016: CIKM, COLING, IJCAI, NAACL, TKDE, WWW, Google Research Grants

2015: CIKM, ICML, IJCAI, MLJ, NIPS, TKDE

Since 2012 I have been the manager of the Machine Learning News Google Group, which seems to be quite popular in academia.