I work on machine learning, information retrieval, and natural language processing. Previously I have studied a wide range of topics such as robotic soccer, computer system diagnosis, product search, and question answering. Now I am interested in learning to control machines, and learning to create machines.I graduated from Language Technologies Institute, School of Computer Science at Carnegie Mellon University. My thesis was advised by professor William W. Cohen. I worked at Google and Apple on language understanding and question answering. I was the chief scientist at SayMosaic. Now I work at Google on large pretrained models.
(Patents) Google Patents
Cheng HE, Ni Lao, Xiuqi Tan, Sumang Liu, Method and apparatus for searching historical data, US20190370398A1, 2019
Ni Lao, Chen Liang, Quoc V Le, John Blitzer, Neural question answering system, US20190130251A1. 2019
Ni Lao, Lukasz Mieczyslaw Kaiser, Nitin Gupta, Afroz Mohiuddin, Preyas Popat, Answer to question neural networks, US20180114108A1, WO2018097907A1, DE202017106363U1, GB2557014A. 2018
Ni Lao, Jiazhong NIE, Fan Yang, Natural language processing with an n-gram machine, US Patent App. 16/069,781, WO2019083519A1, 2017
A Subramanya, F Pereira, N Lao, J Blitzer, R Gupta, Querying a data graph using natural language queries US Patent 10,810,193
Kevyn B Collins-thompson, Ni Lao, Context-Aware Query Alteration, US Patent App. 13/043,500, 2012
Gengchen Mai, Krzysztof Janowicz, Rui Zhu, Ling Cai, Ni Lao, Geographic Question Answering: Challenges, Uniqueness, Classification, and Future Directions , AGILE: GIScience Series 2, 1-21, 2021.
Gengchen Mai, Krzysztof Janowicz, Ling Cai, Rui Zhu, Bo Yan, Blake Regalia, Bo Yan, Meilin Shi, Ni Lao, SE‐KGE: A location‐aware Knowledge Graph Embedding model for Geographic Question Answering and Spatial Semantic Lifting , in Transactions in GIS, 2020.
Gengchen Mai, Krzysztof Janowicz, Bo Yan, Rui Zhu, Ling Cai, Ni Lao,
Multi-Scale Representation Learning for Spatial Feature Distributions using Grid Cells ,
in ICLR 2020.
spotlight presentation ,
Notes: Symbolic representations are efficient and accurate, and that is how mammals represent positions and locations.
Mai et al, Semantically-Enriched Search Engine for Geoportals: A Case Study with ArcGIS Online, In: Proceedings of AGILE 2020, Jun. 16 - 19, 2020, Chania, Crete, Greece.
Gengchen Mai, Krzysztof Janowicz, Bo Yan, Rui Zhu, Ling Cai, Ni Lao. Contextual Graph Attention for Answering Logical Queries over Incomplete Knowledge Graphs, In the 10th ACM International Conference on Knowledge Capture (K-CAP 2019)
Felix Wu, Boyi Li, Lequn Wang, Ni Lao, John Blitzer, Kilian Q. Weinberger, Integrated Triaging for Fast Reading Comprehension, Preprint, 2019.
Jacob Biloki, Chen Liang, Ni Lao, Neural Program Planner for Structured Predictions, In ICLR 2019, Workshop on Deep Reinforcement Learning Meets Structured Prediction.
Felix Wu, Boyi Li, Lequn Wang, Ni Lao, John Blitzer, Kilian Q. Weinberger, FastFusionNet: New State-of-the-Art for DAWNBench SQuAD, Technical Report, 2019
Gengchen Mai, Krzysztof Janowicz and Cheng He, Sumang Liu, Ni Lao. POIReviewQA: A Semantically Enriched POI Retrieval and Question Answering Dataset, In 12th Workshop on Geographic Information Retrieval (GIR 2018)
Chen Liang, Mohammad Norouzi, Jonathan Berant, Quoc Le, Ni Lao,
Memory Augmented Policy Optimization for Program Synthesis with Generalization ,
In NIPS 2018
Notes: Animals dream of memory traces of high emotional/motivational values. The optimal experience replay strategy in RL is to balance good and bad experiences focusing on the more supprising ones.
T. Mitchell, W. Cohen, E. Hruschka, P. Talukdar, B. Yang, J. Betteridge, A. Carlson, B. Dalvi, M. Gardner, B. Kisiel, J. Krishnamurthy, N. Lao, K. Mazaitis, T. Mohamed, N. Nakashole, E. Platanios, A. Ritter, M. Samadi, B. Settles, R. Wang, D. Wijaya, A. Gupta, X. Chen, A. Saparov, M. Greaves, J. Welling, Never-Ending Learning, Communications of the ACM, 2018
Juanzi Li, Ming Zhou, Guilin Qi, Ni Lao, Tong Ruan, Jianfeng Du, Knowledge Graph and Semantic Computing. Language, Knowledge, and Intelligence, Communications in Computer and Information Science, Springer, 2017
Fan Yang, Jiazhong Nie, William W. Cohen, Ni Lao,
Learning to Organize Knowledge with N-Gram Machines ,
ICLR 2018 Workshop.
best poster award,
Notes: Towards indexing meaning in text with a symbolic open-domain schema.
Chen Liang, Jonathan Berant, Quoc Le, Kenneth D. Forbus, Ni Lao,
Neural Symbolic Machines:
Learning Semantic Parsers on Freebase with Weak Supervision , ACL 2017.
poster slides slides
Notes: We want a symbolic machine, which is good at large scale KG computation, to be controlled by a neural net, which is good at learning from data.
Felix Wu, Ni Lao, John Blitzer, Guandao Yang, and Kilian Weinberger, Fast Reading Comprehension with Convnets , Techincal Report, 2017.
Ni Lao, Einat Minkov and William Cohen, Learning relational features with backward random walks , ACL 2015. poster
William Yang Wang, Kathryn Mazaitis, Ni Lao, Tom M. Mitchell, William W. Cohen, Efficient Inference and Learning in a Large Knowledge Base: Reasoning with Extracted Information using a Locally Groundable First-Order Probabilistic Logic, Machine Learning Journal (MLJ 2015), Springer.
T. Mitchell, W. Cohen, E. Hruscha, P. Talukdar, J. Betteridge, A. Carlson, B. Dalvi, M. Gardner,B. Kisiel,J. Krishnamurthy, N. Lao, K. Mazaitis, T. Mohammad, N. Nakashole, E. Platanios,A. Ritter, M. Samadi, B. Settles, R.Wang, D.Wijaya, A. Gupta, X. Chen, A. Saparov, M. Greaves, J.Welling (2015): Never-Ending Learning in AAAI-2015.
Ni Lao, Jun Zhu: Contrastive Feature Induction for Efficient Structure Learning of Conditional Random Fields.
CoRR abs/1406.7445 (2014)
Notes: Evaluate exponentially many CRF features efficiently by only considering big gradients, which is very sparse.
Xin Luna Dong, Evgeniy Gabrilovich, Geremy Heitz, Wilko Horn, Ni Lao,
Kevin Murphyy, Thomas Strohmann, Shaohua Sun, Wei Zhang
Knowledge Vault: A Web-Scale Approach to Probabilistic Knowledge Fusion.
Notes: auto-matic KG construcion at Google scale.
Ni Lao, Amarnag Subramanya, Fernando Pereira, William W. Cohen
Reading The Web with Learned Syntactic-Semantic Inference Rules.
Notes: read the web.
Ni Lao, William W. Cohen, Personalized Reading Recommendations for Saccharomyces Genome Database. DILS, 2012 poster
Ni Lao, Tom Mitchell, William W. Cohen,
Random Walk Inference and Learning in A Large Scale Knowledge Base.
EMNLP, 2011 slides
AMT labels of 16 relations
Distant Supervision labels of 96 relations
Notes: To learn effectively we need to first set a budget for every direction/branch we try.
Jun Zhu, Ni Lao, Ning Chen, Eric P. Xing Conditional Topical Coding: an Efficient Topic Model Conditioned on Rich Features. KDD, 2011
Ni Lao, William W. Cohen, Relational retrieval using a combination of path-constrained random walks Machine Learning, 2010, Volume 81, Number 1, Pages 53-67 (ECML, 2010 slides poster )
Ni Lao, Jun Zhu, Liu Liu, Yandong Liu, William W. Cohen,
Efficient Relational Learning with Hidden Variable Detection.
NIPS, 2010 poster
Notes: Where do new concepts come from? They come from the estimated gain of data likelihood.
Ni Lao, William W. Cohen,
Fast Query Execution for Retrieval Models based on Path Constrained Random Walks.
Notes: Randomness is your friend for both efficiency and quality.
Jun Zhu, Ni Lao, E. P. Xing,
Grafting-Light: Fast, Incremental Feature Selection and Structure Learning of Markov Random Fields.
Notes: Where do new features come from? They come from the estimated gain of data likelihood.
Lao, Ni, Hideki Shima, Teruko Mitamura and Eric Nyberg. 2008. Query Expansion and Machine Translation for Robust Cross-Lingual Information Retrieval , in Proceedings of NTCIR-7 Workshop, Japan.
Shima, Hideki, Ni Lao, Eric Nyberg and Teruko Mitamura. 2008. Complex Cross-lingual Question Answering as Sequential Classification and Multi-Document Summarization Task , in Proceedings of NTCIR-7 Workshop, Japan.
W. Zuo, N. Lao, Y. Geng, and K. Ma. 2008.
GeoSVM: an efficient and effective tool to predict species' potential distributions.
Journal of Plant Ecology, 1(2): 143-145.
Notes: Genetic Algorithm sucks. SVM (and max entropy) models rocks.
Yiming Yang,Abhimanyu Lad, Ni Lao, Abhay Harpale, Bryan Kisiel, Monica Rogati,
Utility-based information distillation over temporally sequenced documents,
SIGIR, pp. 31-38, 2007.
Notes: simulated user experience from logs.
Chun Yuan; Ni Lao; Ji-Rong Wen; Jiwei Li; Zheng Zhang; Yi-Min Wang; Wei-Ying Ma,
Automated Known Problem Diagnosis with Event Traces,
Notes: PC Genomics.
Ni Lao, Ji-Rong Wen, Wei-Ying Ma, Yi-Min Wang, Combine High Level Symptom and Low Level State Information for Configuration Fault Diagnosis, LISA, 2004.
Ji-Rong Wen, Ni Lao, Wei-Ying Ma,
Probabilistic Model for Contextual Retrieval,
Notes: you can find/create great data in an industry environment.
Archana Ganapathi, Yi-Min Wang, Ni Lao, Ji-Rong Wen, Why PCs Are Fragile and What We Can Do About It: A Study of Windows Registry Problems, Dependable System and Network (DSN), 2004.
Jinyi Yao, Lao Ni, Fan Yang, Yunpeng Cai, Zengqi Sun,
Technical Solutions of TsinghuAeolus for Robotic Soccer.
Robocup 2003: 205-213,RoboCup, pp. 205-213, 2003
Notes: a two-time RoboCup world champion used dynamic programming and geometry (instead of statistical machine learning).
Ni Lao, Learning to Understand Questions and Organize Knowledge . 2021. In this talk I first introduce previous work in query understanding -- more specifically weakly supervised semantic parsing and its related issues such as 1) symbolic representations for efficient inference; 2) unbiased low-variance gradient estimation with experience replays; 3) sequence reranking model design and training. Then I discuss preliminary work in document understanding, which aims to achieve generalizability, scalability and accountability beyond the current large sequence models.
Ni Lao, A Review of Google's Lingvo . 2019. By invitation of Synced I wrote a review for Google's Lingvo framework. I sincerely thank for the help from Patrick Nguyen and Esther Lee.
Ni Lao, Weakly Supervised Natural Language Understanding . AIFrontiers tutorial, 2018.
Ni Lao, Do Androids Dream of Great Success? . 2018.
Ni Lao, Neural Symbolic Language Understanding . 2017.
Ni Lao, Text Generation Survey . 2017.
Ni Lao, Xipeng Qiu, Knowledge Acquisition . 2017.
Ni Lao, NIPS 2016 Overview . 2016.
Ni Lao, Neural Symbolic Machines . 2016.
Ni Lao, New Development in Knowledge Acquisition, Inference, and Applications . 2015. (This is a lecture at CCF ADL65. I added my take on the relationship between connectionism and symbolism, which seems to be an important issue at the moment.)
Ni Lao, Elephant and AI . LTI Colloquium Report, Spring 2012.
Ni Lao, Programming by Demonstrations and Verbal Commands. LTI Colloquium Report, Spring 2012
Ni Lao, Beyond Shallow Semantics. LTI Colloquium Report, Fall 2011.
Ni Lao, CCG, Fractal, and Emergence. LTI Colloquium Report, Spring 2011.
Ni Lao, Reinforcement Learning In An Unknown Domain (slides). 2011.
Ni Lao, Probablistic Ontology Model. LTI Colloquium Report Fall 2010.
Ni Lao, Split-Emit Process for Natural Language Generation. Advanced NLP seminar 2009.
Ni Lao, Jun Zhu, Contrastive Feature Induction for Efficient Structure Learning of Conditional Random Fields . 2009.
Ni Lao, T. Mitamura, E. Nyberg, Tree Representations for Chinese Semantic Role Labeling. 2009.
Ni Lao, Read The Web (slides). Advanced IR seminar 2007.
Ni Lao, Schema Extraction Model . Advanced IR seminar 2007.
Ni Lao, Knowledge Acquisition From Text--A Survey Statistical NLP class 2007.
PhD thesis, 2012. Efficient Random Walk Inference with Knowledge Bases (slides). Carnegie Mellon University
Master thesis, 2006. Data Mining Problems in Automatic Computer Diagnosis. Tsinghua University
Bachelor thesis, 2003. Mining Spatial-Temporal Data Using Constructive Induction. Tsinghua University
2012, NELL v165 NELL Knowledge graph in both triple format and PRA format
2010, yeast2 updated yeast data with extra information about Mesh heading, chemicals and affiliations etc. (321K entities and 6.1M links)
2010, fly a biological literature graph with 770K entities and 3.5M links
2010, yeast a biological literature graph with 164K entities and 2.8M links
2023: ACL*, EMNLP*, ICML, IJGIS, Neurips*, TALLIP, Computers & Security (*Area chair for large language models and reasoning)
2022: ACL, CoNLL, EMNLP, ICLR, KDD, Neurips, TGIS
2021: ACL, AAAI, CoNLL, EMNLP, ICLR, NAACL, SIGIR, TALLIP, GeoAI, NLP4ConvAI
2020: ACL, AAAI, COLING, CoNLL, EACL, EMNLP, ICLR, ICML, IJCAI, Neurips, SIGIR, TALLIP, TKDE
2019: ACL, AAAI, CCL, CoNLL, EMNLP, ICLR, IJCAI, NAACL, SIGIR, TKDE
2018: ACL, CCKS, COLING, EMNLP, NAACL, NLPCC, NIPS, SIGIR
2017: ACL, CCKS, EMNLP, IJCAI, IJCNLP, SIGIR, TKDE, WSDM, Google Research Grants
2016: CIKM, COLING, IJCAI, NAACL, TKDE, WWW, Google Research Grants
2015: CIKM, ICML, IJCAI, MLJ, NIPS, TKDE
Since 2012 I have being the manager of the
Machine Learning News Google Group,
which seems to be quite popular in academia.