Paco Nathan
Derwen, Inc.
Known as a "player/coach", with core expertise in data science, natural language processing, machine learning, cloud computing; 35+ years tech industry experience, ranging from Bell Labs to early-stage start-ups. Co-chair Rev and JupyterCon. Advisor for NYU Coleridge Initiative, IBM Data Science Community, Amplify Partners, Recognai, Primer. Formerly: Director, Community Evangelism @ Databricks and Apache Spark. Cited in 2015 as one of the Top 30 People in Big Data and Analytics by Innovation Enterprise.

2020 Talk: Rich Context: a knowledge graph for linking datasets with research outcomes

The Rich Context project at NYU Wagner is the knowledge graph complement to the ADRF platform for cross-agency social science research using sensitive data, currently used by 50+ agencies. Rich Context represents metadata about datasets and their use in research which in turn influences public policy, with a goal of producing recommender systems for analysts and policymakers. Most all of the code is open source. This talk introduces the background for the project, our team process for collaboration, and several areas where machine learning is used to infer or clean metadata obtained from scholarly infrastructure and for semi-automated graph construction, along with human-in-the-loop feedback mechanisms for domain experts to help improve our graph.