Machine Learning and Big Data expert/consultant. Worked in automated trading, healthcare analytics, and Internet startups. Numerous publications in machine learning, computational finance, algorithms and networks.
Specialties: Machine Learning; Computational Finance; Healthcare analytics; Efficient algorithms for computer applications; Social Networks.
Talks and Events
A Distributed Vertex-Deduplication Framework for Large Graphs
Our framework is built on top of a distributed Hadoop/MapReduce/Hbase infrastructure capturing both “low-level” graph database operations as well as “higher level” algorithmic aspects such as vertex-vertex similarity, graph clustering, and “robust” vertex id stamping. We have been using the framework to de-duplicate the Goldman Sachs Knowledge Graph – in this talk we will report a few experimental results applying the framework to public datasets
Track: Systems and Scale