Add Jin Wang's talk details

ZHANG-EH · ZHANG-EH · commit 390e62bb040d · 2024-01-09T22:09:11.000-08:00
diff --git a/nwds/nwds.markdown b/nwds/nwds.markdown
@@ -49,7 +49,7 @@ Faisal Nawab is an assistant professor in the computer science department at the
 ---
 
 <p><a name="Pat_Helland_2024_01_26"></a>
-<strong>Speaker</strong>: <a href="">Pat Helland</a> </p>
+<strong>Speaker</strong>: <a href="pathelland.substack.com">Pat Helland</a> </p>
 
 <p><strong>Where</strong>: University of Washington, Seattle.<br>
 Allen School of Computer Science and Engineering.<br>
@@ -72,6 +72,30 @@ Pat Helland has been building distributed systems, database systems, high-perfor
 
 ---
 
+<p><a name="Jin_Wang_2024_01_19"></a>
+<strong>Speaker</strong>: <a href="">Jin Wang</a> </p>
+
+<p><strong>Where</strong>: University of Washington, Seattle.<br>
+Allen School of Computer Science and Engineering.<br>
+Paul G. Allen Center, CSE 291</p>
+
+<p><strong>When</strong>:
+Friday, January 19th, 2024, 2:30pm-3:30pm</p>
+
+<p><strong>Title</strong>:
+    Towards End-to-end Data Pipeline for Effective Data Science
+</p>
+
+<p><strong>Abstract</strong>:
+Nowadays data-driven approaches have become a mainstream research methodology in multiple communities. To support effective and scalable data science applications on the ever growing datasets, researchers from both academic and industrial fields have made great efforts in building end-to-end data pipelines. In this talk, I will present my efforts in improving two essential components of an end-to-end data pipeline: data preparation and data processing. First, I will present a unified self-supervised learning paradigm that can improve the performance of a variety of data preparation tasks, such as dataset discovery, table annotation and entity matching. Next, I will introduce my work in optimizing parallel recursive queries to support analytical workloads in data processing. Finally, I will conclude with the vision for future work of data pipelines.
+</p>
+
+<p><strong>Bio</strong>:
+Jin Wang is a research scientist and research lead from Megagon Labs. Before that he obtained his PhD degree of Computer Science from University of California, Los Angeles in July 2020. His research interests lie in the board area of data management and data science. In particular, his research focuses on Database systems, Datalog, Data Integration and Table Representation Learning. His work appears in leading conferences and journals of data management such as SIGMOD, VLDB, ICDE and VLDB Journal.
+</p>
+
+---
+
 #### Winter 2023
 
 ---