Skip to content

Commit bcaa9fe

Browse files
committed
Now using <h4> elements
1 parent e2b8441 commit bcaa9fe

File tree

1 file changed

+10
-3
lines changed

1 file changed

+10
-3
lines changed

01_index.markdown

+10-3
Original file line numberDiff line numberDiff line change
@@ -66,11 +66,18 @@ background: home/bg.png
6666
<img style="width: 70%;" alt="Transformed, Cleaned Data in Silver Tables" src="assets/slides/Entity-Resolution-Phase-1-Silver-ETL.png" />
6767
</center>
6868
<div style="margin-top: 3%;"></div>
69-
<h3>Step 2) Entity Resolution (ER): Manual Deduplication</h3>
69+
<h3>Entity Resolution (ER): Node and Edge Deduplication</h3>
7070
<div>
71-
Entity Resolution (ER) is the process of deduplicating and combining duplicate nodes and splitting up mistakenly merged nodes. In a similar manner, edges can also be merged or split up.
71+
Entity Resolution (ER) is the process of deduplicating and combining duplicate nodes and splitting up mistakenly merged nodes. In a similar manner, edges can also be merged or split up. This is important as it is difficult to detect patterns in networks if there are disconnected duplicates that are each a part of different, important relationships.
7272
</div>
7373
<div style="margin-top: 2%;"></div>
74+
<div>
75+
<center>
76+
<img style="width: 70%;" alt="Clean Data in Silver Tables" src="assets/slides/Entity-Resolution-Enables-Motif-Search.jpg" />
77+
</center>
78+
</div>
79+
<div style="margin-top: 2%;"></div>
80+
<h4>Manual Block and Match</h4>
7481
<div>
7582
Traditional entity resolution involves two phases: blocking and matching. Querying data as part of exploratory data analysis (EDA) reveals strategies to match records. This was traditionally done by hand, and it still can be for a limited number of small datasets.
7683
</div>
@@ -91,7 +98,7 @@ background: home/bg.png
9198
</center>
9299
</div>
93100
<div style="margin-top: 2%;"></div>
94-
<h3>Step 3) Entity Resolution (ER): Automatic Deduplication</h3>
101+
<h4>Automatic Deduplication</h4>
95102
<div>
96103
When dealing with big data, especially when there are a a number of datasets large and small, the traditional entity resolution model of manual blocking and matching starts to break down. It is cumbersome and takes too much developer time. What is needed is a <i>generic approach to entity resolution</i>.
97104
</div>

0 commit comments

Comments
 (0)