Skip to content

Commit

Permalink
GITBOOK-11: No subject
Browse files Browse the repository at this point in the history
  • Loading branch information
sonalgoyal authored and gitbook-bot committed Feb 8, 2025
1 parent 7248dd6 commit 050511b
Show file tree
Hide file tree
Showing 6 changed files with 92 additions and 16 deletions.
6 changes: 2 additions & 4 deletions docs/approval.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,12 +6,10 @@ nav_order: 13

# Approval of Clusters

##
##

[Zingg Enterprise Feature](#user-content-fn-1)[^1]



### The approval phase is run as follows:

` `
[^1]: Zingg Enterprise is an advance version of Zingg Community with production grade features
9 changes: 4 additions & 5 deletions docs/modelexplain.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,16 +2,15 @@
title: Explanation
parent: Step By Step Guide
nav_order: 12
description: To get a better understanding of how the data is trained and matched
---

# Explanation of Models

## To get a better understanding of how the data is trained and matched

[Zingg Enterprise Feature](#user-content-fn-1)[^1]



### The explain phase is run as follows:

` ./scripts/zingg.sh --phase <phase for explanation> --conf <path to config> --mode explain `
`./scripts/zingg.sh --phase <phase for explanation> --conf <path to config> --mode explain`

[^1]: Zingg Enterprise is an advance version of Zingg Community with production grade features
83 changes: 81 additions & 2 deletions docs/relations.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,85 @@ description: When a single match model is not sufficient

# Combining Different Match Models

[Zingg Enterprise Feature](relations.md#user-content-fn-1)\[^1]
[Zingg Enterprise Feature](#user-content-fn-1)[^1]

## In many cases
In many cases, we want to build the identity graph using a combination of different datasets, schemas and matching logic. An example could be having a source system which only contains userids and emails, another one wtih user name and phone numbers and a few others with person information with addresses. Another example could be some systems capturing spousal information, but others to be matched on the basis of lastname and address.&#x20;

In such cases, Zingg can build the entire graph and relate different models together. In the following case, results of a query with exact match on family Id and a matching model(household) using address and lastname are brought together.&#x20;

````json
```
{
"vertices" :
[
{
"name" : "spouse",
"vertexType" : "zingg_pipe",
"data" : [
{
"name" : "spouse",
"format" : "snowflake",
"props": {
"query": "select a.id as id, a.FNAME, a.LNAME, a.STNO, a.ADD1, a.CITY, a.STATE, a.ZINGG_ID_PERSON, b.id as z_id, b.fname as Z_FNAME,b.lname as Z_LNAME,b.stno as Z_STNO,b.add1 as Z_ADD1, b.city as Z_CITY,b.state as Z_STATE, b.ZINGG_ID_PERSON as Z_ZINGG_ID_PERSON from CUSTOMER_RELATE_PARTIAL a, CUSTOMER_RELATE_PARTIAL b where a.familyId = b.familyId"
}
}
],
"edges" :
{ "edgeType" : "same_edge",
"edges":[
{
"dataColumn" : "zingg_personId",
"column" : "zingg_personId",
"name" : "zingg_personId1"
},
{
"dataColumn" : "zingg_personId",
"column" : "z_zingg_personId",
"name" : "zingg_personId2"
}
]
}
},
{
"name" : "household",
"config" : "$ZINGG_ENTERPRISE_HOME$/zinggEnterprise/configHousehold.json",
"strategy" : {
"vDataStrategy" : "unique_edge",
"props" : {
"column" : "zingg_personId",
"edge" : "zingg_personId,z_zingg_personId"
}
},
"vertexType" : "zingg_match",
"edges" :
{ "edgeType" : "same_edge",
"edges":[
{
"dataColumn" : "zingg_personId",
"column" : "zingg_personId",
"name" : "zingg_personId1"
},
{
"dataColumn" : "zingg_personId",
"column" : "z_zingg_personId",
"name" : "zingg_personId2"
}
]
}
}
],
"output" : [{
"name":"relatedCustomers",
"format":"snowflake",
"props": {
"table": "RELATED_CUSTOMERS_PARTIAL"
}
}],
"strategy":"pairs_and_vertices"
}


```
````

[^1]: Zingg Enterprise is an advance version of Zingg Community with production grade features
5 changes: 3 additions & 2 deletions docs/runIncremental.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,13 @@
title: Adding incremental data
parent: Step By Step Guide
nav_order: 10
description: >-
Building a continuosly updated identity graph with new, updated and deleted
records
---

# Adding Incremental Data

## Building a continuosly updated identity graph with new, updated and deleted records

[Zingg Enterprise Feature](#user-content-fn-1)[^1]

Rerunning matching on entire datasets is wasteful, and we lose the lineage of matched records against a persistent identifier. Using the[ incremental flow](https://www.learningfromdata.zingg.ai/p/zingg-incremental-flow) feature in [Zingg Enterprise](https://www.zingg.ai/company/zingg-enterprise), incremental loads can be run to match existing pre-resolved entities. The new and updated records are matched to existing clusters, and new persistent [**ZINGG\_IDs**](https://www.learningfromdata.zingg.ai/p/hello-zingg-id) are generated for records that do not find a match. If a record gets updated and Zingg Enterprise discovers that it is a more suitable match with another cluster, it will be reassigned. Cluster assignment, merge, and unmerge happens automatically in the flow. Zingg Enterprise also takes care of human feedback on previously matched data to ensure that it does not override the approved records.
Expand Down
3 changes: 1 addition & 2 deletions docs/setup/match.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,11 @@
title: Find the matches
parent: Step By Step Guide
nav_order: 9
description: Finds the records that match with each other.
---

# Finding The Matches

## Finds the records that match with each other.

`./zingg.sh --phase match --conf config.json`

As can be seen in the image below, matching records are given the same **z\_cluster** id. Each record also gets a **z\_minScore** and **z\_maxScore** which shows the _least/greatest_ it matched with other records in the same cluster.
Expand Down
2 changes: 1 addition & 1 deletion docs/setup/train.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
title: Build and save the model
parent: Step By Step Guide
nav_order: 8
description: So that the same model can be applied to new data
---

# Building And Saving The Model
Expand All @@ -11,4 +12,3 @@ Builds up the Zingg models using the training data from the above phases and wri
```
./zingg.sh --phase train --conf config.json
```

0 comments on commit 050511b

Please # to comment.