You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Univer Clipsheet is a powerful Chrome extension for web scraping and data automation. It simplifies the process of extracting, organizing, and managing web data with powerful scraping capabilities and workflow integration.
Clicking the button will navigate to the URL as a detail page.
11
11
12
12
As shown in the image below, you can select three highlighted blocks on the detail page. These selected blocks can be saved as your drill-down configuration for the URL column.
When you run the `Scraper` with drill-**drill-down** columns in the scraped table, it will automatically visit each URL in the URL column. For every visited URL, it will scrape the data based on the blocks you selected in the **drill-down**-down column configuration.
23
23
24
24
As shown in the picture below, You can see that Clipsheet has scraped the three **drill-down** columns we selected earlier.
Copy file name to clipboardexpand all lines: docs/en-US/scraper.md
+8-8
Original file line number
Diff line number
Diff line change
@@ -8,50 +8,50 @@ In the previous chapters, we learned how to quickly scrape the tables we need us
8
8
9
9
However, this is not sufficient for large-scale web scraping requirements. To handle such scenarios, we need to automate the data collection process. What we need is a **Scraper**!
We provide three adaptable scraping methods: `Scroll`, `Click`, and `Page`. These configurations allow you to extract more data from a webpage by customizing the scraping process.
22
22
23
23
### 3.1.1 Infinite Scroll
24
24
25
25
For pages with infinite scrolling to load more data, you can use the `Scroll`**Scraper** to extract the entire list.
In this table, we list the columns from the scraped table, allowing you to customize it. You can define column names, delete columns, and preview the table data by clicking the `View Table` button
46
46
47
47
## 3.3 Save your scraper
48
48
49
49
After configuring, you can save the scraper, and it will appear in the scraper list in the popup. You can then `start` the scraper to begin web scraping and collect the data.
Copy file name to clipboardexpand all lines: docs/en-US/workflow.md
+5-5
Original file line number
Diff line number
Diff line change
@@ -12,32 +12,32 @@ Below, we will introduce several key features of workflows, including scheduled
12
12
13
13
A **workflow** can be linked to a data source. If no data source is defined, a new data source will be automatically created and bound after the **workflow**'s first execution.
> **Note:** When a **workflow** is bound to a data source, the columns of the data source will be set to the **workflow** and cannot be modified. (This is because once a data source is created, its table structure is immutable.)
18
18
19
19
Once a **workflow** has bound a data source, the workflow’s output data will be appended to this data source, and duplicate rows will be automatically removed.
20
20
21
21
You can customize the configuration for removing duplicates.
A **workflow** is a collection of multiple scrapers working together.
28
28
29
29
You need to define the **workflow** based on the scrapers that make up the **workflow**. For example, in the following setup, we configure the **workflow** to include the `Amazon Scraper` and `Google Maps Scraper`.
0 commit comments