isoxya-api-init
endpoint [http://localhost]:
{
"time": "2021-12-16T14:19:34.138417603Z",
"version": "2.1.6.49"
}
Register Processor plugin
isoxya-api-create-processor
channels (Pro) [null]:
tag [crawler-html]:
url [http://isoxya-plugin-crawler-html.localhost/data]:
{
"channels": 1,
"href": "/processor/7135e6c2-3026-44bf-abcc-c64af3efce73",
"tag": "crawler-html",
"url": "http://isoxya-plugin-crawler-html.localhost/data"
}
Register Streamer plugin
isoxya-api-create-streamer
channels (Pro) [null]:
tag [nginx]:
url [http://isoxya-plugin-nginx.localhost]:
{
"channels": 1,
"href": "/streamer/b49fcc24-6562-415a-94a6-3e8dcd848aac",
"tag": "nginx",
"url": "http://isoxya-plugin-nginx.localhost"
}
Register Site
isoxya-api-create-site
channels (Pro) [null]:
rate_limit (Pro) [null]:
url [http://example.com]:
{
"channels": 1,
"href": "/site/aHR0cDovL2V4YW1wbGUuY29tOjgw",
"rate_limit": 1,
"url": "http://example.com:80"
}
Start Crawl
isoxya-api-create-crawl
site.href [/site/aHR0cDovL2V4YW1wbGUuY29tOjgw]:
agent (Pro) [null]:
depth_max (Pro) [null]:
list.href (Pro):
0: null
1:
[0]:
null
pages_max (Pro) [null]:
processor_config [null]:
processors.hrefs [/processor/7135e6c2-3026-44bf-abcc-c64af3efce73]:
streamers.hrefs [/streamer/b49fcc24-6562-415a-94a6-3e8dcd848aac]:
validate (Pro) [null]:
{
"agent": "Isoxya/0.0.0 (+https://www.isoxya.com/)",
"began": "2021-12-21T10:29:06.959456Z",
"depth_max": null,
"duration": null,
"ended": null,
"href": "/site/aHR0cDovL2V4YW1wbGUuY29tOjgw/crawl/2021-12-21T10:29:06.959456Z",
"list": null,
"pages": null,
"pages_max": null,
"parent": null,
"processor_config": null,
"processors": [
{
"href": "/processor/7490dcfc-9756-46e0-a641-e0fe57ce3e20"
}
],
"progress": null,
"site": {
"channels": 1,
"href": "/site/aHR0cDovL2V4YW1wbGUuY29tOjgw",
"rate_limit": 1,
"url": "http://example.com:80"
},
"speed": null,
"status": "pending",
"streamers": [
{
"href": "/streamer/9426239c-97e3-4879-a689-6d5e4f19eadf"
}
],
"validate": false
}
Check Crawl status
isoxya-api-read
href:
0: /processor/7135e6c2-3026-44bf-abcc-c64af3efce73
1: /streamer/b49fcc24-6562-415a-94a6-3e8dcd848aac
2: /site/aHR0cDovL2V4YW1wbGUuY29tOjgw
3:
4: /site/aHR0cDovL2V4YW1wbGUuY29tOjgw/crawl/2021-12-16T14:21:41.719297Z
[4]:
/site/aHR0cDovL2V4YW1wbGUuY29tOjgw/crawl/2021-12-16T14:21:41.719297Z
{
"agent": "Isoxya/0.0.0 (+https://www.isoxya.com/)",
"began": "2021-12-21T10:29:06.959456Z",
"depth_max": null,
"duration": 0.548366,
"ended": "2021-12-21T10:29:07.507822Z",
"href": "/site/aHR0cDovL2V4YW1wbGUuY29tOjgw/crawl/2021-12-21T10:29:06.959456Z",
"list": null,
"pages": 1,
"pages_max": null,
"parent": null,
"processor_config": null,
"processors": [
{
"href": "/processor/7490dcfc-9756-46e0-a641-e0fe57ce3e20"
}
],
"progress": 100,
"site": {
"channels": 1,
"href": "/site/aHR0cDovL2V4YW1wbGUuY29tOjgw",
"rate_limit": 1,
"url": "http://example.com:80"
},
"speed": 1.823599566712,
"status": "completed",
"streamers": [
{
"href": "/streamer/9426239c-97e3-4879-a689-6d5e4f19eadf"
}
],
"validate": false
}
To crawl the same Site again, use isoxya-api-create-crawl
.
To crawl another Site, register it with isoxya-api-create-site
first.
That's it!