Skip to content

website sync feature #4429

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Merged
merged 8 commits into from
Apr 2, 2025
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions deploy/docker/docker-compose-milvus.yml
Original file line number Diff line number Diff line change
@@ -110,6 +110,18 @@ services:

# 等待docker-entrypoint.sh脚本执行的MongoDB服务进程
wait $$!
redis:
image: redis:7.2-alpine
container_name: redis
# ports:
# - 6379:6379
networks:
- fastgpt
restart: always
command: |
redis-server --requirepass mypassword --loglevel warning --maxclients 10000 --appendonly yes --save 60 10 --maxmemory 4gb --maxmemory-policy noeviction
volumes:
- ./redis/data:/data

# fastgpt
sandbox:
@@ -157,6 +169,8 @@ services:
# zilliz 连接参数
- MILVUS_ADDRESS=http://milvusStandalone:19530
- MILVUS_TOKEN=none
# Redis 地址
- REDIS_URL=redis://default:mypassword@redis:6379
# sandbox 地址
- SANDBOX_URL=http://sandbox:3000
# 日志等级: debug, info, warn, error
15 changes: 15 additions & 0 deletions deploy/docker/docker-compose-pgvector.yml
Original file line number Diff line number Diff line change
@@ -69,6 +69,19 @@ services:
# 等待docker-entrypoint.sh脚本执行的MongoDB服务进程
wait $$!

redis:
image: redis:7.2-alpine
container_name: redis
# ports:
# - 6379:6379
networks:
- fastgpt
restart: always
command: |
redis-server --requirepass mypassword --loglevel warning --maxclients 10000 --appendonly yes --save 60 10 --maxmemory 4gb --maxmemory-policy noeviction
volumes:
- ./redis/data:/data

# fastgpt
sandbox:
container_name: sandbox
@@ -114,6 +127,8 @@ services:
- MONGODB_URI=mongodb://myusername:mypassword@mongo:27017/fastgpt?authSource=admin
# pg 连接参数
- PG_URL=postgresql://username:password@pg:5432/postgres
# Redis 连接参数
- REDIS_URL=redis://default:mypassword@redis:6379
# sandbox 地址
- SANDBOX_URL=http://sandbox:3000
# 日志等级: debug, info, warn, error
15 changes: 15 additions & 0 deletions deploy/docker/docker-compose-zilliz.yml
Original file line number Diff line number Diff line change
@@ -51,6 +51,19 @@ services:

# 等待docker-entrypoint.sh脚本执行的MongoDB服务进程
wait $$!
redis:
image: redis:7.2-alpine
container_name: redis
# ports:
# - 6379:6379
networks:
- fastgpt
restart: always
command: |
redis-server --requirepass mypassword --loglevel warning --maxclients 10000 --appendonly yes --save 60 10 --maxmemory 4gb --maxmemory-policy noeviction
volumes:
- ./redis/data:/data

sandbox:
container_name: sandbox
image: ghcr.io/labring/fastgpt-sandbox:v4.9.3 # git
@@ -92,6 +105,8 @@ services:
- FILE_TOKEN_KEY=filetoken
# MongoDB 连接参数. 用户名myusername,密码mypassword。
- MONGODB_URI=mongodb://myusername:mypassword@mongo:27017/fastgpt?authSource=admin
# Redis 连接参数
- REDIS_URI=redis://default:mypassword@redis:6379
# zilliz 连接参数
- MILVUS_ADDRESS=zilliz_cloud_address
- MILVUS_TOKEN=zilliz_cloud_token
Binary file added docSite/assets/imgs/sealos-redis1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docSite/assets/imgs/sealos-redis2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docSite/assets/imgs/sealos-redis3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
33 changes: 33 additions & 0 deletions docSite/content/zh-cn/docs/development/upgrading/494.md
Original file line number Diff line number Diff line change
@@ -7,11 +7,44 @@ toc: true
weight: 796
---

## 升级指南

### 1. 做好数据备份

### 1. 安装 Redis

* docker 部署的用户,参考最新的 `docker-compose.yml` 文件增加 Redis 配置。增加一个 redis 容器,并配置`fastgpt`,`fastgpt-pro`的环境变量,增加 `REDIS_URL` 环境变量。
* Sealos 部署的用户,在数据库里新建一个`redis`数据库,并复制`内网地址的 connection` 作为 `redis` 的链接串。然后配置`fastgpt`,`fastgpt-pro`的环境变量,增加 `REDIS_URL` 环境变量。

| | | |
| --- | --- | --- |
| ![](/imgs/sealos-redis1.png) | ![](/imgs/sealos-redis2.png) | ![](/imgs/sealos-redis3.png) |

### 2. 更新镜像 tag


### 3. 执行升级脚本

该脚本仅需商业版用户执行。

从任意终端,发起 1 个 HTTP 请求。其中 {{rootkey}} 替换成环境变量里的 `rootkey`;{{host}} 替换成**FastGPT 域名**。

```bash
curl --location --request POST 'https://{{host}}/api/admin/initv494' \
--header 'rootkey: {{rootkey}}' \
--header 'Content-Type: application/json'
```

**脚本功能**

1. 更新站点同步定时器

## 🚀 新增内容

1. 集合数据训练状态展示
2. SMTP 发送邮件插件
3. BullMQ 消息队列。
4. 站点同步支持配置训练参数。

## 🐛 修复

3 changes: 1 addition & 2 deletions packages/global/core/dataset/api.d.ts
Original file line number Diff line number Diff line change
@@ -15,7 +15,6 @@ export type DatasetUpdateBody = {
name?: string;
avatar?: string;
intro?: string;
status?: DatasetSchemaType['status'];

agentModel?: string;
vlmModel?: string;
@@ -26,6 +25,7 @@ export type DatasetUpdateBody = {
apiServer?: DatasetSchemaType['apiServer'];
yuqueServer?: DatasetSchemaType['yuqueServer'];
feishuServer?: DatasetSchemaType['feishuServer'];
chunkSettings?: DatasetSchemaType['chunkSettings'];

// sync schedule
autoSync?: boolean;
@@ -141,7 +141,6 @@ export type PushDatasetDataChunkProps = {

export type PostWebsiteSyncParams = {
datasetId: string;
billId: string;
};

export type PushDatasetDataProps = {
6 changes: 5 additions & 1 deletion packages/global/core/dataset/constants.ts
Original file line number Diff line number Diff line change
@@ -50,14 +50,18 @@ export const DatasetTypeMap = {

export enum DatasetStatusEnum {
active = 'active',
syncing = 'syncing'
syncing = 'syncing',
waiting = 'waiting'
}
export const DatasetStatusMap = {
[DatasetStatusEnum.active]: {
label: i18nT('common:core.dataset.status.active')
},
[DatasetStatusEnum.syncing]: {
label: i18nT('common:core.dataset.status.syncing')
},
[DatasetStatusEnum.waiting]: {
label: i18nT('common:core.dataset.status.waiting')
}
};

22 changes: 19 additions & 3 deletions packages/global/core/dataset/type.d.ts
Original file line number Diff line number Diff line change
@@ -17,6 +17,20 @@ import { SourceMemberType } from 'support/user/type';
import { DatasetDataIndexTypeEnum } from './data/constants';
import { ChunkSettingModeEnum } from './constants';

export type ChunkSettingsType = {
trainingType: DatasetCollectionDataProcessModeEnum;
autoIndexes?: boolean;
imageIndex?: boolean;

chunkSettingMode?: ChunkSettingModeEnum;
chunkSplitMode?: DataChunkSplitModeEnum;

chunkSize?: number;
indexSize?: number;
chunkSplitter?: string;
qaPrompt?: string;
};

export type DatasetSchemaType = {
_id: string;
parentId?: string;
@@ -29,7 +43,6 @@ export type DatasetSchemaType = {
name: string;
intro: string;
type: `${DatasetTypeEnum}`;
status: `${DatasetStatusEnum}`;

vectorModel: string;
agentModel: string;
@@ -39,14 +52,16 @@ export type DatasetSchemaType = {
url: string;
selector: string;
};

chunkSettings?: ChunkSettingsType;

inheritPermission: boolean;
apiServer?: APIFileServer;
feishuServer?: FeishuServer;
yuqueServer?: YuqueServer;

autoSync?: boolean;

// abandon
autoSync?: boolean;
externalReadUrl?: string;
defaultPermission?: number;
};
@@ -193,6 +208,7 @@ export type DatasetListItemType = {
};

export type DatasetItemType = Omit<DatasetSchemaType, 'vectorModel' | 'agentModel' | 'vlmModel'> & {
status: `${DatasetStatusEnum}`;
vectorModel: EmbeddingModelItemType;
agentModel: LLMModelItemType;
vlmModel?: LLMModelItemType;
74 changes: 74 additions & 0 deletions packages/service/common/bullmq/index.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
import { ConnectionOptions, Processor, Queue, QueueOptions, Worker, WorkerOptions } from 'bullmq';
import { addLog } from '../system/log';
import { newQueueRedisConnection, newWorkerRedisConnection } from '../redis';

const defaultWorkerOpts: Omit<ConnectionOptions, 'connection'> = {
removeOnComplete: {
count: 0 // Delete jobs immediately on completion
},
removeOnFail: {
count: 0 // Delete jobs immediately on failure
}
};

export enum QueueNames {
websiteSync = 'websiteSync'
}

export const queues = (() => {
if (!global.queues) {
global.queues = new Map<QueueNames, Queue>();
}
return global.queues;
})();
export const workers = (() => {
if (!global.workers) {
global.workers = new Map<QueueNames, Worker>();
}
return global.workers;
})();

export function getQueue<DataType, ReturnType = void>(
name: QueueNames,
opts?: Omit<QueueOptions, 'connection'>
): Queue<DataType, ReturnType> {
// check if global.queues has the queue
const queue = queues.get(name);
if (queue) {
return queue as Queue<DataType, ReturnType>;
}
const newQueue = new Queue<DataType, ReturnType>(name.toString(), {
connection: newQueueRedisConnection(),
...opts
});

// default error handler, to avoid unhandled exceptions
newQueue.on('error', (error) => {
addLog.error(`MQ Queue [${name}]: ${error.message}`, error);
});
queues.set(name, newQueue);
return newQueue;
}

export function getWorker<DataType, ReturnType = void>(
name: QueueNames,
processor: Processor<DataType, ReturnType>,
opts?: Omit<WorkerOptions, 'connection'>
): Worker<DataType, ReturnType> {
const worker = workers.get(name);
if (worker) {
return worker as Worker<DataType, ReturnType>;
}

const newWorker = new Worker<DataType, ReturnType>(name.toString(), processor, {
connection: newWorkerRedisConnection(),
...defaultWorkerOpts,
...opts
});
// default error handler, to avoid unhandled exceptions
newWorker.on('error', (error) => {
addLog.error(`MQ Worker [${name}]: ${error.message}`, error);
});
workers.set(name, newWorker);
return newWorker;
}
7 changes: 7 additions & 0 deletions packages/service/common/bullmq/type.d.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
import { Queue, Worker } from 'bullmq';
import { QueueNames } from './index';

declare global {
var queues: Map<QueueNames, Queue> | undefined;
var workers: Map<QueueNames, Worker> | undefined;
}
27 changes: 27 additions & 0 deletions packages/service/common/redis/index.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
import Redis from 'ioredis';

const REDIS_URL = process.env.REDIS_URL ?? 'redis://localhost:6379';

export function newQueueRedisConnection() {
const redis = new Redis(REDIS_URL);
redis.on('connect', () => {
console.log('Redis connected');
});
redis.on('error', (error) => {
console.error('Redis connection error', error);
});
return redis;
}

export function newWorkerRedisConnection() {
const redis = new Redis(REDIS_URL, {
maxRetriesPerRequest: null
});
redis.on('connect', () => {
console.log('Redis connected');
});
redis.on('error', (error) => {
console.error('Redis connection error', error);
});
return redis;
}
6 changes: 4 additions & 2 deletions packages/service/core/dataset/collection/controller.ts
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
import {
DatasetCollectionTypeEnum,
DatasetCollectionDataProcessModeEnum
DatasetCollectionDataProcessModeEnum,
DatasetTypeEnum
} from '@fastgpt/global/core/dataset/constants';
import type { CreateDatasetCollectionParams } from '@fastgpt/global/core/dataset/api.d';
import { MongoDatasetCollection } from './schema';
@@ -104,7 +105,8 @@ export const createCollectionAndInsertData = async ({
hashRawText: hashStr(rawText),
rawTextLength: rawText.length,
nextSyncTime: (() => {
if (!dataset.autoSync) return undefined;
// ignore auto collections sync for website datasets
if (!dataset.autoSync && dataset.type === DatasetTypeEnum.websiteDataset) return undefined;
if (
[DatasetCollectionTypeEnum.link, DatasetCollectionTypeEnum.apiFile].includes(
createCollectionParams.type
Loading
Loading