-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Validate Business Role yaml using pydantic #161
base: master
Are you sure you want to change the base?
Validate Business Role yaml using pydantic #161
Conversation
2bebf90
to
a38b9a8
Compare
Ok, I've looked into differences between Reasons: (1) Nested structures JSON schema is "nested" by design. All layers are defined in the same data structure and in natural order. We have a lot of nested and complex structures in other configs, e.g. (2) Conditional keywords As far as I know, if we want to implement similar logic with pydantic, it would require coding and manual exception handling, similar to how you implemented XOR for (3) Used for validation only Currently I would prefer to stay on jsonschema, since changing it seems to be low ROI or slightly negative RIO. Unless we find a genuinely better data structure validation library. |
In the TableParse JSON schema takes 190 lines, defining Python dataclasses would take less and be more clear developers who are not very proficient with JSON Schema.
If dataclass' property is defined as Union type like Here is an example: ColumnName = Annotated[str, Field(min_length=1, description="Column name")]
VarcharType = Annotated[str, Field(pattern=r"varchar\(\d+\, ?\d+\)")]
BoolType = Literal['bool']
ColumnType = VarcharType | BoolType # other types
@dataclass
class ColumnSpec:
name: ColumnName
type: ColumnType
comment: Optional[str] = None
TableName = Annotated[str, Field(min_length=1)]
@dataclass
class Table:
name: TableName
columns: list[ColumnName | ColumnSpec] = Field(min_length=1)
ta = TypeAdapter(Table)
print(json.dumps(ta.json_schema())) Gives: {
"$defs": {
"ColumnSpec": {
"properties": {
"name": {
"description": "Column name",
"minLength": 1,
"title": "Name",
"type": "string"
},
"type": {
"anyOf": [
{
"pattern": "varchar\\(\\d+\\, ?\\d+\\)",
"type": "string"
},
{
"const": "bool",
"type": "string"
}
],
"title": "Type"
},
"comment": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"title": "Comment"
}
},
"required": [
"name",
"type"
],
"title": "ColumnSpec",
"type": "object"
}
},
"properties": {
"name": {
"minLength": 1,
"title": "Name",
"type": "string"
},
"columns": {
"items": {
"anyOf": [
{
"description": "Column name",
"minLength": 1,
"type": "string"
},
{
"$ref": "#/$defs/ColumnSpec"
}
]
},
"minItems": 1,
"title": "Columns",
"type": "array"
}
},
"required": [
"name",
"columns"
],
"title": "Table",
"type": "object"
}
Agree, after recent changes parsers become much more simple. My last argument in defense of dataclasses is that property names are defined as strings in JSON Schema (when manually defined) and Blueprint constructor also use string to access parsed properties. Having dataclasses would reveal potential typos. |
No description provided.