This document details the modeling conventions to use for tables, attributes and constraints definition in the Loris SQL database. These guidelines should be followed when submitting a Pull Request to the Loris repository as well as when reviewing the PR of a peer developer.
Disclosure: Due to the previous lack of standard in modeling, we acknowledge that existing tables do not abide by the rules described below. As the codebase and database are being cleaned-up, the existing infrastructure will be re-designed to follow this guideline.
- The table name should be in snake_case.
- The table name should be in lowercase.
- The table name made of multiple word should use an underscore in order to separate each of them.
- The table name should be in singular form.
- The table name should contain English words only.
- The table name should not contain abbreviation with the exception of "_rel" for entity relationship tables.
- The table name should be a noun or group of nouns, as concise as possible.
- When a single word is not sufficient to describe a table, the ordering
of the words in the name should reveal the concern of the table in a
broad-to-specific manner.
- i.e. A table containing the consent information of a candidate would be
named
candidate_consent
, candidate being the broader concern and justconsent
is not sufficiently informative to rule out user consents and examiner consents. A table containing the consent types would be called as suchconsent_type
.
- i.e. A table containing the consent information of a candidate would be
named
- A table that is specific to a module should be prefixed with the
module name.
- i.e. A table named
xyz
used in moduleabc
should be nameabc_xyz
.
- i.e. A table named
- Acronyms should be avoided, however may be used when the acronym would be commonly used by the neuroimaging community. The acronym should only contain upper case letter. For historic reasons, “PSC” to refer to a Site is an exception to this rule.
- When creating a table representing a relation between two or more tables.
Regardless of the relation type (many-to-many, one-to-many, many-to-one or
one-to-one), the table should have a composite name as such
table1_table2_rel
.- i.e. the table mapping users to their addresses would be named
user_address_rel
whereuser
is the name of the Users' entity table andaddress
is the name of the Addresses' entity table.
- i.e. the table mapping users to their addresses would be named
- The field name should be in CapitalizedWords.
- The field name should start with an Upper case letter.
- The field name should not contain any underscore.
- The field name should be in singular form.
- The field name should contain English words only.
- The field name should not contain abbreviations.
- Acronyms should be avoided, however may be used when the acronym would be commonly used by the neuroimaging community. The acronym should only contain upper case letter.
- The primary key field of a table should be named
<TableName>ID
. While the table name is snake_case, the field names referencing it (such as the primary key) should follow field naming conventions. (ie.my_table
’s primary key should beMyTableID
).- i.e. the name of the primary identifier of the
country
table should beCountryID
.
- i.e. the name of the primary identifier of the
- When an explicit auto-incremented primary key field is present, it should be of type unsigned int. Most tables should have an explicit primary key field.
- When adding a field which serves as a foreign key to another table
in the database, the field should follow the same convention as above
ReferenceTableNameID
.- i.e. The
country
table would have a primary identifier field namedCountryID
and another table a foreign key reference to thecountry
table with a field namedCountryID
.
- i.e. The
- When two(2) fields refer to the same foreign key id, a qualifier should
be added to the names.
- i.e. Two fields pointing to the
CountryID
in the same table should be named like inOriginCountryID
andDestinationCountryID
.
- i.e. Two fields pointing to the
- Both primary key and foreign key fields should end with
ID
in upper case. - Date fields should not just be named “Date”. A qualifier like “BoughtDate” or “DateAccepted” should be added as required.
- No ENUM attributes should be used in the default LORIS schema or modules. Instead a lookup table to refer to possible choices should be used.
- The primary key is always in the first field.
- The foreign key field(s) follow the primary key.
- Other indexed fields (Unique and Index key) are after the primary and foreign key if present. Fields that are part of a composite key should be positioned together as much as possible.
- Other fields follow.
- Text, Json and Blob are the last fields in that order when present.
- Mandatory fields should generally be place before optional fields
- Within each section, it recommended but not an obligation that fix size fields are to be positioned before variable size fields (Numeric, Date and Char before Varchar).
- A foreign key constraint definition should contain
ON DELETE
andON UPDATE
clause. Do not rely on default behavior.
- Although not an enforced rule, recommendation for naming constraint are
as follow:
- The constraint name should be specifically declared.
- The constraint name should be in singular form.
- The constraint name should be of the proper format:
- Primary key
PK_<table_name>
- Unique key
UK_<table_name>_<ColumnName(s)>
- Foreign key
FK_<table_name>_<ColumnName>
- Check constraint
CK_<table_name>_<ColumnName>_<Check>
- Primary key
- Tables should be normalised to 3rd normal form unless there is a strong justification to do otherwise.
- Engine should be InnoDB (
ENGINE=InnoDB
) for all tables unless specific requirements demand otherwise. - The character encoding used by LORIS should be UTF-8. This implies that MySQL tables should use CHARSET=’utf8mb4’ (note the mb4). If the mb4 variant causes MySQL row size violations, CHARSET=’utf8’ may be used. (However, these exceptions should be rare outside of “legacy” tables due to the 3NF requirement.)
- Be explicit instead of implicit.