Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

[candidate] Create Sex table and remove Sex enum #9025

Merged
merged 24 commits into from
Feb 12, 2024

Conversation

CamilleBeau
Copy link
Contributor

@CamilleBeau CamilleBeau commented Jan 29, 2024

Brief summary of changes

This PR changed the datatype of the candidate table columns "Sex" and ProbandSex" from enum to varchar(255), instead creating a constraint on the columns determined by a new "sex" table.

The "sex" table contains optional values for candidate sex and candidate proband sex. Projects can add or remove Sex options by simply inserting into or dropping from this table.

I also modified anywhere that I could find that was hardcoding Sex options to instead get the sex options from the database through a Utility class function.

  • Have you updated related documentation?

Testing instructions (if applicable)

  1. Test creating a new candidate. Make sure that the sex is inserted into the candidate table properly.
  2. Set "use proband" config setting to true in the config module, and then test the proband tab of candidate parameters
  3. Go to the candidate list module and select "Show Advanced Filters". Make sure that the Sex filter contains the right values and works properly.
  4. Check the cnv, methylation, profiles, and snp tabs all filter for Sex properly, and that the sex dropdown contains the correct values.
  5. Add or drop a value from the Sex table, and repeat steps 1 through 5 making sure that the drop-downs are populated with the updated sex options.
  6. Attempt update a candidate to have a Sex and ProbandSex that do not exist in the sex table. Make sure you get an error.
  7. Attempt to delete a value from the sex table that is used in the candidate table. Make sure you get an error.
  8. Test the CouchDB_Import_Demographics.php script

Link(s) to related issue(s)

@CamilleBeau
Copy link
Contributor Author

@kongtiaowang I may need a bit of help with the test suites here. It looks like I'm going to have to add the Sex table to the mock data in these tests, but I don't understand how to do this very well

@kongtiaowang
Copy link
Contributor

@CamilleBeau I will debug the test issue later, after #9026 which it will fix the api test issue.

@CamilleBeau
Copy link
Contributor Author

@kongtiaowang Thanks so much!

@@ -72,6 +72,15 @@ CREATE TABLE `language` (
INSERT INTO language (language_code, language_label) VALUES
('en-CA', 'English');

CREATE TABLE `sex` (
`SexID` int(10) unsigned NOT NULL AUTO_INCREMENT,
`SexValue` varchar(255) NOT NULL,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpicky but as I said during the meeting, I find "SexValue" to be a little generic, and think something like "SexName" would be more descriptive (this also applies to the rest of this PR obviously).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated "SexValue" to "Sex", and also modified charset to utf8mb4

@CamilleBeau
Copy link
Contributor Author

@kongtiaowang following the fix for api test issue and rebasing, this should be ready for debugging

@@ -72,6 +72,15 @@ CREATE TABLE `language` (
INSERT INTO language (language_code, language_label) VALUES
('en-CA', 'English');

CREATE TABLE `sex` (
`SexID` int(10) unsigned NOT NULL AUTO_INCREMENT,
`Sex` varchar(255) NOT NULL,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hhhhm, my previous comment was just a little thought I had during our meeting, but if I am to really review this pull request I guess I'll try to be more exhaustive.

After giving it more thought, I am still not fully satisfied with the field names. I think having Sex in these names when we are already in the sex table is a little redundant. For instance, I'd really prefer to write sex.ID and sex.Name rather than sex.SexID and sex.Sex in my SQL requests, especially since the first one may look like a foreign key and the second is quite cryptic (imo). So I strongly advocate for the names ID and Name for the field names.

Regarding capitalization, the database sure is messy but I think it would be a good idea to maintain a consistent convention in at least the new tables. I am under the impression that this convention is unofficially singular snake_case for the table names and PascalCase for the field names. I am fine with that and this PR matches this convention, but I am still new and this impression of mine may be wrong, so can someone with more database experience than me confirm or deny it ?

Finally, regarding the pointing of sex foreign keys on ID or Name, I do not have a strong opinion as both have their pros and cons. I think using ID as a foreign key would probably be conceptually cleaner, but also require to update the current database and queries as well as require more joins in the latter. Using Name is certainly easier and will lead to simpler queries.

It would probably have been easier if I had mentionned all that earlier so sorry for the late reply ! In any case, I'd like someone with more database experience than me to also review this pull request.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ERROR 3780 (HY000) at line 155: Referencing column 'Sex' and referenced column 'Sex' in foreign key constraint 'FK_candidate_sex_1' are incompatible.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding capitalization, the database sure is messy but I think it would be a good idea to maintain a consistent convention in at least the new tables. I am under the impression that this convention is unofficially singular snake_case for the table names and PascalCase for the field names. I am fine with that and this PR matches this convention, but I am still new and this impression of mine may be wrong, so can someone with more database experience than me confirm or deny it ?

That seems to match is the official policy documented in docs/SQLModelingStandard.md but it's so weakly enforced that I had to go look it up.

Copy link
Contributor

@maximemulder maximemulder Feb 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems to match is the official policy documented in docs/SQLModelingStandard.md but it's so weakly enforced that I had to go look it up.

Damn I looked at the documentation but somehow missed this file, thanks for pointing it to me. I forgot my glasses at home today so I guess I'm gonna use that as my excuse 😅

@@ -72,6 +72,15 @@ CREATE TABLE `language` (
INSERT INTO language (language_code, language_label) VALUES
('en-CA', 'English');

CREATE TABLE `sex` (
`SexID` int(10) unsigned NOT NULL AUTO_INCREMENT,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What purpose does this column have? It is not being used as a foreign key by any other tables and the other column in this table already has a not null/unique constraint and is being used as a foreign key into this table.

Removing it would solve the "what should it be named" debate in the thread below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, this column is not useful. I don't mind removing it

@CamilleBeau
Copy link
Contributor Author

@driusan @maximemulder I made the updates to change the "Sex" column of sex table to "Name", and removed the "ID" column. It turns out changing the charset of the "sex" table to utf8mb4 caused an issue with the foreign keys because the sex table and candidate table have to have the same charset. I switched it back to utf8.

@kongtiaowang The last test suite errors I believe were caused by the above issue with the charsets differing between tables. That is fixed now, but I expect I will still get test suite errors because of tests in the CandidateTest class which I am still unfamiliar with how to fix.

{
return in_array($value, self::VALID_VALUES, true);
return in_array($value, $validValues, true);
Copy link
Contributor

@kongtiaowang kongtiaowang Feb 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getSexList will return array(
'Male' => 'Male',
'Female' => 'Female',
'Other' => 'Other'
) so you need to change to [return in_array($value, array_values($validValues), true);]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kongtiaowang Made that change but still having trouble unfortunately!

@driusan driusan merged commit e2d6f04 into aces:main Feb 12, 2024
28 checks passed
@ridz1208 ridz1208 added this to the 26.0.0 milestone Apr 9, 2024
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Sex should not be an enum
5 participants