added function to track active users #23

ArnavS59 · 2025-01-28T15:35:35Z

Track active users in DB

♻️ Current situation & Problem

PR for #17

⚙️ Release Notes

Added function extract_latest_user_interaction to data_processor.py to track active users in the DB.

📚 Documentation

Please ensure that you properly document any additions in conformance to Spezi Documentation Guide.
You can use this section to describe your solution, but we encourage contributors to document your reasoning and changes using in-line documentation.

✅ Testing

Please ensure that the PR meets the testing requirements set by CodeCov and that new functionality is appropriately tested.
This section describes important information about the tests and why some elements might not be testable.

📝 Code of Conduct & Contributing Guidelines

By submitting creating this pull request, you agree to follow our Code of Conduct and Contributing Guidelines:

I agree to follow the Code of Conduct and Contributing Guidelines.

codecov · 2025-02-07T18:04:30Z

Codecov Report

Attention: Patch coverage is 11.11111% with 8 lines in your changes missing coverage. Please review.

Project coverage is 81.65%. Comparing base (e6cf0e6) to head (490c25a).

Files with missing lines	Patch %	Lines
...zi_data_pipeline/data_processing/data_processor.py	11.12%	8 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main      #23      +/-   ##
==========================================
- Coverage   82.04%   81.65%   -0.39%     
==========================================
  Files          16       16              
  Lines        1609     1618       +9     
==========================================
+ Hits         1320     1321       +1     
- Misses        289      297       +8

Flag	Coverage Δ
unittests	`81.65% <11.12%> (-0.39%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines	Coverage Δ
...zi_data_pipeline/data_processing/data_processor.py	`60.44% <11.12%> (-5.41%)`	⬇️

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e6cf0e6...490c25a. Read the comment docs.

Vicbi

Hi @ArnavS59,
Thank you for your contribution! I have a few suggestions:

Consider using string constants for the column names within your function. This can help with readability and future code maintenance.
Please take a moment to address the linting errors.
It would be helpful to add a unit test for your function.

Vicbi · 2025-02-07T18:15:11Z

src/spezi_data_pipeline/data_processing/data_processor.py

+        return
+
+    #First convert column to datetime for comparison
+    flattend_fhir_df['EffectiveDateTime'] = pd.to_datetime(flattend_fhir_df['EffectiveDateTime'], format='%d.%m.%y')


Consider replacing the string column name with a constant already declared for all column names in FHIRDataframe

Vicbi · 2025-02-07T18:16:06Z

src/spezi_data_pipeline/data_processing/data_processor.py

+    flattend_fhir_df['EffectiveDateTime'] = pd.to_datetime(flattend_fhir_df['EffectiveDateTime'], format='%d.%m.%y')
+    #Filter the most recent entry for each userid 
+    most_recent_df=flattend_fhir_df.loc[flattend_fhir_df.groupby('UserId')['EffectiveDateTime'].idxmax()]
+    most_recent_df=most_recent_df[['UserId','EffectiveDateTime']]#select the relevant cols


Same comment here for "UserID" and "EffectiveDateTime".

Vicbi · 2025-02-07T18:17:17Z

src/spezi_data_pipeline/data_processing/data_processor.py

+    most_recent_df=flattend_fhir_df.loc[flattend_fhir_df.groupby('UserId')['EffectiveDateTime'].idxmax()]
+    most_recent_df=most_recent_df[['UserId','EffectiveDateTime']]#select the relevant cols
+    most_recent_df.rename(columns={'EffectiveDateTime': 'LastUserInteraction'}, inplace=True)
+    most_recent_df.to_csv('output.csv')


Another suggestion would be to allow the user of the spezi-data-pipeline package to set the name of the output file by themselves.

Vicbi · 2025-02-07T18:22:27Z

src/spezi_data_pipeline/data_processing/data_processor.py

@@ -313,3 +313,24 @@ def select_data_by_dates(  # pylint: disable=unused-variable
        filtered_df.reset_index(drop=True),
        resource_type=flattened_fhir_dataframe.resource_type,
    )
+
+def extract_latest_user_interaction(  # pylint: disable=unused-variable
+    flattend_fhir_df: pd.DataFrame


In the spezi-data-pipeline package, the functionalities are built around FHIRDataFrame objects. You might want to adjust your function to receive FHIRDataFrame as an input argument.

added function to track active users

c562479

PSchmiedmayer requested a review from Vicbi January 29, 2025 01:19

PSchmiedmayer assigned ArnavS59 Jan 29, 2025

PSchmiedmayer added the enhancement New feature or request label Jan 29, 2025

Merge branch 'main' into main

490c25a

Vicbi reviewed Feb 7, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

added function to track active users #23

added function to track active users #23

ArnavS59 commented Jan 28, 2025

codecov bot commented Feb 7, 2025

Vicbi left a comment

Vicbi Feb 7, 2025

Vicbi Feb 7, 2025

Vicbi Feb 7, 2025

Vicbi Feb 7, 2025

added function to track active users #23

Are you sure you want to change the base?

added function to track active users #23

Conversation

ArnavS59 commented Jan 28, 2025

Track active users in DB

♻️ Current situation & Problem

⚙️ Release Notes

📚 Documentation

✅ Testing

📝 Code of Conduct & Contributing Guidelines

codecov bot commented Feb 7, 2025

Codecov Report

Vicbi left a comment

Choose a reason for hiding this comment

Vicbi Feb 7, 2025

Choose a reason for hiding this comment

Vicbi Feb 7, 2025

Choose a reason for hiding this comment

Vicbi Feb 7, 2025

Choose a reason for hiding this comment

Vicbi Feb 7, 2025

Choose a reason for hiding this comment