Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Create technical documentation with suggested queries to be circulated with the FormAssembly survey #116

Closed
Tracked by #107
jmakowski1123 opened this issue Feb 7, 2022 · 13 comments
Assignees

Comments

@jmakowski1123
Copy link

jmakowski1123 commented Feb 7, 2022

This is a task in epic #107

The assumption is that we will be sending Providers this survey (built in FormAssembly). The expectation will be that one survey will be filled out for each Instance.

We want to make it as easy, quick and seamless as possible for Providers to complete the surveys. One method to accomplish this is to create brief technical documentation with suggested queries that can be run to answer as many questions as possible.

This idea for providing technical documentation was discussed briefly in #62 . Related comments/suggestions copied below:

Is there a form of technical documentation that we would provide with the form to help people successfully fill it out. How to we help people do this, for example?

  • Recommended approach: Simple count of the User model, minus a simple count of the LtiUser model.

Is there a form of technical documentation that we would provide with the form to help people successfully fill it out. How to we help people do this, for example

  • Sure, I can give some queries for them to run. It'd be nice if edX could run them on their read replica to test early, but it's not absolutely required. That's probably only a couple hours of actual work with the caveats I put in the recommended queries above. Might take more calendar time if someone at edX is testing and we get weird results that we need to debug.
@jmakowski1123
Copy link
Author

jmakowski1123 commented Feb 9, 2022

@feanil Few comments about approaching this iteratively:

  1. The wording/framing of the survey questions are not set in stone. We know what results/data we are ultimately aiming for. But if it makes sense to adjust the wording/framing in order to align with the more effective or simpler queries to get to the same end results, we can tweak the questions as we go.

  2. Before sending the survey and corresponding documentation to the community writ large, we could do a brief beta run with one Provider, to test the ease and efficacy of the survey, the clarity of the questions, and the helpfulness of the suggested queries.

@jmakowski1123 jmakowski1123 moved this from To Do to In Progress in Axim Engineering Tasks Feb 11, 2022
@feanil
Copy link
Contributor

feanil commented Feb 15, 2022

@jmakowski1123 here is what I was thinking for the attached technical documentation, though honestly I think if we just give them the all-in-one query at the end that might be simpler.

Version 1

Hello, thank you for your interest in filling out the Instance Survey.

For a few of the questions, you may be able to get quick, fairly accurate
answers by running sql queries against the database of your instance of the
Open edX platform. Below are the queries that we think will help speed up your responses.

Number of Unique Courses Currently Offered

select count(*) from course_overviews_courseoverview where start < now() and (end > now() OR end is NULL);

Total Number of Learners Currently Using the Site

select count(*) from auth_user where last_login > date_sub(now(), interval 4 week);

Total Number of Enrollments for All Courses

select count(*)  from student_courseenrollment as se join auth_user as au on se.user_id = au.id left join course_creators_coursecreator as cc on se.user_id = cc.user_id where is_staff=0 and is_superuser=0 and (cc.state is NULL or cc.state="unrequested");

(Optional) Total Number of Course Completions

If you have Persistent Grades Enabled you can run the below query and get the latest information about how many users have passe their courses.

select count(*) from grades_persistentcoursegrade where passed_timestamp is not NULL;

Run all at once

You can run all these at once by running the following combined query.

SELECT (
    select count(*) from course_overviews_courseoverview where start < now() and (end > now() OR end is NULL)
) as current_uniqe_courses, (
    select count(*) from auth_user where last_login > date_sub(now(), interval 4 week)
) as active_users, (
    select count(*)  from student_courseenrollment as se join auth_user as au on se.user_id = au.id left join course_creators_coursecreator as cc on se.user_id = cc.user_id where is_staff=0 and is_superuser=0 and (cc.state is NULL or cc.state="unrequested")
) as total_enrollments, (
    select count(*) from grades_persistentcoursegrade where passed_timestamp is not NULL
) as total_course_completions;

You should get output that looks something like this:

+-----------------------+--------------+-------------------+--------------------------+
| current_uniqe_courses | active_users | total_enrollments | total_course_completions |
+-----------------------+--------------+-------------------+--------------------------+
|                     1 |            2 |                 1 |                        0 |
+-----------------------+--------------+-------------------+--------------------------+

Version 2

Thank you for filling out the instance survey, for a few of the questions, you may be able to run a simple query to get answers directly from your database.

If you run the following:

SELECT (
    select count(*) from course_overviews_courseoverview where start < now() and (end > now() OR end is NULL)
) as current_uniqe_courses, (
    select count(*) from auth_user where last_login > date_sub(now(), interval 4 week)
) as active_users, (
    select count(*)  from student_courseenrollment as se join auth_user as au on se.user_id = au.id left join course_creators_coursecreator as cc on se.user_id = cc.user_id where is_staff=0 and is_superuser=0 and (cc.state is NULL or cc.state="unrequested")
) as total_enrollments, (
    select count(*) from grades_persistentcoursegrade where passed_timestamp is not NULL
) as total_course_completions;

You should get output that looks something like this:

+-----------------------+--------------+-------------------+--------------------------+
| current_uniqe_courses | active_users | total_enrollments | total_course_completions |
+-----------------------+--------------+-------------------+--------------------------+
|                     1 |            2 |                 1 |                        0 |
+-----------------------+--------------+-------------------+--------------------------+

Note: total_course_completions may not be accurate if don't have Persistent Grades Enabled In this case, please omit this or provide us this information via some other means that you might have.

@jmakowski1123
Copy link
Author

jmakowski1123 commented Feb 15, 2022

Thanks @feanil ! Agree, I think Version 2 is better to lead with - less intimidating. We could then have an option to link/scroll down to view each individual one if they want.

@jmakowski1123
Copy link
Author

jmakowski1123 commented Feb 15, 2022

@e0d Two questions about the flow of data:

How does data/results from the queries Feanil built above feed into FormAssembly? Is the assumption that Providers would have to manually input the results of each query they run into FA?

Once each FA has been submitted, what is the output that we see? How is the data presented? This question is related to how we structure the multi-tenant Google Sheet in #117

@e0d
Copy link

e0d commented Feb 16, 2022

I think we want the providers to supply us with the data in a sheet or CSV. We can integrate their sheets with the data gathered via the form. The goal is to really minimize busy work for providers.

For example, we can use the query function to join data across multiple files or sheets into a single sheet for our comprehensive analysis.

@jmakowski1123
Copy link
Author

I think we want the providers to supply us with the data in a sheet or CSV. We can integrate their sheets with the data gathered via the form. The goal is to really minimize busy work for providers.

For example, we can use the query function to join data across multiple files or sheets into a single sheet for our comprehensive analysis.

@e0d Just to make sure I understand the data flow. We circulate the FormAssembly Survey. For the subset of questions that can be answered with the queries, we ask for that data back in csv/excel. The remainder of the questions are answered directly in FA.

I'm wondering if it might be more straightforward to circulate the google sheet we are using for multi-tenancy instances for everyone? And not use FA at all? Or am I overlooking something?

@e0d
Copy link

e0d commented Feb 16, 2022

@jmakowski1123 My thought was that one would either use FA or the CSV, not both. The CSV is the interface for experts running many tenants. The FA form is the interface for the rest of us. This relies one the form of the data being the say in either case. If that is true, aggregating it will be easy.

@feanil
Copy link
Contributor

feanil commented Feb 16, 2022

I think that makes sense and I can incorporate that into instructions. It sounds like the entry point will be the instructions which will tell users to either use the form or the CSV to submit their data based on which criteria they meet. In either case it will be helpful for them to have the queries to get the data that's easy to calculate. For the CSV, how should they get it back to us? Should we have them e-mail it to Jenna? Should we setup a google groups inbox for this so it's not a specific person's e-mail address?(I lean towards the latter).

@jmakowski1123
Copy link
Author

jmakowski1123 commented Feb 16, 2022

Agree, I think a groups inbox is the better option. The instructions can also include that email if they have specific questions or issues perhaps

@feanil
Copy link
Contributor

feanil commented Feb 16, 2022

Here is a newer version of the instructions incorporating what we just talked about: https://gist.github.com/feanil/0b49d5219ae08baf1216556d455dfdbb

@jmakowski1123
Copy link
Author

Thanks @feanil ! Couple of minor edits to the text of the instructions (can I make these changes myself? If not, they are:)

In the intro, I'd like to add another line: "We aim to present the results of the Survey – all anonymized and in aggregate – at the Open edX Conference in April."

And in this line: "If you operate multiple sites, we'd love to hear about as many of them as you are willing to disclose." Change to ---> "If you operate multiple sites, we'd love to hear about all of them."

@feanil
Copy link
Contributor

feanil commented Feb 17, 2022

I think you can fork my gist but that can be annoying, so I've just made the updates you've asked for.

Link: https://gist.github.com/feanil/0b49d5219ae08baf1216556d455dfdbb

@jmakowski1123
Copy link
Author

Thanks!!

Repository owner moved this from In Progress to Done in Axim Engineering Tasks Feb 17, 2022
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
Archived in project
Development

No branches or pull requests

3 participants