mitodl/open-discussions

Include external courses and programs in the catalog API results

Closed this issue · 14 comments

As a user of the xpro catalog API, I'd like external courses and programs to be included in the results, with a field identifying which are external and which are internal.

Designs and Mockups

The current viewsets for these API responses will need a significant refactor, since they are each based on one model (Course or Program) and one serializer (CourseSerializer or ProgramSerializer). External course and program data needs to come from different models ExternalCoursePage, ExternalProgramPage) which will need their own serializers, and these will need to be combined somehow in the API response. Something like this might work:

class CourseAPIView(APIView, GenericAPIView):
    permission_classes = (IsAuthenticated,)
    internal_queryset = Course.objects.all()
    external_queryset =CoursePage.objects.all(
    internal_serializer_class = CourseSerializer
    external_serializer_class = CoursePageSerializer

    def get(self, request, *args, **kwargs):
        internals = self.internal_serializer_class(self.internal_queryset, many=True)
        externals = self.external_serializer_class(self.external_queryset, many=True)

        response_results = internals.extend(externals)
        return Response(response_results)

Program vs ProgramPage: Both models have (directly or via foreign key) a title, description, readable_id, price, and thumbnail.
Course vs CoursePage: Both models have (directly or via foreign key) a title, description, readable_id, and thumbnail.

Program Courses: From what I can tell, there does not seem to be any equivalent for this in ProgramPages. So each external program won't have any courses listed.

Course Runs: I think we will have to assume that each external course has one run, with the same title and description as the CoursePage, and with the CoursePage start date and price. There are no fields in the CoursePage model for other dates (end_date, enrollment_<start/end>, expiration_date, and none of the following usually included in API run data: run_tag, product_id, instructors. I am guessing that the following may be equivalent but not sure:

CoursePage.external_url == Course.url ?
CoursePage.readable_id == CourseRun.courseware_id ?

Platform/Offered By: What should be used for external courses and programs? A generic "Other"? open-discussions currently assumes everything in these API results is for the xpro platform and offered by xpro.

Acceptance Criteria:

[ ] All courses and programs, internal and external, are returned by the urls https://xpro.mit.edu/api/courses/ and https://xpro.mit.edu/api/programs/ respectively
[ ] A new field for each entity in the API response (is_external, True or False) will indicate whether each course/program is external or internal
[ ] open-discussions assigns a correct platform and offered_by value for externals (either something generic like "Other", or the xpro pages need to specify them and the API needs to include this).

It's hard for me to follow the details without some sample values. Can you mock up an API response for an external course?

It should be the same as for internal courses, though lots of the fields will be blank. So something like this:

[
    {
        "id": 18,  // Based on model id, so there might be dupes because the data will come from multiple models
        "title": "Discovering and Implementing Your Leadership Strengths",
        "description": "<p>A three-week online course for technical professionals that will empower you with the essential skills needed to solve problems, innovate, and drive change.</p>",
        "thumbnail_url": "https://xpro-app-production.s3.amazonaws.com/original_images/NEW-LP-Course-4_New.jpg",
        "readable_id": "course-v1:xPRO+MLx2-SL",
        "is_external": true,
        "courseruns": [
            {
                "title": "Discovering and Implementing Your Leadership Strengths",  // Same as `CoursePage title`
                "start_date": "2023-02-06T05:00:00Z",   // Same as `CoursePage start_date`
                "end_date": null,
                "enrollment_start": null,
                "enrollment_end": null,
                "expiration_date": null,
                "courseware_url": "https://xpro.mit.edu/programs/program-v1:xPRO+MLx-SL+R1/",  // `CoursePage.readable_id`?
                "courseware_id": "course-v1:xPRO+MLx2-SL",  // `CoursePage.readable_id`?
                "run_tag": null,
                "id": 18,   // Same as `CoursePage.id`
                "product_id": null,
                "instructors": [],
                "current_price": 949.0. // CoursePage.price
            }
        ],
        "next_run_id": null,
        "topics": []
}                

And for programs:

[
    {
        "title": "Executive Leadership Principles",
        "description": "Some description",
        "thumbnail_url": "https://xpro.mit.edu/static/images/mit-dome.png",
        "readable_id": "program-v1:xPRO-SL+LASERx-SL",
        "current_price": 0.0,
        "id": 13, // Based on model id, so there might be dupes because the data will come from multiple models
        "courses": [],  // No courses list field in page
        "is_external": true,
        "start_date": "2023-01-10T05:00:00Z",
        "end_date": null,
        "enrollment_start": null,
        "url": "https://sl-onlinetraining.xpro.mit.edu/executive-leadership-program?abVariation=235erdcdf",
        "instructors": [],
        "topics": []
   }
]     

We need topics for both courses and programs. I expect we need to add a field to the CMS for collecting topics (could be a separate issue)

Can you tell from looking at the ProLearn API if there are any other fields we are missing? Like maybe format? (again, could be a separate issue)

Not sure what format is. Prolearn API is missing instructors, enrollment start/end dates, expiration dates

Summary of what's missing from externals (most are not necessary for open-discussions import) :

Courses:

  • course_runs (can include one in API response based on course page info)
  • end_date
  • enrollment_start
  • enrollment_end
  • expiration_date
  • instructors
  • topics
  • product_id
  • next_run_id
  • run_tag
  • platform/offered_by

Programs:

  • courses
  • end_date
  • enrollment_start
  • instructors
  • topics
  • platform/offered_by

Which are required for open-discussion import?

The open-discussions import function is pretty forgiving of missing data, but:

  • in order for a course details drawer to have a link to the course, there must be a url provided for either the course or its runs. Ditto for programs. If missing, the drawer will still work fine but no link to the course/program will be displayed.
  • the import always assumes the platform/offered_by value is xpro. If you want this to be different for external courses, and not just a generic "Other" value for all external courses, it will need to be provided
Ferdi commented

We'd probably need external courses to have similar metadata as the regular ones. For that to happen, we need to

  1. Add the relevant fields to the respective wagtail content type
  2. Someone fllling in and maintain the content (manually, for now at least)

right ?

Yes, that's right. I'll open an issue for the CMS changes.

I've been told that the xPRO team will manage the content for these external courses manually.

@arslanashraf7 @cachob what is left to do on this issue?

@mbertrand can we start ingesting external xPRO courses into the Open index?

@arslanashraf7 @cachob what is left to do on this issue?

@pdpinch This whole feat should be available on RC now with the last PR that does the API changes being merged in this release.

We can test the new API data at:

Could you take a look or ask someone relevant to these changes to check the API results in these APIs in RC and let me know if the response details look good to you and/or them?

Note: There would be a slight change in the response when we finish mitodl/mitxpro#2628. This would be because we're moving the external_marketing_url to Courseware pages in CMS instead of CourseRuns/ProgramRuns. So at the API level, this field would come under the courses/programs section directly instead of nested Course Run/Program Runs. This is urther discussed in mitodl/mitxpro#2613.

Sorry for the late response but yes, the external courses should get picked up by open-discussions automatically

I think we can close this one. @mbertrand what do you think?