DevOps Manager. Engineering & Technical Operations, MIT
The recent shift to digital learning has permanently opened doors for providing world-class education regardless of location. To meet this opportunity for its current and future students, MIT is building its entire open courseware infrastructure at the edge. Hear how building on Fastly has enabled the MIT team to bring open courseware under engineering control, simplify architecture, and streamline its health check and publishing workflows — completely removing a step between QA and production.
Empowering the new age of education
See why edtech and online education platforms trust Fastly to deliver their tools and courses at scale on any device.
Tobias Macey (00:00):
Hi, this is Tobias Macey, and I'm here to talk about what we've been doing at MIT Open Learning to bring education to the edge where we've been replatforming the OpenCourseWare site for the Cloud Era. To give a bit of an overview of OpenCourseWare by the numbers, it was originally launched in 2001 with the goal of providing access to MIT's educational techniques and course materials, freely available to everybody in the world to be able to gain better access to education. In that time, we've been able to publish over 2,400 courses with 500 million visitors to the site, and we have moved the majority of our video content onto YouTube, where we've gained over 1.7 million subscribers. Another interesting statistics about OpenCourseWare in particular is that over half of the traffic is from people outside of North America, which is surprising for a site that is primarily produced in the United States.
Tobias Macey (01:09):
In terms of how we're using OpenCourseWare today and the way that we're managing content, we are currently still on a legacy platform that's using an old CMS called Plone, and it has been falling out of maintenance because of various issues due to technical debt. It has a fairly cumbersome workflow where the progression requires a lot of various stages. Actually being able to put content into the site has a number of different steps. There are a number of different forms and attributes of trying to produce the content that are not very intuitive, particularly for people who are used to modern CMS's. Once the content has actually been produced, it is actually copied over to the Origin servers and various other systems using SSH, which is error prone and there are just a lot of moving pieces to this overall site.
Tobias Macey (02:07):
Once we do get it to the Origin server, though, it is fairly easy for delivery because we're able to take advantage of a CDN, in this case Fastly, and all the content is served as static HTML with some downloadable assets, such as PDFs, of course, syllabuses or quizzes for people who want to be able to experiment with testing their own learning. The actual server is just a single instance, running engine X to be able to proxy those static files, and we're able to take advantage of Fastly capabilities, such as shielding to reduce the overall load incoming to that origin server, as well as improving the speed at which content is delivered to all of the global edge locations, which is important for OpenCourseWare because of the global audience that we are trying to serve.
Tobias Macey (03:04):
We've also been using aggressive caching strategies, so that content is long lived and will only be retired once we publish a new version. We've also been able to take advantage of the image optimizer capability so that images are significantly reduced in size, the bandwidth is reduced and it improves the experience of people who are in bandwidth constrained environments, such as developing nations, or for people who are unable to afford high speed internet for whatever the reasons might be. So we want to still be able to reach those people because education is important, particularly for helping to elevate people's overall station in life. So that's one of the things that we want to ensure that we are not blocking this content for the people who need it the most.
Tobias Macey (04:02):
In terms of some of the big wins that we've been able to realize by moving to Fastly from our prior provider is a over 30% annual savings in terms of the actual cost of bandwidth and delivery. We have also been able to gain better access to more granular metrics of things such as geography of people who are visiting the site so that we can build better reports as to who is accessing the content and from where, and when. We're also able to gain better understanding of things such as cash hits, cash misses, ways that we can improve the site to ensure that the experience is the best for everybody who's accessing it.
Tobias Macey (04:42):
We've also been able to improve the policies and configuration updates for caching for the CDN and bring that under control of the engineering department, whereas previously it had been the role of just a small handful of people who had an understanding of what the old CDN provider was set up as. We had to rely a lot on their support teams for being able to make changes, but by using Fastly and using their more intuitive configuration approach, we have improved the option of being able to do things such as targeted cache purges so that we don't need to purge the entire site. We can just purge a cluster of resources when a single course is published and things like that.
Tobias Macey (05:31):
So that's where we've been able to get to today with OpenCourseWare, but by being able to improve the overall stability of the site and reduce the amount of maintenance that's necessary to keep it up and running, it has freed us up to be able to consider what the next generation of the OpenCourseWare platform is going to look like. So to that end, we have decided to structure the authoring workflow around static markdown files, because they are easier to author. They are more accessible than the Legacy CMS that we've been dealing with. They are also easier to version and track in various systems, and so that also improves the accessibility of the authoring so that if we wanted to, we can open up this overall pipeline to people who want to be able to build their own bespoke course websites outside of the domain of OpenCourseWare.
Tobias Macey (06:27):
Once we have the content in that markdown format, we are using static site publishing tools, such as Hugo for being able to render the content to HTML, using custom themes and custom templates to set the specific structure for how we want that content to be represented. Once that content is rendered to HTML as a static site, we can copy it all up to S3, where we're able to take advantage of the high reliability and durability guarantees of object storage to serve as the origin, which also reduces the maintenance burden in terms of having to ensure that there is an origin server that is remaining up, that is being updated with the latest security patches that is using the latest versions of TLS. So that just reduces, again, the amount of time that we need to spend managing infrastructure where the primary goals of this tool is to just get the content out and in front of people.
Tobias Macey (07:25):
In terms of the delivery, we're able to continue using Fastly in the same fashion so that we have a reliable mechanism for delivering that content globally, and we're able to just use that S3 origin from a Fastly while still being able to do things such as managing redirects or rewriting paths for different files as things change without having to worry about pushing that logic into the S3 layer or restructuring the content as it sits in the buckets. We can handle those aspects using Fastly directly.
Tobias Macey (08:04):
So in order to transition from the old system to the new system, we want to be able to carry all of that existing content forward into the new platform, because there's a lot of value to be had there. So in order to be able to make that transition, we've added a new publishing target in the legacy CMS, so that in addition to generating the HTML content that we're publishing and delivering from the origin through Fastly, we have also structured all of that information into a set of JSON documents, including binary representations of things such as images and PDF files, but also the HTML is embedded directly into those documents. Then we're able to store that in S3, and we have a custom pipeline that picks up that information, transforms it into the representation and the structure that we're looking for with our new system, parses all of that HTML and renders it out to markdown so that we have a solid starting point for the new system to be able to take over without having to do anything manual to bring that content along for the journey.
Tobias Macey (09:20):
Once that content has been rendered out to mark down, we're able to then pick it up from the place as if somebody was just offering a brand new site, run it through that Hugo workflow, bring in those custom templates and custom themes, and then all of the images and binary assets that were embedded in those JSON documents, we just upload to S3 and it can all live there and be served from that origin using Fastly. So [inaudible 00:09:49] we are using Fastly for being able to deliver OpenCourseWare, we have also started using it for some of our other applications that we're building at MIT Open Learning.
Tobias Macey (09:59):
So one of the interesting aspects of Fastly is that it provides an any cast IP address, which means that if you have systems that would otherwise require something like a C name in your DNS, we're instead able to use it as an A record, which frees us up to use custom sub-domains for other things, such as email delivery, email validation, using text records to validate things such as Google Analytics and domain ownership. In addition to that, we've also started using it for some of our new applications that we're building at MIT Open Learning, most notably being MIT bootcamps, where before it was an in-person experience, because of the pandemic, we have had to bring it entirely online. So we've built out a brand new application to be able to handle things such as the application process for users who want to take part in that experience. We're also using it for the xPro application for marketing and being able to track progress in the dashboard for users who are taking part in those professional development courses, for folks who want to improve their overall career trajectories.
Tobias Macey (11:15):
So thank you for listening. If you have any questions, I'm always happy to answer them or discuss other things that we're building at MIT Open Learning. You can find me online. In addition to my work as the team lead for platform and data engineering at MIT, I also host the data engineering podcast and podcast.init. So if you are interested in any of those topics, again, I'm happy to answer your questions. You can also find me on Twitter at Tobias Macey, and my LinkedIn is in this final slide.