How many students are on Domain of One’s Own? How many faculty? How many sophomores signed up last semester? Which freshman seminars are using Domain of One’s Own? How many domains will expire this month? Which of those belong to graduates, and which need to be renewed? Which platforms are most frequently installed? (WordPress, Known, DocuWiki, Omeka, etc.)
Each of these questions requires looking at data that lives in a different place. In fact, some of them require comparing data from multiple sources to answer. This can make it difficult to gain even a bird’s-eye view of the program, as even some of the big, overview questions require getting into the nitty-gritty details in order to properly merge databases and get an accurate answer.
But not anymore.
Over the past three months, in between other projects, I’ve been building a data management and reporting system for UMW’s Domain of One’s Own program. Today I’m happy to announce that the public-facing part of that system, the dashboard-like site data.umwdomains.com, is now live! This site is a Shiny App run on a Digital Ocean virtual private server, built on R scripts that pull and merge data daily from several different sources:
- our Enom domain registrar account (current information via their robust API)
- a manual Enom transaction report (for historical account data)
- user account data on umw.domains (via the WordPress REST API)
- Banner, our institutional information system (via a manual pull from UMW’s SharePoint database system)
- WHMCS, our client management system (merged with Banner in SharePoint)
- Some manually created data tables containing information missing from one or more of these sources (usually for former employees or irregular accounts from the original DoOO pilot)
- coming soon: data from an installatron plugin written by Martha Burtis that records app installations and asks users to volunteer information like if the site is for a class, and if so, which class and instructor
This is a lot of data. Figuring out how to clean, wrangle, and merge it all into a single coherent entity was a challenge, especially as I was learning a lot of the tools involved while doing it! But we now have all of this data in one place, making it much easier to ask questions like how many sophomores signed up last semester? or how many students will be graduating (and need to migrate their data) this spring? This is a big help for our planning, our account management, and ― as we get more data about usage in specific courses ― our faculty development offerings and our creation of documentation and other resources.
While this dataset makes it easier for us to access the data we need, it also raises questions about user data privacy, something that DTLT cares about deeply. So I should note that we (and the keepers of the institutional data at UMW) have been careful not to collect data we don’t need or that will compromise user’s privacy. In fact, the only new data collection is the plugin that records app installations and asks about class usage. Other data is already necessary to run Domain of One’s Own (name, domain, email address, student/faculty/staff status, active/inactive/graduated student, etc.) or are standard university personnel data (class, matriculation date, anticipated graduation date, etc.). We’ve also taken care to ensure that the API links between umw.domains, data.umwdomains.com, and Enom are secure, and that the online data storage is secure.
We’ve also drawn a clear separation between what we’re comfortable sharing publicly on data.umwdomains.com, and what we’re restricting to internal reports. As you can see on data.umwdomains.com, we only post broad, aggregate data in public: total domains allocated over time, signups by year/semester/month, divided by faculty/staff and students. In the future, we’ll also likely add information about app usage (once we have more data), and break down student data by class ― Fr/So/Jr/Sr/Grad (once I write a couple functions to convert current class and/or graduation date into student class when they signed up).
Internally, we collect and report more information. I’ve written an RMarkdown script that will generate a report each month containing more detailed information. Here’s a sample of the test report for January:
Monthly report: January 2017
We had 82 new domain registrations, 533 renewals, and 3 domains expire in the month of January 2017.
New registrations by group
group | class | count |
---|---|---|
Student | Senior | 31 |
Student | Junior | 27 |
Student | Sophomore | 13 |
Student | Freshman | 9 |
Faculty/Staff | Senior | 2 |
Below the class-by-class breakdown is a list of all the students and faculty/staff who signed up for Domain of One’s Own last month, with their domain URL, their email address, their status (student or faculty/staff), and their signup date. Once I’ve added course data to the mix, we can add that to the report as well. The monthly report also gives us a list of domains and users whose domains expired in the past month.
The report also does some projecting for future months. Here’s a sample of the test projection for February.
Monthly projection: February 2017
In February 2017, we have 7 domains scheduled to expire (or renew).
Based on a linear model of past Domain of One’s Own registration activity, we expect approximately 34 new registrations in February 2017. Please note that these projections are only rough estimates, based on a small data set without accounting for special events and initiatives that may have contributed to registration counts in the past.
Note that student class in this plot represents their current class, not their class at the time of signup. Most of the time, then, ‘NA’ means graduated students. In the future, I’ll add some nuance to this so that we can see both “Class of 2017”, “Class of 2016”, etc. and Fr/So/Jr/Sr/Grad at the time of signup. Both of those distinctions will be helpful for us to track, but requires a little more wrangling of the data to produce.
The report also contains a list of the users and domains set to expire or renew that month, so we can ensure that auto-renew is on or off, as appropriate. Knowing who is rolling off or renewing, and knowing how many new domains to expect in the upcoming couple of months can help ensure that we’re properly supporting and funding the program.
I’m really excited about this. It was fun to build, I learned some cool things while doing it, and it will help us ensure that we’re doing right by UMW’s students, faculty, and staff as Domain of One’s Own continues to grow. If you’re at another institution with a Domain of One’s Own program and want to see if these reports will fit your setup, please checkout the code on GitHub. Private data has been omitted, but all the scripts for the data wrangling, the Shiny App, and the RMarkdown report are there for you to test out!
Header image by Bruno Scramgnon (CC0).