brightspaceR ships with convenience functions that handle the repetitive work of joining Brightspace Data Sets (BDS) and parsing column types. This article walks through each one.
bs_get_dataset() is the workhorse. It looks up a dataset
by name, downloads the latest full extract, parses the CSV with the
correct column types, and returns a tidy tibble with snake_case column
names:
If you don’t know the exact dataset name, use
bs_list_datasets() to browse everything available, or
bs_search_datasets() to filter by keyword:
BDS datasets are normalized – users, enrollments, grades, and org units live in separate tables linked by ID columns. brightspaceR provides two ways to join them.
bs_join()bs_join() examines both data frames, finds columns
ending in _id that appear in both, and performs a left join
on those columns:
users <- bs_get_dataset("Users")
enrollments <- bs_get_dataset("User Enrollments")
# Automatically joins on user_id
combined <- bs_join(users, enrollments)This works well for most pairs of datasets. Under the hood it uses the schema registry to identify key columns.
For explicit, self-documenting code, use the named join functions. Each specifies exactly which key columns are used:
# Users + Enrollments (by user_id)
bs_join_users_enrollments(users, enrollments)
# Enrollments + Grades (by org_unit_id and user_id)
grades <- bs_get_dataset("Grade Results")
bs_join_enrollments_grades(enrollments, grades)
# Grades + Grade Objects (by grade_object_id and org_unit_id)
grade_objects <- bs_get_dataset("Grade Objects")
bs_join_grades_objects(grades, grade_objects)
# Enrollments + Org Units (by org_unit_id)
org_units <- bs_get_dataset("Org Units")
bs_join_enrollments_orgunits(enrollments, org_units)
# Enrollments + Roles (by role_id)
roles <- bs_get_dataset("Role Details")
bs_join_enrollments_roles(enrollments, roles)
# Content Objects + User Progress (by content_object_id and org_unit_id)
content <- bs_get_dataset("Content Objects")
progress <- bs_get_dataset("Content User Progress")
bs_join_content_progress(content, progress)All join functions use dplyr::left_join(), so the first
argument determines which rows are preserved.
Build a complete grade report by chaining joins with the pipe:
BDS exports everything as CSV with PascalCase column names and string values. Without schema information, dates come through as character, booleans as “True”/“False” strings, and IDs as text. brightspaceR’s schema registry fixes this automatically.
The package knows the column types for ~20 common BDS datasets:
bs_list_schemas()
#> [1] "users" "user_enrollments"
#> [3] "org_units" "org_unit_types"
#> [5] "grade_objects" "grade_results"
#> [7] "content_objects" "content_user_progress"
#> [9] "quiz_attempts" "quiz_user_answers"
#> [11] "discussion_posts" "discussion_topics"
#> [13] "assignment_submissions" "attendance_registers"
#> [15] "attendance_records" "role_details"
#> [17] "course_offerings" "final_grades"
#> [19] "enrollments_and_withdrawals" "organizational_unit_ancestors"Each schema defines column types, date columns, boolean columns, and key columns (used for joining):
For datasets without a registered schema, brightspaceR applies intelligent type coercion:
doublelogicalPOSIXctcharacterThis means bs_get_dataset() returns usable tibbles even
for datasets the package doesn’t know about.
org_units <- bs_get_dataset("Org Units")
grades <- bs_get_dataset("Grade Results")
grade_objs <- bs_get_dataset("Grade Objects")
# Find the course
course <- org_units |> filter(grepl("STAT101", name))
# Build grade report
grades |>
filter(org_unit_id %in% course$org_unit_id) |>
bs_join_grades_objects(grade_objs) |>
group_by(name) |>
summarise(
n_students = n_distinct(user_id),
mean_score = mean(points_numerator, na.rm = TRUE),
.groups = "drop"
)content <- bs_get_dataset("Content Objects")
progress <- bs_get_dataset("Content User Progress")
bs_join_content_progress(content, progress) |>
group_by(title) |>
summarise(
n_users = n_distinct(user_id),
n_completed = sum(!is.na(completed_date)),
completion_rate = n_completed / n_users,
.groups = "drop"
) |>
arrange(desc(n_users))