-
Notifications
You must be signed in to change notification settings - Fork 0
Schema
Once the data has been extracted from the Moodle database, it needs to be stored in two separate databases. All of the actual data will be anonymized and stored in the "Course" database. The user database keeps the information needed to identify a student in the "Course" database. Since the data stored in the course database comes from sources other than the Moodle, the identifying data needs to be kept until all of the data from the various sources involved has been entered into the "Course" database. Once all of the data for a given student has been entered, the identifying data should be purged from the user database.
The database schema for has been designed with the following objectives:
- Complete referential integrity
- Fully normalized
- Self-documenting
- Independent of any applications used to operate on the database
- Extendible, so that changes can be made gracefully
The initial schema for the course database includes 40 tables. Twelve of these tables contain core elements and the remaining 28 tables contain activity data and log references. With a large number of tables a few naming conventions are required to make sure that the table and field names are readable, and to make sure that a user can find the relevant table in a listing. Since table and field names in relational databases are not case sensitive, the naming convention must be able to work will names that are lower case only.
Most field names should be a single word that describes the contents of the field. In cases where multiple words are required, the words must be separated by a single underscore. Field names should be chosen to be consistent with the names of similar fields in other (pre-existing) tables. Identifier fields should be named "id" and foreign key fields should be named "<table name>_id." In cases where the referenced table has a multi-word name, the <table name> component of the field name should be shortened to the last word of the name of the referenced table. A few examples are:
- An identifier field for a table: id
- A foreign key to the "activity" table: activity_id
- A foreign key to the "activity_source": table: source_id
- A foreign key to the "activity_Moodle_book" table: book_id
Table names should follow the same convention as field names: either one word describing the table, are multiple words separated by underscores. However, given the large number of tables, the names need to be chosen in such a way that they group related tables together. Since the tables in the database divide into four groups, each group name (activity, course, enrolment, and log) should be used to prefix the table name. For example:
- activity
- activity_source
Most of the tables in the database are the activity data tables. These tables should follow the following naming convention:
- activity_<source>_<activity name>
Where <source> of the source of the data (ie. Moodle), and <activity name> is the name of the activity. The value chosen for <source> must match the corresponding value in the activity_source table and the value chosen for <activity name> must match the corresponding entry in the activity table.
For activities which have a primary table and one or more child-tables, the primary table should follow the convention outlined above, which child-tables are named as follows:
- activity_<source>_<activity name>_<child-table name>
Where <source> and <activity name> are the same are for the primary table and <child-table name> identifies the contents of the table. A few examples are:
- A primary table: activity_Moodle_forum
- A child-table: activity_Moodle_forum_post
All of the activity child-tables will need a corresponding log reference table. It should be named as follows:
- log_<source>_<activity_name>_<child-table name>
Where <source>, <activity name>, and <child-table name> match the corresponding elements of the child-table. For example:
- log_Moodle_forum_post
Information that identifies a particular participant in a course is stored in the user database. In the Entity-Relationship diagram the entity sets which are stored in the user database are highlighted in red. To protect the privacy of the students enrolled in the course, the user identifying information is kept in a separate database, or a separate schema in the database with course data. The tables mapping the user-identifying information to the enrolment are designed to ensure that there is no dependency between the user information and the enrolment. Once all of the data for a particular course has been loaded into the course database, the user identifying information can be safely purged.
Identifying student information.
Field | Type | Description |
---|---|---|
id | Integer | Auto-incrementing Primary Key |
username | String | Student's Login ID |
first_name | String | Student's Given (First) name |
last_name | String | Student's surname |
Mappings between the identifying student information and the anonymized enrolment information in the "course" database. Enrolment ID is a pseudo Foreign Key since it is not possible to specify foreign key constraints across databases. However, the foreign key can exist in cases where the user data and course data exist in separate schema's within the same database.
Field | Type | Description |
---|---|---|
user_id | Integer | Foreign Key to the ID field of the user table |
enrolment_id | Integer | "Foreign Key" to the ID field of the enrolment table |
All of the non-user identifying data is stored in the course database.
The semester in which a course is offered.
Field | Type | Description |
---|---|---|
id | Integer | Auto-incrementing Primary Key |
name | String | Name of the Semester |
Note: This table represents an enumeration, with the values: FALL, WINTER and SPRING.
Information on the course offerings.
Field | Type | Description |
---|---|---|
id | Integer | Auto-incrementing Primary Key |
semester | Integer | Foreign key to the ID field of the course_semester table |
year | Integer | Year in which the course was offered |
name | String | Name of the course |
The list of sources for activity data.
Field | Type | Description |
---|---|---|
id | Integer | Auto-incrementing Primary Key |
name | String | The name of the source of the activity data |
All of the components that can make up a course.
Field | Type | Description |
---|---|---|
id | Integer | Auto-incrementing Primary Key |
source_id | Integer | Foreign key to the ID field of the activity_source table |
name | String | Name of the item (i.e. the corresponding Moodle module) |
Note: For data coming from Moodle the activity name must match the Moodle module name.
All of the components of a particular course.
Field | Type | Description |
---|---|---|
id | Integer | Auto-incrementing Primary Key |
course_id | Integer | Foreign Key to the ID field of the course table |
type_id | Integer | Foreign Key to the ID field of the activity_type table |
Activities for which there is no actual data. These are activity entries that show up in the log, but are not actual activities.
Field | Type | Description |
---|---|---|
id | Integer | Foreign Key to the ID field of the activity table |
The list of roles that a participant can have in a course.
Field | Type | Description |
---|---|---|
id | Integer | Auto-incrementing Primary Key |
name | String | Name of the Role (instructor, TA, Student, etc.) |
The list of participants in a course.
Field | Type | Description |
---|---|---|
id | Integer | Auto-incrementing Primary Key |
course_id | Integer | Foreign Key to the ID field of the course table |
role_id | Integer | Foreign Key to the ID field of the enrolment_role table |
grade | Integer | The student's final grade in the course |
usable | Boolean | Indicator if the student has given permission to use their data |
Note: The grade field is allowed to be null.
Grades recorded for individual activities completed by each student.
Field | Type | Description |
---|---|---|
id | Integer | Auto-incrementing Primary Key |
enrolment_id | Integer | Foreign Key to the ID field of the enrolment table |
activity_id | Integer | Foreign Key to the ID field of the activity table |
grade | Integer | The students grade for the item |
Note: The combination of the enrolment_id and activity_id fields is unique. Normally these two fields would be used to form a composite primary key, however composite primary keys cause problems with the object model, so an auto-incrementing key is used here instead.
The list of actions that can be performed, with the corresponding module.
Field | Type | Description |
---|---|---|
id | Integer | Auto-incrementing Primary Key |
name | String | Name of the action (corresponding to the Moodle log) |
The network from which a logged activity originated. Moodle stores the IP address for the originating event, which could be used to identify the student. This table store the name of the network owner, instead of the IP address.
Field | Type | Description |
---|---|---|
id | Integer | Auto-incrementing Primary Key |
name | String | The name of the organization that owns the network |
All actions recorded for a particular participant and an instance of an activity.
Field | Type | Description |
---|---|---|
id | Integer | Auto-incrementing Primary Key |
enrolment_id | Integer | Foreign Key to the ID field of the Enrolment table |
activity_id | Integer | Foreign Key to the ID field of the activity table |
action_id | Integer | Foreign Key to the ID field of the log_action table |
network_id | Integer | Foreign Key to the ID field of the log_network table |
time | Time | Time (including the date) at which the action was performed |
Data specific to a particular instance of a given activity within a course is stored in the activity data tables. In the Entity-Relationship diagram the activity data tables are represented in blue. The individual tables, by default, all have the same structure: A foreign key to the activity table and a text field for the name of the entry. Since the activity data table is dependent on the activity, and has a one-to-one relationship with the activity, the foreign key to the activity instance table serves as the primary key for the activity data tables. For simplicity, template tables are used to represent the activity data and sub-activity data tables in the Entity-Relationship diagram.
Some of the activities will have sub-activity data associated with the activity data. For the activities that have sub-activities, the activity data tables are the same as for activities that do not have sub-activities. Separate tables which are dependant on the activity data tables are used to store the data for the sub-activities, as shown by the dashed blue entity sets in the Entity-Relationship diagram. The sub-activity tables are structured like the activity data tables. They have a foreign key to the activity relevant activity data table, and a text field to store the name of the entry, but they also have an id field that acts as a discriminator since the sub-activity tables have a many-to-one relationship with the activity data tables.
By default, the activity data tables and the sub-activity data tables only store the name of the activity. The name of the activity is the minimum amount for activity data which is required for the user to be able to discriminate between multiple instances of a given activity in the same course. It is expected that additional activities will be added to the database, and some of the activity data tables will eventually be expanded to include additional information. Storing the data for each activity or sub-activity in its own table allows for any of the activity data tables to be expanded to contain additional information without affecting the rest of the database.
Data from the Moodle assign module.
Field | Type | Description |
---|---|---|
id | Integer | Foreign Key to the ID field of the activity table (Primary Key) |
name | String | Name of the Assignment |
Data from the Moodle book module.
Field | Type | Description |
---|---|---|
id | Integer | Foreign Key to the ID field of the activity table (Primary Key) |
name | String | Name of the Book |
Data for chapters in the Moodle book module.
Field | Type | Description |
---|---|---|
id | Integer | Auto-incrementing Discriminator (Primary Key) |
book_id | Integer | Foreign Key to the ID field of the activity_moodle_book table |
name | String | Title of the chapter |
Note: To simplify the implementation of the sub-activity data tables the ID field is specified as the primary key, rather than as discriminator coupled with the instance ID to form a composite primary key. Since PostgreSQL (and some other databases) have facilities to generate table unique primary keys, it is simpler to use that facility for the ID field rather than using ID field as a true discriminator (which would involve keeping a new count for each instance).
Relationship between the activity_moodle_book and the log entry. A many-to-one relationship exists between the log and the sub-activity, where a sub-activity instance could be referenced by multiple log entries. Since the sub-activities are spread across many different tables (and many log entries will not reference any sub-activities), the relationship is implemented as a series of relationship tables with one relationship table per sub-activity.
Field | Type | Description |
---|---|---|
log_id | Integer | Foreign Key to the ID field of the log table |
chapter_id | Integer | Foreign Key to the ID field of the activity_moodle_book_chapter table |