-
Notifications
You must be signed in to change notification settings - Fork 24
Database Schema
compressed_files: This table contains a register for each archive file that has been downloaded, or tried to download.
| Name | Data Type | Description | Key |
|---|---|---|---|
| url | VARCHAR (255) | URL of the file | PK |
| mailing_list_url | VARCHAR (255) | URL of the web archives of the mailing list where this file belongs to | FK |
| status | ENUM | Either visited, new or failed | |
| last_analysis | DATETIME | Date and time of the last analysis of this time |
mailing_lists: This table contains a register for each different mailing list analysed.
| Name | Data Type | Description | Key |
|---|---|---|---|
| mailing_list_url | VARCHAR(255) | URL of the archives web page | PK |
| mailing_list_name | VARCHAR(255) | Name of the mailing list, as it appears in the headers of the messages | |
| project_name | VARCHAR(255) | Name of the software project were this list belongs to. Taken from the email address of the mailing list | |
| last_analysis | DATETIME | Date and time of the last analysis performed on this mailing list |
mailing_lists_people: This table joins the table mailing lists and people, making possible to search for people grouping by different mailing lists.
| Name | Data Type | Description | Key |
|---|---|---|---|
| email_address | VARCHAR(255) | People unique identifier by email address | PK |
| mailing_list_url | VARCHAR(255) | URL of the mailing list archives web page | PK |
messages: This table contains a register for each message in the mailing list archives. It contains all the information in the headers plus the message itself.
| Name | Data Type | Description | Key |
|---|---|---|---|
| message_id | VARCHAR(255) | Unique identifier assigned by the mailing list manager | PK |
| mailing_list_url | VARCHAR(255) | URL of the archives web page of the mailing list | PK |
| mailing_list | VARCHAR(255) | Name and address of the mailing list | |
| first_date | DATETIME | Local date written in the message by the original sender | |
| first_date_tz | DECIMAL(11,0) | Time zone of the above date | |
| arrival_date | DATETIME | Local time of the server that received the message | |
| arrival_date_tz | DECIMAL(11,0) | Time zone of the above date | |
| subject | VARCHAR(1024) | Subject of the message | |
| message_body | MEDIUMTEXT | Main text of the message | |
| mail_path | TEXT | Mail path | |
| is_response_of | VARCHAR(255) | If this message is a reply of another, this is the id of the original message | FK |
messages_people: This is a table establish the relationship between email addresses and messages.
| Name | Data Type | Description | Key |
|---|---|---|---|
| message_id | VARCHAR(255) | Id of the message where that person appears | PK |
| email_address | VARCHAR(255) | People unique identifier by email address | PK |
| type_of_recipient | ENUM | Either From, To, or Cc | PK |
| mailing_list_url | VARCHAR(255) | URL of the mailing list archives web page | PK |
people: This table contains a register for each one of the people who has written a message to the mailing list, or at least appears as destination in a message that has been sent to the mailing list.
| Name | Data Type | Description | Key |
|---|---|---|---|
| email_address | VARCHAR(255) | People unique identifier by email address | PK |
| name | VARCHAR(255) | Name (if appears in the header) | |
| username | VARCHAR(255) | The first part (before the @) of the email address | |
| domain_name | VARCHAR(255) | The second part (after the @) of the email address | |
| top_level_domain | VARCHAR(255) | Top level domain of the email address (.com, .org, .es, etc) |
This schema is very out of date, but this is the best we have so far. This figure was obtained from the deliverable 3.1 in the FLOSSMetrics project.
