Skip to content

Automatic group construction#51

Open
makkos-lilly wants to merge 5 commits intointegrate_databasefrom
automatic_group_construction
Open

Automatic group construction#51
makkos-lilly wants to merge 5 commits intointegrate_databasefrom
automatic_group_construction

Conversation

@makkos-lilly
Copy link
Collaborator

In this sub-branch of integrate_database, we design the dashboard to pull users and information directly from a group definition written in a .csv group_definition rather than from group_users. This allows groups to be defined directly within the dashboard files, and users loaded from the database automatically, instead of being sourced elsewhere.

The group can be defined by any combination of hashtags, comments, bounding box, start and end date, and a top filter. The top filter pertains only to users. For example, the top 10 most active users could be returned. When data_retrieval.R is run, the group_description populates a group_users.csv file, with all the users matching the description within the group. The rest of the data retrieval and dashboard rendering proceeds as normal within the integrate_database branch. This allows groups to be defined directly within the dashboard files, and users loaded from the database automatically, instead of being sourced elsewhere.

Groups can be defined in group_definition.csv. Users are then retrieved for this group definition. The dashboard then populates information about these users as usual. To do this, a duplicate data retrieval from dashboard.qmd was removed, group_definition.csv was added, an empty group_users.csv is written to the dashboard directory after creating the dashboard. This is to be populated automatically. data_retrieval.R was modified to ignore all stale users (those accounts deleted and usernames changed) and not throw errors on them when populating group_users.csv. README.md file has been updated to reflect these changes.
Elegant failure if no users are found for some group. If empty, the header is still written to the new groups_user.csv file.
@makkos-lilly makkos-lilly force-pushed the automatic_group_construction branch from 5e7c59a to 4ee51ec Compare October 16, 2025 12:02
Now contains all the graphs when using db_overlay:true as it would without using the overlay.
To run this code, you must make a python environment called .venv.
Worked on sampling from the APIs intelligently based on the percentage user contributions to the overall changesets of the groups and fixed some errors. This code only works with an appropriate python environment.
Requirements.in added
Tidied up comments
Debugged main files
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants