Biomechanics Literature Update

This repository has been migrated to https://github.com/alcantarar/BiomchBERT, where it is being loosely maintained.

Ryan Alcantara & Gary Bruening

We use Machine Learning to predict the general topic of a biomechanics-related paper given its title. To accomplish this, we:

Developed an HTML web scraper to extract the paper information and assigned paper topic from every Biomch-L Literature Update since 2010. (webscraper.py)
Trained and compared multiple classification Machine Learning algorithms (keras_1.py & test_many_ML_algorithms_nn.ipynb)
Created a python script (literature_search.ipynb) that:
1. Searches PubMed for Biomechanics-related papers published in the past week,
2. Uses the top-performing Machine Learning model (keras-1, a Deep Neural Network with 73.5% accuracy) to predict the paper topic for the week’s papers,
3. Compiles papers, formats their citation, and organizes them by topic, saving to .md file here: Literature Updates.

Files

Assets

A neato gif.

Construct_Models

Contains the files to contstruct the models. Two main files keras_1.py and test_many_ML_algorithms_nn.ipynb.

keras_1.py - Fits a deep neural network to data contained in Data. Saves the models into models. The vectorizer and label encoders are saved here as well.
test_many_ML_algorithms_nn.ipynb - Fits multiple machine learning methods to the Data. Includes Multinomial Naive Payes, Logistic Regression, Stochastic Gradient Descent (SGD), Linear Support Vector Classification), and Multi-layer Perceptron Classifier. Saves the data into models. The vectorizer and label encoders are saved here as well.
keras_eval.py - A small script to evaluate the keras neural network on test strings.

Data

Where the webscraped data is stored.

RYANDATA.csv - The full csv file including paper number, Category/Topic, Authors, Title, Journal, Year, Volume and Issue, DOI, and Abstract. Named this way because Gary just thought he would hand the data off and not get really really caught up in this. Boy, was he wrong.
RYANDATA_filt.csv - Has all the same headers as RYANDATA.csv, but filters out topics that represent less than 5% of the total papers.
RYANDATA_filt_even.csv - An evenly downsampled (by topic) csv of RYANDATA_filt.csv. Each topic has the same number of representations in this csv.

Literature_Updates

Where weekly updates can be stored in markdown & csv format for publishing.

Models

Where all the model files are saved after being created.

Keras_model - Location of all the Keras Neural Net files. Some neural net files are to large to upload to Git on their own so are split. Using 7-zip(Windows) or Keka (MacOS) you can recombine these files to create the model file and weights file.
Many_ML_models - Location of all the many ML testing files are saved. The mpl file will need to be recombined using 7-zip/Keka similar to the Keras Neural Net files.

Plots

Model validation plots are saved here. Usually a confusion matrix.

Webscraper

The python file to scrape the Biomch-L forum.

`literature_search.ipynb`

Ipython Notebook to generate the literature update. Uses Biopython v1.73 to perfrom a literature search, then the a given ML model to classify the papers. Saves the results in a markdown file in literature update.

Unique Packages

BeautifySoup is used to scrape the web for the articles to feed into the ML models.
Keras and Scikit-learn are used to construct ML models.
Biopython is used to access PubMed. Requires version 1.73 or newer.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Biomechanics Literature Update

This repository has been migrated to https://github.com/alcantarar/BiomchBERT, where it is being loosely maintained.

Ryan Alcantara & Gary Bruening

Files

Assets

Construct_Models

Data

Literature_Updates

Models

Plots

Webscraper

`literature_search.ipynb`

Unique Packages

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 179 Commits
Assets		Assets
Construct_Models		Construct_Models
Data		Data
Literature_Updates		Literature_Updates
Models		Models
Plots		Plots
Webscraper		Webscraper
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
literature_search.py		literature_search.py

Folders and files

Latest commit

History

Repository files navigation

Biomechanics Literature Update

This repository has been migrated to https://github.com/alcantarar/BiomchBERT, where it is being loosely maintained.

Ryan Alcantara & Gary Bruening

Files

Assets

Construct_Models

Data

Literature_Updates

Models

Plots

Webscraper

literature_search.ipynb

Unique Packages

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`literature_search.ipynb`

Packages