Training a Model to Predict News Popularity

As an exercise in hands-on Machine Learning, I trained, tuned, and tested a model using publicly available dataset that provided information of roughly 40,000 online articles.

Overall I sought to predict what attributes of online news articles lead to higher rates of sharing.


The study was meant as a basis for automated content organization.  In addition, I considered potential societal implications of the attributes most highly correlated with "virality", such as tone and length.

The dataset is available for download here and the paper I submitted as part of my study of Machine Learning at Carnegie Mellon University can be found below.