https://doi.org/10.1140/epjds/s13688-019-0208-6
Regular article
Success in books: predicting book sales before publication
1
Center for Complex Network Research and Department of Physics, Northeastern University, Boston, USA
2
College of Computer and Information Science, Northeastern University, Boston, USA
3
Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, USA
4
Center for Network Science, Central European University, Budapest, Hungary
* e-mail: barabasi@gmail.com
Received:
21
February
2019
Accepted:
10
September
2019
Published online:
17
October
2019
Reading remains a preferred leisure activity fueling an exceptionally competitive publishing market: among more than three million books published each year, only a tiny fraction are read widely. It is largely unpredictable, however, which book will that be, and how many copies it will sell. Here we aim to unveil the features that affect the success of books by predicting a book’s sales prior to its publication. We do so by employing the Learning to Place machine learning approach, that can predicts sales for both fiction and nonfiction books as well as explaining the predictions by comparing and contrasting each book with similar ones. We analyze features contributing to the success of a book by feature importance analysis, finding that a strong driving factor of book sales across all genres is the publishing house. We also uncover differences between genres: for thrillers and mystery, the publishing history of an author (as measured by previous book sales) is highly important, while in literary fiction and religion, the author’s visibility plays a more central role. These observations provide insights into the driving forces behind success within the current publishing industry, as well as how individuals choose what books to read.
Key words: Success / Books / Learning to place
© The Author(s), 2019