The new Big Data technological innovation has the potential to usher in an information-based, scientific revolution in a number of industries and human endeavors. Like all scientific revolutions however, it will take some time to transition to all sectors of businesses and society, especially in cases where markets are relatively heterogeneous and imperfect, such as in the rare book trade. The unique attributes of rare books and the variations from one to the next, present a domain challenge that is difficult to holistically analyze and derive projections on market trends. Machines are well suited for pattern mining using complex analytical models, and with rare books the pattern is more difficult to identify than the equivalent homogeneous new book marketplace.
The reality is that things that machines have yet to revolutionize are still done very well by humans. For example, visual comparison of two rare books having the same imprint in order to identify anomalies. Another example is seen when one looks through the pages to understand the sentimental value and meaning of a particular association copy in order to estimate its value. Data scientists, without the assistance of big data technologies, can also better recognize and analyze adjacency and symmetry in data patterns. For example, the simple line graphical representation below plots the actual high price sales for rare books sold by Abebooks each month during the last four years.
A quick visual inspection derives the following observations and projections:
- The general sale prices for rare book high-spots sold by Abebooks on-line, since early 2009, have been on an upward trend.
- The month of July is usually the month carrying the least expensive sale of the year.
- The month of November is usually the month with the most expensive sale during the year.
- Sales of high-spots have been particularly picking up pace during the last two years on-line, worldwide.
- High-spots are more likely to sell in the month of November, and least likely to sell during the month of July.
The assertions of the visual analysis based on 50 data values of a single measure seem quite predictable. They are not far from observations made by other approaches which were dominated by gut and intuition, rather than by actual data. There is no need for a chart to suggest that prices of rare books are rising in general, or that sales are higher in the month before the holidays and slower during the time that people are on vacation. A common sense opinion or a hunch can probably get the same result.
Big data on the other hand, despite all the obstacles, is far more powerful than the analytics that were used in the past. Factors that are less obvious, hidden away from common sense and even common statistical analysis methods, are uncovered through advancements in big data technologies that allow processing of massive amounts of data. There is no doubt that in an imperfect market such as the rare book marketplace, the task is not just a matter of better technologies, but also semantics. Rare book data of interest is segregated between multiple e-commerce sites, auction houses, small or big bookstores, dealers at book fairs and so forth. On the plus side, time is working to consolidate the industry and generate the big data to analyze. Even during the recent times of economic crisis the rare book marketplace has become even more globally interconnected with bookstore closings, book fairs conducting fewer transactions and on-line marketplaces expanding.
Another task that the Rare Book Sale Monitor (RBSM) project is putting a great deal of energy into is the construction of rules to formalize human knowledge in order to provide the maximum level of homogeneity to the heterogeneous rare book trade. This effort allows for automated pattern matching in order to compare pricing that cannot be considered “apples to oranges.”
Data-driven decision making is geared to revolutionize the book trade. As buyers and sellers cluster the bulk of their activities under the big on-line marketplaces which represent some form of a more “perfect market,” datafication is also bound to grow. The ability to compare what is available for sale or put up for sale by using existing market availability data from worldwide sources, is constantly improving. Massive amounts of data that get generated are being used to measure and manage strategies that rely on smarter decisions and accurate predictions. Man and machine are working together to explore and discover the vast areas of opportunity that the rare book marketplace has yet to reveal. Being able to predict, among other projections that Abebooks will conduct its most expensive sales in late spring just like it had last year, may not be too far off.