Yahoo Lab Machine Learning Dataset: Info From 20M Users Now Available to Researchers

A billboard for the technology company Yahoo is seen August 5, 2015 in Washington, D.C. (Karen Bleier/AFP/Getty Images)
By Clyde HughesFriday, 15 Jan 2016 11:00 AM





Yahoo revealed Thursday that it is making its Yahoo Lab machine learning dataset available for academic research through the lab's Webscope program, Tech Crunch reported Thursday.

The dataset is about 13.5 terabytes and holds the interactions of roughly 20 million users from February 2015 through May 2015, including those happening on Yahoo's homepage along with Yahoo News, Yahoo Sports, Yahoo Finance, and Yahoo Real Estate, according to Tech Crunch.
"Data is the lifeblood of research in machine learning," Suju Rajan, Yahoo's director of personalization science at Yahoo Labs, said in a statement. "However, access to truly large-scale datasets is a privilege that has been traditionally reserved for machine learning researchers and data scientists working at large companies — and out of reach for most academic researchers."

The dataset also contains demographic information such as age range, gender, and generalized geographic data. Items in the dataset include title, summary, and key phrases of the news article in question, local timestamps, and some device information, the tech website noted. 

Popular posts from this blog

Does Lil Wayne have cancer?

Police dismisses bomb scare in Lagos

Survivor