Hadoop history
*The genesis of Hadoop came from the Google File System paper[11] that was published in October 2003. This paper spawned another research paper from Google – MapReduce: Simplified Data Processing on Large Clusters.[12] Development started on the Apache Nutch project, but was moved to the new Hadoop subproject in January 2006.[13] Doug Cutting, who was working at Yahoo! at the time,[14] named it after his son's toy elephant.[15] The initial code that was factored out of Nutch consisted of 5k lines of code for HDFS and 6k lines of code for MapReduce.
The first committer added to the Hadoop project was Owen O’Malley in March 2006.[16] Hadoop 0.1.0 was released in April 2006[17] and continues to evolve by the many contributors[18] to the Apache Hadoop project.
Timeline[edit]
Year
Month
Event
Ref.
2003
October
Google File System paper released
[19]
2004
December
MapReduce: Simplified Data Processing on Large Clusters
[20]
2006
January
Hadoop subproject created with mailing lists, jira, and wiki
[21]
2006
January
Hadoop is born from Nutch 197
[22]
2006
February
NDFS+ MapReduce moved out of Apache Nutch to create Hadoop
[23]
2006
February
Owen O'Malley's first patch goes into Hadoop
[24]
2006
February
Hadoop is named after Cutting's son's yellow plush toy
[25]
2006
April
Hadoop 0.1.0 released
[26]
2006
April
Hadoop sorts 1.8 TB on 188 nodes in 47.9 hours
[23]
2006
May
Yahoo deploys 300 machine Hadoop cluster
[23]
2006
October
Yahoo Hadoop cluster reaches 600 machines
[23]
2007
April
Yahoo runs two clusters of 1,000 machines
[23]
2007
June
Only three companies on "Powered by Hadoop Page"
[27]
2007
October
First release of Hadoop that includes HBase
[28]
2007
October
Yahoo Labs creates Pig, and donates it to the ASF
[29]
2008
January
YARN JIRA opened
Yarn Jira (Mapreduce 279)
2008
January
20 companies on "Powered by Hadoop Page"
[27]
2008
February
Yahoo moves its web index onto Hadoop
[30]
2008
February
Yahoo! production search index generated by a 10,000-core Hadoop cluster
[23]
2008
March
First Hadoop Summit
[31]
2008
April
Hadoop world record fastest system to sort a terabyte of data. Running on a 910-node cluster, Hadoop sorted one terabyte in 209 seconds
[23]
2008
May
Hadoop wins TeraByte Sort (World Record sortbenchmark.org)
[32]
2008
July
Hadoop wins Terabyte Sort Benchmark
[33]
2008
October
Loading 10 TB/day in Yahoo clusters
[23]
2008
October
Cloudera, Hadoop distributor is founded
[34]
2008
November
Google MapReduce implementation sorted one terabyte in 68 seconds
[23]
2009
March
Yahoo runs 17 clusters with 24,000 machines
[23]
2009
April
Hadoop sorts a petabyte
[35]
2009
May
Yahoo! used Hadoop to sort one terabyte in 62 seconds
[23]
2009
June
Second Hadoop Summit
[36]
2009
July
Hadoop Core is renamed Hadoop Common
[37]
2009
July
MapR, Hadoop distributor founded
[38]
2009
July
HDFS now a separate subproject
[37]
2009
July
MapReduce now a separate subproject
[37]
2010
January
Kerberos support added to Hadoop
[39]
2010
May
Apache HBase Graduates
[40]
2010
June
Third Hadoop Summit
[41]
2010
June
Yahoo 4,000 nodes/70 petabytes
[42]
2010
June
Facebook 2,300 clusters/40 petabytes
[42]
2010
September
Apache Hive Graduates
[43]
2010
September
Apache Pig Graduates
[44]
2011
January
Apache Zookeeper Graduates
[45]
2011
January
Facebook, LinkedIn, eBay and IBM collectively contribute 200,000 lines of code
[46]
2011
March
Apache Hadoop takes top prize at Media Guardian Innovation Awards
[47]
2011
June
Rob Beardon and Eric Badleschieler spin out Hortonworks out of Yahoo.
[48]
2011
June
Yahoo has 42K Hadoop nodes and hundreds of petabytes of storage
[48]
2011
June
Third Annual Hadoop Summit (1,700 attendees)
[49]
2011
October
Debate over which company had contributed more to Hadoop.
[46]
2012
January
Hadoop community moves to separate from MapReduce and replace with YARN
[25]
2012
June
San Jose Hadoop Summit (2,100 attendees)
[50]
2012
November
Apache Hadoop 1.0 Available
[37]
2013
March
Hadoop Summit – Amsterdam (500 attendees)
[51]
2013
March
YARN deployed in production at Yahoo
[52]
2013
June
San Jose Hadoop Summit (2,700 attendees)
[53]
2013
October
Apache Hadoop 2.2 Available
[37]
2014
February
Apache Hadoop 2.3 Available
[37]
2014
February
Apache Spark top Level Apache Project
[54]
2014
April
Hadoop summit Amsterdam (750 attendees)
[55]
2014
June
Apache Hadoop 2.4 Available
[37]
2014
June
San Jose Hadoop Summit (3,200 attendees)
[56]
2014
August
Apache Hadoop 2.5 Available
[37]
2014
November
Apache Hadoop 2.6 Available
[37]
2015
April
Hadoop Summit Europe
[57]
2015
June
Apache Hadoop 2.7 Available
[37]