Security in Hadoop
Data is growing at an increasing rate, and processing and storing that data is a real issue that present and future generations will have to deal with. Hadoop, Apache's open source implementation of Google's MapReduce, can scale both storage space and processing power almost indefinitely across a large dataset. This is achieved by how Hadoop distributes data across its nodes, and then that it distributes the work out to the nodes. The data is processed in manageable chunks by 'mappers', and then the results are aggregated, and processed as well by 'reducers'.
Hadoop is becoming a key business tool, due to its ability to manage processing large datasets. Companies like Yahoo, IBM, Facebook, New York Times, and e-Harmony are already using Hadoop to varying degrees for their needs already, and other companies are beginning to see the potential for Hadoop. The trend appears to be that Hadoop will become one of the leading platforms for processinglarge quantities of data.
Unfortunately, as of Version 0.19, Hadoop has security flaws that limit how data can be handled, and what kind of data can be handled. First, the file system that Hadoop runs on, HDFS, has no read control. Second, Hadoop authenticates a user for access control by using the output of the 'whoami' command, which is not secure. Third, HBase, which is the "database" that Hadoop uses, has no access control at all. Any company employing Hadoop needs to be aware of these issues, and apply security practices that work around how they deal with them.
http://www.hackedexistence.com/downloads/Cloud_Security_in_Map_Reduce.pdf

低调大师中文资讯倾力打造互联网数据资讯、行业资源、电子商务、移动互联网、网络营销平台。
持续更新报道IT业界、互联网、市场资讯、驱动更新,是最及时权威的产业资讯及硬件资讯报道平台。
转载内容版权归作者及来源网站所有,本站原创内容转载请注明来源。
- 上一篇
应该在什么时候使用Hadoop
版权声明:本文为博主chszs的原创文章,未经博主允许不得转载。 https://blog.csdn.net/chszs/article/details/12114845 应该在什么时候使用Hadoop 作者:chszs,转载需注明。博客主页:http://blog.csdn.net/chszs 有人问我,“你在大数据和Hadoop方面有多少经验?”我告诉他们,我一直在使用Hadoop,但是我处理的数据集很少有大于几个TB的。 他们又问我,“你能使用Hadoop做简单的分组和统计吗?”我说当然可以,我只是告诉他们我需要看一些文件格式的例子。 他们递给我一个包含600MB数据的闪盘,看起来这些数据并非样本数据,由于一些我不能理解的原因,当我的解决方案涉及到pandas.read_csv文件,而不是Hadoop,他们很不愉快。 Hadoop实际上是有很多局限的。Hadoop允许你运行一个通用的计算,下面我用伪码进行说明: Scala风格的伪码: collection.flatMap( (k,v) => F(k,v) ).groupBy( _._1 ).map( _.reduce( (...
- 下一篇
HBase Shell输入命令无法删除问题解决技巧
一、引言: HBase shell使用过程中,使用CRT客户端,命令输入后无法删除一直困绕着我,今天终于受不了,几番度娘,谷哥之后,终于有了解决方法,特共享给大家。 二、操作步骤 secureCRT中,点击【选项】【回话选项】【终端】【仿真】,右边的终端选择linux 英文版本请对应选择输入: 三、删除操作 执行删除操作时,需要使用组合键:Ctrl+Back Space同时按下,不妨现在就开始尝试一下吧。 四、遗憾 我一直比较喜欢黄色字体,黑色背景,设置Linux后原来的背景色不起作用了。也不知道是什么原因导致的,希望下一步能找到答案。 作者:张子良 出处:http://www.cnblogs.com/hadoopdev 本文版权归作者所有,欢迎转载,但未经作者同意必须保留此段声明,且在文章页面明显位置给出原文连接,否则保留追究法律责任的权利。
相关文章
文章评论
共有0条评论来说两句吧...