Ubuntu16.0.4下单机集群式elasticsearch 安装、配置及示例
文章友情链接:CSDN:https://blog.csdn.net/u010820857/article/details/81944517
1、下载
elasticsearch点击下载
或者用命令行下载
curl -L -O http://download.elasticsearch.org/PATH/TO/VERSION.zip <1>
unzip elasticsearch-$VERSION.zip
cd elasticsearch-$VERSION
2、配置单机集群
-
1、在home目录下创建一个文件夹elasticsearch,然后将刚才下载的elastcsearch复制三份,分别重命名为:elastcsearch-node1、elastcsearch-node2、elastcsearch-node3
image.png -
2、修改config
elasticsearch集群配置比较简单,只需把每个节点的cluster name设置成相同的,es启动时会自动发现同一网段内相同cluster name的节点自动加入到集群中。要做到单机上开多个实例,需要修改ES的默认配置,以下是一些配置要点:
image.png
- node.max_local_storage_nodes
这个配置限制了单节点上可以开启的ES存储实例的个数,我们需要开多个实例,因此需要把这个配置写到配置文件中,并为这个配置赋值为2或者更高。 - http.port
这个配置是elasticsearch对外提供服务的http端口配置,默认情况下ES会取用9200~9299之间的端口,如果9200被占用就会自动使用9201,在单机多实例的配置中这个配置实际是不需要修改的。 - transport.tcp.port
这个配置指定了elasticsearch集群内数据通讯使用的端口,默认情况下为9300,与上面的http.port配置类似,ES也会自动为已占用的端口选择下一个端口号。我们可以将第一个实例的tcp传输端口配置为9300,第二实例配置为9301
- 3、几个基本名词
index: es里的index相当于一个数据库。
type: 相当于数据库里的一个表。
id: 唯一,相当于主键。
node:节点是es实例,一台机器可以运行多个实例,但是同一台机器上的实例在配置文件中要确保http和tcp端口不同(下面有讲)。
cluster:代表一个集群,集群中有多个节点,其中有一个会被选为主节点,这个主节点是可以通过选举产生的,主从节点是对于集群内部来说的。
shards:代表索引分片,es可以把一个完整的索引分成多个分片,这样的好处是可以把一个大的索引拆分成多个,分布到不同的节点上,构成分布式搜索。分片的数量只能在索引创建前指定,并且索引创建后不能更改。
replicas:代表索引副本,es可以设置多个索引的副本,副本的作用一是提高系统的容错性,当个某个节点某个分片损坏或丢失时可以从副本中恢复。二是提高es的查询效率,es会自动对搜索请求进行负载均衡。 - 4、配置文件 ./elasticsearch-node1/config/elasticsearch.yml
- elasticsearch-node1的elasticsearch.yml配置如下:
# ======================== Elasticsearch Configuration =========================
#
# NOTE: Elasticsearch comes with reasonable defaults for most settings.
# Before you set out to tweak and tune the configuration, make sure you
# understand what are you trying to accomplish and the consequences.
#
# The primary way of configuring a node is via this file. This template lists
# the most important settings you may want to configure for a production cluster.
#
# Please consult the documentation for further information on configuration options:
# https://www.elastic.co/guide/en/elasticsearch/reference/index.html
#
# ---------------------------------- Cluster -----------------------------------
#
# Use a descriptive name for your cluster:
#
cluster.name: ictr_ElasticSearch
#
# ------------------------------------ Node ------------------------------------
#
# Use a descriptive name for the node:
#
#换个节点名字
node.name: node1
node.master: true
#
# Add custom attributes to the node:
#
node.attr.rack: r1
#这个配置限制了单节点上可以开启的ES存储实例的个数,我们需要开多个实例,因此需要把这个配置写到配置文件中,并为这个配置赋值为2或者更高
node.max_local_storage_nodes: 3
#
# ----------------------------------- Paths ------------------------------------
#
# Path to directory where to store the data (separate multiple locations by comma):
#
#path.data: /path/to/data
#
# Path to log files:
#
#path.logs: /path/to/logs
#
# ----------------------------------- Memory -----------------------------------
#
# Lock the memory on startup:
#
#bootstrap.memory_lock: true
#
# Make sure that the heap size is set to about half the memory available
# on the system and that the owner of the process is allowed to use this
# limit.
#
# Elasticsearch performs poorly when the system is swapping the memory.
#
# ---------------------------------- Network -----------------------------------
#
# Set the bind address to a specific IP (IPv4 or IPv6):
#
network.host: 127.0.0.1
#
# Set a custom port for HTTP:
#
http.port: 9201
#
transport.tcp.port: 9301
# For more information, consult the network module documentation.
#
# --------------------------------- Discovery ----------------------------------
#
# Pass an initial list of hosts to perform discovery when new node is started:
# The default list of hosts is ["127.0.0.1", "[::1]"]
#
#discovery.zen.ping.unicast.hosts: ["127.0.0.1:9301"] #候选主节点地址
#
# Prevent the "split brain" by configuring the majority of nodes (total number of master-eligible nodes / 2 + 1):
#
#指定集群中的节点中有几个有master资格的节点。
#对于大集群可以写3个以上。
discovery.zen.minimum_master_nodes: 1
#默认是3s,这是设置集群中自动发现其它节点时ping连接超时时间,
#为避免因为网络差而导致启动报错,我设成了40s。
discovery.zen.ping_timeout: 40s
#
# For more information, consult the zen discovery module documentation.
#
# ---------------------------------- Gateway -----------------------------------
#
# Block initial recovery after a full cluster restart until N nodes are started:
#
#gateway.recover_after_nodes: 3
#
# For more information, consult the gateway module documentation.
#
# ---------------------------------- Various -----------------------------------
#
# Require explicit names when deleting indices:
#
#action.destructive_requires_name: true
http.cors.enabled: true
http.cors.allow-origin: "*"
- elasticsearch-node2的elasticsearch.yml配置如下:
# ======================== Elasticsearch Configuration =========================
#
# NOTE: Elasticsearch comes with reasonable defaults for most settings.
# Before you set out to tweak and tune the configuration, make sure you
# understand what are you trying to accomplish and the consequences.
#
# The primary way of configuring a node is via this file. This template lists
# the most important settings you may want to configure for a production cluster.
#
# Please consult the documentation for further information on configuration options:
# https://www.elastic.co/guide/en/elasticsearch/reference/index.html
#
# ---------------------------------- Cluster -----------------------------------
#
# Use a descriptive name for your cluster:
#
cluster.name: ictr_ElasticSearch
#
# ------------------------------------ Node ------------------------------------
#
# Use a descriptive name for the node:
#
#换个节点名字
node.name: node2
node.master: false
#
# Add custom attributes to the node:
#
node.attr.rack: r1
#这个配置限制了单节点上可以开启的ES存储实例的个数,我们需要开多个实例,因此需要把这个配置写到配置文件中,并为这个配置赋值为2或者更高
node.max_local_storage_nodes: 3
#
# ----------------------------------- Paths ------------------------------------
#
# Path to directory where to store the data (separate multiple locations by comma):
#
#path.data: /path/to/data
#
# Path to log files:
#
#path.logs: /path/to/logs
#
# ----------------------------------- Memory -----------------------------------
#
# Lock the memory on startup:
#
bootstrap.memory_lock: false
bootstrap.system_call_filter: false
#
# Make sure that the heap size is set to about half the memory available
# on the system and that the owner of the process is allowed to use this
# limit.
#
# Elasticsearch performs poorly when the system is swapping the memory.
#
# ---------------------------------- Network -----------------------------------
#
# Set the bind address to a specific IP (IPv4 or IPv6):
#
network.host: 127.0.0.1
#
# Set a custom port for HTTP:
#
http.port: 9202
#
transport.tcp.port: 9302
# For more information, consult the network module documentation.
#
# --------------------------------- Discovery ----------------------------------
#
# Pass an initial list of hosts to perform discovery when new node is started:
# The default list of hosts is ["127.0.0.1", "[::1]"]
#
discovery.zen.ping.unicast.hosts: ["127.0.0.1:9301"] #候选主节点地址
#
# Prevent the "split brain" by configuring the majority of nodes (total number of master-eligible nodes / 2 + 1):
#
#discovery.zen.minimum_master_nodes: 1
#
# For more information, consult the zen discovery module documentation.
#
# ---------------------------------- Gateway -----------------------------------
#
# Block initial recovery after a full cluster restart until N nodes are started:
#
#gateway.recover_after_nodes: 3
#
# For more information, consult the gateway module documentation.
#
# ---------------------------------- Various -----------------------------------
#
# Require explicit names when deleting indices:
#
#action.destructive_requires_name: true
http.cors.enabled: true
http.cors.allow-origin: "*"
- elasticsearch-node3的elasticsearch.yml配置如下:
# ======================== Elasticsearch Configuration =========================
#
# NOTE: Elasticsearch comes with reasonable defaults for most settings.
# Before you set out to tweak and tune the configuration, make sure you
# understand what are you trying to accomplish and the consequences.
#
# The primary way of configuring a node is via this file. This template lists
# the most important settings you may want to configure for a production cluster.
#
# Please consult the documentation for further information on configuration options:
# https://www.elastic.co/guide/en/elasticsearch/reference/index.html
#
# ---------------------------------- Cluster -----------------------------------
#
# Use a descriptive name for your cluster:
#
cluster.name: ictr_ElasticSearch
#
# ------------------------------------ Node ------------------------------------
#
# Use a descriptive name for the node:
#
#换个节点名字
node.name: node3
node.master: false
#
# Add custom attributes to the node:
#
node.attr.rack: r1
#这个配置限制了单节点上可以开启的ES存储实例的个数,我们需要开多个实例,因此需要把这个配置写到配置文件中,并为这个配置赋值为2或者更高
node.max_local_storage_nodes: 3
#
# ----------------------------------- Paths ------------------------------------
#
# Path to directory where to store the data (separate multiple locations by comma):
#
#path.data: /path/to/data
#
# Path to log files:
#
#path.logs: /path/to/logs
#
# ----------------------------------- Memory -----------------------------------
#
# Lock the memory on startup:
#
bootstrap.memory_lock: false
bootstrap.system_call_filter: false
#
# Make sure that the heap size is set to about half the memory available
# on the system and that the owner of the process is allowed to use this
# limit.
#
# Elasticsearch performs poorly when the system is swapping the memory.
#
# ---------------------------------- Network -----------------------------------
#
# Set the bind address to a specific IP (IPv4 or IPv6):
#
network.host: 127.0.0.1
#
# Set a custom port for HTTP:
#
http.port: 9203
#
transport.tcp.port: 9303
# For more information, consult the network module documentation.
#
# --------------------------------- Discovery ----------------------------------
#
# Pass an initial list of hosts to perform discovery when new node is started:
# The default list of hosts is ["127.0.0.1", "[::1]"]
#
discovery.zen.ping.unicast.hosts: ["127.0.0.1:9301"] #候选主节点地址
#
# Prevent the "split brain" by configuring the majority of nodes (total number of master-eligible nodes / 2 + 1):
#
#discovery.zen.minimum_master_nodes: 1
#
# For more information, consult the zen discovery module documentation.
#
# ---------------------------------- Gateway -----------------------------------
#
# Block initial recovery after a full cluster restart until N nodes are started:
#
#gateway.recover_after_nodes: 3
#
# For more information, consult the gateway module documentation.
#
# ---------------------------------- Various -----------------------------------
#
# Require explicit names when deleting indices:
#
#action.destructive_requires_name: true
http.cors.enabled: true
http.cors.allow-origin: "*"
3、启动节点
分别启动三个elasticsearch实例
./bin/elasticsearch -d (后台运行)
4、访问浏览器
http://127.0.0.1:9100/
5、问题
因为elasticsearch为了安全,不能用root用户运行,如果有些文件没有权限运行,要么新建用户组和用户,要么给文件授权
chmod -R 777 elasticsearch
chmod -R 777 xxx

低调大师中文资讯倾力打造互联网数据资讯、行业资源、电子商务、移动互联网、网络营销平台。
持续更新报道IT业界、互联网、市场资讯、驱动更新,是最及时权威的产业资讯及硬件资讯报道平台。
转载内容版权归作者及来源网站所有,本站原创内容转载请注明来源。
-
上一篇
高性能Spark作业基础:你必须知道的调优原则及建议
在大数据计算领域,Spark已经成为了越来越流行、越来越受欢迎的计算平台之一。Spark的功能涵盖了大数据领域的离线批处理、SQL类处理、流式/实时计算、机器学习、图计算等各种不同类型的计算操作,应用范围与前景非常广泛。在美团点评,已经有很多同学在各种项目中尝试使用Spark。大多数同学(包括笔者在内),最初开始尝试使用Spark的原因很简单,主要就是为了让大数据计算作业的执行速度更快、性能更高。 然而,通过Spark开发出高性能的大数据计算作业,并不是那么简单的。如果没有对Spark作业进行合理的调优,Spark作业的执行速度可能会很慢,这样就完全体现不出Spark作为一种快速大数据计算引擎的优势来。因此,想要用好Spark,就必须对其进行合理的性能优化。 Spark的性能调优实际上是由很多部分组成的,不是调节几个参数就可以立竿见影提升作业性能的。我们需要根据不同的业务场景以及数据情况,对Spark作业进行综合性的分析,然后进行多个方面的调节和优化,才能获得最佳性能。 笔者根据之前的Spark作业开发经验以及实践积累,总结出了一套Spark作业的性能优化方案。整套方案主要分为开发调优...
-
下一篇
利用MaxCompute内建函数及UDTF转换json格式日志数据
一、业务场景分析: 由于业务的复杂性,数据开发者需要面对不同来源的不同类型数据,需要把这些数据抽取到数据平台,按照设计好的数据模型对关键业务字段进行抽取,形成一张二维表,以便后续在大数据平台/数据仓库中进行统计分析、关联计算。 本文结合一个具体的案例来介绍如何使用MaxCompute对json格式的日志数据进行转换处理。 1.数据来源:应用实时写入ECS主机的指定目录下的日志文件中; 2.数据格式:日志文件中,每条日志的格式如下图所示(示例中对数据进行了简化和脱敏),每一条日志中包含了设备信息,以及1个或多个Session信息,且每条日志中的Session数量是动态的:1个或多个Session。每条日志的内容示例如下: 3.数据处理需求:采集日志数据,对日志数据进行解析、转换,对转换后的日志数据在MaxCompute进行统计分析。由于日志数
相关文章
文章评论
共有0条评论来说两句吧...