Python够直接,从它开始是个不错的选择。
Elasticsearch客户端列表:https://www.elastic.co/guide/en/elasticsearch/client/index.html
Python API:https://www.elastic.co/guide/en/elasticsearch/client/python-api/current/index.html
参考文档:http://elasticsearch-py.readthedocs.io/en/master/index.html
安装
我在CentOS 7上安装了Python3.6,安装时使用下面的命令:
pip3 install elasticsearch
安装时需要root权限
牛刀小试
由于Elasticsearch索引的文档是JSON形式,而MongoDB存储也是以JSON形式,因此这里选择通过MongoDB导出数据添加到Elasticsearch中。
使用MongoDB的Python API时,需要先安装pymongo,命令:pip3 install pymongo
import traceback
from pymongo import MongoClient
from elasticsearch import Elasticsearch
_db = MongoClient('mongodb://127.0.0.1:27017')['blog']
_es = Elasticsearch()
_index_mappings = {
"mappings": {
"user": {
"properties": {
"title": { "type": "text" },
"name": { "type": "text" },
"age": { "type": "integer" }
}
},
"blogpost": {
"properties": {
"title": { "type": "text" },
"body": { "type": "text" },
"user_id": {
"type": "keyword"
},
"created": {
"type": "date"
}
}
}
}
}
if _es.indices.exists(index='blog_index') is not True:
_es.indices.create(index='blog_index', body=_index_mappings)
user_cursor = db.user.find({}, projection={'_id':False})
user_docs = [x for x in user_cursor]
processed = 0
for _doc in user_docs:
try:
_es.index(index='blog_index', doc_type='user', refresh=True, body=_doc)
processed += 1
print('Processed: ' + str(processed), flush=True)
except:
traceback.print_exc()
print('Search all...', flush=True)
_query_all = {
'query': {
'match_all': {}
}
}
_searched = _es.search(index='blog_index', doc_type='user', body=_query_all)
print(_searched, flush=True)
for hit in _searched['hits']['hits']:
print(hit['_source'], flush=True)
print('Search name contains jerry.', flush=True)
_query_name_contains = {
'query': {
'match': {
'name': 'jerry'
}
}
}
_searched = _es.search(index='blog_index', doc_type='user', body=_query_name_contains)
print(_searched, flush=True)
运行上面的文件(elasticsearch_trial.py):
python3 elasticsearch_tria.py
可以得到下面的输出结果:
Processed: 1
Processed: 2
Processed: 3
Search all...
{'took': 1, 'timed_out': False, '_shards': {'total': 5, 'successful': 5, 'failed': 0}, 'hits': {'total': 3, 'max_score': 1.0, 'hits': [{'_index': 'blog_index', '_type': 'user', '_id': 'AVn4TrrVXvwnWPWhxu5q', '_score': 1.0, '_source': {'title': 'Manager', 'name': 'Trump Heat', 'age': 67}}, {'_index': 'blog_index', '_type': 'user', '_id': 'AVn4TrscXvwnWPWhxu5s', '_score': 1.0, '_source': {'title': 'Engineer', 'name': 'Tommy Hsu', 'age': 32}}, {'_index': 'blog_index', '_type': 'user', '_id': 'AVn4Trr2XvwnWPWhxu5r', '_score': 1.0, '_source': {'title': 'President', 'name': 'Jerry Jim', 'age': 21}}]}}
{'title': 'Manager', 'name': 'Trump Heat', 'age': 67}
{'title': 'Engineer', 'name': 'Tommy Hsu', 'age': 32}
{'title': 'President', 'name': 'Jerry Jim', 'age': 21}
Search name contains jerry.
{'took': 3, 'timed_out': False, '_shards': {'total': 5, 'successful': 5, 'failed': 0}, 'hits': {'total': 1, 'max_score': 0.25811607, 'hits': [{'_index': 'blog_index', '_type': 'user', '_id': 'AVn4Trr2XvwnWPWhxu5r', '_score': 0.25811607, '_source': {'title': 'President', 'name': 'Jerry Jim', 'age': 21}}]}}
![这里写图片描述]()