干货 | 一文带你搞定Python 数据可视化-低调大师

干货 | 一文带你搞定Python 数据可视化

2018-10-29 599

01前言

在之前的一篇文章《Python 数据可视化利器》中，我写了 Bokeh、pyecharts 的用法，但是有一个挺强大的库 Plotly 没写，主要是我看到它的教程都是在 Jupyter Notebooks 中使用，说来也奇怪，硬是找不到如何本地使用（就是本地输出 HTML 文件），所以不敢写出来。现在已经找到方法了，这里我就在原文的基础上增加了 Plotly 的部分教程。

数据可视化的第三方库挺多的，这里我主要推荐两个，分别是 Bokeh、pyecharts。

02推荐

数据可视化的库有挺多的，这里推荐几个比较常用的：

●  Matplotlib
●  Plotly
●  Seaborn
●  Ggplot
●  Bokeh
●  Pyechart
●  Pygal

03Plotly

Plotly 文档地址：

● https://plot.ly/python/#financial-charts

使用方式：

Plotly 有 online 和 offline 两种方式，这里只介绍 offline 的。

这是 Plotly 官方教程的一部分

import plotly.plotly as py
import numpy as np

data = [dict(

visible=False,

line=dict(color='#00CED1', width=6), # 配置线宽和颜色

name='𝜈 = ' + str(step),

x=np.arange(0, 10, 0.01), # x 轴参数

y=np.sin(step * np.arange(0, 10, 0.01))) for step in np.arange(0, 5, 0.1)] # y 轴参数

data[10]['visible'] = True

py.iplot(data, filename='Single Sine Wave')

只要将最后一行中的

py.iplot

替换为下面代码

py.offline.plot

便可以运行。

漏斗图

这个图代码太长了，就不 po 出来了。

Basic Box Plot

 import plotly.plotly

 import plotly.graph_objs as go

 import numpy as np

 y0 = np.random.randn(50)-1

 y1 = np.random.randn(50)+1

trace0 = go.Box(

y=y0

)

trace1 = go.Box(

y=y1

)

data = [trace0, trace1]

plotly.offline.plot(data)

Wind Rose Chart

 import plotly.graph_objs as go

trace1 = go.Barpolar(

r=[77.5, 72.5, 70.0, 45.0, 22.5, 42.5, 40.0, 62.5],

text=['North', 'N-E', 'East', 'S-E', 'South', 'S-W', 'West', 'N-W'],

name='11-14 m/s',

marker=dict(

color='rgb(106,81,163)'

)

trace2 = go.Barpolar(

r=[57.49999999999999, 50.0, 45.0, 35.0, 20.0, 22.5, 37.5, 55.00000000000001],

text=['North', 'N-E', 'East', 'S-E', 'South', 'S-W', 'West', 'N-W'], # 鼠标浮动标签文字描述

name='8-11 m/s',

marker=dict(

color='rgb(158,154,200)'

)

trace3 = go.Barpolar(

r=[40.0, 30.0, 30.0, 35.0, 7.5, 7.5, 32.5, 40.0],

text=['North', 'N-E', 'East', 'S-E', 'South', 'S-W', 'West', 'N-W'],

name='5-8 m/s',

marker=dict(

color='rgb(203,201,226)'

)

trace4 = go.Barpolar(

r=[20.0, 7.5, 15.0, 22.5, 2.5, 2.5, 12.5, 22.5],

text=['North', 'N-E', 'East', 'S-E', 'South', 'S-W', 'West', 'N-W'],

name='< 5 m/s',

marker=dict(

color='rgb(242,240,247)'

)

data = [trace1, trace2, trace3, trace4]

layout = go.Layout(

title='Wind Speed Distribution in Laurel, NE',

font=dict(

size=16

),

legend=dict(

font=dict(

size=16

)

),

radialaxis=dict(

ticksuffix='%'

),

orientation=270

)

fig = go.Figure(data=data, layout=layout)

plotly.offline.plot(fig, filename='polar-area-chart')

Basic Ternary Plot with Markers

篇幅有点长，这里就不 po 代码了。

04Bokeh

这里展示一下常用的图表和比较抢眼的图表，详细的文档可查看。

条形图

这配色看着还挺舒服的，比 pyecharts 条形图的配色好看一点。

from bokeh.io import show, output_file

from bokeh.models import ColumnDataSource

from bokeh.palettes import Spectral6

from bokeh.plotting import figure

output_file("colormapped_bars.html")# 配置输出文件名

fruits = ['Apples', '魅族', 'OPPO', 'VIVO', '小米', '华为'] # 数据

counts = [5, 3, 4, 2, 4, 6] # 数据

source = ColumnDataSource(data=dict(fruits=fruits, counts=counts, color=Spectral6))

p = figure(x_range=fruits, y_range=(0,9), plot_height=250, title="Fruit Counts",

toolbar_location=None, tools="")# 条形图配置项

p.vbar(x='fruits', top='counts', width=0.9, color='color', legend="fruits", source=source)

p.xgrid.grid_line_color = None # 配置网格线颜色

p.legend.orientation = "horizontal" # 图表方向为水平方向

p.legend.location = "top_center"

show(p) # 展示图表

年度条形图

可以对比不同时间点的量。

from bokeh.io import show, output_file

from bokeh.models import ColumnDataSource, FactorRange

from bokeh.plotting import figure

output_file("bars.html") # 输出文件名

fruits = ['Apple', '魅族', 'OPPO', 'VIVO', '小米', '华为'] # 参数

years = ['2015', '2016', '2017'] # 参数

data = {'fruits': fruits,

'2015': [2, 1, 4, 3, 2, 4],

'2016': [5, 3, 3, 2, 4, 6],

'2017': [3, 2, 4, 4, 5, 3]}

x = [(fruit, year) for fruit in fruits for year in years]

counts = sum(zip(data['2015'], data['2016'], data['2017']), ())

source = ColumnDataSource(data=dict(x=x, counts=counts))

p = figure(x_range=FactorRange(*x), plot_height=250, title="Fruit Counts by Year",

toolbar_location=None, tools="")

p.vbar(x='x', top='counts', width=0.9, source=source)

p.y_range.start = 0

p.x_range.range_padding = 0.1

p.xaxis.major_label_orientation = 1

p.xgrid.grid_line_color = None

show(p)

饼图

from collections import Counter

from math import pi

import pandas as pd

from bokeh.io import output_file, show

from bokeh.palettes import Category20c

from bokeh.plotting import figure

from bokeh.transform import cumsum

output_file("pie.html")

x = Counter({

'中国': 157,

'美国': 93,

'日本': 89,

'巴西': 63,

'德国': 44,

'印度': 42,

'意大利': 40,

'澳大利亚': 35,

'法国': 31,

'西班牙': 29

})

data = pd.DataFrame.from_dict(dict(x), orient='index').reset_index().rename(index=str, columns={0:'value', 'index':'country'})

data['angle'] = data['value']/sum(x.values()) * 2*pi

data['color'] = Category20c[len(x)]

p = figure(plot_height=350, title="Pie Chart", toolbar_location=None,

tools="hover", tooltips="@country: @value")

p.wedge(x=0, y=1, radius=0.4,

start_angle=cumsum('angle', include_zero=True), end_angle=cumsum('angle'),

line_color="white", fill_color='color', legend='country', source=data)

p.axis.axis_label=None

p.axis.visible=False

p.grid.grid_line_color = None

show(p)

条形图

年度水果进出口

from bokeh.io import output_file, show
from bokeh.models import ColumnDataSource
from bokeh.palettes import GnBu3, OrRd3
from bokeh.plotting import figure

output_file("stacked_split.html")

fruits = ['Apples', 'Pears', 'Nectarines', 'Plums', 'Grapes', 'Strawberries']
years = ["2015", "2016", "2017"]

exports = {'fruits': fruits,
'2015': [2, 1, 4, 3, 2, 4],
'2016': [5, 3, 4, 2, 4, 6],
'2017': [3, 2, 4, 4, 5, 3]}

imports = {'fruits': fruits,
'2015': [-1, 0, -1, -3, -2, -1],
'2016': [-2, -1, -3, -1, -2, -2],
'2017': [-1, -2, -1, 0, -2, -2]}

p = figure(y_range=fruits, plot_height=250, x_range=(-16, 16), title="Fruit import/export, by year",

toolbar_location=None)

p.hbar_stack(years, y='fruits', height=0.9, color=GnBu3, source=ColumnDataSource(exports),

legend=["%s exports" % x for x in years])

p.hbar_stack(years, y='fruits', height=0.9, color=OrRd3, source=ColumnDataSource(imports),

legend=["%s imports" % x for x in years])

p.y_range.range_padding = 0.1

p.ygrid.grid_line_color = None

p.legend.location = "top_left"

p.axis.minor_tick_line_color = None

p.outline_line_color = None
show(p)

散点图

from bokeh.plotting import figure, output_file, show

output_file("line.html")

p = figure(plot_width=400, plot_height=400)

p.circle([1, 2, 3, 4, 5], [6, 7, 2, 4, 5], size=20, color="navy", alpha=0.5)

show(p)

六边形图

这两天，马蜂窝刚被发现数据造假，这不，与马蜂窝应应景。

import numpy as np

from bokeh.io import output_file, show

from bokeh.plotting import figure

from bokeh.util.hex import axial_to_cartesian

output_file("hex_coords.html")

q = np.array([0, 0, 0, -1, -1, 1, 1])

r = np.array([0, -1, 1, 0, 1, -1, 0])

p = figure(plot_width=400, plot_height=400, toolbar_location=None) #

p.grid.visible = False # 配置网格是否可见

p.hex_tile(q, r, size=1, fill_color=["firebrick"] * 3 + ["navy"] * 4,

line_color="white", alpha=0.5)

x, y = axial_to_cartesian(q, r, 1, "pointytop")

p.text(x, y, text=["(%d, %d)" % (q, r) for (q, r) in zip(q, r)],

text_baseline="middle", text_align="center")

show(p)

环比条形图

这个实现挺厉害的，看了一眼就吸引了我。我在代码中都做了一些注释，希望对你理解有帮助。注：圆心为正中央，即直角坐标系中标签为（0，0）的地方。

 from collections import OrderedDict

 from math import log, sqrt

 import numpy as np

 import pandas as pd

 from six.moves import cStringIO as StringIO

 from bokeh.plotting import figure, show, output_file

antibiotics = """

bacteria, penicillin, streptomycin, neomycin, gram

结核分枝杆菌, 800, 5, 2, negative

沙门氏菌, 10, 0.8, 0.09, negative

变形杆菌, 3, 0.1, 0.1, negative

肺炎克雷伯氏菌, 850, 1.2, 1, negative

布鲁氏菌, 1, 2, 0.02, negative

铜绿假单胞菌, 850, 2, 0.4, negative

大肠杆菌, 100, 0.4, 0.1, negative

产气杆菌, 870, 1, 1.6, negative

白色葡萄球菌, 0.007, 0.1, 0.001, positive

溶血性链球菌, 0.001, 14, 10, positive

草绿色链球菌, 0.005, 10, 40, positive

肺炎双球菌, 0.005, 11, 10, positive

"""

drug_color = OrderedDict([# 配置中间标签名称与颜色

("盘尼西林", "#0d3362"),

("链霉素", "#c64737"),

("新霉素", "black"),

])

gram_color = {

"positive": "#aeaeb8",

"negative": "#e69584",

}

# 读取数据

df = pd.read_csv(StringIO(antibiotics),

skiprows=1,

skipinitialspace=True,

engine='python')

width = 800

height = 800

inner_radius = 90

outer_radius = 300 - 10

minr = sqrt(log(.001 * 1E4))

maxr = sqrt(log(1000 * 1E4))

a = (outer_radius - inner_radius) / (minr - maxr)

b = inner_radius - a * maxr

def rad(mic):

return a * np.sqrt(np.log(mic * 1E4)) + b

big_angle = 2.0 * np.pi / (len(df) + 1)

small_angle = big_angle / 7

# 整体配置

p = figure(plot_width=width, plot_height=height, title="",

x_axis_type=None, y_axis_type=None,

x_range=(-420, 420), y_range=(-420, 420),

min_border=0, outline_line_color="black",

background_fill_color="#f0e1d2")

p.xgrid.grid_line_color = None

p.ygrid.grid_line_color = None

# annular wedges

angles = np.pi / 2 - big_angle / 2 - df.index.to_series() * big_angle #计算角度

colors = [gram_color[gram] for gram in df.gram] # 配置颜色

p.annular_wedge(

0, 0, inner_radius, outer_radius, -big_angle + angles, angles, color=colors,

)

# small wedges

p.annular_wedge(0, 0, inner_radius, rad(df.penicillin),

-big_angle + angles + 5 * small_angle, -big_angle + angles + 6 * small_angle,

color=drug_color['盘尼西林'])

p.annular_wedge(0, 0, inner_radius, rad(df.streptomycin),

-big_angle + angles + 3 * small_angle, -big_angle + angles + 4 * small_angle,

color=drug_color['链霉素'])

p.annular_wedge(0, 0, inner_radius, rad(df.neomycin),

-big_angle + angles + 1 * small_angle, -big_angle + angles + 2 * small_angle,

color=drug_color['新霉素'])

# 绘制大圆和标签

labels = np.power(10.0, np.arange(-3, 4))

radii = a * np.sqrt(np.log(labels * 1E4)) + b

p.circle(0, 0, radius=radii, fill_color=None, line_color="white")

p.text(0, radii[:-1], [str(r) for r in labels[:-1]],

text_font_size="8pt", text_align="center", text_baseline="middle")

# 半径

p.annular_wedge(0, 0, inner_radius - 10, outer_radius + 10,

-big_angle + angles, -big_angle + angles, color="black")

# 细菌标签

xr = radii[0] * np.cos(np.array(-big_angle / 2 + angles))

yr = radii[0] * np.sin(np.array(-big_angle / 2 + angles))

label_angle = np.array(-big_angle / 2 + angles)

label_angle[label_angle < -np.pi / 2] += np.pi # easier to read labels on the left side

# 绘制各个细菌的名字

p.text(xr, yr, df.bacteria, angle=label_angle,

text_font_size="9pt", text_align="center", text_baseline="middle")

# 绘制圆形，其中数字分别为 x 轴与 y 轴标签

p.circle([-40, -40], [-370, -390], color=list(gram_color.values()), radius=5)

# 绘制文字

p.text([-30, -30], [-370, -390], text=["Gram-" + gr for gr in gram_color.keys()],

text_font_size="7pt", text_align="left", text_baseline="middle")

# 绘制矩形，中间标签部分。其中 -40，-40，-40 为三个矩形的 x 轴坐标。18，0，-18 为三个矩形的 y 轴坐标

p.rect([-40, -40, -40], [18, 0, -18], width=30, height=13,

color=list(drug_color.values()))

# 配置中间标签文字、文字大小、文字对齐方式

p.text([-15, -15, -15], [18, 0, -18], text=list(drug_color),

text_font_size="9pt", text_align="left", text_baseline="middle")

output_file("burtin.html", title="burtin.py example")

show(p)

元素周期表

元素周期表，这个实现好牛逼啊，距离初三刚开始学化学已经很遥远了，想当年我还是化学课代表呢！由于基本用不到化学了，这里就不实现了。

真实状态

05Pyecharts

pyecharts 也是一个比较常用的数据可视化库，用得也是比较多的了，是百度 Echarts 库的 Python 支持。这里也展示一下常用的图表。

文档地址为：

● http://pyecharts.org/#/zh-cn/prepare?id=%E5%AE%89%E8%A3%85-pyecharts

条形图

from pyecharts import Bar

bar = Bar("我的第一个图表", "这里是副标题")

bar.add("服装", ["衬衫", "羊毛衫", "雪纺衫", "裤子", "高跟鞋", "袜子"], [5, 20, 36, 10, 75, 90])

# bar.print_echarts_options() # 该行只为了打印配置项，方便调试时使用

bar.render() # 生成本地 HTML 文件

散点图

from pyecharts import Polar

import random

data_1 = [(10, random.randint(1, 100)) for i in range(300)]

data_2 = [(11, random.randint(1, 100)) for i in range(300)]

polar = Polar("极坐标系-散点图示例", width=1200, height=600)

polar.add("", data_1, type='scatter')

polar.add("", data_2, type='scatter')

polar.render()

饼图

import random

from pyecharts import Pie

attr = ['A', 'B', 'C', 'D', 'E', 'F']

pie = Pie("饼图示例", width=1000, height=600)

pie.add(

"",

attr,

[random.randint(0, 100) for _ in range(6)],

radius=[50, 55],

center=[25, 50],

is_random=True,

)

pie.add(

"",

attr,

[random.randint(20, 100) for _ in range(6)],

radius=[0, 45],

center=[25, 50],

rosetype="area",

)

pie.add(

"",

attr,

[random.randint(0, 100) for _ in range(6)],

radius=[50, 55],

center=[65, 50],

is_random=True,

)

pie.add(

"",

attr,

[random.randint(20, 100) for _ in range(6)],

radius=[0, 45],

center=[65, 50],

rosetype="radius",

)

pie.render()

词云

这个是我在前面的文章中用到的图片实例，这里就不 po 具体数据了。

from pyecharts import WordCloud

name = ['Sam S Club'] # 词条

value = [10000] # 权重

wordcloud = WordCloud(width=1300, height=620)

wordcloud.add("", name, value, word_size_range=[20, 100])

wordcloud.render()

树图

这个是我在前面的文章中用到的图片实例，这里就不 po 具体数据了。

from pyecharts import TreeMap

data = [ # 键值对数据结构

{

value: 1212, # 数值

# 子节点

children: [

{

# 子节点数值

value: 2323,

# 子节点名

name: 'description of this node',

children: [...],

},

{

value: 4545,

name: 'description of this node',

children: [

{

value: 5656,

name: 'description of this node',

children: [...]

},

...

]

}

]

},

...

]

treemap = TreeMap(title, width=1200, height=600) # 设置标题与宽高

treemap.add("深圳", data, is_label_show=True, label_pos='inside', label_text_size=19)

treemap.render()

地图

 from pyecharts import Map

value = [155, 10, 66, 78, 33, 80, 190, 53, 49.6]

attr = [

"福建", "山东", "北京", "上海", "甘肃", "新疆", "河南", "广西", "西藏"

]

map = Map("Map 结合 VisualMap 示例", width=1200, height=600)

map.add(

"",

attr,

value,

maptype="china",

is_visualmap=True,

visual_text_color="#000",

)

map.render()

3D 散点图

from pyecharts import Scatter3D

import random

data = [

[random.randint(0, 100),

random.randint(0, 100),

random.randint(0, 100)] for _ in range(80)

]

range_color = [
'#313695', '#4575b4', '#74add1', '#abd9e9', '#e0f3f8', '#ffffbf',
'#fee090', '#fdae61', '#f46d43', '#d73027', '#a50026']

scatter3D = Scatter3D("3D 散点图示例", width=1200, height=600) # 配置宽高

scatter3D.add("", data, is_visualmap=True, visual_range_color=range_color) # 设置颜色等

scatter3D.render() # 渲染

06后记

大概介绍就是这样了，三个库的功能都挺强大的，Bokeh 的中文资料会少一点，如果阅读英文有点难度，还是建议使用 pyecharts 就好。总体也不是很难，按照文档来修改数据都能够直接上手使用。主要是多练习。

原文发布时间为：2018-10-29

本文作者：zone

本文来自云栖社区合作伙伴“CDA数据分析师”，了解相关信息可以关注“CDA数据分析师”。

微信关注我们

原文链接：https://yq.aliyun.com/articles/659746

转载内容版权归作者及来源网站所有！

低调大师中文资讯倾力打造互联网数据资讯、行业资源、电子商务、移动互联网、网络营销平台。持续更新报道IT业界、互联网、市场资讯、驱动更新,是最及时权威的产业资讯及硬件资讯报道平台。

Linux用户权限文件详解

用户管理中最重要的一个文件，这个文件是对所有用户开放的，每一行存一个用户的信息，每个属性之间用冒号分割。 cat /etc/passwd root:x:0:0:root:/root:/bin/zsh bin:x:1:1:bin:/bin:/sbin/nologin daemon:x:2:2:daemon:/sbin:/sbin/nologin adm:x:3:4:adm:/var/adm:/sbin/nologin lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin 举例: root:x:0:0:root:/root:/bin/zsh 1.登陆用户名:root 2.用户密码占位符:x 3.用户UID:0 4.用户组GID:0 5.对用户账户的描述:root 6.用户家目录位置:/root 7.用户默认的shell:/bin/zsh (默认是/bin/bash) linux密码信息文件该文件只有root用户有读权限，每一行存一个用户的记录，每个属性用冒号分割。 cat /etc/shadow root:$6$LxE7qiaZ$ZZXf/aCO1Zqs1...

2018-10-30

640

快学 Go 语言第 1 课 —— Hello World

Go 语言的 Logo 最初 Go 语言的 Logo 是一只可爱的土拨鼠，土拨鼠昼伏夜出的习性让它显得很有 Geek 范。土拨鼠的行动其实并不敏捷，不过它繁殖能力很强，生长发育的很快。到了 2018年，Go 语言重新制定了 Logo，消灭了土拨鼠，取而代之的是纯文字。这好像是在告诉用户 Go 语言不再是一个玩具语言，而是一个严肃的高效的正式语言。不过这个新 Logo 不是很受欢迎，它虽然给人一种快速的感觉，但是也显得很极为古板。到目前为止，你在 Google 图片里搜索 Golang Logo，铺天盖地的还是土拨鼠的形象。 Go 语言的「元团队」很多著名的计算机语言都是那么一两个人业余时间捣鼓出来的，但是 Go 语言是 Google 养着一帮团队打造出来的。这个团队非常豪华，它被称之为 Go Team，成员之一就有大名鼎鼎的 Unix 操作系统的创造者 Ken Thompson，C 语言就是他和已经过世的李奇一起发明的。图片图中翘着二郎腿的谢顶老头就是 Ken Thompson，它参与 Go 项目时个人已经处于半退休状态 —— 年纪大了。有条件的读者可以看看 Youtube ...

2018-10-30

709

发表评论

资源下载

更多资源

Mario

马里奥是站在游戏界顶峰的超人气多面角色。马里奥靠吃蘑菇成长，特征是大鼻子、头戴帽子、身穿背带裤，还留着胡子。与他的双胞胎兄弟路易基一起，长年担任任天堂的招牌角色。

Nacos

Nacos /nɑ:kəʊs/ 是 Dynamic Naming and Configuration Service 的首字母简称，一个易于构建 AI Agent 应用的动态服务发现、配置管理和AI智能体管理平台。Nacos 致力于帮助您发现、配置和管理微服务及AI智能体应用。Nacos 提供了一组简单易用的特性集，帮助您快速实现动态服务发现、服务配置、服务元数据、流量管理。Nacos 帮助您更敏捷和容易地构建、交付和管理微服务平台。

Spring

Spring框架（Spring Framework）是由Rod Johnson于2002年提出的开源Java企业级应用框架，旨在通过使用JavaBean替代传统EJB实现方式降低企业级编程开发的复杂性。该框架基于简单性、可测试性和松耦合性设计理念，提供核心容器、应用上下文、数据访问集成等模块，支持整合Hibernate、Struts等第三方框架，其适用范围不仅限于服务器端开发，绝大多数Java应用均可从中受益。

Rocky Linux

Rocky Linux（中文名：洛基）是由Gregory Kurtzer于2020年12月发起的企业级Linux发行版，作为CentOS稳定版停止维护后与RHEL（Red Hat Enterprise Linux）完全兼容的开源替代方案，由社区拥有并管理，支持x86_64、aarch64等架构。其通过重新编译RHEL源代码提供长期稳定性，采用模块化包装和SELinux安全架构，默认包含GNOME桌面环境及XFS文件系统，支持十年生命周期更新。