搜索[工具模块]结果-低调大师优秀个人博客

精选列表

搜索[工具模块]，共10000篇文章

干货丨时序数据库DolphinDB代码模块复用教程

2.2 创建模块文件在modules目录下创建以.dos为后缀的模块文件，比如FileLog.dos。模块文件的第一行必须是模块声明语句。

2021-02-22

Wine 5.0 发布，多显示器支持、PE 内建模块

主要的更新亮点包括： PE 格式的内建模块多显示器支持 XAudio2 重新实现 Vulkan 1.1 支持同时，官方表示此版本用于纪念 Józef Kucia，他是 Wine 的 Direct3D

2020-01-21

在Joomla 4中，模块样式将转移到布局文件

在Joomla中，模块样式定义模块的HTML输出。这些样式控制模块标题，标题和类后缀的输出。

2019-12-08

Python的C/C++扩展——boost_python编写Python模块

前面讲述了Python使用ctypes直接调用动态库和使用Python的C语言API封装C函数，本文概述方便封装C++类给Python使用的boost_python库。安装boost python库： sudo aptitude install libboost-python-dev 示例下面代码简单实现了一个普通函数maxab()和一个Student类： #include <iostream> #include <string> int maxab(int a, int b) { return a>b?a:b; } class Student { private: int age; std::string name; public: Student() {} Student(std::string const& _name, int _age) { name=_name; age=_age; } static void myrole() { std::cout << "I'm a student!" << std::endl; } void whoami() { std::cout << "I am " << name << std::endl; } bool operator==(Student const& s) const { return age == s.age; } bool operator!=(Student const& s) const { return age != s.age; } }; 使用boost.python库封装也很简单，如下代码所示： #include <Python.h> #include <boost/python.hpp> #include <boost/python/suite/indexing/vector_indexing_suite.hpp> #include <vector> #include "student.h" using namespace boost::python; BOOST_PYTHON_MODULE(student) { // This will enable user-defined docstrings and python signatures, // while disabling the C++ signatures scope().attr("__version__") = "1.0.0"; scope().attr("__doc__") = "a demo module to use boost_python."; docstring_options local_docstring_options(true, false, false); def( "maxab", &maxab, "return max of two numbers.\n" ); class_<Student>("Student", "a class of student") .def(init<>()) .def(init<std::string, int>()) // methods for Chinese word segmentation .def( "whoami", &Student::whoami, "method's doc string..." ) .def( "myrole", &Student::myrole, "method's doc string..." ) .staticmethod("myrole"); // 封装STL class_<std::vector<Student> >("StudentVec") .def(vector_indexing_suite<std::vector<Student> >()) ; } 上述代码还是include了Python.h文件，如果不include的话，会报错误： wrap_python.hpp:50:23: fatal error: pyconfig.h: No such file or directory 编译编译以上代码有两种方式，一种是在命令行下面直接使用g++编译： g++ -I/usr/include/python2.7 -fPIC wrap_student.cpp -lboost_python -shared -o student.so 首先指定Python.h的路径，如果是Python 3的话就要修改为相应的路径，编译wrap_student.cpp要指定-fPIC参数，链接(-lboost_python)生成动态库(-shared)。生成的student.so动态库就可以被python直接import使用了 In [1]: import student In [2]: student.maxab(2, 5)Out[2]: 5 In [3]: s = student.Student('Tom', 12) In [4]: s.whoami()I am Tom In [5]: s.myrole()I'm a student!另外一直方法是用python的setuptools编写setup.py脚本： #!/usr/bin/env python from setuptools import setup, Extension setup(name="student", ext_modules=[ Extension("student", ["wrap_student.cpp"], libraries = ["boost_python"]) ]) 然后执行命令编译： python setup.py build or sudo python setup.py install 文章版权归属于猿人学

2019-03-31

分布式事务中间件 Fescar—RM 模块源码解读

Fescar设计上将整体分成三个大模块，即TM、RM、

2019-02-13

纯手工搭建k8s集群-(二)核心模块部署

（其他模块通过API Server查询或修改数据，只有API Server才直接操作etcd）生产环境为了保证apiserver的高可用一般会部署2+个节点，在上层做一个lb做负载均衡，比如haproxy

2018-12-13

4-学会刷Wi-Fi模块固件(刷AT指令固件)

ESP12 最新版可以直接对接，不需要接线其实刷固件都是GPIO0接低电平,然后复位一下,然后就可以刷固件了先看ESP01刷固件拨码开关然后GPIO0接低电平的时候复位一下模块

2018-12-01

23、【支付模块开发】——Java对接支付宝步骤(沙箱环境)

104506 打开支付宝接口官网： image.png 我们下载Java版Demo 下载之后解压，然后我们用IDEA导入这个Demo项目~ image.png 然后，我们下载一个我们后面需要生成生成RSA密钥的工具

2018-10-14

linux 定时任务 python找不到模块问题解决

先说结论：以后在涉及到定时任务，指定python的环境路径。 shell中python路径问题定时任务默认的python路径为系统自带写一个python程序sys_path.py import sys print(sys.path) 放入shell脚本sys_path.sh python ./sys_path.py 执行sh脚本 sh sys_path.sh ['/data0/qinyk/test', '/data0/anaconda3/lib/python36.zip', '/data0/anaconda3/lib/python3.6', '/data0/anaconda3/lib/python3.6/lib-dynload', '/data0/anaconda3/lib/python3.6/site-packages', '/data0/anaconda3/lib/python3.6/site-packages/PyHive-0.3.0-py3.6.egg', '/data0/anaconda3/lib/python3.6/site-packages/xgboost-0.71-py3.6.egg'] 定时任务crontab -e 并保存日志 * * * * * sh sys_path.sh >sys_path.log 2>&1 cat sys_path.log ['/data0/qinyk/test', '/usr/lib64/python26.zip', '/usr/lib64/python2.6', '/usr/lib64/python2.6/plat-linux2', '/usr/lib64/python2.6/lib-tk', '/usr/lib64/python2.6/lib-old', '/usr/lib64/python2.6/lib-dynload', '/usr/lib64/python2.6/site-packages', '/usr/lib64/python2.6/site-packages/gtk-2.0', '/usr/lib/python2.6/site-packages']

2018-09-06

python-基于UDP通信的套接字，socketserver模块的使用

一个recvfrom(x)必须对唯一一个sendinto(y),收完了x个字节的数据就算完成,若是y>x数据就丢失，这意味着udp根本不会粘包，但是会丢数据，不可靠二、socketserver模块

2018-09-03

Python random模块（获取随机数）常用方法和使用例子

random.randomrandom.random()用于生成一个0到1的随机符点数: 0 <= n < 1.0 random.uniformrandom.uniform(a, b)，用于生成一个指定范围内的随机符点数，两个参数其中一个是上限，一个是下限。如果a < b，则生成的随机数n: a <= n <= b。如果 a > b，则 b <= n <= a 代码如下: print random.uniform(10, 20)print random.uniform(20, 10) 18.7356606526 12.5798298022 random.randintrandom.randint(a, b)，用于生成一个指定范围内的整数。其中参数a是下限，参数b是上限，生成的随机数n: a <= n <= b 代码如下: print random.randint(12, 20) # 生成的随机数 n: 12 <= n <= 20print random.randint(20, 20) # 结果永远是20 print random.randint(20, 10) # 该语句是错误的。下限必须小于上限 random.randrangerandom.randrange([start], stop[, step])，从指定范围内，按指定基数递增的集合中获取一个随机数。如：random.randrange(10, 100, 2)，结果相当于从[10, 12, 14, 16, ... 96, 98]序列中获取一个随机数。random.randrange(10, 100, 2)在结果上与 random.choice(range(10, 100, 2) 等效 random.choicerandom.choice从序列中获取一个随机元素。其函数原型为：random.choice(sequence)。参数sequence表示一个有序类型。这里要说明一下：sequence在python不是一种特定的类型，而是泛指一系列的类型。list, tuple, 字符串都属于sequence。有关sequence可以查看python手册数据模型这一章。下面是使用choice的一些例子：代码如下: print random.choice("学习Python")print random.choice(["JGood", "is", "a", "handsome", "boy"])print random.choice(("Tuple", "List", "Dict")) random.shufflerandom.shuffle(x[, random])，用于将一个列表中的元素打乱。如: 代码如下: p = ["Python", "is", "powerful", "simple", "and so on..."]random.shuffle(p)print p ['powerful', 'simple', 'is', 'Python', 'and so on...'] random.samplerandom.sample(sequence, k)，从指定序列中随机获取指定长度的片断。sample函数不会修改原有序列代码如下: list = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]slice = random.sample(list, 5) # 从list中随机获取5个元素，作为一个片断返回print sliceprint list # 原有序列并没有改变随机整数：代码如下: import randomrandom.randint(0,99) 21 随机选取0到100间的偶数：代码如下: import randomrandom.randrange(0, 101, 2) 42 随机浮点数：代码如下: import randomrandom.random()0.85415370477785668random.uniform(1, 10) 5.4221167969800881 随机字符：代码如下: import randomrandom.choice('abcdefg%^*f') 'd' 多个字符中选取特定数量的字符：代码如下: import random random.sample('abcdefghij', 3) ['a', 'd', 'b'] 多个字符中选取特定数量的字符组成新字符串：代码如下: import randomimport stringstring.join( random.sample(['a','b','c','d','e','f','g','h','i','j'], 3) ).replace(" ","") 'fih' 随机选取字符串：代码如下: import randomrandom.choice ( ['apple', 'pear', 'peach', 'orange', 'lemon'] ) 'lemon' 洗牌：代码如下: import randomitems = [1, 2, 3, 4, 5, 6]random.shuffle(items)items [3, 2, 5, 6, 4, 1]

2018-08-07

Python全栈 MySQL 数据库（引擎、事物、pymysql模块、orm）

开始事物 begin；此时autocommit被禁用提交事物： commit；终止事物 rollback；与python交互：交互类型 python3： pymysql模块

2018-08-02

node事件循环 EventEmitter 异步I/O Buffer缓冲区模块

\Desktop\test> node main.js 连接成功数据接收成功程序执行完毕 PS C:\Users\mingm\Desktop\test> // 引入一个 events 模块

2018-07-25

JavaScript 编程精解中文第三版十、模块

模块模块试图避免这些问题。模块是一个程序片段，规定了它依赖的其他部分，以及它为其他模块提供的功能（它的接口）。模块接口与对象接口有许多共同之处，我们在第 6 章中看到。

2018-05-07

TensorFlow Hub介绍：TensorFlow中可重用的机器学习模块库

我们常常使用库构建块或模块，并将其连接在一起进行编程。开发人员是如果使用库的呢？除了共享代码之外，我们还可以共享预训练模型。

2018-04-03

基于JAVA的模块化开发框架JarsLink

模块化开发的好处可插拔，一个应用由多个模块组成，应用里的模块可拆和合，模块可快速在多个系统中迁移和部署。模块化开发，模块之间互相隔离，实现故障隔离。一个模块一个分支，不会引发代码冲突。

2018-03-21

JavaScript进阶【一】JavaScript模块化开发的基础知识

//1.最初写法 //下面的m1和m2就组成了一个模块 //缺点："污染"了全局变量，无法保证不与其他模块发生变量名冲突，而且模块成员之间看不出直接关系。

2018-02-02

Android Studio - 第四十八期模块ViewPager+Fragment

最近一直在review撸撸的代码，发现了一种模块的写法，非常不错，独立出来，希望能帮到你~ 如果你遇到这样的页面，怎么办，不会把所有代码都写到一个页面中吧~，这样看你代码的人会骂死你的吧~我想~而且如果不同的版本要用不同的位置

2017-11-16

探索 OpenStack 之（17）：计量模块 Ceilometer 中的数据收集机制

本文将阐述 Ceilometer 中的数据收集机制。Ceilometer 使用三种机制来收集数据： Notifications：Ceilometer 接收 OpenStack 其它服务发出的 notification message Polling：直接从 Hypervisor 或者使用 SNMP 从host machine，或者使用 OpenStack 其它服务的 API 来获取数据。 RESTful API：别的 application 使用 Ceilometer 的 REST API 创建 samples。 1. Notifications 1.1 被 Ceilometer 处理的 notifications 所有的 OpenStack 服务都会在执行了某种操作或者状态变化时发出 notification。一些 nofication message 会包含 metering 需要的数据，这部分消息会被ceilometer 处理并转化为samples。下表列出了目前 Ceilometer 所处理的各服务的notification：（参考文档：http://docs.openstack.org/admin-guide-cloud/content/section_telemetry-notifications.html） OpenStack service Event types Note OpenStack Compute scheduler.run_instance.scheduled，scheduler.select_destinations compute.instance.* For a more detailed list of Compute notifications please check theSystem Usage Data wiki page. Bare metal module for OpenStack hardware.ipmi.* OpenStack Image Service image.update，image.upload，image.delete，image.send The required configuration for Image service can be found in theConfigure the Image Service for Telemetry sectionsection in theOpenStack Installation Guide. OpenStack Networking floatingip.create.end，floatingip.update.*，floatingip.exists network.create.end，network.update.*，network.exists port.create.end，port.update.*，port.exists router.create.end，router.update.*，router.exists subnet.create.end，subnet.update.*，subnet.exists l3.meter Orchestration module orchestration.stack.create.end，orchestration.stack.update.end orchestration.stack.delete.end，orchestration.stack.resume.end orchestration.stack.suspend.end OpenStack Block Storage volume.exists，volume.create.*，volume.delete.* volume.update.*，volume.resize.*，volume.attach.* volume.detach.* snapshot.exists，snapshot.create.* snapshot.delete.*，snapshot.update.* The required configuration for Block Storage service can be found in theAdd the Block Storage service agent for Telemetry sectionsection in theOpenStack Installation Guide. 1.2 Cinder Volume Notificaitons 发出过程 Cinder 中 /cinder/volume/util.py 的notify_about_volume_usage 函数负责调用 oslo.message 的方法来发出 volume usage 相关的 notificaiton message： def notify_about_volume_usage(context, volume, event_suffix, extra_usage_info=None, host=None): if not host: host = CONF.host if not extra_usage_info: extra_usage_info = {} usage_info = _usage_from_volume(context, volume, **extra_usage_info) rpc.get_notifier("volume", host).info(context, 'volume.%s' % event_suffix, usage_info) 下图显示了该函数被调用的地方。可见： Controller 节点上的 cinder-api 会发出 Info 级别的 volume.update.* notificaiton Controller 节点上的cinder-scheduler 会发出 Info 级别的volume.create.* notification Volume 节点上的 cinder-volume 会发出Info 级别的别的volume.*.* notificaiton 再看看 notification 发出的时机。以 volume.update.* 为例： @wsgi.serializers(xml=VolumeTemplate) def update(self, req, id, body): """Update a volume.""" ... try: volume = self.volume_api.get(context, id, viewable_admin_meta=True) volume_utils.notify_about_volume_usage(context, volume, 'update.start') #开始更新前发出 volume.update.start notificaiton self.volume_api.update(context, volume, update_dict) except exception.NotFound: msg = _("Volume could not be found") raise exc.HTTPNotFound(explanation=msg) volume.update(update_dict) utils.add_visible_admin_metadata(volume) volume_utils.notify_about_volume_usage(context, volume, 'update.end') #更新结束后发出 volume.update.end notification return self._view_builder.detail(req, volume) 在来看看使用 notificaiton driver 是如何发出 notification 的： // /oslo/messaging/notify/_impl_messaging.py， // notificaiton driver 由 cinder.conf 配置项 notification_driver = cinder.openstack.common.notifier.rpc_notifier 指定，它实际对应的是 oslo.messaging.notify._impl_messaging:MessagingDriver (对应关系由 cinder/setup.cfg 定义) def notify(self, ctxt, message, priority, retry): priority = priority.lower() for topic in self.topics: target = messaging.Target(topic='%s.%s' % (topic, priority)) #topic 是 notificaitons.info，因此会被发到同名的queue。使用默认的由 cinder.conf 中配置项 control_exchange 指定的exchange，其默认值为 openstack。而 topic 中的 "notifications" 由配置项 #notification_topics=notifications指定。 try: self.transport._send_notification(target, ctxt, message, version=self.version, retry=retry) #Send a notify message on a topic except Exception: ...... 因此，为了 Cinder 能正确发出 notificaiton 被 Ceilometer 接收到，需要在 controller 节点和 cinder-volume 节点上的 cinder.conf 中做如下配置： control_exchange = cinder #因为queue "notificaitons.info" 是 bind 到 "cinder" exchange 上的，所以 cinder 的 notificaiton message 需要被发到 “cinder” exchange。 notification_driver = cinder.openstack.common.notifier.rpc_notifier #在某些时候 /oslo/messaging/notify/_impl_messaging.py 不存在，需要手工从别的地方拷贝过来 Cinder 还有会同样的方式发出别的资源的notification： 81: rpc.get_notifier("volume", host).info(context, 'volume.%s' % event_suffix, 113: rpc.get_notifier('snapshot', host).info(context, 'snapshot.%s' % event_suffix, 129: rpc.get_notifier('replication', host).info(context, 'replication.%s' % suffix, 145: rpc.get_notifier('replication', host).error(context, 'replication.%s' % suffix, 174: rpc.get_notifier("consistencygroup", host).info(context,'consistencygroup.%s' % event_suffix, 204: rpc.get_notifier("cgsnapshot", host).info( 但是目前 Ceilometer 只处理 volume 和 snapshot notificaiton message。 1.3 Ceilometer 处理 Volume notifications 的过程 Ceilometer 从 AMQP message queue "notifications.info" 中获取 notificaiton 消息。该 queue 的名字由 ceilometer.conf 中的配置项notification_topics = notifications 指定。它会按照一定的方法将 notification 转化为 ceilometer event，然后再转化为 samples。 1.4 Cinder 到 Ceilometer 全过程（1） cinder-* 发出 event-type 为 "volume.*.*" topic 为"<topic>.<priority>" 的消息到类型为 topic 名为 <service> 的exchange （2）exchange <service> 和queue "<topic>.<priority>" 使用 routing-key "<topic>.<priority>"绑定（3）notificaiton message 被 exchange 转发到queue "<topic>.<priority>" （4）ceilometer-agent-notification 从queue "<topic>.<priority>" 中获取 message 这里对cinder 来说： <service> 是 "cinder"。需要注意 cinder 默认的 control exchange 是 "openstack"，所以使用 ceilometer 时需要将其修改为 "cinder"。 <topic> 是 "notificaitons"，由 cinder.conf 中的配置项 notification_topics=notifications指定。 <priority> 是 "info"，由 cinder 代码中写死的。 notificaiton message 的数据内容可参考https://wiki.openstack.org/wiki/SystemUsageData 2. Polling Ceilometer 的 polling 机制使用三种类型的 agent： Compute agent Central agent IPMI agent 在 Kilo 版本中，这些 agent 都属于ceilometer-polling，不同的是，每种agent使用不同的polling plug-ins (pollsters) 2.1 Central agent 该 agent 负责使用个 OpenStack 服务的 REST API 来获取 openstack 资源的各种信息，以及通过 SNMP 来获取 hardware 资源的信息。这些资源包括： OpenStack Networking OpenStack Object Storage OpenStack Block Storage Hardware resources via SNMP Energy consumption metrics viaKwapiframework 该 agent 收集到的 samples 会通过 AMQP 发给 Ceilometer Collector 或者外部系统。 2.2 Compute agent Compute agent 安装在 compute node 上，负责收集在上面运行的虚机的使用数据。它是通过调用 hypervisor SDK 来收集数据的。到目前为止支持的hypervisor包括： Kernel-based Virtual Machine (KVM) Quick Emulator (QEMU) Linux Containers (LXC) User-mode Linux (UML) Hyper-V XEN VMWare vSphere 除了虚机外，该 agent 还能够收集 compute 节点 cpu 的数据。这功能需要配置 nova.conf 文件中的compute_monitors项为ComputeDriverCPUMonitor。 2.3 IPMI agent IPMI agent 负责在 compute 节点上收集 IPMI 传感器（sensor）的数据，以及Intel Node Manager 的数据。 3. 使用 Ceilometer REST API 创建 samples $ ceilometer sample-create -r 37128ad6-daaa-4d22-9509-b7e1c6b08697 -m memory.usage --meter-type gauge --meter-unit MB --sample-volume 48 +-------------------+--------------------------------------------+ | Property | Value | +-------------------+--------------------------------------------+ | message_id | 6118820c-2137-11e4-a429-08002715c7fb | | name | memory.usage | | project_id | e34eaa91d52a4402b4cb8bc9bbd308c1 | | resource_id | 37128ad6-daaa-4d22-9509-b7e1c6b08697 | | resource_metadata | {} | | source | e34eaa91d52a4402b4cb8bc9bbd308c1:openstack | | timestamp | 2014-08-11T09:10:46.358926 | | type | gauge | | unit | MB | | user_id | 679b0499e7a34ccb9d90b64208401f8e | | volume | 48.0 | +-------------------+--------------------------------------------+ 4. 收集 Neutron Bandwidth samples Havana 版本中添加该功能。与 Ceilometer 其他采集方式不同的是，bandwidth 的采集是通过 neutron-meter-agent 收集，然后 push 到 oslo-messaging，ceilometer-agent-notification通过监听消息队列来收取bandwidth信息。其实现是在 L3 router 层次来收集数据，因此需要操作员配置 IP 范围以及设置标签（label）。比如，我们加两个标签，一个表示内部网络流量，另一个表示外部网络流量。每个标签会计量一定IP范围内的流量。然后，每个标签的带宽的测量数据会被发到 MQ，然后被 Ceilometer 收集到。参考链接： https://wiki.openstack.org/wiki/Neutron/Metering/Bandwidth https://openstackr.wordpress.com/2014/05/23/bandwidth-monitoring-with-neutron-and-ceilometer/ 5. 收集物理设备samples 5.1 使用 kwapi kwapi 收集设备能耗数据有时候我们需要收集 OpenStack 集群中服务器的能耗数据。kwapi 是采集物理机能耗信息的项目，agent-central 组件通过kwapi暴露的api来收集物理机的能耗信息。目前 kwapi 提供两个类型的计量数据： Energy (cumulative type): 表示 kWh. Power (gauge type): 表示 watts. Ceilometer central agent 的 pollers 直接调用 kwapi 的 API 来获取 samples。参考文档： http://kwapi.readthedocs.org/en/latest/architecture.html http://blog.zhaw.ch/icclab/collecting-energy-consumption-data-using-kwapi-in-openstack/ http://perso.ens-lyon.fr/laurent.lefevre/greendayslux/GreenDays_Rossigneux.pdf 5.2 使用 snmp 协议收集硬件的CPU、MEM、IO等信息在 IceHouse 中新增该功能。参考文档：http://www.cnblogs.com/smallcoderhujin/p/4150368.html 6. 基于 OpenDayLight 收集 SDN samples OpenDayLight 是 SDN 解决方案的开源项目，它的规范中包括暴露 REST API 接口来提供SDN内部的一些信息，Ceilometer Central agent 正是通过这些 API 来收集网络组件的信息。基本实现： Central agent 不直接调用OpenDayLight 的 REST API，而是实现了一个 driver 来调用。 Driver 调用 REST API 收集统计数据，返回 volume、resource id 和 metadata 给 pollster。 Pollster 负责产生 samples。实现代码在OpenStack 的\ceilometer\network\statistics 目录中。参考链接： https://blueprints.launchpad.net/ceilometer/+spec/monitoring-network-from-opendaylight https://wiki.openstack.org/wiki/Ceilometer/blueprints/monitoring-network 总结图：本文转自SammyLiu博客园博客，原文链接：http://www.cnblogs.com/sammyliu/p/4384470.html，如需转载请自行联系原作者

2017-11-16

spark 数据预处理特征标准化归一化模块

#We will also standardise our data as we have done so far when performing distance-based clustering. from pyspark.mllib.feature import StandardScaler standardizer = StandardScaler(True, True) t0 = time() standardizer_model = standardizer.fit(parsed_data_values) tt = time() - t0 standardized_data_values = standardizer_model.transform(parsed_data_values) print "Data standardized in {} seconds".format(round(tt,3)) Data standardized in 9.54 seconds We can now perform k-means clustering. from pyspark.mllib.clustering import KMeans t0 = time() clusters = KMeans.train(standardized_data_values, 80, maxIterations=10, runs=5, initializationMode="random") tt = time() - t0 print "Data clustered in {} seconds".format(round(tt,3)) Data clustered in 137.496 seconds kmeans demo 摘自：http://spark.apache.org/docs/latest/api/python/pyspark.mllib.html#module-pyspark.mllib.feature pyspark.mllib.feature module Python package for feature in MLlib. classpyspark.mllib.feature.Normalizer( p=2.0) [source] Bases:pyspark.mllib.feature.VectorTransformer Normalizes samples individually to unit Lpnorm For any 1 <=p< float(‘inf’), normalizes samples using sum(abs(vector)p)(1/p)as norm. Forp= float(‘inf’), max(abs(vector)) will be used as norm for normalization. Parameters: p– Normalization in L^p^ space, p = 2 by default. >>> v = Vectors.dense(range(3)) >>> nor = Normalizer(1) >>> nor.transform(v) DenseVector([0.0, 0.3333, 0.6667]) >>> rdd = sc.parallelize([v]) >>> nor.transform(rdd).collect() [DenseVector([0.0, 0.3333, 0.6667])] >>> nor2 = Normalizer(float("inf")) >>> nor2.transform(v) DenseVector([0.0, 0.5, 1.0]) New in version 1.2.0. transform( vector) [source] Applies unit length normalization on a vector. Parameters: vector– vector or RDD of vector to be normalized. Returns: normalized vector. If the norm of the input is zero, it will return the input vector. New in version 1.2.0. classpyspark.mllib.feature.StandardScalerModel( java_model) [source] Bases:pyspark.mllib.feature.JavaVectorTransformer Represents a StandardScaler model that can transform vectors. New in version 1.2.0. mean [source] Return the column mean values. New in version 2.0.0. setWithMean( withMean) [source] Setter of the boolean which decides whether it uses mean or not New in version 1.4.0. setWithStd( withStd) [source] Setter of the boolean which decides whether it uses std or not New in version 1.4.0. std [source] Return the column standard deviation values. New in version 2.0.0. transform( vector) [source] Applies standardization transformation on a vector. Note In Python, transform cannot currently be used within an RDD transformation or action. Call transform directly on the RDD instead. Parameters: vector– Vector or RDD of Vector to be standardized. Returns: Standardized vector. If the variance of a column is zero, it will return default0.0for the column with zero variance. New in version 1.2.0. withMean [source] Returns if the model centers the data before scaling. New in version 2.0.0. withStd [source] Returns if the model scales the data to unit standard deviation. New in version 2.0.0. classpyspark.mllib.feature.StandardScaler( withMean=False, withStd=True) [source] Bases:object Standardizes features by removing the mean and scaling to unit variance using column summary statistics on the samples in the training set. Parameters: withMean– False by default. Centers the data with mean before scaling. It will build a dense output, so take care when applying to sparse input. withStd– True by default. Scales the data to unit standard deviation. >>> vs = [Vectors.dense([-2.0, 2.3, 0]), Vectors.dense([3.8, 0.0, 1.9])] >>> dataset = sc.parallelize(vs) >>> standardizer = StandardScaler(True, True) >>> model = standardizer.fit(dataset) >>> result = model.transform(dataset) >>> for r in result.collect(): r DenseVector([-0.7071, 0.7071, -0.7071]) DenseVector([0.7071, -0.7071, 0.7071]) >>> int(model.std[0]) 4 >>> int(model.mean[0]*10) 9 >>> model.withStd True >>> model.withMean True New in version 1.2.0. fit( dataset) [source] Computes the mean and variance and stores as a model to be used for later scaling. Parameters: dataset– The data used to compute the mean and variance to build the transformation model. Returns: a StandardScalarModel New in version 1.2.0. classpyspark.mllib.feature.HashingTF( numFeatures=1048576) [source] Bases:object Maps a sequence of terms to their term frequencies using the hashing trick. Note The terms must be hashable (can not be dict/set/list...). Parameters: numFeatures– number of features (default: 2^20) >>> htf = HashingTF(100) >>> doc = "a a b b c d".split(" ") >>> htf.transform(doc) SparseVector(100, {...}) New in version 1.2.0. indexOf( term) [source] Returns the index of the input term. New in version 1.2.0. setBinary( value) [source] If True, term frequency vector will be binary such that non-zero term counts will be set to 1 (default: False) New in version 2.0.0. transform( document) [source] Transforms the input document (list of terms) to term frequency vectors, or transform the RDD of document to RDD of term frequency vectors. New in version 1.2.0. classpyspark.mllib.feature.IDFModel( java_model) [source] Bases:pyspark.mllib.feature.JavaVectorTransformer Represents an IDF model that can transform term frequency vectors. New in version 1.2.0. idf() [source] Returns the current IDF vector. New in version 1.4.0. transform( x) [source] Transforms term frequency (TF) vectors to TF-IDF vectors. IfminDocFreqwas set for the IDF calculation, the terms which occur in fewer thanminDocFreqdocuments will have an entry of 0. Note In Python, transform cannot currently be used within an RDD transformation or action. Call transform directly on the RDD instead. Parameters: x– an RDD of term frequency vectors or a term frequency vector Returns: an RDD of TF-IDF vectors or a TF-IDF vector New in version 1.2.0. classpyspark.mllib.feature.IDF( minDocFreq=0) [source] Bases:object Inverse document frequency (IDF). The standard formulation is used:idf = log((m + 1) / (d(t) + 1)), wheremis the total number of documents andd(t)is the number of documents that contain termt. This implementation supports filtering out terms which do not appear in a minimum number of documents (controlled by the variableminDocFreq). For terms that are not in at leastminDocFreqdocuments, the IDF is found as 0, resulting in TF-IDFs of 0. Parameters: minDocFreq– minimum of documents in which a term should appear for filtering >>> n = 4 >>> freqs = [Vectors.sparse(n, (1, 3), (1.0, 2.0)), ... Vectors.dense([0.0, 1.0, 2.0, 3.0]), ... Vectors.sparse(n, [1], [1.0])] >>> data = sc.parallelize(freqs) >>> idf = IDF() >>> model = idf.fit(data) >>> tfidf = model.transform(data) >>> for r in tfidf.collect(): r SparseVector(4, {1: 0.0, 3: 0.5754}) DenseVector([0.0, 0.0, 1.3863, 0.863]) SparseVector(4, {1: 0.0}) >>> model.transform(Vectors.dense([0.0, 1.0, 2.0, 3.0])) DenseVector([0.0, 0.0, 1.3863, 0.863]) >>> model.transform([0.0, 1.0, 2.0, 3.0]) DenseVector([0.0, 0.0, 1.3863, 0.863]) >>> model.transform(Vectors.sparse(n, (1, 3), (1.0, 2.0))) SparseVector(4, {1: 0.0, 3: 0.5754}) New in version 1.2.0. fit( dataset) [source] Computes the inverse document frequency. Parameters: dataset– an RDD of term frequency vectors New in version 1.2.0. classpyspark.mllib.feature.Word2Vec [source] Bases:object Word2Vec creates vector representation of words in a text corpus. The algorithm first constructs a vocabulary from the corpus and then learns vector representation of words in the vocabulary. The vector representation can be used as features in natural language processing and machine learning algorithms. We used skip-gram model in our implementation and hierarchical softmax method to train the model. The variable names in the implementation matches the original C implementation. For original C implementation, seehttps://code.google.com/p/word2vec/For research papers, see Efficient Estimation of Word Representations in Vector Space and Distributed Representations of Words and Phrases and their Compositionality. >>> sentence = "a b " * 100 + "a c " * 10 >>> localDoc = [sentence, sentence] >>> doc = sc.parallelize(localDoc).map(lambda line: line.split(" ")) >>> model = Word2Vec().setVectorSize(10).setSeed(42).fit(doc) Querying for synonyms of a word will not return that word: >>> syms = model.findSynonyms("a", 2) >>> [s[0] for s in syms] [u'b', u'c'] But querying for synonyms of a vector may return the word whose representation is that vector: >>> vec = model.transform("a") >>> syms = model.findSynonyms(vec, 2) >>> [s[0] for s in syms] [u'a', u'b'] >>> import os, tempfile >>> path = tempfile.mkdtemp() >>> model.save(sc, path) >>> sameModel = Word2VecModel.load(sc, path) >>> model.transform("a") == sameModel.transform("a") True >>> syms = sameModel.findSynonyms("a", 2) >>> [s[0] for s in syms] [u'b', u'c'] >>> from shutil import rmtree >>> try: ... rmtree(path) ... except OSError: ... pass New in version 1.2.0. fit( data) [source] Computes the vector representation of each word in vocabulary. Parameters: data– training data. RDD of list of string Returns: Word2VecModel instance New in version 1.2.0. setLearningRate( learningRate) [source] Sets initial learning rate (default: 0.025). New in version 1.2.0. setMinCount( minCount) [source] Sets minCount, the minimum number of times a token must appear to be included in the word2vec model’s vocabulary (default: 5). New in version 1.4.0. setNumIterations( numIterations) [source] Sets number of iterations (default: 1), which should be smaller than or equal to number of partitions. New in version 1.2.0. setNumPartitions( numPartitions) [source] Sets number of partitions (default: 1). Use a small number for accuracy. New in version 1.2.0. setSeed( seed) [source] Sets random seed. New in version 1.2.0. setVectorSize( vectorSize) [source] Sets vector size (default: 100). New in version 1.2.0. setWindowSize( windowSize) [source] Sets window size (default: 5). New in version 2.0.0. classpyspark.mllib.feature.Word2VecModel( java_model) [source] Bases:pyspark.mllib.feature.JavaVectorTransformer,pyspark.mllib.util.JavaSaveable,pyspark.mllib.util.JavaLoader class for Word2Vec model New in version 1.2.0. findSynonyms( word, num) [source] Find synonyms of a word Parameters: word– a word or a vector representation of word num– number of synonyms to find Returns: array of (word, cosineSimilarity) Note Local use only New in version 1.2.0. getVectors() [source] Returns a map of words to their vector representations. New in version 1.4.0. classmethodload( sc, path) [source] Load a model from the given path. New in version 1.5.0. transform( word) [source] Transforms a word to its vector representation Note Local use only Parameters: word– a word Returns: vector representation of word(s) New in version 1.2.0. classpyspark.mllib.feature.ChiSqSelector( numTopFeatures=50, selectorType='numTopFeatures', percentile=0.1, fpr=0.05, fdr=0.05, fwe=0.05) [source] Bases:object Creates a ChiSquared feature selector. The selector supports different selection methods:numTopFeatures,percentile,fpr,fdr,fwe. numTopFeatureschooses a fixed number of top features according to a chi-squared test. percentileis similar but chooses a fraction of all features instead of a fixed number. fprchooses all features whose p-values are below a threshold, thus controlling the false positive rate of selection. fdruses theBenjamini-Hochberg procedureto choose all features whose false discovery rate is below a threshold. fwechooses all features whose p-values are below a threshold. The threshold is scaled by 1/numFeatures, thus controlling the family-wise error rate of selection. By default, the selection method isnumTopFeatures, with the default number of top features set to 50. >>> data = sc.parallelize([ ... LabeledPoint(0.0, SparseVector(3, {0: 8.0, 1: 7.0})), ... LabeledPoint(1.0, SparseVector(3, {1: 9.0, 2: 6.0})), ... LabeledPoint(1.0, [0.0, 9.0, 8.0]), ... LabeledPoint(2.0, [7.0, 9.0, 5.0]), ... LabeledPoint(2.0, [8.0, 7.0, 3.0]) ... ]) >>> model = ChiSqSelector(numTopFeatures=1).fit(data) >>> model.transform(SparseVector(3, {1: 9.0, 2: 6.0})) SparseVector(1, {}) >>> model.transform(DenseVector([7.0, 9.0, 5.0])) DenseVector([7.0]) >>> model = ChiSqSelector(selectorType="fpr", fpr=0.2).fit(data) >>> model.transform(SparseVector(3, {1: 9.0, 2: 6.0})) SparseVector(1, {}) >>> model.transform(DenseVector([7.0, 9.0, 5.0])) DenseVector([7.0]) >>> model = ChiSqSelector(selectorType="percentile", percentile=0.34).fit(data) >>> model.transform(DenseVector([7.0, 9.0, 5.0])) DenseVector([7.0]) New in version 1.4.0. fit( data) [source] Returns a ChiSquared feature selector. Parameters: data– anRDD[LabeledPoint]containing the labeled dataset with categorical features. Real-valued features will be treated as categorical for each distinct value. Apply feature discretizer before using this function. New in version 1.4.0. setFdr( fdr) [source] set FDR [0.0, 1.0] for feature selection by FDR. Only applicable when selectorType = “fdr”. New in version 2.2.0. setFpr( fpr) [source] set FPR [0.0, 1.0] for feature selection by FPR. Only applicable when selectorType = “fpr”. New in version 2.1.0. setFwe( fwe) [source] set FWE [0.0, 1.0] for feature selection by FWE. Only applicable when selectorType = “fwe”. New in version 2.2.0. setNumTopFeatures( numTopFeatures) [source] set numTopFeature for feature selection by number of top features. Only applicable when selectorType = “numTopFeatures”. New in version 2.1.0. setPercentile( percentile) [source] set percentile [0.0, 1.0] for feature selection by percentile. Only applicable when selectorType = “percentile”. New in version 2.1.0. setSelectorType( selectorType) [source] set the selector type of the ChisqSelector. Supported options: “numTopFeatures” (default), “percentile”, “fpr”, “fdr”, “fwe”. New in version 2.1.0. classpyspark.mllib.feature.ChiSqSelectorModel( java_model) [source] Bases:pyspark.mllib.feature.JavaVectorTransformer Represents a Chi Squared selector model. New in version 1.4.0. transform( vector) [source] Applies transformation on a vector. Parameters: vector– Vector or RDD of Vector to be transformed. Returns: transformed vector. New in version 1.4.0. classpyspark.mllib.feature.ElementwiseProduct( scalingVector) [source] Bases:pyspark.mllib.feature.VectorTransformer Scales each column of the vector, with the supplied weight vector. i.e the elementwise product. >>> weight = Vectors.dense([1.0, 2.0, 3.0]) >>> eprod = ElementwiseProduct(weight) >>> a = Vectors.dense([2.0, 1.0, 3.0]) >>> eprod.transform(a) DenseVector([2.0, 2.0, 9.0]) >>> b = Vectors.dense([9.0, 3.0, 4.0]) >>> rdd = sc.parallelize([a, b]) >>> eprod.transform(rdd).collect() [DenseVector([2.0, 2.0, 9.0]), DenseVector([9.0, 6.0, 12.0])] New in version 1.5.0. transform( vector) [source] Computes the Hadamard product of the vector. New in version 1.5.0. 本文转自张昺华-sky博客园博客，原文链接：http://www.cnblogs.com/bonelee/p/7774142.html，如需转载请自行联系原作者

2017-11-15

资源下载

更多资源

Mario

马里奥是站在游戏界顶峰的超人气多面角色。马里奥靠吃蘑菇成长，特征是大鼻子、头戴帽子、身穿背带裤，还留着胡子。与他的双胞胎兄弟路易基一起，长年担任任天堂的招牌角色。

Rocky Linux

Rocky Linux（中文名：洛基）是由Gregory Kurtzer于2020年12月发起的企业级Linux发行版，作为CentOS稳定版停止维护后与RHEL（Red Hat Enterprise Linux）完全兼容的开源替代方案，由社区拥有并管理，支持x86_64、aarch64等架构。其通过重新编译RHEL源代码提供长期稳定性，采用模块化包装和SELinux安全架构，默认包含GNOME桌面环境及XFS文件系统，支持十年生命周期更新。

Sublime Text

Sublime Text具有漂亮的用户界面和强大的功能，例如代码缩略图，Python的插件，代码段等。还可自定义键绑定，菜单和工具栏。Sublime Text 的主要功能包括：拼写检查，书签，完整的 Python API ， Goto 功能，即时项目切换，多选择，多窗口等等。Sublime Text 是一个跨平台的编辑器，同时支持Windows、Linux、Mac OS X等操作系统。

WebStorm

WebStorm 是jetbrains公司旗下一款JavaScript 开发工具。目前已经被广大中国JS开发者誉为“Web前端开发神器”、“最强大的HTML5编辑器”、“最智能的JavaScript IDE”等。与IntelliJ IDEA同源，继承了IntelliJ IDEA强大的JS部分的功能。