6、web爬虫讲解2—urllib库爬虫—基础使用—超时设置—自动模拟http请求
利用python系统自带的urllib库写简单爬虫 urlopen()获取一个URL的html源码read()读出html源码内容decode("utf-8")将字节转化成字符串 #!/usr/bin/env python # -*- coding:utf-8 -*- import urllib.request html = urllib.request.urlopen('http://edu.51cto.com/course/8360.html').read().decode("utf-8") print(html) <!DOCTYPE html> <html> <head> <meta charset="UTF-8"> <meta name="viewport" content="width=device-width, initial-scale=1"> <meta name="csrf-param" content="_csrf"> <meta name="csrf-token" content="X1...