大家帮我看一下我的scrapymac 安装scrapy哪有问题

点击联系发帖人 时间：2018-03-29 03:20

scrapy 安装

21被浏览2,224分享邀请回答rcdisk.com/index.php/group/topic/id-18111 条评论分享收藏感谢收起85 条评论分享收藏感谢收起scrapy安装好了在哪里操作_百度知道
scrapy安装好了在哪里操作
我有更好的答案
假设某个Scrapy工程目录为X_Spider。Shell中执行 cd X_Spider&&scrapy list可查看可用蜘蛛列表。假设列表中有一蜘蛛名为Spider_x。则在shell中 scrapy crawl Spider_x运行该蜘蛛，也可以直接对Spider_x.py文档使用 scrapy runspider命令。
为您推荐：
其他类似问题
换一换
回答问题，赢新手礼包
个人、企业类
违法有害信息,请在下方选择后提交
色情、暴力
我们会通过消息、邮箱等方式尽快将举报结果通知您。&nbsp>&nbsp
&nbsp>&nbsp
&nbsp>&nbsp
关于Python爬虫程序scrapy的安装问题
摘要：Linux下关于Python爬虫程序scrapy的安装问题我的安装过程:sudopipinstallscrapy够简单吧。&但是在运行第一个爬虫例子时scrapycrawldmoz出现下面错误:AttributeError:'module'objecthasnoattribute'Spider'解决方案如下:&http://stackoverflow.com/questions//attributeerror-module-object-h
Linux下关于Python爬虫程序scrapy的安装问题
我的安装过程:
sudo pip install scrapy
够简单吧。&但是在运行第一个爬虫例子时
scrapy crawl dmoz
出现下面错误:
AttributeError: 'module' object has no attribute 'Spider'
解决方案如下:&http://stackoverflow.com/questions//attributeerror-module-object-has-no-attribute-spider
sudo pip install scrapy --upgrade
正常上述过程之后,问题应该能够解决。但是我又出了下面的问题lxml装不上
creating build/temp.linux-x86_64-2.7/src/lxml
x86_64-linux-gnu-gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -fPIC -Isrc/lxml/includes -I/usr/include/python2.7 -c src/lxml/lxml.etree.c -o build/temp.linux-x86_64-2.7/src/lxml/lxml.etree.o -w
In file included from src/lxml/lxml.etree.c:320:0:
src/lxml/includes/etree_defs.h:14:31: fatal error: libxml/xmlversion.h: 没有那个文件或目录
#include &libxml/xmlversion.h&
compilation terminated.
Compile failed: command 'x86_64-linux-gnu-gcc' failed with exit status 1
creating tmp
cc -I/usr/include/libxml2 -c /tmp/xmlXPathInitM_KXBh.c -o tmp/xmlXPathInitM_KXBh.o
cc tmp/xmlXPathInitM_KXBh.o -lxml2 -o a.out
error: command 'x86_64-linux-gnu-gcc' failed with exit status 1
----------------------------------------
Rolling back uninstall of lxmlCommand &/usr/bin/python -u -c &import setuptools,__file__='/tmp/pip-build-F1ulO4/lxml/setup.py';exec(compile(getattr(tokenize, 'open', open)(__file__).read().replace('/r/n', '/n'), __file__, 'exec'))& install --record /tmp/pip-OMbiRQ-record/install-record.txt --single-version-externally-managed --compile& failed with error code 1 in /tmp/pip-build-F1ulO4/lxml/
解决方案如下:&http://stackoverflow.com/questions/5178416/pip-install-lxml-error
sudo apt-get install python-dev libxml2-dev libxslt1-dev zlib1g-dev
安装好依赖之后
sudo pip install lxml --upgrade
:~/Code/python/tutorial$ sudo pip install lxml --upgradeThe directory '/home/beast/.cache/pip/http' or its parent directory is not owned by the current user and the cache has been disabled. Please check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.The directory '/home/beast/.cache/pip' or its parent directory is not owned by the current user and caching wheels has been disabled. check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag./usr/local/lib/python2.7/dist-packages/pip/_vendor/requests/packages/urllib3/util/ssl_.py:318: SNIMissingWarning: An HTTPS request has been made, but the SNI (Subject Name Indication) extension to TLS is not available on this platform. This may cause the server to present an incorrect TLS certificate, which can cause validation failures. You can upgrade to a newer version of Python to solve this. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#snimissingwarning.
SNIMissingWarning/usr/local/lib/python2.7/dist-packages/pip/_vendor/requests/packages/urllib3/util/ssl_.py:122: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. You can upgrade to a newer version of Python to solve this. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning.
InsecurePlatformWarningCollecting lxml
Downloading lxml-3.6.0.tar.gz (3.7MB)
100% |████████████████████████████████| 3.7MB 213kB/s Installing collected packages: lxml
Found existing installation: lxml 3.3.3
Uninstalling lxml-3.3.3:
Successfully uninstalled lxml-3.3.3
Running setup.py install for lxml ... doneSuccessfully installed lxml-3.6.0
现在可以第一个爬虫例子了:
beast@beast:~/Code/python/tutorial$ scrapy crawl dmoz/usr/local/lib/python2.7/dist-packages/scrapy/settings/deprecated.py:26: ScrapyDeprecationWarning: You are using the following settings which are deprecated or obsolete (ask scrapy-users@googlegroups.com for alternatives):
BOT_VERSION: no longer used (user agent defaults to Scrapy now)
warnings.warn(msg, ScrapyDeprecationWarning)2016-07-06 16:41:56 [scrapy] INFO: Scrapy 1.1.0 started (bot: tutorial)2016-07-06 16:41:56 [scrapy] INFO: Overridden settings: {'NEWSPIDER_MODULE': 'tutorial.spiders', 'SPIDER_MODULES': ['tutorial.spiders'], 'USER_AGENT': 'tutorial/1.0', 'BOT_NAME': 'tutorial'}2016-07-06 16:41:56 [scrapy] INFO: Enabled extensions:['scrapy.extensions.logstats.LogStats', 'scrapy.extensions.telnet.TelnetConsole', 'scrapy.extensions.corestats.CoreStats']2016-07-06 16:41:56 [scrapy] INFO: Enabled downloader middlewares:['scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware', 'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware', 'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware', 'scrapy.downloadermiddlewares.retry.RetryMiddleware', 'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware', 'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware', 'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware', 'scrapy.downloadermiddlewares.redirect.RedirectMiddleware', 'scrapy.downloadermiddlewares.cookies.CookiesMiddleware', 'scrapy.downloadermiddlewares.chunked.ChunkedTransferMiddleware', 'scrapy.downloadermiddlewares.stats.DownloaderStats']2016-07-06 16:41:56 [scrapy] INFO: Enabled spider middlewares:['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware', 'scrapy.spidermiddlewares.offsite.OffsiteMiddleware', 'scrapy.spidermiddlewares.referer.RefererMiddleware', 'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware', 'scrapy.spidermiddlewares.depth.DepthMiddleware']2016-07-06 16:41:56 [scrapy] INFO: Enabled item pipelines:[]2016-07-06 16:41:56 [scrapy] INFO: Spider opened2016-07-06 16:41:56 [scrapy] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)2016-07-06 16:41:56 [scrapy] DEBUG: Telnet console listening on 127.0.0.1:60232016-07-06 16:41:58 [scrapy] DEBUG: Crawled (200) &GET http://www.dmoz.org/Computers/Programming/Languages/Python/Books/& (referer: None)2016-07-06 16:41:58 [scrapy] DEBUG: Crawled (200) &GET http://www.dmoz.org/Computers/Programming/Languages/Python/Resources/& (referer: None)2016-07-06 16:41:58 [scrapy] INFO: Closing spider (finished)2016-07-06 16:41:58 [scrapy] INFO: Dumping Scrapy stats:{'downloader/request_bytes': 472, 'downloader/request_count': 2, 'downloader/request_method_count/GET': 2, 'downloader/response_bytes': 16392, 'downloader/response_count': 2, 'downloader/response_status_count/200': 2, 'finish_reason': 'finished', 'finish_time': datetime.datetime(2016, 7, 6, 8, 41, 58, 337488), 'log_count/DEBUG': 3, 'log_count/INFO': 7, 'response_received_count': 2, 'scheduler/dequeued': 2, 'scheduler/dequeued/memory': 2, 'scheduler/enqueued': 2, 'scheduler/enqueued/memory': 2, 'start_time': datetime.datetime(2016, 7, 6, 8, 41, 56, 777087)}2016-07-06 16:41:58 [scrapy] INFO: Spider closed (finished)
以上是的内容，更多
的内容，请您使用右上方搜索功能获取相关信息。
若你要投稿、删除文章请联系邮箱：zixun-group@service.aliyun.com,工作人员会在五个工作日内给你回复。
云服务器 ECS
可弹性伸缩、安全稳定、简单易用
&40.8元/月起
预测未发生的攻击
&24元/月起
为您提供0门槛上云实践机会
你可能还喜欢
你可能感兴趣
阿里云教程中心为您免费提供
关于Python爬虫程序scrapy的安装问题相关信息，包括
的信息，所有关于Python爬虫程序scrapy的安装问题相关内容均不代表阿里云的意见！投稿删除文章请联系邮箱：zixun-group@service.aliyun.com，工作人员会在五个工作日内答复
售前咨询热线
支持与服务
资源和社区
关注阿里云
International安装Scrapy 发现的问题
安装Scrapy 发现的问题
围观11928次
编辑日期：字体：
以前机器上面pip都有安装的，所以直接
pip install scrapy
报了一大堆的错误，大慨意思是安装过程中setuptools会通过https访问PyPI来拉取某个库依赖的其它包，而通过https访问PyPI源时，它使用的是高版本的openssl，但其加载的certificate证书却是与系统默认的旧版openssl库配套的证书，所以会报出”certificate verify failed”的错误
可以看到，出问题的地方从”Running setup.py install for cryptography”那行（第33行）开始，而setup.py install调用了setuptools，所以可以推断，认证失败的确是由setuptools引起
安装网上的解决方法
pip install --cert /home/slvher/tools/https-ca/https/cacert.pem -U -i https://pypi.python.org/simple six
在执行安装的时候报了ImportError: No module named pkg_resources
而且连pip都用不了直接报错
pip 和python 版本不合。我把python升级到2.7了的，决定重新安装下相对应的版本
setuptool：
#wget https://pypi.python.org/packages/2.7/s/setuptools/setuptools-0.6c11-py2.7.egg
--no-check-certificate
#chmod +x setuptools-0.6c11-py2.7.egg
#sh setuptools-0.6c11-py2.7.egg
安装的时候还报 zipimport.ZipImportError: can’ zlib not available
解决方法：
1、安装依赖zlib、zlib-devel
2、重新编译安装Python
./configure
编辑Modules/Setup文件
找到下面这句，去掉注释
#zlib zlibmodule.c -I$(prefix)/include -L$(exec_prefix)/lib -lz
重新编译安装：make & make install
wget --no-check-certificate https://github.com/pypa/pip/archive/1.5.5.tar.gz
注意：wget获取https的时候要加上：–no-check-certificate
tar zvxf 1.5.5.tar.gz
cd pip-1.5.5/
python setup.py install
当执行pip -v 的时候又报了个 ImportError: cannot import name HTTPSHandler
没有安装openssl-devel 至装了openssl
没办法只能装了
yum install -y openssl-devel
然后还要重新安装python 都重装好几回python了
等python ./configure && make && make install
完后在执行
pip &command& [options]
Install packages.
Uninstall packages.
Output installed packages in requirements format.
List installed packages.
显示正常了，哪再安装scrapy试试
pip install scrapy
继续有问题，还是依赖关系的问题
ERROR: /bin/sh: xslt-config: command not found
** make sure the development packages of libxml2 and libxslt are installed **
File &/usr/local/lib/python2.7/tarfile.py&, line 1744, in bz2open
raise CompressionError(&bz2 module is not available&)
CompressionError: bz2 module is not available
#yum -y install libxslt-devel bzip2-devel
再编译还是报CompressionError: bz2 module is not available的错误
老外的意思是
sudo yum install bzip2-devel
and then rebuild python
擦哪就再次rebuild python吧
继续pip install scrapy 老的问题是解决了，报了个新的问题
distutils.errors.DistutilsError: Setup script exited with error: command 'gcc' failed with exit status 1
#yum -y install libffi-devel
# pip install scrapy
Requirement already satisfied (use --upgrade to upgrade): scrapy in /usr/local/lib/python2.7/site-packages
Cleaning up...
没有报错了
scrapy 命令也出来了但是不能用，继续报错
Traceback (most recent call last):
File &/usr/local/bin/scrapy&, line 7, in &module&
from scrapy.cmdline import execute
File &/usr/local/lib/python2.7/site-packages/scrapy/__init__.py&, line 27, in &module&
from . import _monkeypatches
File &/usr/local/lib/python2.7/site-packages/scrapy/_monkeypatches.py&, line 20, in &module&
import twisted.persisted.styles
File &/usr/local/lib/python2.7/site-packages/twisted/__init__.py&, line 53, in &module&
_checkRequirements()
File &/usr/local/lib/python2.7/site-packages/twisted/__init__.py&, line 37, in _checkRequirements
raise ImportError(required + &: no module named zope.interface.&)
ImportError: Twisted requires zope.interface 3.6.0 or later: no module named zope.interface.
还需要安装zope包，不是说在linux下安装setuptools，再安装twisted，会联网自动安装zope，不管了就手动安装试下
wget http://pypi.python.org/packages/source/z/zope.interface/zope.interface-4.0.1.tar.gz
# tar xvf zope.interface-4.0.1.tar.gz
# cd zope.interface-4.0.1
# python setup.py install
&# scrapy -h
Scrapy 1.0.3 - no active project
scrapy &command& [options] [args]
Available commands:
Run quick benchmark test
Fetch a URL using the Scrapy downloader
Run a self-contained spider (without creating a project)
Get settings values
Interactive scraping console
startproject
Create new project
Print Scrapy version
Open URL in browser, as seen by Scrapy
[ more ]
More commands available when run from project directory
Use &scrapy &command& -h& to see more info about a command
# scrapy startproject tutorial
2015-09-16 12:03:28 [scrapy] INFO: Scrapy 1.0.3 started (bot: scrapybot)
2015-09-16 12:03:28 [scrapy] INFO: Optional features available: ssl, http11
2015-09-16 12:03:28 [scrapy] INFO: Overridden settings: {}
New Scrapy project 'tutorial' created in:
/opt/tutorial
You can start your first spider with:
cd tutorial
scrapy genspider example example.com
貌似可以了，有点小激动哈
本文固定链接:
转载请注明:
作者：saunix
大型互联网公司linux系统运维攻城狮,专门担当消防员
您可能还会对这些文章感兴趣！}

叫阿莫西中心