Using a Scrapy pipeline without using settings.py config

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP



Using a Scrapy pipeline without using settings.py config



I'm avoiding using the Scrapy boilerplate generator because my code will be integrated as part of a wider project.



My current project tree is like this:


/ test
|- items.py
|- pipelines.py
|- spider.py



My pipeline.py contains a pipeline that looks like this:


pipeline.py


import pymongo

class MongoPipeline(object):
collection_name = 'pages'
[... rest of the pipeline class ...]



How can I use this class in spider.py without using a settings.py file and scrapy.conf?


spider.py


settings.py


scrapy.conf



I've tried importing the pipeline class and setting ITEM_PIPELINES in custom_settings but that throws ValueError: Error loading object 'MongoPipeline': not a full path:


ITEM_PIPELINES


custom_settings


ValueError: Error loading object 'MongoPipeline': not a full path


from pipelines import MongoPipeline

class MySpider(CrawlSpider):
name = 'x'
allowed_domains = ['x']
start_urls = ['x']

custom_settings =
'ITEM_PIPELINES':
'MongoPipeline': 100



def parse(self, response):
[...]




1 Answer
1



it should be:


custom_settings =
'ITEM_PIPELINES':
'YourProjectName.pipelines.MongoPipeline': 100






Thanks. It turned it was only pipelines.MongoPipeline for me (or module.PipelineClass for the generic case. This is when you're not using a scrapy project to run your spider.
– Juicy
Aug 12 at 9:28


pipelines.MongoPipeline


module.PipelineClass





@Juicy it's just a path to your pipelines.py
– gangabass
Aug 12 at 9:29






By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Popular posts from this blog

Firebase Auth - with Email and Password - Check user already registered

Dynamically update html content plain JS

How to determine optimal route across keyboard