Storing data to SQL not working with my sql connec…

I am trying to store my scraped data with scrapy to a SQL database but my code does not send anything while no error is mentioned when runned.

I am using my sql connector since I don’t manage to install MySQL-python. My SQL database seems to running well and when I run the code the traffic KB/s raise. Please find below my pipelines.py code.

import mysql.connector
from mysql.connector import errorcode

class CleaningPipeline(object):
    ...

class DatabasePipeline(object):

    def _init_(self):
        self.create_connection()
        self.create_table()

    def create_connection(self):
        self.conn = mysql.connector.connect(
            host = 'localhost',
            user = 'root',
            passwd = '********',
            database = 'lecturesinparis_db'
        )
        self.curr = self.conn.cursor()

    def create_table(self):
        self.curr.execute("""DROP TABLE IF EXISTS mdl""")
        self.curr.execute("""create table mdl(
                        title text,
                        location text,
                        startdatetime text,
                        lenght text,
                        description text,
                        )""")

    def process_item(self, item, spider):
        self.store_db(item)
        return item

    def store_db(self, item):
        self.curr.execute("""insert into mdl values (%s,%s,%s,%s,%s)""", (
            item['title'][0],
            item['location'][0],
            item['startdatetime'][0],
            item['lenght'][0],
            item['description'][0],
        ))
        self.conn.commit()

​x
 
import mysql.connectorfrom mysql.connector import errorcode​class CleaningPipeline(object):    ...​class DatabasePipeline(object):​    def _init_(self):        self.create_connection()        self.create_table()​    def create_connection(self):        self.conn = mysql.connector.connect(            host = 'localhost',            user = 'root',            passwd = '********',            database = 'lecturesinparis_db'        )        self.curr = self.conn.cursor()​    def create_table(self):        self.curr.execute("""DROP TABLE IF EXISTS mdl""")        self.curr.execute("""create table mdl(                        title text,                        location text,                        startdatetime text,                        lenght text,                        description text,                        )""")​    def process_item(self, item, spider):        self.store_db(item)        return item​    def store_db(self, item):        self.curr.execute("""insert into mdl values (%s,%s,%s,%s,%s)""", (            item['title'][0],            item['location'][0],            item['startdatetime'][0],            item['lenght'][0],            item['description'][0],        ))        self.conn.commit()​

Answer

You need to add the class in ITEM_PIPELINES first to let the scrapy know i want to use this pipeline.

In your settings.py file Update the lines below with your class name as following.

# https://docs.scrapy.org/en/latest/topics/item-pipeline.html
ITEM_PIPELINES = {
    'projectname.pipelines.CleaningPipeline': 700,
    'projectname.pipelines.DatabasePipeline': 800,
}

 
# https://docs.scrapy.org/en/latest/topics/item-pipeline.htmlITEM_PIPELINES = {    'projectname.pipelines.CleaningPipeline': 700,    'projectname.pipelines.DatabasePipeline': 800,}​

The numbers 700 and 800 shows in which order the pipelines will process data, it can be any integer between 1-1000. Pipelines will process items in the order based by this number, so pipeline with 700 would process data before the pipeline with 800.

Note: Replace the projectname in 'projectname.pipelines.CleaningPipeline' with your actual projectname.

Storing data to SQL not working with my sql connector and scrapy

Advertisement

Answer