Importing CSV Data into a PostgreSQL Table with an Array Type Column

Dealing with PostgreSQL database datatypes and tables can sometimes be challenging. In this tutorial, I will demonstrate how to populate a PostgreSQL table with a column of type Array from a CSV (comma-separated value) file. To accomplish this, we will utilize Python, Flask, SQLAlchemy, and pg-admin. It is assumed that you are already acquainted with the aforementioned tools.

To begin, let's create a Python class that represents the table structure using SQLAlchemy's declarative syntax:

from flask_sqlalchemy import SQLAlchemy
from sqlalchemy.dialects.postgresql import ARRAY

db = SQLAlchemy()

class Product(db.Model):
    __tablename__ = "product"

    id = db.Column(db.Integer, primary_key=True)
    itemName = db.Column(db.String(), unique=True, nullable=False)
    itemDescription = db.Column(db.Text, nullable=True)
    itemPrice = db.Column(db.Float(), nullable=False)
    itemColorCode = db.Column(ARRAY(db.String()))
    created_at = db.Column(db.DateTime, default=datetime.now())

Above, we have a code block where we establish a table schema called Product. This schema includes various columns of different data types, including a column named itemColorCode which is of the type ARRAY.

Following this, we will establish a connection to the database using SQLAlchemy within Flask.

from flask import Flask
from flask_sqlalchemy import SQLAlchemy

app = Flask(__name__)
app.config['SQLALCHEMY_DATABASE_URI'] = 'postgresql://username:password@host:port/database_name'
db = SQLAlchemy(app)

if __name__ == '__main__':
    with app.app_context():
        db.create_all()

Please replace the placeholders username, password, host, port, and database_name with the actual connection details for your PostgreSQL database, and then execute the file.

Depending on your setup, you can use commands like flask run or python app.py, with 'app.py' being the filename in this example. The product table will be created in the specified PostgreSQL database.

Now, let's take a look at our sample CSV file that contains the data we'll use to populate our table.

itemName,itemDescription,itemPrice,itemColorCode
Vank Shoes,Really nice piece of work,5000,#AC172D
Glossy Leather bag,classy formal office accessory for women,14000,#00EEDC

Upon closer examination of the code block above, it becomes apparent that the data in our itemColorCode column is in the form of a string. Importing this data directly into the table using pg-admin will result in an error because it expects an array of strings, not just a single string.

It's also important to note that PostgreSQL stores data of array type using curly braces. Therefore, our data needs to be properly formatted in the CSV file to match this structure before importing it.

For example, if we have a color code value like '#AC172D', it should be formatted as {#AC172D} in the CSV file to represent an array with a single string element.

By adhering to this formatting requirement, we ensure compatibility with PostgreSQL's array storage and avoid errors during the import process.

import csv

def convert_column_to_array(csv_file, column_name):
    # Read the CSV file and load its contents
    with open(csv_file, 'r') as file:
        reader = csv.reader(file)
        data = list(reader)

    # Convert the specified column to an array of strings
    for row in data:
        column_value =  row[column_name]
        converted_text =  '{' + column_value.stripstrip("''") + '}'
        row[column_index] = converted_text

    # Write the updated data back to the CSV file
    with open(csv_file, 'w', newline='') as file:
        writer = csv.writer(file)
        writer.writerows(data)

# Example usage
csv_file = 'example.csv'  # Replace with your CSV file path
column_name = 'itemColorCode'  # Replace with the name of the target column
convert_column_to_array(csv_file, column_name)

The resulting example.csv will look like the following:

itemName,itemDescription,itemPrice,itemColorCode
Vank Shoes,Really nice piece of work,5000,{#AC172D}
Glossy Leather bag,classy formal office accessory for women,14000,{#00EEDC}

To proceed with populating our 'product' table, we will follow these steps:

  1. Launch pg-admin and access the database

  2. In the schema tab, locate the 'product' table and right-click on it.

  3. Choose the 'import/export Data' option from the menu.

  4. In the General tab, specify the file location of our 'example.csv' in the 'filename' field. Select the 'CSV' format option. You can leave the 'encoding' field blank.

  5. Click ok. Congratulations!

If you followed the instructions correctly, you should now have a 'product' table populated with two rows. Well done!

I hope you found this tutorial helpful and learned something valuable. Happy coding until next time!