Integrating Pandas with SQL Databases

Diving into pandas and SQL integration opens up a world where data flows smoothly between your Python scripts and relational databases. Let’s get straight to the how-to.

From SQL Tables to Pandas DataFrames

Pulling data from a SQL database into pandas is straightforward. You’ll need SQLAlchemy for the heavy lifting, creating a connection engine, and then it’s all about the right query.

from sqlalchemy import create_engine
import pandas as pd

# Create a connection to your database
engine = create_engine('sqlite:///my_database.db')

# Load data into a DataFrame
df = pd.read_sql_query("SELECT * FROM my_table", engine)
print(df.head())

This snippet fetches everything from my_table and loads it into a pandas DataFrame, ready for all the slicing and dicing pandas offers.

Pushing DataFrames to SQL Databases

Got a DataFrame you want to save back into SQL? pandas’ to_sql method makes it easy. Decide if you’re appending to an existing table, replacing it, or making a new one.

# Save the DataFrame to a new SQL table
df.to_sql('new_table', engine, if_exists='replace', index=False)

This code replaces new_table with your DataFrame df, skipping the index to keep things clean.

Working with big data? Use the chunksize parameter in both read_sql_query and to_sql to manage memory better and keep your operations smooth.

Leave a Reply