Diving into pandas and SQL integration opens up a world where data flows smoothly between your Python scripts and relational databases. Let’s get straight to the how-to.
From SQL Tables to Pandas DataFrames
Pulling data from a SQL database into pandas is straightforward. You’ll need SQLAlchemy for the heavy lifting, creating a connection engine, and then it’s all about the right query.
from sqlalchemy import create_engine import pandas as pd # Create a connection to your database engine = create_engine('sqlite:///my_database.db') # Load data into a DataFrame df = pd.read_sql_query("SELECT * FROM my_table", engine) print(df.head())
This snippet fetches everything from my_table and loads it into a pandas DataFrame, ready for all the slicing and dicing pandas offers.
Pushing DataFrames to SQL Databases
Got a DataFrame you want to save back into SQL? pandas’ to_sql method makes it easy. Decide if you’re appending to an existing table, replacing it, or making a new one.
# Save the DataFrame to a new SQL table df.to_sql('new_table', engine, if_exists='replace', index=False)
This code replaces new_table with your DataFrame df, skipping the index to keep things clean.
Working with big data? Use the chunksize parameter in both read_sql_query and to_sql to manage memory better and keep your operations smooth.