Parquet is a columnar storage format. It is efficient for large datasets. Pandas can read and write Parquet files. This makes it a good option for data storage.
Reading Parquet Files
You can read Parquet files using pd.read_parquet. This function reads the data into a DataFrame.
Example (Reading a Parquet File)
import pandas as pd df = pd.read_parquet("my_data.parquet") # Replace with your file path
Writing Parquet Files
You can write DataFrames to Parquet files using df.to_parquet. This saves the DataFrame in Parquet format.
Example (Writing a DataFrame to Parquet)
import pandas as pd data = {'col1': [1, 2], 'col2': [3, 4]} df = pd.DataFrame(data) df.to_parquet("my_data.parquet")
Benefits of Parquet
Parquet stores data column by column. This makes queries faster. It also compresses data well. This saves storage space. It is often much faster and more efficient than CSV. Especially for large datasets.
For more advanced options, see the Pandas documentation. It contains more information on Parquet I/O.