Lesson 20
Saving and Exporting Data¶
This lesson covers:
- Saving and reloading data
This first block loads the data that was used in the previous lesson.
In [ ]:
# Setup: Load the data to use later
import pandas as pd
gs10_csv = pd.read_csv("data/GS10.csv", index_col="DATE", parse_dates=True)
gs10_excel = pd.read_excel("data/GS10.xls", skiprows=10,
index_col="observation_date")
Problem: Export to Excel¶
Export gs10_csv
to the Excel file gs10-exported.xlsx
.
In [ ]:
Problem: Export to CSV¶
Export gs10_excel
to CSV.
In [ ]:
Problem: Export to HDF¶
Export both to a single HDF file (the closest thing to a "native" format in pandas).
In [ ]:
Problem: Import from HDF¶
Import the data saved as HDF and verify it is the same as the original data.
In [ ]:
In [ ]:
Exercises¶
Exercise: Import, export and verify¶
- Import the data in "data/fred-md.csv"
- Parse the dates and set the index column to "sasdate"
- Remove first row labeled "Transform:" (Hint: Transpose,
del
and transpose back, or usedrop
) - Re-parse the dates on the index
- Remove columns that have more than 10% missing values
- Save to "data/fred-md.h5" as HDF.
- Load the data into the variable
reloaded
and verify it is identical.
In [ ]:
Exercise: Looping Export¶
Export the columns RPI, INDPRO, and HWI from the FRED-MD data to "data/
variablename.csv"
so that, e.g., RPI is exported to data/RPI.csv
:
Note You need to complete the previous exercise first (or at least the first 4 steps).
In [ ]: