Data Analysis with pandas#
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
Import and Inspect Data#
Import a csv file with the function pd.read_csv. The file should be saved to the same directory as the current notebook:
df = pd.read_csv('wind.csv')
The data is imported as a pandas dataframe object which we think of as spreadsheet:
type(df)
pandas.core.frame.DataFrame
View the first 5 rows of the dataframe:
df.head()
| day | month | year | dayofyear | avg_wind_speed | |
|---|---|---|---|---|---|
| 0 | 13 | 4 | 2023 | 103 | 13.0 | 
| 1 | 12 | 4 | 2023 | 102 | 13.0 | 
| 2 | 11 | 4 | 2023 | 101 | 17.5 | 
| 3 | 10 | 4 | 2023 | 100 | 11.5 | 
| 4 | 9 | 4 | 2023 | 99 | 19.5 | 
View the last 5 rows of the dataframe:
df.tail()
| day | month | year | dayofyear | avg_wind_speed | |
|---|---|---|---|---|---|
| 9995 | 1 | 12 | 1995 | 335 | 19.5 | 
| 9996 | 30 | 11 | 1995 | 334 | 19.5 | 
| 9997 | 29 | 11 | 1995 | 333 | 25.0 | 
| 9998 | 28 | 11 | 1995 | 332 | 21.5 | 
| 9999 | 27 | 11 | 1995 | 331 | 17.5 | 
View a summary of the dataframe:
df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10000 entries, 0 to 9999
Data columns (total 5 columns):
 #   Column          Non-Null Count  Dtype  
---  ------          --------------  -----  
 0   day             10000 non-null  int64  
 1   month           10000 non-null  int64  
 2   year            10000 non-null  int64  
 3   dayofyear       10000 non-null  int64  
 4   avg_wind_speed  10000 non-null  float64
dtypes: float64(1), int64(4)
memory usage: 390.8 KB
Select Rows and Columns#
Create New Columns#
Aggregate Data#
df[['month','day','avg_wind_speed']].groupby(['month','day']).mean().unstack()
| avg_wind_speed | |||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| day | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | ... | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 
| month | |||||||||||||||||||||
| 1 | 14.625000 | 16.339286 | 16.714286 | 15.267857 | 15.267857 | 14.446429 | 16.642857 | 15.625000 | 15.910714 | 14.125000 | ... | 13.803571 | 14.160714 | 13.178571 | 11.964286 | 13.214286 | 12.071429 | 12.553571 | 15.142857 | 14.178571 | 13.892857 | 
| 2 | 13.642857 | 12.517857 | 12.964286 | 14.553571 | 17.410714 | 13.732143 | 14.000000 | 13.821429 | 10.660714 | 12.303571 | ... | 15.285714 | 15.410714 | 13.482143 | 14.285714 | 14.500000 | 12.803571 | 14.500000 | 15.357143 | NaN | NaN | 
| 3 | 15.142857 | 15.982143 | 14.785714 | 15.607143 | 15.357143 | 13.392857 | 13.392857 | 14.428571 | 15.303571 | 15.267857 | ... | 15.625000 | 16.267857 | 12.857143 | 14.678571 | 14.410714 | 16.125000 | 18.285714 | 16.303571 | 14.482143 | 16.214286 | 
| 4 | 15.250000 | 16.160714 | 15.589286 | 13.482143 | 15.142857 | 13.017857 | 14.821429 | 15.821429 | 14.321429 | 14.517857 | ... | 16.037037 | 14.962963 | 13.574074 | 14.370370 | 15.259259 | 15.833333 | 12.796296 | 14.388889 | 14.129630 | NaN | 
| 5 | 14.962963 | 14.962963 | 14.018519 | 13.796296 | 14.351852 | 13.777778 | 14.370370 | 14.648148 | 12.592593 | 11.518519 | ... | 13.185185 | 13.574074 | 13.833333 | 13.648148 | 13.296296 | 13.518519 | 14.277778 | 13.629630 | 13.740741 | 11.092593 | 
| 6 | 14.018519 | 13.444444 | 12.777778 | 13.907407 | 14.296296 | 11.888889 | 13.648148 | 13.759259 | 14.388889 | 12.574074 | ... | 14.092593 | 13.388889 | 12.425926 | 12.074074 | 14.555556 | 13.555556 | 13.537037 | 13.111111 | 12.462963 | NaN | 
| 7 | 15.611111 | 14.685185 | 13.500000 | 12.777778 | 13.462963 | 12.666667 | 14.777778 | 12.000000 | 12.166667 | 13.814815 | ... | 14.555556 | 14.462963 | 13.185185 | 12.722222 | 12.611111 | 12.444444 | 13.148148 | 12.407407 | 12.962963 | 12.555556 | 
| 8 | 15.000000 | 13.222222 | 11.907407 | 12.629630 | 13.407407 | 13.037037 | 12.777778 | 13.333333 | 11.351852 | 12.518519 | ... | 12.407407 | 11.111111 | 12.166667 | 12.055556 | 12.814815 | 13.333333 | 11.703704 | 13.981481 | 13.555556 | 12.981481 | 
| 9 | 12.462963 | 11.574074 | 11.777778 | 12.481481 | 11.500000 | 12.444444 | 12.833333 | 14.685185 | 14.055556 | 13.129630 | ... | 12.592593 | 12.388889 | 12.703704 | 13.407407 | 13.685185 | 13.166667 | 13.481481 | 14.148148 | 14.944444 | NaN | 
| 10 | 12.796296 | 13.314815 | 12.574074 | 12.907407 | 11.277778 | 11.629630 | 13.277778 | 14.314815 | 13.629630 | 15.425926 | ... | 12.629630 | 13.500000 | 12.703704 | 15.851852 | 13.185185 | 13.611111 | 14.722222 | 14.388889 | 14.851852 | 14.222222 | 
| 11 | 11.851852 | 12.870370 | 13.611111 | 15.703704 | 13.370370 | 15.870370 | 13.592593 | 13.333333 | 14.129630 | 12.518519 | ... | 13.962963 | 14.962963 | 15.555556 | 15.203704 | 14.611111 | 16.303571 | 14.500000 | 15.535714 | 15.553571 | NaN | 
| 12 | 15.625000 | 14.357143 | 12.589286 | 13.839286 | 12.017857 | 12.285714 | 13.428571 | 13.392857 | 14.107143 | 14.160714 | ... | 13.053571 | 13.160714 | 13.625000 | 13.375000 | 14.678571 | 14.982143 | 12.303571 | 14.160714 | 13.517857 | 12.357143 | 
12 rows × 31 columns