
在 Pandas 中,dtypes 是数据类型(Data Types)的简称,它们定义了 DataFrame 中列(或 Series)的数据类型。了解和使用正确的数据类型对于数据的处理、存储和性能至关重要。
import pandas as pdimport numpy as np# 创建数据字典data = {'int_col': [1, 2, np.nan], # 整数列'float_col': [1.1, 2.2, 3.3], # 浮点数列'bool_col': [True, False, True], # 布尔列'str_col': ['apple', 'banana', 'cherry'], # 字符串列'cat_col': pd.Categorical(['red', 'blue', 'red'], categories=['red', 'blue', 'green']), # 分类列'datetime_col': pd.to_datetime(['2021-01-01', '2021-01-02', '2021-01-03']), # 日期时间列'timedelta_col': pd.to_timedelta([1, 2, 3], unit='D'), # 时间差列'complex_col': [1+2j, 3+4j, 5+6j], # 复数列'interval_col': pd.IntervalIndex([pd.Interval(0, 1), pd.Interval(2, 3), pd.Interval(4, 5)]), # 区间列# 'period_col': pd.PeriodIndex(["2021-01", "2021-02", "2021-03"], freq='M'), # 周期列 (不再支持)'index_col': [100, 101, 102], # 索引列}# 创建 DataFramedf = pd.DataFrame(data)# 打印 DataFrameprint(df)print(df.info())print(df.dtypes)
int_col float_col bool_col str_col cat_col datetime_col timedelta_col complex_col interval_col index_col0 1.0 1.1 True apple red 2021-01-01 1 days 1.000000+2.000000j (0, 1] 1001 2.0 2.2 False banana blue 2021-01-02 2 days 3.000000+4.000000j (2, 3] 1012 NaN 3.3 True cherry red 2021-01-03 3 days 5.000000+6.000000j (4, 5] 102
<class 'pandas.core.frame.DataFrame'>RangeIndex: 3 entries, 0 to 2Data columns (total 10 columns):# Column Non-Null Count Dtype--- ------ -------------- -----0 int_col 2 non-null float641 float_col 3 non-null float642 bool_col 3 non-null bool3 str_col 3 non-null object4 cat_col 3 non-null category5 datetime_col 3 non-null datetime64[ns]6 timedelta_col 3 non-null timedelta64[ns]7 complex_col 3 non-null complex1288 interval_col 3 non-null interval[int64]9 index_col 3 non-null int64dtypes: bool(1), category(1), complex128(1), datetime64[ns](1), float64(2), int64(1), interval(1), object(1), timedelta64[ns](1)memory usage: 478.0+ bytesNone
int_col float64float_col float64bool_col boolstr_col objectcat_col categorydatetime_col datetime64[ns]timedelta_col timedelta64[ns]complex_col complex128interval_col interval[int64]index_col int64dtype: object