Timedelta 简介
Pandas 的 Timedelta 对象用于表示两个日期或时间之间的差异,它是一个表示时间差的类,可以以天、秒、微秒或纳秒为单位。Timedelta 对象通常用于时间序列分析,包括对日期和时间的偏移量操作。
生成时长数据 Timedelta
Timedelta 数据类型用来代表时间增量,两个固定时间相减会产生时差:
| 操作 |
输入 |
输出 |
| 两个固定时间相减 |
pd.Timestamp('2020-11-01 15') - pd.Timestamp('2020-11-01 14') |
Timedelta('0 days 01:00:00') |
| 两个固定时间相减 |
pd.Timestamp('2020-11-01 08') - pd.Timestamp('2020-11-02 08') |
Timedelta('-1 days +00:00:00') |
| 一天 |
pd.Timedelta('1 days') |
Timedelta('1 days 00:00:00') |
| 一天 |
pd.Timedelta('1 days 00:00:00') |
Timedelta('1 days 00:00:00') |
| 一天 |
pd.Timedelta('1 days 2 hours') |
Timedelta('1 days 02:00:00') |
| 一天 |
pd.Timedelta('-1 days 2 min 3us') |
Timedelta('-2 days +23:57:59.999997' |
| 关键字参数指定时间 |
pd.Timedelta(days=5, seconds=10) |
Timedelta('5 days 00:00:10') |
| 关键字参数指定时间 |
pd.Timedelta(minutes=3, seconds=2) |
Timedelta('0 days 00:03:02') |
| 指定分钟有多少天,多少小时 |
pd.Timedelta(minutes=3242) |
Timedelta('2 days 06:02:00') |
| 别名:一天 |
pd.Timedelta('1D') |
Timedelta('1 days 00:00:00') |
| 别名:两周 |
pd.Timedelta('2W') |
Timedelta('14 days 00:00:00') |
| 别名:一天2小时3分钟4秒 |
pd.Timedelta('1D2H3M4S') |
Timedelta('1 days 02:03:04') |
| 带单位的整型数字:一天 |
pd.Timedelta(1, unit='d') |
Timedelta('1 days 00:00:00') |
| 带单位的整型数字:100 秒 |
pd.Timedelta(100, unit='s') |
Timedelta('0 days 00:01:40') |
| 带单位的整型数字:4 周 |
pd.Timedelta(4, unit='w') |
Timedelta('28 days 00:00:00') |
Python内置的datetime.timedelta或者Numpy的np.timedelta64:
import datetimeimport numpy as npimport pandas as pd
| 操作 |
输入 |
输出 |
| 一天10分钟 |
pd.Timedelta(datetime.timedelta(days=1, minutes=10)) |
Timedelta('1 days 00:10:00') |
| 100纳秒 |
pd.Timedelta(np.timedelta64(100, 'ns')) |
Timedelta('0 days 00:00:00.000000100') |
| 负值 |
pd.Timedelta('-1min') |
Timedelta('-1 days +23:59:00') |
| 空值,缺失值 |
pd.Timedelta('nan') |
NaT |
| 空值,缺失值 |
pd.Timedelta('nat') |
NaT |
| 标准字符串ISO 8601 Duration strings |
pd.Timedelta('P0DT0H1M0S') |
Timedelta('0 days 00:01:00') |
| 标准字符串ISO 8601 Duration strings |
pd.Timedelta('P0DT0H0M0.000000123S') |
Timedelta('0 days 00:00:00.000000') |
| DateOffsets (Day, Hour, Minute, Second, Milli, Micro, Nano) |
pd.Timedelta(pd.offsets.Second(2)) |
Timedelta('0 days 00:00:02') |
单个时长to_timedelta
| 操作 |
输入 |
输出 |
| 单个时长to_timedelta |
pd.to_timedelta('1 days 06:05:01.00003') |
Timedelta('1 days 06:05:01.000030') |
| 单个时长to_timedelta |
pd.to_timedelta('15.5us') |
Timedelta('0 days 00:00:00.000015') |
| 单个时长to_timedelta |
pd.to_timedelta(pd.offsets.Day(3)) |
Timedelta('3 days 00:00:00') |
| 单个时长to_timedelta |
pd.to_timedelta('15.5min') |
Timedelta('0 days 00:15:30') |
| 单个时长to_timedelta |
pd.to_timedelta(124524564574835) |
Timedelta('1 days 10:35:24.564574835') |
| 类列表生成TimedeltaIndex |
pd.to_timedelta(['1 days 06:05:01.00003', '15.5us', 'nan']) |
TimedeltaIndex(['1 days 06:05:01.000030', '0 days 00:00:00.000015', NaT], dtype='timedelta64[ns]', freq=None) |
| 类列表生成TimedeltaIndex |
pd.to_timedelta(np.arange(5), unit='s') |
TimedeltaIndex(['00:00:00', '00:00:01', '00:00:02', '00:00:03', '00:00:04'], dtype='timedelta64[ns]', freq=None) |
| 类列表生成TimedeltaIndex |
pd.to_timedelta(np.arange(5), unit='d') |
TimedeltaIndex(['0 days', '1 days', '2 days', '3 days', '4 days'], dtype='timedelta64[ns]', freq=None) |
| 如时间戳可上下限: |
pd.Timedelta.min |
Timedelta('-106752 days +00:12:43.145224') |
| 如时间戳可上下限: |
pd.Timedelta.max |
Timedelta('106751 days 23:47:16.854775') |
| 时长相加: |
pd.Timedelta(pd.offsets.Day(2)) + pd.Timedelta(pd.offsets.Second(2)) +pd.Timedelta('00:00:00.000123') |
Timedelta('2 days 00:00:02.000123') |
df操作:
s = pd.Series(pd.date_range('2012-1-1', periods=3, freq='D')) # 生成3天td = pd.Series([pd.Timedelta(days=i) for i in range(3)]) # 生成3天时长df = pd.DataFrame({'A': s, 'B': td})
|
A |
B |
| 0 |
2012-01-01 |
0 days |
| 1 |
2012-01-02 |
1 days |
| 2 |
2012-01-03 |
2 days |
加法:A+B
df['C'] = df['A'] + df['B']df
|
A |
B |
C |
| 0 |
2012-01-01 |
0 days |
2012-01-01 |
| 1 |
2012-01-02 |
1 days |
2012-01-03 |
| 2 |
2012-01-03 |
2 days |
2012-01-05 |
类型是timedelta64[ns]
df.dtypesA datetime64[ns]B timedelta64[ns]C datetime64[ns]dtype: object
Series操作 - 最大
s - s.max() # 相减2天(最大的一天)0 -2 days1 -1 days2 0 daysdtype: timedelta64[ns]
Series操作 - 相减日期得到时长
s - datetime.datetime(2011, 1, 1, 3, 5)0 364 days 20:55:001 365 days 20:55:002 366 days 20:55:00dtype: timedelta64[ns]
Series操作 - 相加
s + datetime.timedelta(minutes=5)0 2012-01-01 00:05:001 2012-01-02 00:05:002 2012-01-03 00:05:00dtype: datetime64[ns]
Series操作 - 偏移分钟
s + pd.offsets.Minute(5)0 2012-01-01 00:05:001 2012-01-02 00:05:002 2012-01-03 00:05:00dtype: datetime64[ns]
Series操作 - 偏移分钟 + 微妙
s + pd.offsets.Minute(5) + pd.offsets.Milli(5)0 2012-01-01 00:05:00.0051 2012-01-02 00:05:00.0052 2012-01-03 00:05:00.005dtype: datetime64[ns]
Series操作 - 两个时间标量之间相减
y = s - s[0]0 0 days1 1 days2 2 daysdtype: timedelta64[ns]
Series操作 - 位移后
y = s - s.shift()y0 NaT1 1 days2 1 daysdtype: timedelta64[ns]
Series操作 - 绝对值会将负数变为正数
td1 = pd.Timedelta('-1 days 2 hours 3 seconds')abs(td1)## Timedelta('1 days 02:00:03')
Series操作 - 生成Series
y2 = pd.Series(pd.to_timedelta(['-1 days +00:00:05', 'nat', '-1 days +00:00:05', '1 days']))0 -1 days +00:00:051 NaT2 -1 days +00:00:053 1 days 00:00:00dtype: timedelta64[ns]
| 意思 |
方法 |
结果 |
| 最小 |
y2.min() |
Timedelta('-1 days +00:00:05') |
| 最大 |
y2.max() |
Timedelta('1 days 00:00:00') |
| 最小的index |
y2.idxmin() |
0 |
| 最大的index |
y2.idxmax() |
3 |
|
y2.mean() |
Timedelta('-1 days +16:00:03.333333') |
| 中位数 |
y2.median() |
Timedelta('-1 days +00:00:05') |
| 分位数 |
y2.quantile(.1) |
Timedelta('-1 days +00:00:05') |
| 合计 |
y2.sum() |
Timedelta('-1 days +00:00:10') |
Series操作 - 替换 - NaT为0 day 00:00:00
y.fillna(pd.Timedelta(0))0 -1 days +00:00:051 0 days 00:00:002 -1 days +00:00:053 1 days 00:00:00dtype: timedelta64[ns]
Series操作 - 转换总结
| 操作 |
语句1 |
语句2(会失去一定的精度) |
| 转换为天 |
td / np.timedelta64(1, 'D') |
td.astype('timedelta64[D]') |
| 转为秒 |
td / np.timedelta64(1, 's') |
td.astype('timedelta64[s]') |
| 转为月份 |
td / np.timedelta64(1, 'M') |
td.astype('timedelta64[M]') |
| 转为分钟 |
tt / np.timedelta64(1, 'm') |
td.astype('timedelta64[m]') |
december = pd.Series(pd.date_range('20121201', periods=4))january = pd.Series(pd.date_range('20130101', periods=4))td = january - december # 生成时长序列## 修改指定值td[2] += datetime.timedelta(minutes=5, seconds=3)td[3] = np.nan
Series操作 - 时长 - 转换为天
td / np.timedelta64(1, 'D')0 31.0000001 31.0000002 31.0035073 NaNdtype: float64
转换后类型转为浮点型
使用类型转换会失去一定的精度:
td.astype('timedelta64[D]')0 31.01 31.02 31.03 NaNdtype: float64
Series操作 - 时长 - 转为秒
td / np.timedelta64(1, 's')0 2678400.01 2678400.02 2678703.03 NaNdtype: float64
Series操作 - 时长 - 转为秒
td.astype('timedelta64[s]')0 2678400.01 2678400.02 2678703.03 NaNdtype: float64
Series操作 - 时长 - 转为月份
td / np.timedelta64(1, 'M')0 1.0185011 1.0185012 1.0186173 NaNdtype: float64