Timedelta 简介
Pandas 的 Timedelta 对象用于表示两个日期或时间之间的差异,它是一个表示时间差的类,可以以天、秒、微秒或纳秒为单位。Timedelta 对象通常用于时间序列分析,包括对日期和时间的偏移量操作。
生成时长数据 Timedelta
Timedelta 数据类型用来代表时间增量,两个固定时间相减会产生时差:
操作 |
输入 |
输出 |
两个固定时间相减 |
pd.Timestamp('2020-11-01 15') - pd.Timestamp('2020-11-01 14') |
Timedelta('0 days 01:00:00') |
两个固定时间相减 |
pd.Timestamp('2020-11-01 08') - pd.Timestamp('2020-11-02 08') |
Timedelta('-1 days +00:00:00') |
一天 |
pd.Timedelta('1 days') |
Timedelta('1 days 00:00:00') |
一天 |
pd.Timedelta('1 days 00:00:00') |
Timedelta('1 days 00:00:00') |
一天 |
pd.Timedelta('1 days 2 hours') |
Timedelta('1 days 02:00:00') |
一天 |
pd.Timedelta('-1 days 2 min 3us') |
Timedelta('-2 days +23:57:59.999997' |
关键字参数指定时间 |
pd.Timedelta(days=5, seconds=10) |
Timedelta('5 days 00:00:10') |
关键字参数指定时间 |
pd.Timedelta(minutes=3, seconds=2) |
Timedelta('0 days 00:03:02') |
指定分钟有多少天,多少小时 |
pd.Timedelta(minutes=3242) |
Timedelta('2 days 06:02:00') |
别名:一天 |
pd.Timedelta('1D') |
Timedelta('1 days 00:00:00') |
别名:两周 |
pd.Timedelta('2W') |
Timedelta('14 days 00:00:00') |
别名:一天2小时3分钟4秒 |
pd.Timedelta('1D2H3M4S') |
Timedelta('1 days 02:03:04') |
带单位的整型数字:一天 |
pd.Timedelta(1, unit='d') |
Timedelta('1 days 00:00:00') |
带单位的整型数字:100 秒 |
pd.Timedelta(100, unit='s') |
Timedelta('0 days 00:01:40') |
带单位的整型数字:4 周 |
pd.Timedelta(4, unit='w') |
Timedelta('28 days 00:00:00') |
Python内置的datetime.timedelta或者Numpy的np.timedelta64:
import datetime
import numpy as np
import pandas as pd
操作 |
输入 |
输出 |
一天10分钟 |
pd.Timedelta(datetime.timedelta(days=1, minutes=10)) |
Timedelta('1 days 00:10:00') |
100纳秒 |
pd.Timedelta(np.timedelta64(100, 'ns')) |
Timedelta('0 days 00:00:00.000000100') |
负值 |
pd.Timedelta('-1min') |
Timedelta('-1 days +23:59:00') |
空值,缺失值 |
pd.Timedelta('nan') |
NaT |
空值,缺失值 |
pd.Timedelta('nat') |
NaT |
标准字符串ISO 8601 Duration strings |
pd.Timedelta('P0DT0H1M0S') |
Timedelta('0 days 00:01:00') |
标准字符串ISO 8601 Duration strings |
pd.Timedelta('P0DT0H0M0.000000123S') |
Timedelta('0 days 00:00:00.000000') |
DateOffsets (Day, Hour, Minute, Second, Milli, Micro, Nano) |
pd.Timedelta(pd.offsets.Second(2)) |
Timedelta('0 days 00:00:02') |
单个时长to_timedelta
操作 |
输入 |
输出 |
单个时长to_timedelta |
pd.to_timedelta('1 days 06:05:01.00003') |
Timedelta('1 days 06:05:01.000030') |
单个时长to_timedelta |
pd.to_timedelta('15.5us') |
Timedelta('0 days 00:00:00.000015') |
单个时长to_timedelta |
pd.to_timedelta(pd.offsets.Day(3)) |
Timedelta('3 days 00:00:00') |
单个时长to_timedelta |
pd.to_timedelta('15.5min') |
Timedelta('0 days 00:15:30') |
单个时长to_timedelta |
pd.to_timedelta(124524564574835) |
Timedelta('1 days 10:35:24.564574835') |
类列表生成TimedeltaIndex |
pd.to_timedelta(['1 days 06:05:01.00003', '15.5us', 'nan']) |
TimedeltaIndex(['1 days 06:05:01.000030', '0 days 00:00:00.000015', NaT], dtype='timedelta64[ns]', freq=None) |
类列表生成TimedeltaIndex |
pd.to_timedelta(np.arange(5), unit='s') |
TimedeltaIndex(['00:00:00', '00:00:01', '00:00:02', '00:00:03', '00:00:04'], dtype='timedelta64[ns]', freq=None) |
类列表生成TimedeltaIndex |
pd.to_timedelta(np.arange(5), unit='d') |
TimedeltaIndex(['0 days', '1 days', '2 days', '3 days', '4 days'], dtype='timedelta64[ns]', freq=None) |
如时间戳可上下限: |
pd.Timedelta.min |
Timedelta('-106752 days +00:12:43.145224') |
如时间戳可上下限: |
pd.Timedelta.max |
Timedelta('106751 days 23:47:16.854775') |
时长相加: |
pd.Timedelta(pd.offsets.Day(2)) + pd.Timedelta(pd.offsets.Second(2)) +pd.Timedelta('00:00:00.000123') |
Timedelta('2 days 00:00:02.000123') |
df操作:
s = pd.Series(pd.date_range('2012-1-1', periods=3, freq='D')) # 生成3天
td = pd.Series([pd.Timedelta(days=i) for i in range(3)]) # 生成3天时长
df = pd.DataFrame({'A': s, 'B': td})
|
A |
B |
0 |
2012-01-01 |
0 days |
1 |
2012-01-02 |
1 days |
2 |
2012-01-03 |
2 days |
加法:A+B
df['C'] = df['A'] + df['B']
df
|
A |
B |
C |
0 |
2012-01-01 |
0 days |
2012-01-01 |
1 |
2012-01-02 |
1 days |
2012-01-03 |
2 |
2012-01-03 |
2 days |
2012-01-05 |
类型是timedelta64[ns]
df.dtypes
A datetime64[ns]
B timedelta64[ns]
C datetime64[ns]
dtype: object
Series操作 - 最大
s - s.max() # 相减2天(最大的一天)
0 -2 days
1 -1 days
2 0 days
dtype: timedelta64[ns]
Series操作 - 相减日期得到时长
s - datetime.datetime(2011, 1, 1, 3, 5)
0 364 days 20:55:00
1 365 days 20:55:00
2 366 days 20:55:00
dtype: timedelta64[ns]
Series操作 - 相加
s + datetime.timedelta(minutes=5)
0 2012-01-01 00:05:00
1 2012-01-02 00:05:00
2 2012-01-03 00:05:00
dtype: datetime64[ns]
Series操作 - 偏移分钟
s + pd.offsets.Minute(5)
0 2012-01-01 00:05:00
1 2012-01-02 00:05:00
2 2012-01-03 00:05:00
dtype: datetime64[ns]
Series操作 - 偏移分钟 + 微妙
s + pd.offsets.Minute(5) + pd.offsets.Milli(5)
0 2012-01-01 00:05:00.005
1 2012-01-02 00:05:00.005
2 2012-01-03 00:05:00.005
dtype: datetime64[ns]
Series操作 - 两个时间标量之间相减
y = s - s[0]
0 0 days
1 1 days
2 2 days
dtype: timedelta64[ns]
Series操作 - 位移后
y = s - s.shift()
y
0 NaT
1 1 days
2 1 days
dtype: timedelta64[ns]
Series操作 - 绝对值会将负数变为正数
td1 = pd.Timedelta('-1 days 2 hours 3 seconds')
abs(td1)
## Timedelta('1 days 02:00:03')
Series操作 - 生成Series
y2 = pd.Series(pd.to_timedelta(['-1 days +00:00:05',
'nat',
'-1 days +00:00:05',
'1 days']))
0 -1 days +00:00:05
1 NaT
2 -1 days +00:00:05
3 1 days 00:00:00
dtype: timedelta64[ns]
意思 |
方法 |
结果 |
最小 |
y2.min() |
Timedelta('-1 days +00:00:05') |
最大 |
y2.max() |
Timedelta('1 days 00:00:00') |
最小的index |
y2.idxmin() |
0 |
最大的index |
y2.idxmax() |
3 |
|
y2.mean() |
Timedelta('-1 days +16:00:03.333333') |
中位数 |
y2.median() |
Timedelta('-1 days +00:00:05') |
分位数 |
y2.quantile(.1) |
Timedelta('-1 days +00:00:05') |
合计 |
y2.sum() |
Timedelta('-1 days +00:00:10') |
Series操作 - 替换 - NaT为0 day 00:00:00
y.fillna(pd.Timedelta(0))
0 -1 days +00:00:05
1 0 days 00:00:00
2 -1 days +00:00:05
3 1 days 00:00:00
dtype: timedelta64[ns]
Series操作 - 转换总结
操作 |
语句1 |
语句2(会失去一定的精度) |
转换为天 |
td / np.timedelta64(1, 'D') |
td.astype('timedelta64[D]') |
转为秒 |
td / np.timedelta64(1, 's') |
td.astype('timedelta64[s]') |
转为月份 |
td / np.timedelta64(1, 'M') |
td.astype('timedelta64[M]') |
转为分钟 |
tt / np.timedelta64(1, 'm') |
td.astype('timedelta64[m]') |
december = pd.Series(pd.date_range('20121201', periods=4))
january = pd.Series(pd.date_range('20130101', periods=4))
td = january - december # 生成时长序列
## 修改指定值
td[2] += datetime.timedelta(minutes=5, seconds=3)
td[3] = np.nan
Series操作 - 时长 - 转换为天
td / np.timedelta64(1, 'D')
0 31.000000
1 31.000000
2 31.003507
3 NaN
dtype: float64
转换后类型转为浮点型
使用类型转换会失去一定的精度:
td.astype('timedelta64[D]')
0 31.0
1 31.0
2 31.0
3 NaN
dtype: float64
Series操作 - 时长 - 转为秒
td / np.timedelta64(1, 's')
0 2678400.0
1 2678400.0
2 2678703.0
3 NaN
dtype: float64
Series操作 - 时长 - 转为秒
td.astype('timedelta64[s]')
0 2678400.0
1 2678400.0
2 2678703.0
3 NaN
dtype: float64
Series操作 - 时长 - 转为月份
td / np.timedelta64(1, 'M')
0 1.018501
1 1.018501
2 1.018617
3 NaN
dtype: float64