• 主页

  • 投资

  • IT

    🔥
  • 设计

  • 销售

  • 共95篇

    python3.X - 数据分析 - Pandas

关闭

返回栏目

关闭

返回python3.X - 数据分析 - Pandas栏目

25 - 数据类型 - pd.Timedelta() - 时长

作者:

贺及楼

成为作者

更新日期:2024-08-14 11:32:17

pd.Timedelta()时长

Timedelta 简介

Pandas 的 Timedelta 对象用于表示两个日期或时间之间的差异,它是一个表示时间差的类,可以以天、秒、微秒或纳秒为单位。Timedelta 对象通常用于时间序列分析,包括对日期和时间的偏移量操作。

生成时长数据 Timedelta

Timedelta 数据类型用来代表时间增量,两个固定时间相减会产生时差:

操作 输入 输出
两个固定时间相减 pd.Timestamp('2020-11-01 15') - pd.Timestamp('2020-11-01 14') Timedelta('0 days 01:00:00')
两个固定时间相减 pd.Timestamp('2020-11-01 08') - pd.Timestamp('2020-11-02 08') Timedelta('-1 days +00:00:00')
一天 pd.Timedelta('1 days') Timedelta('1 days 00:00:00')
一天 pd.Timedelta('1 days 00:00:00') Timedelta('1 days 00:00:00')
一天 pd.Timedelta('1 days 2 hours') Timedelta('1 days 02:00:00')
一天 pd.Timedelta('-1 days 2 min 3us') Timedelta('-2 days +23:57:59.999997'
关键字参数指定时间 pd.Timedelta(days=5, seconds=10) Timedelta('5 days 00:00:10')
关键字参数指定时间 pd.Timedelta(minutes=3, seconds=2) Timedelta('0 days 00:03:02')
指定分钟有多少天,多少小时 pd.Timedelta(minutes=3242) Timedelta('2 days 06:02:00')
别名:一天 pd.Timedelta('1D') Timedelta('1 days 00:00:00')
别名:两周 pd.Timedelta('2W') Timedelta('14 days 00:00:00')
别名:一天2小时3分钟4秒 pd.Timedelta('1D2H3M4S') Timedelta('1 days 02:03:04')
带单位的整型数字:一天 pd.Timedelta(1, unit='d') Timedelta('1 days 00:00:00')
带单位的整型数字:100 秒 pd.Timedelta(100, unit='s') Timedelta('0 days 00:01:40')
带单位的整型数字:4 周 pd.Timedelta(4, unit='w') Timedelta('28 days 00:00:00')

Python内置的datetime.timedelta或者Numpy的np.timedelta64:

  1. import datetime
  2. import numpy as np
  3. import pandas as pd
操作 输入 输出
一天10分钟 pd.Timedelta(datetime.timedelta(days=1, minutes=10)) Timedelta('1 days 00:10:00')
100纳秒 pd.Timedelta(np.timedelta64(100, 'ns')) Timedelta('0 days 00:00:00.000000100')
负值 pd.Timedelta('-1min') Timedelta('-1 days +23:59:00')
空值,缺失值 pd.Timedelta('nan') NaT
空值,缺失值 pd.Timedelta('nat') NaT
标准字符串ISO 8601 Duration strings pd.Timedelta('P0DT0H1M0S') Timedelta('0 days 00:01:00')
标准字符串ISO 8601 Duration strings pd.Timedelta('P0DT0H0M0.000000123S') Timedelta('0 days 00:00:00.000000')
DateOffsets (Day, Hour, Minute, Second, Milli, Micro, Nano) pd.Timedelta(pd.offsets.Second(2)) Timedelta('0 days 00:00:02')

单个时长to_timedelta

操作 输入 输出
单个时长to_timedelta pd.to_timedelta('1 days 06:05:01.00003') Timedelta('1 days 06:05:01.000030')
单个时长to_timedelta pd.to_timedelta('15.5us') Timedelta('0 days 00:00:00.000015')
单个时长to_timedelta pd.to_timedelta(pd.offsets.Day(3)) Timedelta('3 days 00:00:00')
单个时长to_timedelta pd.to_timedelta('15.5min') Timedelta('0 days 00:15:30')
单个时长to_timedelta pd.to_timedelta(124524564574835) Timedelta('1 days 10:35:24.564574835')
类列表生成TimedeltaIndex pd.to_timedelta(['1 days 06:05:01.00003', '15.5us', 'nan']) TimedeltaIndex(['1 days 06:05:01.000030', '0 days 00:00:00.000015', NaT], dtype='timedelta64[ns]', freq=None)
类列表生成TimedeltaIndex pd.to_timedelta(np.arange(5), unit='s') TimedeltaIndex(['00:00:00', '00:00:01', '00:00:02', '00:00:03', '00:00:04'], dtype='timedelta64[ns]', freq=None)
类列表生成TimedeltaIndex pd.to_timedelta(np.arange(5), unit='d') TimedeltaIndex(['0 days', '1 days', '2 days', '3 days', '4 days'], dtype='timedelta64[ns]', freq=None)
如时间戳可上下限: pd.Timedelta.min Timedelta('-106752 days +00:12:43.145224')
如时间戳可上下限: pd.Timedelta.max Timedelta('106751 days 23:47:16.854775')
时长相加: pd.Timedelta(pd.offsets.Day(2)) + pd.Timedelta(pd.offsets.Second(2)) +pd.Timedelta('00:00:00.000123') Timedelta('2 days 00:00:02.000123')

df操作:

  1. s = pd.Series(pd.date_range('2012-1-1', periods=3, freq='D')) # 生成3天
  2. td = pd.Series([pd.Timedelta(days=i) for i in range(3)]) # 生成3天时长
  3. df = pd.DataFrame({'A': s, 'B': td})
A B
0 2012-01-01 0 days
1 2012-01-02 1 days
2 2012-01-03 2 days

加法:A+B

  1. df['C'] = df['A'] + df['B']
  2. df
A B C
0 2012-01-01 0 days 2012-01-01
1 2012-01-02 1 days 2012-01-03
2 2012-01-03 2 days 2012-01-05

类型是timedelta64[ns]

  1. df.dtypes
  2. A datetime64[ns]
  3. B timedelta64[ns]
  4. C datetime64[ns]
  5. dtype: object

Series操作 - 最大

  1. s - s.max() # 相减2天(最大的一天)
  2. 0 -2 days
  3. 1 -1 days
  4. 2 0 days
  5. dtype: timedelta64[ns]

Series操作 - 相减日期得到时长

  1. s - datetime.datetime(2011, 1, 1, 3, 5)
  2. 0 364 days 20:55:00
  3. 1 365 days 20:55:00
  4. 2 366 days 20:55:00
  5. dtype: timedelta64[ns]

Series操作 - 相加

  1. s + datetime.timedelta(minutes=5)
  2. 0 2012-01-01 00:05:00
  3. 1 2012-01-02 00:05:00
  4. 2 2012-01-03 00:05:00
  5. dtype: datetime64[ns]

Series操作 - 偏移分钟

  1. s + pd.offsets.Minute(5)
  2. 0 2012-01-01 00:05:00
  3. 1 2012-01-02 00:05:00
  4. 2 2012-01-03 00:05:00
  5. dtype: datetime64[ns]

Series操作 - 偏移分钟 + 微妙

  1. s + pd.offsets.Minute(5) + pd.offsets.Milli(5)
  2. 0 2012-01-01 00:05:00.005
  3. 1 2012-01-02 00:05:00.005
  4. 2 2012-01-03 00:05:00.005
  5. dtype: datetime64[ns]

Series操作 - 两个时间标量之间相减

  1. y = s - s[0]
  2. 0 0 days
  3. 1 1 days
  4. 2 2 days
  5. dtype: timedelta64[ns]

Series操作 - 位移后

  1. y = s - s.shift()
  2. y
  3. 0 NaT
  4. 1 1 days
  5. 2 1 days
  6. dtype: timedelta64[ns]

Series操作 - 绝对值会将负数变为正数

  1. td1 = pd.Timedelta('-1 days 2 hours 3 seconds')
  2. abs(td1)
  3. ## Timedelta('1 days 02:00:03')

Series操作 - 生成Series

  1. y2 = pd.Series(pd.to_timedelta(['-1 days +00:00:05',
  2. 'nat',
  3. '-1 days +00:00:05',
  4. '1 days']))
  5. 0 -1 days +00:00:05
  6. 1 NaT
  7. 2 -1 days +00:00:05
  8. 3 1 days 00:00:00
  9. dtype: timedelta64[ns]
意思 方法 结果
最小 y2.min() Timedelta('-1 days +00:00:05')
最大 y2.max() Timedelta('1 days 00:00:00')
最小的index y2.idxmin() 0
最大的index y2.idxmax() 3
y2.mean() Timedelta('-1 days +16:00:03.333333')
中位数 y2.median() Timedelta('-1 days +00:00:05')
分位数 y2.quantile(.1) Timedelta('-1 days +00:00:05')
合计 y2.sum() Timedelta('-1 days +00:00:10')

Series操作 - 替换 - NaT为0 day 00:00:00

  1. y.fillna(pd.Timedelta(0))
  2. 0 -1 days +00:00:05
  3. 1 0 days 00:00:00
  4. 2 -1 days +00:00:05
  5. 3 1 days 00:00:00
  6. dtype: timedelta64[ns]

Series操作 - 转换总结

操作 语句1 语句2(会失去一定的精度)
转换为天 td / np.timedelta64(1, 'D') td.astype('timedelta64[D]')
转为秒 td / np.timedelta64(1, 's') td.astype('timedelta64[s]')
转为月份 td / np.timedelta64(1, 'M') td.astype('timedelta64[M]')
转为分钟 tt / np.timedelta64(1, 'm') td.astype('timedelta64[m]')
  1. december = pd.Series(pd.date_range('20121201', periods=4))
  2. january = pd.Series(pd.date_range('20130101', periods=4))
  3. td = january - december # 生成时长序列
  4. ## 修改指定值
  5. td[2] += datetime.timedelta(minutes=5, seconds=3)
  6. td[3] = np.nan

Series操作 - 时长 - 转换为天

  1. td / np.timedelta64(1, 'D')
  2. 0 31.000000
  3. 1 31.000000
  4. 2 31.003507
  5. 3 NaN
  6. dtype: float64

转换后类型转为浮点型

使用类型转换会失去一定的精度:

  1. td.astype('timedelta64[D]')
  2. 0 31.0
  3. 1 31.0
  4. 2 31.0
  5. 3 NaN
  6. dtype: float64

Series操作 - 时长 - 转为秒

  1. td / np.timedelta64(1, 's')
  2. 0 2678400.0
  3. 1 2678400.0
  4. 2 2678703.0
  5. 3 NaN
  6. dtype: float64

Series操作 - 时长 - 转为秒

  1. td.astype('timedelta64[s]')
  2. 0 2678400.0
  3. 1 2678400.0
  4. 2 2678703.0
  5. 3 NaN
  6. dtype: float64

Series操作 - 时长 - 转为月份

  1. td / np.timedelta64(1, 'M')
  2. 0 1.018501
  3. 1 1.018501
  4. 2 1.018617
  5. 3 NaN
  6. dtype: float64