hand
_1_21_27
4
python3.X - 数据分析 - Pandas
共95篇
python3.X - 数据分析 - Pandas
返回栏目
1k
0k
5k
0k
0.1k
0k
2k
3k
1k
1k
0.2k
3k
0k
4k
3k
3k
3k
3k
0.5k
5k
1k
0.3k
3k
4k
7k
2k
7k
0.8k
0.9k
1k
1k
2k
0.4k
0.6k
0.6k
0.5k
0.9k
0.9k
1k
0.9k
1k
0.8k
1k
0.4k
0.4k
0.3k
0.6k
1k
0.9k
1k
1k
1k
0.8k
1k
0.8k
1k
0.7k
0.6k
4k
0.4k
3k
0.7k
0.8k
0.8k
0.2k
2k
1k
0.7k
0.7k
0.4k
0.5k
3k
0.1k
0.7k
0.9k
0.3k
1k
0.4k
0.4k
1k
0.5k
0.1k
0.7k
1k
0k
0.2k
0.7k
0.3k
0k
0k
0.1k
0k
0k
0k
3k
返回python3.X - 数据分析 - Pandas栏目
作者:
贺及楼
成为作者
更新日期:2024-08-14 11:32:17
Pandas 的 Timedelta 对象用于表示两个日期或时间之间的差异,它是一个表示时间差的类,可以以天、秒、微秒或纳秒为单位。Timedelta 对象通常用于时间序列分析,包括对日期和时间的偏移量操作。
Timedelta 数据类型用来代表时间增量,两个固定时间相减会产生时差:
操作 | 输入 | 输出 |
---|---|---|
两个固定时间相减 | pd.Timestamp('2020-11-01 15') - pd.Timestamp('2020-11-01 14') |
Timedelta('0 days 01:00:00') |
两个固定时间相减 | pd.Timestamp('2020-11-01 08') - pd.Timestamp('2020-11-02 08') |
Timedelta('-1 days +00:00:00') |
一天 | pd.Timedelta('1 days') |
Timedelta('1 days 00:00:00') |
一天 | pd.Timedelta('1 days 00:00:00') |
Timedelta('1 days 00:00:00') |
一天 | pd.Timedelta('1 days 2 hours') |
Timedelta('1 days 02:00:00') |
一天 | pd.Timedelta('-1 days 2 min 3us') |
Timedelta('-2 days +23:57:59.999997' |
关键字参数指定时间 | pd.Timedelta(days=5, seconds=10) |
Timedelta('5 days 00:00:10') |
关键字参数指定时间 | pd.Timedelta(minutes=3, seconds=2) |
Timedelta('0 days 00:03:02') |
指定分钟有多少天,多少小时 | pd.Timedelta(minutes=3242) |
Timedelta('2 days 06:02:00') |
别名:一天 | pd.Timedelta('1D') |
Timedelta('1 days 00:00:00') |
别名:两周 | pd.Timedelta('2W') |
Timedelta('14 days 00:00:00') |
别名:一天2小时3分钟4秒 | pd.Timedelta('1D2H3M4S') |
Timedelta('1 days 02:03:04') |
带单位的整型数字:一天 | pd.Timedelta(1, unit='d') |
Timedelta('1 days 00:00:00') |
带单位的整型数字:100 秒 | pd.Timedelta(100, unit='s') |
Timedelta('0 days 00:01:40') |
带单位的整型数字:4 周 | pd.Timedelta(4, unit='w') |
Timedelta('28 days 00:00:00') |
import datetime
import numpy as np
import pandas as pd
操作 | 输入 | 输出 |
---|---|---|
一天10分钟 | pd.Timedelta(datetime.timedelta(days=1, minutes=10)) |
Timedelta('1 days 00:10:00') |
100纳秒 | pd.Timedelta(np.timedelta64(100, 'ns')) |
Timedelta('0 days 00:00:00.000000100') |
负值 | pd.Timedelta('-1min') |
Timedelta('-1 days +23:59:00') |
空值,缺失值 | pd.Timedelta('nan') |
NaT |
空值,缺失值 | pd.Timedelta('nat') |
NaT |
标准字符串ISO 8601 Duration strings | pd.Timedelta('P0DT0H1M0S') |
Timedelta('0 days 00:01:00') |
标准字符串ISO 8601 Duration strings | pd.Timedelta('P0DT0H0M0.000000123S') |
Timedelta('0 days 00:00:00.000000') |
DateOffsets (Day, Hour, Minute, Second, Milli, Micro, Nano) | pd.Timedelta(pd.offsets.Second(2)) |
Timedelta('0 days 00:00:02') |
操作 | 输入 | 输出 |
---|---|---|
单个时长to_timedelta | pd.to_timedelta('1 days 06:05:01.00003') |
Timedelta('1 days 06:05:01.000030') |
单个时长to_timedelta | pd.to_timedelta('15.5us') |
Timedelta('0 days 00:00:00.000015') |
单个时长to_timedelta | pd.to_timedelta(pd.offsets.Day(3)) |
Timedelta('3 days 00:00:00') |
单个时长to_timedelta | pd.to_timedelta('15.5min') |
Timedelta('0 days 00:15:30') |
单个时长to_timedelta | pd.to_timedelta(124524564574835) |
Timedelta('1 days 10:35:24.564574835') |
类列表生成TimedeltaIndex | pd.to_timedelta(['1 days 06:05:01.00003', '15.5us', 'nan']) |
TimedeltaIndex(['1 days 06:05:01.000030', '0 days 00:00:00.000015', NaT], dtype='timedelta64[ns]', freq=None) |
类列表生成TimedeltaIndex | pd.to_timedelta(np.arange(5), unit='s') |
TimedeltaIndex(['00:00:00', '00:00:01', '00:00:02', '00:00:03', '00:00:04'], dtype='timedelta64[ns]', freq=None) |
类列表生成TimedeltaIndex | pd.to_timedelta(np.arange(5), unit='d') |
TimedeltaIndex(['0 days', '1 days', '2 days', '3 days', '4 days'], dtype='timedelta64[ns]', freq=None) |
如时间戳可上下限: | pd.Timedelta.min |
Timedelta('-106752 days +00:12:43.145224') |
如时间戳可上下限: | pd.Timedelta.max |
Timedelta('106751 days 23:47:16.854775') |
时长相加: | pd.Timedelta(pd.offsets.Day(2)) + pd.Timedelta(pd.offsets.Second(2)) +pd.Timedelta('00:00:00.000123') |
Timedelta('2 days 00:00:02.000123') |
s = pd.Series(pd.date_range('2012-1-1', periods=3, freq='D')) # 生成3天
td = pd.Series([pd.Timedelta(days=i) for i in range(3)]) # 生成3天时长
df = pd.DataFrame({'A': s, 'B': td})
A | B | |
---|---|---|
0 | 2012-01-01 | 0 days |
1 | 2012-01-02 | 1 days |
2 | 2012-01-03 | 2 days |
加法:A+B
df['C'] = df['A'] + df['B']
df
A | B | C | |
---|---|---|---|
0 | 2012-01-01 | 0 days | 2012-01-01 |
1 | 2012-01-02 | 1 days | 2012-01-03 |
2 | 2012-01-03 | 2 days | 2012-01-05 |
类型是timedelta64[ns]
df.dtypes
A datetime64[ns]
B timedelta64[ns]
C datetime64[ns]
dtype: object
s - s.max() # 相减2天(最大的一天)
0 -2 days
1 -1 days
2 0 days
dtype: timedelta64[ns]
s - datetime.datetime(2011, 1, 1, 3, 5)
0 364 days 20:55:00
1 365 days 20:55:00
2 366 days 20:55:00
dtype: timedelta64[ns]
s + datetime.timedelta(minutes=5)
0 2012-01-01 00:05:00
1 2012-01-02 00:05:00
2 2012-01-03 00:05:00
dtype: datetime64[ns]
s + pd.offsets.Minute(5)
0 2012-01-01 00:05:00
1 2012-01-02 00:05:00
2 2012-01-03 00:05:00
dtype: datetime64[ns]
s + pd.offsets.Minute(5) + pd.offsets.Milli(5)
0 2012-01-01 00:05:00.005
1 2012-01-02 00:05:00.005
2 2012-01-03 00:05:00.005
dtype: datetime64[ns]
y = s - s[0]
0 0 days
1 1 days
2 2 days
dtype: timedelta64[ns]
y = s - s.shift()
y
0 NaT
1 1 days
2 1 days
dtype: timedelta64[ns]
td1 = pd.Timedelta('-1 days 2 hours 3 seconds')
abs(td1)
## Timedelta('1 days 02:00:03')
y2 = pd.Series(pd.to_timedelta(['-1 days +00:00:05',
'nat',
'-1 days +00:00:05',
'1 days']))
0 -1 days +00:00:05
1 NaT
2 -1 days +00:00:05
3 1 days 00:00:00
dtype: timedelta64[ns]
意思 | 方法 | 结果 |
---|---|---|
最小 | y2.min() | Timedelta('-1 days +00:00:05') |
最大 | y2.max() | Timedelta('1 days 00:00:00') |
最小的index | y2.idxmin() | 0 |
最大的index | y2.idxmax() | 3 |
y2.mean() | Timedelta('-1 days +16:00:03.333333') |
|
中位数 | y2.median() | Timedelta('-1 days +00:00:05') |
分位数 | y2.quantile(.1) | Timedelta('-1 days +00:00:05') |
合计 | y2.sum() | Timedelta('-1 days +00:00:10') |
y.fillna(pd.Timedelta(0))
0 -1 days +00:00:05
1 0 days 00:00:00
2 -1 days +00:00:05
3 1 days 00:00:00
dtype: timedelta64[ns]
操作 | 语句1 | 语句2(会失去一定的精度) |
---|---|---|
转换为天 | td / np.timedelta64(1, 'D') |
td.astype('timedelta64[D]') |
转为秒 | td / np.timedelta64(1, 's') |
td.astype('timedelta64[s]') |
转为月份 | td / np.timedelta64(1, 'M') |
td.astype('timedelta64[M]') |
转为分钟 | tt / np.timedelta64(1, 'm') |
td.astype('timedelta64[m]') |
december = pd.Series(pd.date_range('20121201', periods=4))
january = pd.Series(pd.date_range('20130101', periods=4))
td = january - december # 生成时长序列
## 修改指定值
td[2] += datetime.timedelta(minutes=5, seconds=3)
td[3] = np.nan
td / np.timedelta64(1, 'D')
0 31.000000
1 31.000000
2 31.003507
3 NaN
dtype: float64
使用类型转换会失去一定的精度:
td.astype('timedelta64[D]')
0 31.0
1 31.0
2 31.0
3 NaN
dtype: float64
td / np.timedelta64(1, 's')
0 2678400.0
1 2678400.0
2 2678703.0
3 NaN
dtype: float64
td.astype('timedelta64[s]')
0 2678400.0
1 2678400.0
2 2678703.0
3 NaN
dtype: float64
td / np.timedelta64(1, 'M')
0 1.018501
1 1.018501
2 1.018617
3 NaN
dtype: float64
python3.X - 数据分析 - Pandas
整章节共95节
快分享给你的小伙伴吧 ~