upsample in a timeseries and interpolating data
Clash Royale CLAN TAG#URR8PPP
upsample in a timeseries and interpolating data
I need to perform an Upsample in a timeseries, and then to interpolate data and I would like to find the best way to do that. Timeseries has not constant interval. I show a DatFrame example and the result I'm looking. In the result example I'm interpolating just 1 row. It would be great to bea able to interpolate n rows.
data = 'time': ['08-12-2018 10:00:00','08-12-2018 10:01:00','08-12-2018
10:01:30','08-12-2018 10:03:00','08-12-2018 10:03:10'], 'value':[1,2,3,4,5]
df=pd.DataFrame(data)
df.time=pd.to_datetime(df.time)
df
Out[42]:
time value
0 2018-08-12 10:00:00 1
1 2018-08-12 10:01:00 2
2 2018-08-12 10:01:30 3
3 2018-08-12 10:03:00 4
4 2018-08-12 10:03:10 5
Result
time value
0 2018-08-12 10:00:00 1
1 2018-08-12 10:00:30 1.5
2 2018-08-12 10:01:00 2
3 2018-08-12 10:01:15 2.5
4 2018-08-12 10:01:30 3
5 2018-08-12 10:02:15 3.5
6 2018-08-12 10:03:00 4
7 2018-08-12 10:03:05 4.5
8 2018-08-12 10:03:10 5
1 Answer
1
You can multiple index, convert datetime to numeric - native numpy array in nanoseconds, so possible add new NaN
s rows by reindex
and interpolate
. Last convert time
column back to datetime
s:
NaN
reindex
interpolate
time
datetime
N = 2
df.index = df.index * N
df.time= df.time.astype(np.int64)
df1 = df.reindex(np.arange(df.index.max() + 1)).interpolate()
df1.time=pd.to_datetime(df1.time)
print (df1)
time value
0 2018-08-12 10:00:00 1.0
1 2018-08-12 10:00:30 1.5
2 2018-08-12 10:01:00 2.0
3 2018-08-12 10:01:15 2.5
4 2018-08-12 10:01:30 3.0
5 2018-08-12 10:02:15 3.5
6 2018-08-12 10:03:00 4.0
7 2018-08-12 10:03:05 4.5
8 2018-08-12 10:03:10 5.0
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.
Very nice! Thanks.
– Guido
Aug 12 at 14:12