python - Find value in dataframe closest to a specific time ago -


i have dataframe date-time column , value column, i'd find way create column value @ time closest given interval before date-time.

what i'd have column called "value 2 hours ago", , have value of column correspond "value" column @ time closest 2 hours ago.

for example, if "date-time" column shows "01/01/2014 12:10:00", new column return number in "value" in line "date-time" closest "01/01/2014 10:10:00"

even better if can apply conditions on value based on how far real time interval desired "2 hours" interval. example, "return value closest 2 hours ago, except if it's less 1 hour ago or more 3 hours ago, return nothing"

to illustrate, here sample input dataframe. can value 2 hours ago, , self-merge on 2 date-time columns. challenge have merge on nearest match, rather exact match.

df = pd.dataframe({'date-time' : pd.series(["01/01/2014 04:11:00", "01/01/2014 08:10:00","01/01/2014 09:11:00","01/01/2014 12:10:00"], index=['1', '2','3', '4']),'value' : pd.series([9,12,3,21], index=['1', '2','3','4'])}) df["time"]=pd.to_datetime(df["time"]) df["t_2h_ago"]=df["time"]-pd.to_timedelta('2h') merged=pd.merge(df,df,how='left',left_on='time',right_on='t_2h_ago') 

take cartesian product. find difference between timestamps. note assumed each date-time unique in function named nearest_time. group , calculate min of each group. each group, gives closest timestamp in seconds. join back.

from datetime import datetime import time import pandas pd import numpy np df = pd.dataframe({'date-time' : pd.series(["01/01/2014 04:11:00", "01/01/2014 08:10:00","01/01/2014 09:11:00","01/01/2014 12:10:00"], index=['1', '2','3', '4']),'value' : pd.series([9,12,3,21], index=['1', '2','3','4'])})  def nearest_time(x):     row_i= datetime.strptime(x['date-time_x'], "%m/%d/%y %h:%m:%s")     row_j = datetime.strptime(x['date-time_y'], "%m/%d/%y %h:%m:%s")     diff = time.mktime(row_i.timetuple()) - time.mktime(row_j.timetuple()) #seconds ex(2 hrs)     if diff == 0: diff = float('inf')     return abs(diff)  df = df.copy() df['key']=1 df = pd.merge(df,df,on='key') df['diff'] = df.apply(nearest_time,axis=1) df2 = df.copy() df2= df2.groupby(['date-time_x']).agg({'diff': np.min}) df2 = df2[['diff']] df2['date-time_x']=df2.index  df3 = pd.merge(df2,df, on=['diff',"date-time_x"]) print df3 

Comments

Popular posts from this blog

sql - VB.NET Operand type clash: date is incompatible with int error -

SVG stroke-linecap doesn't work for circles in Firefox? -

python - TypeError: Scalar value for argument 'color' is not numeric in openCV -