Most efficient way to loop through and update rows in a large pandas dataframe

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP



Most efficient way to loop through and update rows in a large pandas dataframe



This is my piece of code to update the rows of a dataframe:


def arrangeData(df):
hour_from_timestamp_list =
date_from_timestamp_list =
for row in df.itertuples():
timestamp = row.timestamp
hour_from_timestamp = datetime.fromtimestamp(
int(timestamp) / 1000).strftime('%H:%M:%S')
date_from_timestamp = datetime.fromtimestamp(
int(timestamp) / 1000).strftime('%d-%m-%Y')
hour_from_timestamp_list.append(hour_from_timestamp)
date_from_timestamp_list.append(date_from_timestamp)
df['Time'] = hour_from_timestamp_list
df['Hour'] = pd.to_datetime(df['Time']).dt.hour
df['ChatDate'] = date_from_timestamp_list
return df



Im trying to extract time, hour and chatdate from timestamp. The code is working fine. But when theres huge set of data, somewhere around 300,000 rows, the function is extremely slow. Can anyone suggest a better way to execute this function faster?


For looping I have tried iterrows() which was even more slower.


This is the document that im processing :



"_id" : ObjectId("5b9feadc32214d2b504ea6e1"),
"id" : 34176,
"timestamp" : NumberLong(1535019434998),
"platform" : "Email",
"sessionId" : LUUID("08a5caac-baa3-11e8-a508-106530216ef0"),
"intentStatus" : "NotHandled",
"botId" : "tony"





Can you add some data sample?
– jezrael
2 mins ago





@jezrael editted the question with the data sample
– Tony Mathew
19 secs ago









By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Popular posts from this blog

Firebase Auth - with Email and Password - Check user already registered

Dynamically update html content plain JS

Creating a leaderboard in HTML/JS