Johan Nilssons Lifestream

Split data in half based on dates

I'd like to split my data in half by year(s). So below in my sample data I need the result to be two separate dataframes, one with the first 50% of each of the years and the other half in the other one. Additional condition is that is that the 50% needs to be based on column 'LG'.

Can anyone help me with this?

Sample data:

import pandas as pd
import numpy as np

df = pd.DataFrame(
    {'LG' : ('AR1', 'AR1', 'AR1', 'AR1', 'AR1', 'AR1', 'PO1',  'PO1', 'AR1', 'AR1', 'PO1', 'PO1'),
     'Date': ('2011-1-1', '2011-3-1',  '2011-4-1', '2011-2-1', '2012-1-1', '2012-2-1', '2012-1-1', '2012-2-1', '2013-1-1', '2013-2-1', '2013-1-1', '2013-2-1'),
     'Year': (2011, 2011, 2011, 2011, 2012, 2012, 2012, 2012, 2013, 2013, 2013, 2013)})

pd.to_datetime(df['Date'])

df:

         Date   LG  Year
0  2011-01-01  AR1  2011
1  2011-03-01  AR1  2011
2  2011-04-01  AR1  2011
3  2011-02-01  AR1  2011
4  2012-01-01  AR1  2012
5  2012-02-01  AR1  2012
6  2012-01-01  PO1  2012
7  2012-02-01  PO1  2012
8  2013-01-01  AR1  2013
9  2013-02-01  AR1  2013
10 2013-01-01  PO1  2013
11 2013-02-01  PO1  2013

via Stack Overflow

blog comments powered by Disqus
Get the source for phplifestream at Github