I have a dataframe which contains values across 4 columns:
What I would like to do is to “split” this dataframe into N different groups where each group will have equal number of rows with same distribution of price, click count and ratings attributes.
Any advice is strongly appreciated, as I don’t have the slightest idea on how to tackle this !
If I understand the question correctly, this will get you what you want. Assuming your data frame is called
df and you have
N defined, you can do this:
split(df, sample(1:N, nrow(df), replace=T))
This will return a list of data frames where each data frame is consists of randomly selected rows from
df. By default
sample() will assign equal probability to each group.