Skip to content

get_group unpredicted behaviour in case of Sorting applied #535

@vanitu

Description

@vanitu

When group_by applied on sorted DataFrame get_group will return wrong entries in DataFrame

df=Daru::DataFrame.new([
                           10.times.collect{|i| i},
                           10.times.collect{|i| "b"},
                           10.times.collect{|i| i%2 == 0 ? "c" : "d"},
                       ],
                       order: [:a,:b,:c]
                       )


#Works Properly
grouped=df.group_by([:b,:c])
grouped.get_group(["b","c"])

=> #<Daru::DataFrame(5x3)>
       a   b   c
   0   0   b   c
   2   2   b   c
   4   4   b   c
   6   6   b   c
   8   8   b   c 

#Corrupted after sort applied to DF
df.sort!([:c])
grouped=df.group_by([:b,:c])
grouped.get_group(["b","c"])

=> #<Daru::DataFrame(5x3)>
       a   b   c
   0   0   b   c
   2   4   b   c
   4   8   b   c
   6   3   b   d
   8   7   b   d 

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions