python - How to limit N results per `group_by` in SQLAlchemy/Postgres? -


this sqlalchemy query code

medium_contact_id_subq = (g.session.query(distinct(func.unnest(fucontact.medium_contact_id_lis))).filter(fucontact._id.in_(contact_id_lis))).subquery() q = (g.session.query(fumessage).          filter(fumessage.fu_medium_contact_id.in_(medium_contact_id_subq))          .order_by(desc(fumessage.timestamp_utc))          ) 

i'd limit fumessage grouped medium_contact_id n results.


as workaround, current ugly , unoptimized code:

    medium_contact_id_lis = (g.session.query(distinct(func.unnest(fucontact.medium_contact_id_lis))).filter(fucontact._id.in_(contact_id_lis))).all()     q = none     medium_contact_id_tup in medium_contact_id_lis:         medium_contact_id = medium_contact_id_tup[0]         if q none:             q = (g.session.query(fumessage)                  .filter(fumessage.fu_medium_contact_id == medium_contact_id)                  .limit(message_limit)                  )         else:             subq = (g.session.query(fumessage)                  .filter(fumessage.fu_medium_contact_id == medium_contact_id)                  .limit(message_limit)                  )             q = q.union(subq)     q = q.order_by(desc(fumessage.timestamp_utc)) 

one way fetch top n rows per group use window function such rank() or row_number() in subselect required grouping , order , filter in enclosing select. n = 1 use distinct on ... order by combination in postgresql.

adopting sqlalchemy straightforward using function element's over() method produce window expression:

medium_contact_id_subq = g.session.query(         func.unnest(fucontact.medium_contact_id_lis).distinct()).\     filter(fucontact._id.in_(contact_id_lis)).\     subquery()  # perform required filtering in subquery. choose suitable ordering, # or you'll indeterminate results. subq = g.session.query(         fumessage,         func.row_number().over(             partition_by=fumessage.fu_medium_contact_id,             order_by=fumessage.timestamp_utc).label('n')).\     filter(fumessage.fu_medium_contact_id.in_(medium_contact_id_subq)).\     subquery()  fumessage_alias = aliased(fumessage, subq)  # row_number() counts 1, include rows row num # less or equal limit q = g.session.query(fumessage_alias).\     filter(subq.c.n <= message_limit) 

Comments

Popular posts from this blog

ubuntu - PHP script to find files of certain extensions in a directory, returns populated array when run in browser, but empty array when run from terminal -

php - How can i create a user dashboard -

javascript - How to detect toggling of the fullscreen-toolbar in jQuery Mobile? -