Pandas - equivalent of str.contains() in pandas query
Asked Answered
A

3

16

Creating a dataframe using subsetting with below conditions

subset_df = df_eq.loc[(df_eq['place'].str.contains('Chile')) & (df_eq['mag'] > 7.5),['time','latitude','longitude','mag','place']]

Want to replicate the above subset using query() in Pandas.However not sure how to replicate str.contains() equivalent in Pandas query. "like" in query doesn't seem to work

query_df = df_eq[['time','latitude','longitude','mag','place']].query('place like \'%Chile\' and mag > 7.5')

place like '%Chile'and mag >7.5 
            ^
SyntaxError: invalid syntax

Any help will be appreciated

Antiquated answered 29/7, 2016 at 15:4 Comment(4)
I'm grasping at straws here, but you might be able to use python's in operator if you set the engine='python'. If it works, it will likely end up with a pretty inefficient query (normally pandas tries to use numexpr to speed things up but numexpr doesn't support the in operator ...)Prizewinner
AFAIK, SQL like operator is not yet implemented in pandas query() method, so you can't do it using query() methodCouteau
Thanks for your comments.Yeah like operator isn't there so the work around remains str.contains()Antiquated
Hi from Chile, I'm wondering why you use .iloc in this case the following should be enough df_eq[(df_eq['place'].str.contains('Chile')) & (df_eq['mag'] > 7.5)][['time','latitude','longitude','mag','place']]Prohibitive
I
12

As of now I am able to do this by using the engine='python' argument of the .query method to use str.contains inside a query.

This should work:

query_df = df_eq[['time', 'latitude', 'longitude', 'mag', 'place']].query(
    "place.str.contains('Chile') and mag > 7.5", engine="python")
Iconic answered 16/11, 2018 at 19:8 Comment(0)
P
10

What I think is going on here is that you are not able to utilize the method str.contains within the query pandas method. What you can do is create a mask and refer to that mask from within query using the at sign (@). Try this:

my_mask = df_eq["feature"].str.contains('my_word')
df_eq.query("@my_mask")
Petrapetracca answered 25/7, 2017 at 17:52 Comment(1)
Useful for code-complete in my_mask command but not inside string for query in my jupyter-lab.Admiration
F
9

Using str.contains works for me in pandas 1.0.0 with this syntax:

df.query("columnA == 'foo' and columnB.str.contains('bar')")
Frothy answered 6/4, 2020 at 12:2 Comment(1)
Check if "numexpr" module is installed or not. If not, a default "python" engine is used where str.contains is a valid expression.Dirk

© 2022 - 2024 — McMap. All rights reserved.