window-functions Questions

2

Solved

Does anyone know the best way for Apache Spark SQL to achieve the same results as the standard SQL qualify() + rnk or row_number statements? For example: I have a Spark Dataframe called statemen...

2

Solved

I'm trying to use Spark 1.4 window functions in pyspark 1.4.1 but getting mostly errors or unexpected results. Here is a very simple example that I think should work: from pyspark.sql.window impo...
Cardiganshire asked 3/9, 2015 at 13:14

6

Is it possible to count distinct values in conjunction with window functions like OVER(PARTITION BY id)? Currently my query is as follows: SELECT congestion.date, congestion.week_nb, congestion.id...
Jess asked 12/2, 2014 at 13:14

2

Disclaimer: The shown problem is much more general than I expected first. The example below is taken from a solution to another question. But now I was taking this sample for solving many problems ...
Gown asked 13/9, 2018 at 18:21

2

Solved

I was asked this question in a job interview: There is a table with vehicle names mentioned in a column. Output when we check for name=car we must get as 4 i.e the maximum count of continuous occu...
Scanty asked 7/6, 2023 at 12:35

3

Solved

The problem is to fill missing values in a table. In pandas, one can use forward (or backward) filling to do so as shown below: $> import pandas as pd $> df = pd.DataFrame({'x': [None, 1, No...
Stock asked 18/6, 2016 at 12:25

7

Solved

Can someone please explain what the partition by keyword does and give a simple example of it in action, as well as why one would want to use it? I have a SQL query written by someone else and I'm ...
Pritchard asked 18/2, 2009 at 16:31

11

Solved

What's the difference between RANK() and DENSE_RANK() functions? How to find out nth salary in the following emptbl table? DEPTNO EMPNAME SAL ------------------------------ 10 rrr 10000.00 11 nnn ...
Cholecystectomy asked 25/6, 2012 at 4:35

4

Solved

I'm trying fill NULL values in multiple columns (different column types INT, VARCHAR) with previous NOT NULL value in a group ordered by date. Considering following table: CREATE TABLE IF NOT EXIST...
Breastbeating asked 15/2, 2023 at 15:53

1

I have the following model: class Foobar(models.Model): foo = models.IntegerField() And I figured out how to calculate the delta of consecutive foo fields by using window functions: qs = Foobar.o...

2

Solved

Is there an idiomatic equivalent to SQL's window functions in Pandas? For example, what's the most compact way to write the equivalent of this in Pandas? SELECT state_name, state_population, SUM...
Joyann asked 10/1, 2017 at 16:8

5

Solved

I'm trying to solve this particular problem from PGExercises.com: https://www.pgexercises.com/questions/aggregates/rankmembers.html The gist of the question is that I'm given a table of club memb...
Gyrocompass asked 18/12, 2016 at 16:21

3

Solved

In PostgreSQL 9.4 the window functions have the new option of a FILTER to select a sub-set of the window frame for processing. The documentation mentions it, but provides no sample. An online searc...
Maudiemaudlin asked 14/7, 2015 at 2:3

3

Solved

I have a dataframe where I want to give id's in each Window partition. For example I have id | col | 1 | a | 2 | a | 3 | b | 4 | c | 5 | c | So I want (based on grouping with column col) id | ...
Restorative asked 8/5, 2018 at 12:21

6

Solved

I've successfully create a row_number() and partitionBy() by in Spark using Window, but would like to sort this by descending, instead of the default ascending. Here is my working code: from pyspar...
Wynellwynn asked 6/2, 2016 at 22:17

8

Solved

So I have a table as follows: ID_STUDENT | ID_CLASS | GRADE ----------------------------- 1 | 1 | 90 1 | 2 | 80 2 | 1 | 99 3 | 1 | 80 4 | 1 | 70 5 | 2 | 78 6 | 2 | 90 6 | 3 | 50 7 | 3 | 9...
Wireworm asked 10/2, 2009 at 15:52

6

Solved

Suppose I have pandas DataFrame like this: df = pd.DataFrame({'id':[1,1,1,2,2,2,2,3,4], 'value':[1,2,3,1,2,3,4,1,1]}) which looks like: id value 0 1 1 1 1 2 2 1 3 3 2 1 4 2 2 5 2 3 6 2 4 7 3 1 8 ...

3

My dataframe like this id value date 1 100 2017 1 null 2016 1 20 2015 1 100 2014 I would like to get most recent previous value but ignoring null id value date recent value 1 100 2017 20 1 nul...

4

Solved

Does anyone know how to replace nulls in a column with a string until it hits a new string then that string replaces all null values below it? I have a column that looks like this Original Column:...
Jackquelinejackrabbit asked 7/2, 2020 at 0:51

1

I am migrating some queries from PostgreSQL dialect over to BigQuery. One nice pattern in PostgreSQL is DISTINCT ON (key), which returns the first row for every key based on the sequence as defined...

3

Solved

I have a Spark SQL DataFrame with date column, and what I'm trying to get is all the rows preceding current row in a given date range. So for example I want to have all the rows from 7 days back pr...

4

Solved

I have a table author_data: author_id | author_name ----------+---------------- 9 | ernest jordan 14 | k moribe 15 | ernest jordan 25 | william h nailon 79 | howard jason 36 | k moribe ...

3

Solved

Given a table like: person_id contact_day days_last_contact dash_group 1 2015-02-09 1 1 2015-05-01 81 2 1 2015-05-02 1 2 1 2015-05-03 1 2 1 2015-06-01 29 3 1 2015-08-01 61 4 1 ...
Anamorphic asked 9/5, 2022 at 9:0

4

Solved

Consider a PySpark data frame. I would like to summarize the entire data frame, per column, and append the result for every row. +-----+----------+-----------+ |index| col1| col2 | +-----+---------...

2

Solved

I want to calculate the cumulative product across rows in Snowflake. Basically I have monthly rates that multiplied accumulate across time. (Some databases have the product() SQL function for that)...
Sayre asked 29/3, 2022 at 0:25

© 2022 - 2024 — McMap. All rights reserved.