빅쿼리(Bigquery)

Window 함수 정리(UNBOUNDED PRECEDING?) 누적합

ROSEV 2024. 4. 2. 20:23

window 함수

SUM, AVG, FIRST_VALUE, LAST_VALUE…

OVER (PARTITION BY <Column> ORDER BY <Column> [Rows / Range] between

[CURRENT ROW / UNBOUNDED PRECEDING / UNBOUNDED FOLLOWING] : 시작

AND

[CURRENT ROW / UNBOUNDED PRECEDING / UNBOUNDED FOLLOWING] : 마지막

rows_range : A clause that defineds a window frame with physical rows or a logical range

[Rows / Range] 차이

  • rows : 물리적인 위치(Physical offsets) 를 사용하는 것을 정의한다.
  • Range : row들의 논리적인 범위를 사용하는 것을 정의한다

[CURRENT ROW / UNBOUNDED PRECEDING / UNBOUNDED FOLLOWING] 차이

  • CURRENT ROW : The window frame starts at the current row
  • UNBOUNDED PRECEDING : The window frame starts at the beginning of the partition
  • UNBOUNDED FOLLOWING : The window frame ends at the end of the partition

 

아래는 Bigquery 공식 문서에 대한 설명

 

window_frame_clause:
  { rows_range } { frame_start | frame_between }

rows_range:
  { ROWS | RANGE }

frame_between:
  {
    BETWEEN  unbounded_preceding AND frame_end_a
    | BETWEEN numeric_preceding AND frame_end_a
    | BETWEEN current_row AND frame_end_b
    | BETWEEN numeric_following AND frame_end_c
  }

frame_start:
  { unbounded_preceding | numeric_preceding | [ current_row ] }

frame_end_a:
  { numeric_preceding | current_row | numeric_following | unbounded_following }

frame_end_b:
  { current_row | numeric_following | unbounded_following }

frame_end_c:
  { numeric_following | unbounded_following }

unbounded_preceding:
  UNBOUNDED PRECEDING

numeric_preceding:
  numeric_expression PRECEDING

unbounded_following:
  UNBOUNDED FOLLOWING

numeric_following:
  numeric_expression FOLLOWING

current_row:
  CURRENT ROW

 

 

아래는 실 사용 예시입니다.

 

WITH example_table AS (
    SELECT * FROM (
        SELECT DATE '2024-04-01' AS date_column, 'A' AS column1, 'Y' AS column2, 10 AS column3 UNION ALL
        SELECT DATE '2024-04-01', 'A', 'Y', 20 UNION ALL
        SELECT DATE '2024-04-01', 'A', 'Z', 30 UNION ALL
        SELECT DATE '2024-04-01', 'A', 'Z', 30 UNION ALL
        SELECT DATE '2024-04-02', 'A', 'X', 15 UNION ALL
        SELECT DATE '2024-04-02', 'A', 'Y', 25 UNION ALL
        SELECT DATE '2024-04-02', 'A', 'Z', 35
    ) AS t
)
SELECT
  date_column,
  column1,
  column2,
  SUM(column3) OVER (PARTITION BY column1, column2 , FORMAT_DATE('%Y%m01', date_column) ORDER BY date_column ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS running_total
FROM
  example_table
  WHERE TRUE