Help with querying moving averages of a few types including fiscal year moving averages

I am interested in general tips around moving averages and techniques to display them.

First question - general technique to display moving averages:
I expect to bind a query beyond the range desired to display.
I expect to use a script to trim the dataset. This is typical?

If showing 60 days with a 200ma, I get 260 days of data and trim the first 200?
Or is it better to do a union, and self contain the query?

Second question
If I am displaying the last 60 days as stacked bars in a chart, what is the way to address days where I have no data without disturbing my average?
Can I do it in the query in a good way?

Third question
Handling overlaps, like I have a moving average from the beginning of the fiscal year, how do I address the display of time over the fiscal start date?
Just let the group by function address it?
So my subquery groups days, then I have an outer query that groups on my fiscal year?

Fourth question
Is this good technique?
I am not sure how I can filter days in an over() function either.
Use a script to fill in the 0 dates afterwards? (0 dates, like Sundays or holidays where I have no data)
Or there is a way to union 0 dates in the query?

  avg(Price) OVER(ORDER BY Date
     ROWS BETWEEN 2 PRECEDING AND CURRENT ROW )
     as moving_average

I am pretty sure that I can accomplish this task, but if I can get some tips from experience, it is appreciated. Thank you in advance.

Why are the additional 200 days needed? Can't the calculations be done in the query, so only what is needed is returned?

The days with no data could be filtered out within the query itself

Can you provide some pseudo sample of the data you will be pulling from along with an expected result? I imagine that would help a bunch with getting a satisfactory answer to the third and fourth questions.

I am still looking at techniques and collecting experience information.

I thought because stock market uses so many of these charts, that it might be likely someone with experience from that would post saying what worked best typically.

Reply to edit #2:

  1. Yes, trim. No reason not to trim with an outer query. Maybe with a second outer query to return the order to ascending.
Select Top 60 ..... Order by t_stamp Desc
  1. Use a table-generating function that yields all possible dates in your analysis range. Outer join to your daily-grouped data subquery to deliver zeros for those days if appropriate, or nulls for those days (which will make them count for the row offsets in your window function range, but will not affect the result).

  2. Do all of your calculating and grouping with actual datetime types. Conversion to displayable strings is a task for the UI. Do not format dates in SQL.

  3. Yes, window functions with explicit row ranges are your friend.

1 Like

One important note I discovered:

  avg(case when Price = 0 then null else Price end) OVER(ORDER BY Date
     ROWS BETWEEN 2 PRECEDING AND CURRENT ROW )
     as moving_average

@pturmel

They asked for the most recent date to be on the right side of my graph, so the bottom row.
So I order by a date ascending.
How do I trim for the bottom 60 instead of the top 60?
I attempted to use limit 60 offset 200, but it did not work. Limit doesn't seem to be a keyword.

Oh I can use rowNumber and where cause.

No, don't. Wrap with a second outer query to order by t_stamp. That's why I included that statement in the answer I gave.

2 Likes

I can have two orders?
For my graph, I need order by t_stamp asc effectively.

Yes, but only because the one on the inner subquery is associated with a TOP clause.

2 Likes

Thanks very much


I am trying to use a variable range now.
I am getting an error when I try to set the rows preceding as a variable.

Error syntax near t1

avg(case when OEE=0 then null else OEE end) OVER(ORDER BY Start asc ROWS BETWEEN t1.test PRECEDING AND CURRENT ROW )

If I try setting a datediff() in there, I also get an error incorrect syntax near 'datediff'.

avg(case when OEE=0 then null else OEE end) OVER(ORDER BY Start asc ROWS BETWEEN (153-1) PRECEDING AND CURRENT ROW )
This one gives me an error like " Incorrect syntax near '('."

I tried declaring an int as the value. Getting an error of incorrect syntax near @number


I can kind of fudge it I think. I can kind of set the window function to shorter than my preceding rows.
Then I set my total range to be shorter in days than my window. I think it works.

Not sure that is supported by the ROWS BETWEEN clause. Why do you think you need to do this?

1 Like

I wanted to do a 50 day moving average, 200 day moving average, year to date moving average, and a fiscal year to date moving average.

Right now, I can only really do the fiscal moving average because in order to dynamically adjust the range, I used a where clause.

If I could use a variable in the number of preceding rows, it would be pretty easy.

Thought maybe on the case side, I could adjust for the date range.

avg(case when OEE=0 or datediff(day,start,getdate())>
datediff(day,DATEADD(year, DATEDIFF(year, 0, GETDATE()), 0),getdate()) 
then null else OEE end) 
OVER(ORDER BY Start asc ROWS BETWEEN (364) PRECEDING AND CURRENT ROW )