Fix data pipelines once and for all
This commit is contained in:
parent
08560476c5
commit
4012e4e047
46 changed files with 4508 additions and 855 deletions
|
|
@ -36,6 +36,20 @@ SHRINKAGE_K = 50
|
|||
# noisy year) without flattening genuine multi-year trends.
|
||||
TEMPORAL_SMOOTHNESS_LAMBDA = 0.05
|
||||
|
||||
# Per-year support scaling for the temporal smoothness penalty. A flat lambda
|
||||
# is too weak for years with very few repeat-sale pairs: a sector can have
|
||||
# hundreds of pairs overall (so cell-level n/(n+k) shrinkage barely moves it)
|
||||
# yet have individual years estimated from 1-2 pairs, producing 2-7x
|
||||
# single-year index spikes. Each curvature row is therefore scaled by the
|
||||
# local pair support of its year triple:
|
||||
# lambda_eff = lambda0 * (1 + SMOOTHNESS_SUPPORT_PAIRS / s)
|
||||
# where s is the minimum cross-year pair count among the triple's years.
|
||||
# Well-supported years (s >> SMOOTHNESS_SUPPORT_PAIRS) keep lambda_eff ~
|
||||
# lambda0 (current behaviour); a year identified by a single pair gets
|
||||
# ~41x lambda0, pulling its beta strongly toward the local trend through its
|
||||
# neighbours. Same-year pairs cancel in the design and are not counted.
|
||||
SMOOTHNESS_SUPPORT_PAIRS = 40
|
||||
|
||||
|
||||
def type_group_expr():
|
||||
"""Polars expression: Property type -> type_group."""
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue