Model
Models for the joint distribution of weekly calendar data.
model = LatentCalendar(n_components=3, random_state=42)
X = df_wide.to_numpy()
model.fit(X)
X_latent = model.transform(X)
X_pred = model.predict(X)
ConjugateModel
Bases: BaseEstimator, TransformerMixin
Conjugate model for the calendar joint distribution.
This is a wrapper around the conjugate model for the multinomial distribution. It is a wrapper around the Dirichlet distribution.
This doesn't use dimensionality reduction, but it does use the conjugate model.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
a
|
ndarray | None
|
(n_times,) prior for each hour of the day. If None, then the prior is the average of the data. |
None
|
Source code in latent_calendar/model/latent_calendar.py
DummyModel
Bases: LatentCalendar
Return even probability of a latent.
This can be used as the worse possible baseline.
Source code in latent_calendar/model/latent_calendar.py
104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 | |
create()
classmethod
fit(X, y=None)
All components are equal probabilty of every hour.
Source code in latent_calendar/model/latent_calendar.py
from_prior(prior)
classmethod
Return a dummy model from a prior.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
prior
|
ndarray | Series
|
prior probability weights over time slots. Can be a numpy
array of shape (n_time_slots,) or a segment Series (e.g. from
|
required |
Returns:
| Type | Description |
|---|---|
DummyModel
|
DummyModel with a single component defined by the prior. |
Example
Build a model that concentrates on weekday mornings:
from latent_calendar import DummyModel
from latent_calendar.segments import create_box_segment
mornings = create_box_segment(
day_start=0, day_end=5, hour_start=7, hour_end=10,
name="Weekday mornings",
)
model = DummyModel.from_prior(mornings)
sampler = model.create_sampler(random_state=0)
df_weights, df_events = sampler.sample(n_samples=[20, 30, 15])
Source code in latent_calendar/model/latent_calendar.py
from_segments(df_segments, weights=None)
classmethod
Return a multi-component model where each segment is one component.
Each row of df_segments becomes one component in the model. The
population-level mixture over components is derived from
component_distribution_ — by default this weights components
proportionally to the number of active slots in each segment. Pass
explicit weights to override this.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df_segments
|
DataFrame
|
segments DataFrame in wide format, shape
(n_segments, n_time_slots), e.g. from |
required |
weights
|
ndarray | list[float] | None
|
optional 1-D array of length n_segments. Scales each component's contribution to the population prior. If None, weighting is proportional to active slot count per segment. |
None
|
Returns:
| Type | Description |
|---|---|
DummyModel
|
DummyModel with one component per segment. |
Example
from latent_calendar import DummyModel
from latent_calendar.segments import create_box_segment, stack_segments
mornings = create_box_segment(
day_start=0, day_end=5, hour_start=7, hour_end=10, name="Mornings"
)
evenings = create_box_segment(
day_start=0, day_end=5, hour_start=18, hour_end=22, name="Evenings"
)
df_segments = stack_segments([mornings, evenings])
# Equal implicit weight (proportional to active slots)
model = DummyModel.from_segments(df_segments)
# Mornings 3x more likely than evenings
model = DummyModel.from_segments(df_segments, weights=[3, 1])
sampler = model.create_sampler(random_state=0)
df_weights, df_events = sampler.sample(n_samples=[10, 20, 15])
Source code in latent_calendar/model/latent_calendar.py
transform(X, y=None)
Everyone has equal probability of being in each group.
LatentCalendar
Bases: LatentDirichletAllocation
Model weekly calendar data as a mixture of multinomial distributions.
Adapted from sklearn's Latent Dirichlet Allocation model.
Provides a predict method that returns the marginal probability of each time slot for a given row and
a transform method that returns the latent representation of each row.
Source code in latent_calendar/model/latent_calendar.py
component_distribution_
property
Population frequency of each component.
normalized_components_
property
Components that each sum to 1.
create_sampler(random_state=None, concentration_scale=1.0)
Create a sampler for generating synthetic calendar data.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
random_state
|
int | None
|
seed for reproducibility |
None
|
concentration_scale
|
float
|
scale for Gamma-perturbing each user's Dirichlet concentration before sampling mixture weights. 1.0 (default) means no perturbation — each user draws from the fixed population prior. Values > 1.0 increase variance across users' mixture weights. |
1.0
|
Returns:
| Type | Description |
|---|---|
|
LatentCalendarSampler bound to this fitted model |
Example
model = LatentCalendar(n_components=5).fit(X) sampler = model.create_sampler(random_state=42) df_weights, df_events = sampler.sample(n_samples=[10, 5, 20])
Source code in latent_calendar/model/latent_calendar.py
joint_distribution(X_latent)
predict(X, y=None)
Return the marginal probabilities for a given row.
Marginalize out the loads via law of total probability
Source code in latent_calendar/model/latent_calendar.py
MarginalModel
Bases: LatentCalendar
Source code in latent_calendar/model/latent_calendar.py
fit(X, y=None)
transform(X, y=None)
constant_prior(X, value=1.0)
Return the prior for each hour of the day.
This is the average of all the rows.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
X
|
ndarray
|
(nrows, n_times) |
required |
Source code in latent_calendar/model/latent_calendar.py
hourly_prior(X)
Return the prior for each hour of the day.
This is the average of all the rows.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
X
|
ndarray
|
(nrows, n_times) |
required |
Returns:
| Type | Description |
|---|---|
ndarray
|
(n_times,) |
Source code in latent_calendar/model/latent_calendar.py
predict_on_dataframe(df, model)
Small wrapper to predict on DataFrame and keep same columns and index.
Source code in latent_calendar/model/utils.py
transform_on_dataframe(df, model)
Small wrapper to transform on DataFrame and keep index.