ИМКТ 2021. Машинное обучение 1

Задача 5A. Poisson regression ≡

задачи
видимые пробелы
простые формулы
широкий текст
редактор

Входной файл:	Стандартный вход	Ограничение времени:	2 сек
Выходной файл:	Стандартный выход	Ограничение памяти:	512 Мб
Максимальный балл:	1

Условие

Требуется написать на языке Python класс PoissonRegression, реализующий обобщённую линейную модель, использующую логарифмическую функцию связи

f(x, θ) = exp(n∑i = 1θ_i x_i + θ₀) ,

где n — количество признаков.

Модель оптимизирует следующую функцию потерь

l(X, y, θ) = 1mm∑i = 1(y_ilogy_if(X_i, θ) − y_i + f(X_i, θ)) + α2n∑i = 1θ2i ,

где m — количество примеров, α — параметр регуляризации.

Для оценки качества модели используется метрика D²

D² = 1 − D(y, ŷ)D(y, y) ,

где y — среднее значение вектора y,

D(y, ŷ) = 2mm∑i = 1(y_ilogy_iŷ_i − y_i + ŷ_i) .

Класс должен иметь следующий интерфейс


from __future__ import annotations

import numpy as np


class PoissonRegression:
    r"""Implements a generalized linear model with log link function[1]

        f(x, \theta) = \exp(\sum\limits_{i=1}^n\theta_ix_i + \theta_0)

    The model optimizes the following loss

        l(X, y, \theta) = \frac{1}{m}\sum\limits_{i=1}^n(y_i\log\frac{y_i}{f(X_i, \theta)} - y_i + f(X_i, \theta)) + \frac{\alpha}{2}\sum\limits_{i=1}^n\theta_i^2

    where n, m are numbers of features and samples respectively, $\alpha$ -- regularization strength.

    Parameters:

        use_bias: bool, default = True
            Whether the model uses bias term.

        alpha: float, default = 1
            L2 regularization strength.

    References:

        [1]: https://en.wikipedia.org/wiki/Generalized_linear_model#Link_function
    """
    
    def __init__(self, use_bias: bool = True, alpha: float = 1):
        """Initializes `use_bias` and `alpha` fields."""
        pass


    def predict(self, X: np.ndarray) -> np.ndarray:
        """Computes model value on feature matrix `X`.

        Arguments:

            X: 2d array of float, feature matrix.

        Returns:

            1d array of float, predicted values.
        """
        pass


    def loss(self, X: np.ndarray, y: np.ndarray) -> float:
        """Computes loss value on feature matrix `X` with target values `y`.

        Arguments:

            X: 2d array of float, feature matrix.
            y: 1d array of int, target values.

        Returns:

            loss value
        """
        pass


    def score(self, X: np.ndarray, y: np.ndarray) -> float:
        r"""Computes D^2 score on feature matrix `X` with target values `y`.

        The score is defined as

            D^2 = 1 - \frac{D(y, \hat{y})}{D(y, \overline{y})}

        where D(y, \hat{y}) = 2(y\log\frac{y}{\hat{y}} - y + \hat{y}).

        Arguments:
            
            X: 2d array of float, feature matrix.
            y: 1d array of int, target values.

        Returns:

            score value
        """
        pass


    def fit(self, X: np.ndarray, y: np.ndarray, *,
            eta: float = 1e-3, beta1: float = 0.9, beta2: float = 0.999,
            epsilon: float = 1e-8, tol: float = 1e-3, max_iter: int = 1000) -> PoissonRegression:
        """Optimizes model loss w.r.t model coefficitents using Adam[1] optimization algorithm.
        Initially model coefficients are initialized with zeros.

        Arguments:

            X: 2d array of float, feature matrix.
            y: 1d array of int, target values.
            eta: learning rate.
            beta1: first moment decay rate.
            beta2: second moment decay rate.
            tol: threshold for L2-norm of gradient.
            max_iter: maximal number of iterations.

        Returns:

            self

        References:

            [1]: https://ruder.io/optimizing-gradient-descent/index.html#adam
        """
        pass


    @property
    def coeffs(self) -> np.ndarray:
        """Returns 1d array of float, coefficients used by the model excluding bias term."""
        pass
    

    @property
    def bias(self) -> float:
        """Returns bias term used by the model. If `use_bias = False` returns 0."""
        pass

Для оптимизации должен быть использован алгоритм Adam. Оптимизация начинается с нулевой начальной точки θ = {0}ni = 0.

При решении задачи запрещено использовать любые модули, кроме numpy и scipy.special.

Формат выходных данных

Код решения должен содержать импортируемые модули, определение и реализацию класса.