Base Model#

class glasses_detector.components.base_model.BaseGlassesModel(task: str, kind: str, size: str, weights: bool | str | None = False, device: str | device | None = None)[source]#

Bases: PredInterface

Base class for all glasses models.

Base class with common functionality, i.e., prediction and weight loading methods, that should be inherited by all glasses models. Child classes must implement create_model() method which should return the model architecture based on model_info which is a dictionary containing the model name and the release version. The dictionary depends on the model’s kind and size, both of which are used when creating an instance. An instance can be created by providing a custom model instead of creating a predefined one, see from_model().

Note

When weights is True, the URL of the weights to be downloaded from will be constructed automatically based on model_info. According to load_state_dict_from_url(), first, the corresponding weights will be checked if they are already present in the hub cache, which by default is ~/.cache/torch/hub/checkpoints, and, if they are not, the weight will be downloaded there and then loaded.

Important

To train the actual model parameters, i.e., the model of type torch.nn.Module, retrieve it using the model attribute.

Parameters:
  • task (str) – The task the model is built for. Used when automatically constructing URL to download the weights from.

  • kind (str) – The kind of the model. Used to access model_info.

  • size (str) – The size of the model. Used to access model_info.

  • weights (bool | str | None, optional) – Whether to load the pre-trained weights from a custom URL (or a local file if they’re already downloaded) which will be inferred based on model’s task, kind, and size. If a string is provided, it will be used as a path or a URL (determined automatically) to the model weights. Defaults to False.

  • device (str | torch.device | None, optional) – Device to cast the model to (once it is loaded). If specified as None, it will be automatically checked if CUDA or MPS is supported. Defaults to None.

BASE_WEIGHTS_URL: ClassVar[str] = 'https://github.com/mantasu/glasses-detector/releases/download'#

The base URL to download the weights from.

Type:

ClassVar[str]

ALLOWED_SIZE_ALIASES: ClassVar[set[str]]#

The set of allowed sizes and their aliases for the model. These are used to convert an alias to a standard size when accessing model_info . Available aliases are:

Small

Medium

Large

small, little, s

medium, normal, m,

large, big, l

Note

Any case is acceptable, for example, Small can be specified as "small", "S", "SMALL", "Little", etc.

Type:

ClassVar[set[str]]

DEFAULT_SIZE_MAP: ClassVar[dict[str, dict[str, str]]]#

The default size map from the size of the model to the model info dictionary which contains the name of the architecture and the version of the weights release. This is just a helper component for DEFAULT_KIND_MAP because each default kind has the same set of default models.

Example:

>>> [info["name"] for info in DEFAULT_SIZE_MAP.values()]
# list of all the available architectures
Type:

ClassVar[dict[str, dict[str, str]]]

DEFAULT_KIND_MAP: ClassVar[dict[str, dict[str, dict[str, str]]]]#

The default map from model kind and size to the model info dictionary. The model info is used to construct the URL to download the weights from. The nested dictionary has 3 levels which are expected to be as follows:

  1. kind - the kind of the model

  2. size - the size of the model

  3. info - the model info, i.e., "name" and "version"

Example:

>>> DEFAULT_KIND_MAP["<kind>"]["<size>"]
{'name': '<architecture-name>', 'version': '<release-version>'}
Type:

ClassVar[dict[str, dict[str, dict[str, str]]]]

property model_info: dict[str, str]#

Model info property.

This contains the information about the model used (e.g., architecture and weights). By default, it should have 2 fields: "name" and "version", both of which are used when initializing the architecture and looking for pretrained weights (see load_weights()).

Note

This is the default implementation which accesses DEFAULT_KIND_MAP based on kind and size. Child classes can override either DEFAULT_KIND_MAP or this property itself for a custom dictionary.

Returns:

The model info dictionary with 2 fields - "name" and "version" which allow to construct model architecture and download the pretrained model weights, if present.

Return type:

dict[str, str]

abstract static create_model(self, model_name: str) Module[source]#

Creates the model architecture.

Takes the name of the model architecture and returns the corresponding model instance.

Parameters:

model_name (str) – The name of the model architecture to create. For available architectures, see the class description (Size Information table) or DEFAULT_SIZE_MAP.

Returns:

The model instance with the corresponding architecture.

Return type:

torch.nn.Module

Raises:

ValueError – If the architecture for the model name is not implemented or is not valid.

classmethod from_model(model: Module, **kwargs) Self[source]#

Creates a glasses model from a custom torch.nn.Module.

Creates a glasses model wrapper for a custom provided torch.nn.Module, instead of creating a predefined one based on kind and size.

Note

Make sure the provided model’s forward method behaves as expected, i.e., returns the prediction in expected format for compatibility with predict().

Warning

model_info property will not be useful as it would return an empty dictionary for custom specified kind and size (if specified at all).

Parameters:
  • model (torch.nn.Module) – The custom model that will be assigned as model.

  • **kwargs – Keyword arguments to pass to the constructor; check the documentation of this class for more details. If task, kind, and size are not provided, they will be set to "custom". If the model architecture is custom, you may still specify the path to the pretrained wights via weights argument. Finally, if device is not provided, the model will remain on the same device as is.

Returns:

The glasses model wrapper of the same class type from which this method was called for the provided custom model.

Return type:

Self

predict(image: FilePath | Image | ndarray, format: Callable[[Any], Default] | Callable[[Image, Any], Default] = lambda x: ..., input_size: tuple[int, int] | None = (256, 256)) Default[source]#
predict(image: Collection[FilePath | Image | ndarray], format: Callable[[Any], Default] | Callable[[Image, Any], Default] = lambda x: ..., input_size: tuple[int, int] | None = (256, 256)) list[Default]

Predicts based on the model specified by the child class.

Takes a path or multiple paths to image files or the loaded images themselves and outputs a formatted prediction generated by the child class.

Note

This method expects that forward() always returns an Iterable of any type of predictions (typically, they would be of type Tensor), even if there is only one prediction. Likewise, Tensor representing a batch of loaded images is passed to forward() when generating those predictions.

Important

If the image is provided as numpy.ndarray, make sure the last dimension specifies the channels, i.e., last dimension should be of size 1 or 3. If it is anything else, e.g., if the shape is (3, H, W), where W is neither 1 nor 3, this would be interpreted as 3 grayscale images.

See also

forward()

Parameters:
  • image (FilePath | PIL.Image.Image | numpy.ndarray | Collection[FilePath | PIL.Image.Image | numpy.ndarray]) – The path(-s) to the image to generate the prediction for or the image(-s) itself represented as Image.Image or as a numpy.ndarray. Note that the image should have values between 0 and 255 and be of RGB format. Normalization is not needed as the channels will be automatically normalized before passing through the network.

  • format (Callable[[Any], Default] | (Callable[[PIL.Image.Image, Any], Default], optional) – Format callback. This is a custom function that takes the predicted elements from the iterable output of forward() (elements are usually of type Tensor) as input or the original image and its prediction as inputs (it will be determined automatically which function it is) and outputs a formatted prediction of type Default. Defaults to lambda x: str(x).

  • input_size (tuple[int, int] | None, optional) – The size (width, height), or (W, H), to resize the image to before passing it through the network. If None, the image will not be resized. It is recommended to resize it to the size the model was trained on, which by default is (256, 256). Defaults to (256, 256).

Returns:

The formatted prediction or a list of formatted predictions if multiple images were provided.

Return type:

Default | list[Default]

forward(x: Tensor) Iterable[Any][source]#

Performs forward pass.

Calls the forward method of the inner model, by passing a batch of images as its first argument.

Tip

If this method is used during inference, make sure to set the model to evaluation mode and enable inference_mode, e.g., via eval_infer_mode decorator/context manager.

Note

The default predict() that uses this method assumes an input is a batch of images of type Tensor and the output can be anything that is Iterable, e.g., a Tensor.

Warning

In case of a custom inner model (e.g., if the instance was created using from_model()) that does not accept a tensor representing a batch of images as its first argument, this method will not work, in which case predict() will also not work.

See also

predict()

Parameters:

x (Tensor) – A batch of images - a Tensor of shape (N, C, H, W) with normalized pixel values between 0 and 1.

Returns:

An iterable of predictions (one for each input). Usually, it is a Tensor with the first dimension of size N which is the batch size of the original input.

Return type:

Iterable[Any]

load_weights(path_or_url: str | bool = True)[source]#

Loads inner model weights.

Takes a path of a URL to the weights file, or True to construct the URL automatically based on model_info and loads the weights into model.

Note

If the weights are already downloaded, they will be loaded from the hub cache, which by default is ~/.cache/torch/hub/checkpoints.

Warning

If the fields in model_info are not recognized, e.g., by providing an unrecognized kind or size or by initializing with from_model(), this method will not be able to construct the URL (if path_or_url is True) and will raise a warning.

Parameters:

path_or_url (str | bool, optional) – The path or the URL (it will be inferred automatically) to the model weights (.pth file). It can also be bool, in which case True indicates to construct URL for the pre-trained weights and False does nothing. Defaults to True.