Base Model#
- class glasses_detector.components.base_model.BaseGlassesModel(task: str, kind: str, size: str, weights: bool | str | None = False, device: str | device | None = None)[source]#
Bases:
PredInterface
Base class for all glasses models.
Base class with common functionality, i.e., prediction and weight loading methods, that should be inherited by all glasses models. Child classes must implement
create_model()
method which should return the model architecture based onmodel_info
which is a dictionary containing the model name and the release version. The dictionary depends on the model’skind
andsize
, both of which are used when creating an instance. An instance can be created by providing a custom model instead of creating a predefined one, seefrom_model()
.Note
When
weights
isTrue
, the URL of the weights to be downloaded from will be constructed automatically based onmodel_info
. According toload_state_dict_from_url()
, first, the corresponding weights will be checked if they are already present in the hub cache, which by default is~/.cache/torch/hub/checkpoints
, and, if they are not, the weight will be downloaded there and then loaded.Important
To train the actual model parameters, i.e., the model of type
torch.nn.Module
, retrieve it using themodel
attribute.- Parameters:
task (str) – The task the model is built for. Used when automatically constructing URL to download the weights from.
kind (str) – The kind of the model. Used to access
model_info
.size (str) – The size of the model. Used to access
model_info
.weights (bool | str | None, optional) – Whether to load the pre-trained weights from a custom URL (or a local file if they’re already downloaded) which will be inferred based on model’s
task
,kind
, andsize
. If a string is provided, it will be used as a path or a URL (determined automatically) to the model weights. Defaults toFalse
.device (str | torch.device | None, optional) – Device to cast the model to (once it is loaded). If specified as
None
, it will be automatically checked if CUDA or MPS is supported. Defaults toNone
.
- BASE_WEIGHTS_URL: ClassVar[str] = 'https://github.com/mantasu/glasses-detector/releases/download'#
The base URL to download the weights from.
- ALLOWED_SIZE_ALIASES: ClassVar[set[str]]#
The set of allowed sizes and their aliases for the model. These are used to convert an alias to a standard size when accessing
model_info
. Available aliases are:Small
Medium
Large
small
,little
,s
medium
,normal
,m
,large
,big
,l
Note
Any case is acceptable, for example, Small can be specified as
"small"
,"S"
,"SMALL"
,"Little"
, etc.
- DEFAULT_SIZE_MAP: ClassVar[dict[str, dict[str, str]]]#
The default size map from the size of the model to the model info dictionary which contains the name of the architecture and the version of the weights release. This is just a helper component for
DEFAULT_KIND_MAP
because each default kind has the same set of default models.Example:
>>> [info["name"] for info in DEFAULT_SIZE_MAP.values()] # list of all the available architectures
- DEFAULT_KIND_MAP: ClassVar[dict[str, dict[str, dict[str, str]]]]#
The default map from model
kind
andsize
to the model info dictionary. The model info is used to construct the URL to download the weights from. The nested dictionary has 3 levels which are expected to be as follows:kind
- the kind of the modelsize
- the size of the modelinfo
- the model info, i.e.,"name"
and"version"
Example:
>>> DEFAULT_KIND_MAP["<kind>"]["<size>"] {'name': '<architecture-name>', 'version': '<release-version>'}
- property model_info: dict[str, str]#
Model info property.
This contains the information about the model used (e.g., architecture and weights). By default, it should have 2 fields:
"name"
and"version"
, both of which are used when initializing the architecture and looking for pretrained weights (seeload_weights()
).Note
This is the default implementation which accesses
DEFAULT_KIND_MAP
based onkind
andsize
. Child classes can override eitherDEFAULT_KIND_MAP
or this property itself for a custom dictionary.
- abstract static create_model(self, model_name: str) Module [source]#
Creates the model architecture.
Takes the name of the model architecture and returns the corresponding model instance.
- Parameters:
model_name (str) – The name of the model architecture to create. For available architectures, see the class description (Size Information table) or
DEFAULT_SIZE_MAP
.- Returns:
The model instance with the corresponding architecture.
- Return type:
- Raises:
ValueError – If the architecture for the model name is not implemented or is not valid.
- classmethod from_model(model: Module, **kwargs) Self [source]#
Creates a glasses model from a custom
torch.nn.Module
.Creates a glasses model wrapper for a custom provided
torch.nn.Module
, instead of creating a predefined one based onkind
andsize
.Note
Make sure the provided model’s
forward
method behaves as expected, i.e., returns the prediction in expected format for compatibility withpredict()
.Warning
model_info
property will not be useful as it would return an empty dictionary for custom specifiedkind
andsize
(if specified at all).- Parameters:
model (torch.nn.Module) – The custom model that will be assigned as
model
.**kwargs – Keyword arguments to pass to the constructor; check the documentation of this class for more details. If
task
,kind
, andsize
are not provided, they will be set to"custom"
. If the model architecture is custom, you may still specify the path to the pretrained wights viaweights
argument. Finally, ifdevice
is not provided, the model will remain on the same device as is.
- Returns:
The glasses model wrapper of the same class type from which this method was called for the provided custom model.
- Return type:
- predict(image: FilePath | Image | ndarray, format: Callable[[Any], Default] | Callable[[Image, Any], Default] = lambda x: ..., input_size: tuple[int, int] | None = (256, 256)) Default [source]#
- predict(image: Collection[FilePath | Image | ndarray], format: Callable[[Any], Default] | Callable[[Image, Any], Default] = lambda x: ..., input_size: tuple[int, int] | None = (256, 256)) list[Default]
Predicts based on the model specified by the child class.
Takes a path or multiple paths to image files or the loaded images themselves and outputs a formatted prediction generated by the child class.
Note
This method expects that
forward()
always returns anIterable
of any type of predictions (typically, they would be of typeTensor
), even if there is only one prediction. Likewise,Tensor
representing a batch of loaded images is passed toforward()
when generating those predictions.Important
If the image is provided as
numpy.ndarray
, make sure the last dimension specifies the channels, i.e., last dimension should be of size1
or3
. If it is anything else, e.g., if the shape is(3, H, W)
, whereW
is neither1
nor3
, this would be interpreted as 3 grayscale images.See also
- Parameters:
image (FilePath | PIL.Image.Image | numpy.ndarray | Collection[FilePath | PIL.Image.Image | numpy.ndarray]) – The path(-s) to the image to generate the prediction for or the image(-s) itself represented as
Image.Image
or as anumpy.ndarray
. Note that the image should have values between 0 and 255 and be of RGB format. Normalization is not needed as the channels will be automatically normalized before passing through the network.format (Callable[[Any], Default] | (Callable[[PIL.Image.Image, Any], Default], optional) – Format callback. This is a custom function that takes the predicted elements from the iterable output of
forward()
(elements are usually of typeTensor
) as input or the original image and its prediction as inputs (it will be determined automatically which function it is) and outputs a formatted prediction of typeDefault
. Defaults tolambda x: str(x)
.input_size (tuple[int, int] | None, optional) – The size (width, height), or
(W, H)
, to resize the image to before passing it through the network. IfNone
, the image will not be resized. It is recommended to resize it to the size the model was trained on, which by default is(256, 256)
. Defaults to(256, 256)
.
- Returns:
The formatted prediction or a list of formatted predictions if multiple images were provided.
- Return type:
- forward(x: Tensor) Iterable[Any] [source]#
Performs forward pass.
Calls the forward method of the inner
model
, by passing a batch of images as its first argument.Tip
If this method is used during inference, make sure to set the model to evaluation mode and enable
inference_mode
, e.g., viaeval_infer_mode
decorator/context manager.Note
The default
predict()
that uses this method assumes an input is a batch of images of typeTensor
and the output can be anything that isIterable
, e.g., aTensor
.Warning
In case of a custom inner
model
(e.g., if the instance was created usingfrom_model()
) that does not accept a tensor representing a batch of images as its first argument, this method will not work, in which casepredict()
will also not work.See also
- load_weights(path_or_url: str | bool = True)[source]#
Loads inner
model
weights.Takes a path of a URL to the weights file, or
True
to construct the URL automatically based onmodel_info
and loads the weights intomodel
.Note
If the weights are already downloaded, they will be loaded from the hub cache, which by default is
~/.cache/torch/hub/checkpoints
.Warning
If the fields in
model_info
are not recognized, e.g., by providing an unrecognizedkind
orsize
or by initializing withfrom_model()
, this method will not be able to construct the URL (ifpath_or_url
isTrue
) and will raise a warning.