# Segmentation ### Preambula To get started you need to install glasses, this can be done through `pip` ```bash pip install git+https://github.com/FrancescoSaverioZuppichini/glasses ``` ## Segmentation Segmentation models can be found in `glasses.models.segmentation`. Easily ```python import torch from glasses.models.segmentation import UNet x = torch.randn((1,1, 384, 384)) model = UNet(n_classes=2) out = model(x) out.shape ``` torch.Size([1, 2, 384, 384]) ### Change Encoder In glasses you can on the fly change the encoder of each segmentation model. Each segmentation model inherits from `SegmentationModule` and expects a `Encoder` instance. All glasses classification models are composed of an **encoder** and a **head**, thus changing the encoder is as easy as pass it as a parameter ```python from glasses.models.classification.resnet import ResNetEncoder x = torch.randn((1,1, 384, 384)) model = UNet(encoder=ResNetEncoder, n_classes=2) out = model(x) out.shape ``` torch.Size([1, 2, 192, 192]) Notice how the output is twice as small as in the standard u-net, this is why resnet has 4 stages, a.k.a four layers that reduce by half the input resolution. To match the correct input shape we have to upsample one more. In other words, we need to increase the widths of the net decoder. Similar to classification models, each segmentation model is composed by three sub-modules: **encoder**, **decoder** and **head**. So, we can easily compose them to create any custom model. ```python from glasses.models.segmentation.unet import UNetDecoder from functools import partial x = torch.randn((1,1, 384, 384)) model = UNet(encoder=ResNetEncoder, decoder=partial(UNetDecoder, widths=[512, 256, 128, 64, 32]), n_classes=2) out = model(x) out.shape ``` torch.Size([1, 2, 384, 384]) We used `partial` to change the `widths` parameter of the decoder to match the encoder's stages. Each segmentation model has the `.from_encoder` method that takes a model as input and automatically the model with that model's encoder. ```python from glasses.models import AutoModel x = torch.randn((1,1, 384, 384)) model = UNet.from_encoder(model=partial(AutoModel.from_name, 'efficientnet_b1'), n_classes=2) out = model(x) out.shape ``` torch.Size([1, 2, 192, 192]) ### Pretrained encoders Easily, we can pass a pretrained network using `AutoModel`. In this case, pretrained models on ImageNet expects an input with 3 channels ```python from glasses.models import AutoModel x = torch.randn((1,3, 384, 384)) model = UNet.from_encoder(model=partial(AutoModel.from_pretrained, 'efficientnet_b1'), in_channels=3, n_classes=2) out = model(x) out.shape ``` INFO:root:Loaded efficientnet_b1 pretrained weights. torch.Size([1, 2, 192, 192]) What if we would like to use an input with different channels than 3? We need to replace the stem. **I [am working](https://github.com/FrancescoSaverioZuppichini/glasses/issues/179) on a way to load only a specific subset of weights, so we can directly create a model with a different stem but with all the rest of the weights pretrained** ```python from glasses.models.classification.resnet import ResNetStem def get_encoder(*args, **kwargs): model = AutoModel.from_pretrained('resnet50') # replace the stem model.encoder.stem = ResNetStem(1, model.encoder.start_features) return model.encoder x = torch.randn((1,1, 384, 384)) model = UNet(encoder=get_encoder, in_channels=1, n_classes=2) out = model(x) out.shape ``` INFO:root:Loaded resnet50 pretrained weights. torch.Size([1, 2, 192, 192]) The APIs are shared from all segmentation models. For example, we can also import PFPN ([Panoptic Feature Pyramid Networks](https://arxiv.org/pdf/1901.02446.pdf)) and keep the same code ```python from glasses.models import AutoModel from glasses.models.segmentation import PFPN x = torch.randn((1,1, 384, 384)) model = PFPN.from_encoder(model=partial(AutoModel.from_name, 'efficientnet_b1'), n_classes=2) out = model(x) out.shape ``` torch.Size([1, 2, 384, 384]) In this case the output always match the input, this is due to how PFPN works.