<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet href="https://konrad.earth/feed_style.xsl" type="text/xsl"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
    <tabi:metadata xmlns:tabi="https://github.com/welpo/tabi">
        <tabi:base_url>https:&#x2F;&#x2F;konrad.earth</tabi:base_url>
        <tabi:separator>
            •
        </tabi:separator>
        <tabi:about_feeds>This is a web feed, also known as an Atom feed. Subscribe by copying the URL from the address bar into your newsreader. Visit About Feeds to learn more and get started. It&#x27;s free.</tabi:about_feeds>
        <tabi:visit_the_site>Visit website</tabi:visit_the_site>
        <tabi:recent_posts>Recent posts</tabi:recent_posts>
        <tabi:last_updated_on>Updated on $DATE</tabi:last_updated_on>
        <tabi:default_theme></tabi:default_theme>
        <tabi:post_listing_date>date</tabi:post_listing_date>
        <tabi:current_section>Paper</tabi:current_section>
    </tabi:metadata><title>konrad.earth - Paper</title>
        <subtitle>Konrad Heidler</subtitle>
    <link href="https://konrad.earth/tags/paper/atom.xml" rel="self" type="application/atom+xml"/>
    <link href="https://konrad.earth/tags/paper/" rel="alternate" type="text/html"/>
    <generator uri="https://www.getzola.org/">Zola</generator><updated>2024-06-03T00:00:00+00:00</updated><id>https://konrad.earth/tags/paper/atom.xml</id><entry xml:lang="en">
        <title>PixelDINO: Semi-Supervised Semantic Segmentation for Detecting Permafrost Disturbances</title>
        <published>2024-06-03T00:00:00+00:00</published>
        <updated>2024-06-03T00:00:00+00:00</updated>
        <author>
            <name>Konrad Heidler</name>
        </author>
        <link rel="alternate" href="https://konrad.earth/blog/pixeldino/" type="text/html"/>
        <id>https://konrad.earth/blog/pixeldino/</id>
        
            <content type="html">&lt;p&gt;In sync with the changing climate, permafrost is undergoing rapid transformations.
As temperatures rise, the frozen ground starts to thaw, which has various consequences.
Not only does permafrost thawing pose risks to local infrastructure such as roads and buildings,
but it also tightly connected to the global climate system by potentially releasing stored carbon into the atmosphere.&lt;&#x2F;p&gt;
&lt;figure class=&quot;w70&quot;&gt;
    
    &lt;img src=permafrost_globe.png#center &#x2F;&gt;
    &lt;figcaption&gt;
      
      &lt;p&gt;
         Distribution of permafrost in the northern hemisphere. (Visualisation based on &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;nsidc.org&#x2F;data&#x2F;ggd318&#x2F;versions&#x2F;2&quot;&gt;data from NSICD&lt;&#x2F;a&gt;) 
        
        
        
      &lt;&#x2F;p&gt;
    &lt;&#x2F;figcaption&gt;
&lt;&#x2F;figure&gt;
&lt;p&gt;Covering more than 10% of the Earth’s land surface, permafrost areas are often remote and sparsely populated, making them difficult to monitor through traditional means. In-situ measurements are limited to specific locations and times, usually when expeditions visit these sites or when local sensors collect data.
To overcome these limitations, remote sensing is a more efficient alternative for monitoring permafrost.&lt;&#x2F;p&gt;
&lt;h2 id=&quot;permafrost-remote-sensing&quot;&gt;Permafrost Remote Sensing&lt;&#x2F;h2&gt;
&lt;figure &gt;
    
    &lt;img src=bykovsky.jpg &#x2F;&gt;
    &lt;figcaption&gt;
      
      &lt;p&gt;
         Coastal retrogressive thaw slump on the Bykovsky Peninsula in northern Siberia. © &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;www.awi.de&#x2F;ueber-uns&#x2F;organisation&#x2F;mitarbeiter&#x2F;detailseite&#x2F;ingmar-nitze.html&quot;&gt;Ingmar Nitze (AWI)&lt;&#x2F;a&gt; 
        
        
        
      &lt;&#x2F;p&gt;
    &lt;&#x2F;figcaption&gt;
&lt;&#x2F;figure&gt;
&lt;p&gt;With satellite imagery, we can easily get observations of the entire Arctic.
While permafrost itself is mostly subsurface, we can monitor surface indicators closely linked to permafrost health or degradation.
One such example are retrogressive thaw slumps (RTS), slow landslides resulting from the thawing of ice-rich permafrost.
Despite their small size and scattered distribution, RTS can be detected in satellite images due to their distinct shape and spectral signature.&lt;&#x2F;p&gt;
&lt;p&gt;But here’s the catch:
While deep learning algorithms show promise in identifying RTS from satellite images, they need huge amounts of labeled training data.
In fact, only a very small fraction of the Arctic has been labelled for RTS:&lt;&#x2F;p&gt;
&lt;p&gt;&lt;a name=&quot;footprints&quot; id=&quot;footprints&quot;&gt;&lt;&#x2F;a&gt;
&lt;div class=&quot;applet&quot;&gt;
  &lt;iframe src=https:&amp;#x2F;&amp;#x2F;maps.heidler.info&amp;#x2F;rts-overview&amp;#x2F; allow=&quot;fullscreen&quot; allowfullscreen&gt;&lt;&#x2F;iframe&gt;
  
    &lt;button
      type=&quot;button&quot;
      class=&quot;fullscreen&quot;
      aria-label=&quot;Open applet fullscreen&quot;
      title=&quot;Fullscreen&quot;
    &gt;
      &lt;svg viewBox=&quot;0 0 24 24&quot; aria-hidden=&quot;true&quot;&gt;
        &lt;path
          d=&quot;M5 9V5h4M15 5h4v4M19 15v4h-4M9 19H5v-4&quot;
          fill=&quot;none&quot;
          stroke=&quot;currentColor&quot;
          stroke-width=&quot;2&quot;
          stroke-linecap=&quot;round&quot;
        &#x2F;&gt;
      &lt;&#x2F;svg&gt;
    &lt;&#x2F;button&gt;
  
&lt;&#x2F;div&gt;
&lt;&#x2F;p&gt;
&lt;p&gt;Acquiring labelled data is no easy feat, as permafrost experts need to manually look through large satellite imagery archives and label examples pixel-by-pixel.
Clearly, it is impossible to cover large fractions of the Arctic in this way.
So ideally, we are looking for ways to get our models to generalize to new locations without the need for extensive labeled data.
Therefore, we are exploring a new way of enhancing this generalisation ability without additional labels in &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;2401.09271&quot;&gt;our recent study&lt;&#x2F;a&gt;.&lt;&#x2F;p&gt;
&lt;h2 id=&quot;what-is-dino&quot;&gt;What is DINO?&lt;&#x2F;h2&gt;
&lt;p&gt;In an ideal setup, we can not only use the existing labelled data, but also teach the model to extract knowledge from unlabelled imagery of previously unseen regions.
This hybrid setup is known as &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;www.ibm.com&#x2F;topics&#x2F;semi-supervised-learning&quot;&gt;&lt;em&gt;semi-supervised learning&lt;&#x2F;em&gt;&lt;&#x2F;a&gt;.&lt;&#x2F;p&gt;
&lt;p&gt;&lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;ai.meta.com&#x2F;blog&#x2F;dino-paws-computer-vision-with-self-supervised-transformers-and-10x-more-efficient-training&#x2F;&quot;&gt;DINO&lt;&#x2F;a&gt;
is a method for training AI models without any labelled examples.
Instead of relying on human-labeled data, it lets the computer figure out its own way to recognize objects in pictures and classify them.
Intuitively, it gives the network the following rules:&lt;&#x2F;p&gt;
&lt;ol&gt;
&lt;li&gt;Assign a class label to each training image&lt;&#x2F;li&gt;
&lt;li&gt;Make sure that the class labels remain the same when the image is transformed&lt;&#x2F;li&gt;
&lt;li&gt;Make use of all available labels (given a predefined number of them)&lt;&#x2F;li&gt;
&lt;&#x2F;ol&gt;
&lt;p&gt;To do this, the DINO learning process introduces two key players: the &lt;em&gt;student&lt;&#x2F;em&gt; and the &lt;em&gt;teacher&lt;&#x2F;em&gt;.
They work together to learn from images through a process called &lt;em&gt;self-distillation&lt;&#x2F;em&gt;, making sure to obey the rules stated above.
The teacher starts by guessing what’s in an image, then the student tries to match those guesses while also learning from the image itself.
It’s like a teacher guiding a student, but in this case, both of them are AI models learning together.&lt;&#x2F;p&gt;
&lt;p&gt;The transformations introduced by the second rule are called &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;viso.ai&#x2F;computer-vision&#x2F;image-data-augmentation-for-computer-vision&#x2F;&quot;&gt;Data Augmentations&lt;&#x2F;a&gt;.
By randomly applying operations to the input images, we can change the layout (flips, rotations, etc.) or adjust the image colours (brightness, contrast, …).
During training, student and teacher will both see differently augmented versions of the same image.
The student is then trained to match the teacher’s label.&lt;&#x2F;p&gt;
&lt;figure &gt;
    
    &lt;video autoplay loop muted playsinline src=augmentation.webm&gt; &lt;&#x2F;video&gt;
    &lt;figcaption&gt;
      
      &lt;p&gt;
         Visualization of spatial and colourspace augmentations on a Sentinel-2 satellite image from &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;www.google.com&#x2F;maps?ll=72.8585351,-119.521537&amp;amp;hl=en&amp;amp;t=h&amp;amp;z=11&quot;&gt;Banks Island, Canada&lt;&#x2F;a&gt;. 
        
        
        
      &lt;&#x2F;p&gt;
    &lt;&#x2F;figcaption&gt;
&lt;&#x2F;figure&gt;
&lt;p&gt;That leaves us with the last rule, making the models to use of all of the available classes.
For this, DINO introduces two operations, called &lt;em&gt;centering&lt;&#x2F;em&gt; and &lt;em&gt;temperature scaling&lt;&#x2F;em&gt;.
So how do these work?
When we give it an image, the teacher doesn’t simply predict a single label, but actually gives us a &lt;em&gt;distribution&lt;&#x2F;em&gt; over the class labels.
For the centering step, we reduce the weight of frequently used classes, and boost the classes that are less often used.
This is done by keeping track of past teacher outputs.
For the temperature scaling step, the teacher outputs are adjusted in a way that emphasizes the differences present in the prediction – high-weight classes are given even higher weight, and low-weight classes are tuned down:&lt;&#x2F;p&gt;
&lt;figure &gt;
    
    &lt;video autoplay loop muted playsinline src=teacher.webm&gt; &lt;&#x2F;video&gt;
    &lt;figcaption&gt;
      
      &lt;p&gt;
         Centering and scaling of the teacher output, encouraging use of all classes while deciding for a single class per image. 
        
        
        
      &lt;&#x2F;p&gt;
    &lt;&#x2F;figcaption&gt;
&lt;&#x2F;figure&gt;
&lt;h2 id=&quot;from-dino-to-pixeldino&quot;&gt;From DINO to PixelDINO&lt;&#x2F;h2&gt;
&lt;p&gt;In the regular DINO scheme, only a single label is assigned to the entire image.
But for mapping tasks in remote sensing, we need a class label for each individual location in the image instead.
This is what we do with our PixelDINO framework.
Instead of classifying whole images, it assigns labels to each pixel in the picture.&lt;&#x2F;p&gt;
&lt;p&gt;A fundamental assumption of DINO training is that augmentations don’t change the class of the image.
When working on the pixel level, this is no longer true!
When mirroring or rotating an image, the objects within the image change their locations.
So not only do we need to augment the imagery for the student, but we also need to transform the teacher labels alongside with it.
For this, we take inspiration from another semi-supervised training method, called &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;2208.00400&quot;&gt;FixMatchSeg&lt;&#x2F;a&gt;.&lt;&#x2F;p&gt;
&lt;p&gt;Instead of creating two random augmentations of an image, FixMatchSeg builds on a chain of augmentations.
A first set, called &lt;em&gt;weak augmentations&lt;&#x2F;em&gt; is applied before passing an image to the teacher.
After getting teacher labels for this version of the image, the weakly augmented image is then augmented together with the teacher labels using a second set of augmentations, called &lt;em&gt;strong augmentations&lt;&#x2F;em&gt;.
The student then trains to match the teacher’s output on the strongly augmented version of the image.&lt;&#x2F;p&gt;
&lt;figure &gt;
    
    &lt;img src=pixeldino.png &#x2F;&gt;
    &lt;figcaption&gt;
      
      &lt;p&gt;
         Overview of the PixelDINO training pipeline used to train a model on unlabelled images. 
        
        
        
      &lt;&#x2F;p&gt;
    &lt;&#x2F;figcaption&gt;
&lt;&#x2F;figure&gt;
&lt;p&gt;One final question in this setup is how to learn the teacher’s weights.
For this, we adapt the simple, yet effective strategy used by DINO: the teacher follows the student with an &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Exponential_smoothing&quot;&gt;exponential moving average&lt;&#x2F;a&gt;.
In this way, the teacher is not static, and is updated with newly distilled knowledge.&lt;&#x2F;p&gt;
&lt;h2 id=&quot;from-self-supervised-to-semi-supervised&quot;&gt;From self-supervised to semi-supervised&lt;&#x2F;h2&gt;
&lt;p&gt;The next step is combining this self-supervised training method with regular supervised training.
After all, we do have imagery with existing ground truth annotations, as we saw &lt;a href=&quot;https:&#x2F;&#x2F;konrad.earth&#x2F;blog&#x2F;pixeldino&#x2F;#footprints&quot;&gt;above&lt;&#x2F;a&gt;.
Due to the fact that PixelDINO already works with pseudoclasses, this is very easy!
All we need to do is align one of the pseudoclasses with the RTS class that we have given in the training data.&lt;&#x2F;p&gt;
&lt;p&gt;All in all, we arrive at the following training procedure for a single batch (in Pytorch-Pseudocode):&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo z-code&quot;&gt;&lt;code data-lang=&quot;python&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span class=&quot;z-storage z-type z-function z-python&quot;&gt;def&lt;&#x2F;span&gt;&lt;span class=&quot;z-entity z-name z-function&quot;&gt; train_step&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation&quot;&gt;(&lt;&#x2F;span&gt;&lt;span class=&quot;z-variable z-parameter&quot;&gt;img&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation&quot;&gt;,&lt;&#x2F;span&gt;&lt;span class=&quot;z-variable z-parameter&quot;&gt; mask&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation&quot;&gt;,&lt;&#x2F;span&gt;&lt;span class=&quot;z-variable z-parameter&quot;&gt; unlabelled&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation&quot;&gt;):&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span class=&quot;z-punctuation z-definition z-comment z-comment&quot;&gt;  # Supervised Training Step&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span class=&quot;z-source&quot;&gt;  pred&lt;&#x2F;span&gt;&lt;span class=&quot;z-keyword z-operator&quot;&gt; =&lt;&#x2F;span&gt;&lt;span class=&quot;z-meta z-function-call z-python&quot;&gt; student&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation&quot;&gt;(&lt;&#x2F;span&gt;&lt;span class=&quot;z-meta z-function-call z-arguments z-python&quot;&gt;img&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation&quot;&gt;)&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span class=&quot;z-source&quot;&gt;  loss_supervised&lt;&#x2F;span&gt;&lt;span class=&quot;z-keyword z-operator&quot;&gt; =&lt;&#x2F;span&gt;&lt;span class=&quot;z-meta z-function-call z-python&quot;&gt; cross_entropy&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation&quot;&gt;(&lt;&#x2F;span&gt;&lt;span class=&quot;z-meta z-function-call z-arguments z-python&quot;&gt;pred&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation&quot;&gt;,&lt;&#x2F;span&gt;&lt;span class=&quot;z-meta z-function-call z-arguments z-python&quot;&gt; mask&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation&quot;&gt;)&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span class=&quot;z-punctuation z-definition z-comment z-comment&quot;&gt;  # Get pseudo-classes from teacher&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span class=&quot;z-source&quot;&gt;  view_1&lt;&#x2F;span&gt;&lt;span class=&quot;z-keyword z-operator&quot;&gt; =&lt;&#x2F;span&gt;&lt;span class=&quot;z-meta z-function-call z-python&quot;&gt; augment_weak&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation&quot;&gt;(&lt;&#x2F;span&gt;&lt;span class=&quot;z-meta z-function-call z-arguments z-python&quot;&gt;unlabelled&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation&quot;&gt;)&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span class=&quot;z-source&quot;&gt;  mask_1&lt;&#x2F;span&gt;&lt;span class=&quot;z-keyword z-operator&quot;&gt; =&lt;&#x2F;span&gt;&lt;span class=&quot;z-meta z-function-call z-python&quot;&gt; teacher&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation&quot;&gt;(&lt;&#x2F;span&gt;&lt;span class=&quot;z-meta z-function-call z-arguments z-python&quot;&gt;mask_1&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation&quot;&gt;)&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span class=&quot;z-source&quot;&gt;  mask_1&lt;&#x2F;span&gt;&lt;span class=&quot;z-keyword z-operator&quot;&gt; =&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation&quot;&gt; (&lt;&#x2F;span&gt;&lt;span class=&quot;z-source&quot;&gt;mask_1&lt;&#x2F;span&gt;&lt;span class=&quot;z-keyword z-operator&quot;&gt; -&lt;&#x2F;span&gt;&lt;span class=&quot;z-source&quot;&gt; center&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation&quot;&gt;)&lt;&#x2F;span&gt;&lt;span class=&quot;z-keyword z-operator&quot;&gt; &#x2F;&lt;&#x2F;span&gt;&lt;span class=&quot;z-source&quot;&gt; temp&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span class=&quot;z-source&quot;&gt;  batch_center&lt;&#x2F;span&gt;&lt;span class=&quot;z-keyword z-operator&quot;&gt; =&lt;&#x2F;span&gt;&lt;span class=&quot;z-source&quot;&gt; center&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation&quot;&gt;.&lt;&#x2F;span&gt;&lt;span class=&quot;z-meta z-function-call z-python&quot;&gt;mean&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation&quot;&gt;(&lt;&#x2F;span&gt;&lt;span class=&quot;z-variable z-parameter&quot;&gt;dim&lt;&#x2F;span&gt;&lt;span class=&quot;z-keyword z-operator&quot;&gt;=&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation&quot;&gt;[&lt;&#x2F;span&gt;&lt;span class=&quot;z-constant z-numeric&quot;&gt;0&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation&quot;&gt;,&lt;&#x2F;span&gt;&lt;span class=&quot;z-constant z-numeric&quot;&gt; 2&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation&quot;&gt;,&lt;&#x2F;span&gt;&lt;span class=&quot;z-constant z-numeric&quot;&gt; 3&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation&quot;&gt;])&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span class=&quot;z-source&quot;&gt;  mask_1&lt;&#x2F;span&gt;&lt;span class=&quot;z-keyword z-operator&quot;&gt; =&lt;&#x2F;span&gt;&lt;span class=&quot;z-meta z-function-call z-python&quot;&gt; softmax&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation&quot;&gt;(&lt;&#x2F;span&gt;&lt;span class=&quot;z-meta z-function-call z-arguments z-python&quot;&gt;mask_1&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation&quot;&gt;)&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span class=&quot;z-punctuation z-definition z-comment z-comment&quot;&gt;  # Strongly augment image and label together&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span class=&quot;z-source&quot;&gt;  view_2&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation&quot;&gt;,&lt;&#x2F;span&gt;&lt;span class=&quot;z-source&quot;&gt; mask_2&lt;&#x2F;span&gt;&lt;span class=&quot;z-keyword z-operator&quot;&gt; =&lt;&#x2F;span&gt;&lt;span class=&quot;z-meta z-function-call z-python&quot;&gt; augment_strong&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation&quot;&gt;(&lt;&#x2F;span&gt;&lt;span class=&quot;z-meta z-function-call z-arguments z-python&quot;&gt;view_1&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation&quot;&gt;,&lt;&#x2F;span&gt;&lt;span class=&quot;z-meta z-function-call z-arguments z-python&quot;&gt; mask_1&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation&quot;&gt;)&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span class=&quot;z-source&quot;&gt;  pred_2&lt;&#x2F;span&gt;&lt;span class=&quot;z-keyword z-operator&quot;&gt; =&lt;&#x2F;span&gt;&lt;span class=&quot;z-meta z-function-call z-python&quot;&gt; student&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation&quot;&gt;(&lt;&#x2F;span&gt;&lt;span class=&quot;z-meta z-function-call z-arguments z-python&quot;&gt;view_2&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation&quot;&gt;)&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span class=&quot;z-source&quot;&gt;  loss_dino&lt;&#x2F;span&gt;&lt;span class=&quot;z-keyword z-operator&quot;&gt; =&lt;&#x2F;span&gt;&lt;span class=&quot;z-meta z-function-call z-python&quot;&gt; cross_entropy&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation&quot;&gt;(&lt;&#x2F;span&gt;&lt;span class=&quot;z-meta z-function-call z-arguments z-python&quot;&gt;pred_2&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation&quot;&gt;,&lt;&#x2F;span&gt;&lt;span class=&quot;z-meta z-function-call z-arguments z-python&quot;&gt; mask_2&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation&quot;&gt;)&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span class=&quot;z-source&quot;&gt;  loss&lt;&#x2F;span&gt;&lt;span class=&quot;z-keyword z-operator&quot;&gt; =&lt;&#x2F;span&gt;&lt;span class=&quot;z-source&quot;&gt; loss_supervised&lt;&#x2F;span&gt;&lt;span class=&quot;z-keyword z-operator&quot;&gt; +&lt;&#x2F;span&gt;&lt;span class=&quot;z-source&quot;&gt; beta&lt;&#x2F;span&gt;&lt;span class=&quot;z-keyword z-operator&quot;&gt;*&lt;&#x2F;span&gt;&lt;span class=&quot;z-source&quot;&gt;loss_dino&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span class=&quot;z-source&quot;&gt;  loss&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation&quot;&gt;.&lt;&#x2F;span&gt;&lt;span class=&quot;z-meta z-function-call z-python&quot;&gt;backward&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation&quot;&gt;()&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation z-definition z-comment z-comment&quot;&gt;  # Back-propagate losses&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span class=&quot;z-meta z-function-call z-python&quot;&gt;  update&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation&quot;&gt;(&lt;&#x2F;span&gt;&lt;span class=&quot;z-meta z-function-call z-arguments z-python&quot;&gt;student&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation&quot;&gt;)&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation z-definition z-comment z-comment&quot;&gt;  # Adam weight update&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span class=&quot;z-meta z-function-call z-python&quot;&gt;  ema_update&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation&quot;&gt;(&lt;&#x2F;span&gt;&lt;span class=&quot;z-meta z-function-call z-arguments z-python&quot;&gt;teacher&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation&quot;&gt;,&lt;&#x2F;span&gt;&lt;span class=&quot;z-meta z-function-call z-arguments z-python&quot;&gt; student&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation&quot;&gt;)&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation z-definition z-comment z-comment&quot;&gt;  # Teacher EMA&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span class=&quot;z-meta z-function-call z-python&quot;&gt;  ema_update&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation&quot;&gt;(&lt;&#x2F;span&gt;&lt;span class=&quot;z-meta z-function-call z-arguments z-python&quot;&gt;center&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation&quot;&gt;,&lt;&#x2F;span&gt;&lt;span class=&quot;z-meta z-function-call z-arguments z-python&quot;&gt; batch_center&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation&quot;&gt;)&lt;&#x2F;span&gt;&lt;span class=&quot;z-punctuation z-definition z-comment z-comment&quot;&gt;  # Center EMA&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;&lt;h2 id=&quot;results&quot;&gt;Results&lt;&#x2F;h2&gt;
&lt;p&gt;To evaluate how well our method works, we trained multiple models with different configurations to see how well they classify permafrost disturbances.
To make the training process as fair as possible, we counted the number of training steps each model went through instead of using epochs, since our labeled data was much smaller than the unlabeled data.
We set aside two regions for testing the models: Herschel Island because it’s an island isolated from the mainland and Lena because it is in a different land cover zone.
This helps with seeing how well the models can handle areas they’ve never seen before.
We then compared the performance of these models with different configurations to see which training methods worked best.&lt;&#x2F;p&gt;
&lt;table&gt;&lt;thead&gt;&lt;tr&gt;&lt;th&gt;Model&lt;&#x2F;th&gt;&lt;th style=&quot;text-align: center&quot;&gt;Herschel&lt;&#x2F;th&gt;&lt;th style=&quot;text-align: center&quot;&gt;Lena&lt;&#x2F;th&gt;&lt;&#x2F;tr&gt;&lt;&#x2F;thead&gt;&lt;tbody&gt;
&lt;tr&gt;&lt;td&gt;Baseline&lt;&#x2F;td&gt;&lt;td style=&quot;text-align: center&quot;&gt;19.8 ± 1.7&lt;&#x2F;td&gt;&lt;td style=&quot;text-align: center&quot;&gt;28.8 ±  3.0&lt;&#x2F;td&gt;&lt;&#x2F;tr&gt;
&lt;tr&gt;&lt;td&gt;Baseline+Aug&lt;&#x2F;td&gt;&lt;td style=&quot;text-align: center&quot;&gt;22.9 ± 3.0&lt;&#x2F;td&gt;&lt;td style=&quot;text-align: center&quot;&gt;25.8 ± 10.2&lt;&#x2F;td&gt;&lt;&#x2F;tr&gt;
&lt;tr&gt;&lt;td&gt;FixMatchSeg&lt;&#x2F;td&gt;&lt;td style=&quot;text-align: center&quot;&gt;23.4 ± 0.8&lt;&#x2F;td&gt;&lt;td style=&quot;text-align: center&quot;&gt;32.4 ±  3.2&lt;&#x2F;td&gt;&lt;&#x2F;tr&gt;
&lt;tr&gt;&lt;td&gt;Adversarial&lt;&#x2F;td&gt;&lt;td style=&quot;text-align: center&quot;&gt;26.6 ± 3.9&lt;&#x2F;td&gt;&lt;td style=&quot;text-align: center&quot;&gt;25.1 ± 15.1&lt;&#x2F;td&gt;&lt;&#x2F;tr&gt;
&lt;tr&gt;&lt;td&gt;PixelDINO&lt;&#x2F;td&gt;&lt;td style=&quot;text-align: center&quot;&gt;30.2 ± 2.7&lt;&#x2F;td&gt;&lt;td style=&quot;text-align: center&quot;&gt;39.5 ±  6.5&lt;&#x2F;td&gt;&lt;&#x2F;tr&gt;
&lt;&#x2F;tbody&gt;&lt;&#x2F;table&gt;
&lt;p&gt;Indeed, PixelDINO outperforms not only the supervised base methods, but also the two other semi-supervised segmentation methods we tested: &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;2208.00400&quot;&gt;FixMatchSeg&lt;&#x2F;a&gt; and &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;1802.07934&quot;&gt;Adversarial Semi-Segmentation&lt;&#x2F;a&gt;.
In practice, this improved training method leads to less false positives and more faithful reconstruction of RTS shapes:&lt;&#x2F;p&gt;
&lt;div class=&quot;applet&quot;&gt;
  &lt;iframe src=https:&amp;#x2F;&amp;#x2F;konrad.earth&amp;#x2F;blog&amp;#x2F;pixeldino&amp;#x2F;lena_interactive.svg?h=cb845ac93dc691bf5361 allow=&quot;fullscreen&quot; allowfullscreen&gt;&lt;&#x2F;iframe&gt;
  
&lt;&#x2F;div&gt;
&lt;p&gt;The developed PixelDINO method should prove useful not only for permafrost monitoring, but also for other usecases in remote sensing, where spatial variability poses a challenge.
We hope that this work can inspire follow-up research for other applications.&lt;&#x2F;p&gt;
</content>
        </entry><entry xml:lang="en">
        <title>COBRA: A Deep Active Contour Model for Delineating Glacier Calving Fronts</title>
        <published>2023-07-31T00:00:00+00:00</published>
        <updated>2023-07-31T00:00:00+00:00</updated>
        <author>
            <name>Konrad Heidler</name>
        </author>
        <link rel="alternate" href="https://konrad.earth/blog/deepsnake/" type="text/html"/>
        <id>https://konrad.earth/blog/deepsnake/</id>
        
            <content type="html">&lt;figure &gt;
    
    &lt;img src=helheim_overlay.jpg &#x2F;&gt;
    &lt;figcaption&gt;
      
      &lt;p&gt;
        
        
        
        
      &lt;&#x2F;p&gt;
    &lt;&#x2F;figcaption&gt;
&lt;&#x2F;figure&gt;
&lt;p&gt;Existing approaches for calving front detection generally work by first performing
a pixel-wise segmentation or edge detection,
and then extract the actual calving front in a post-processing step.
Our goal in this study is to build a model that only needs a single step,
and directly outputs the calving front as a polyline.&lt;&#x2F;p&gt;
&lt;figure &gt;
    
    &lt;img src=architecture.png &#x2F;&gt;
    &lt;figcaption&gt;
      
      &lt;p&gt;
        
        
        
        
      &lt;&#x2F;p&gt;
    &lt;&#x2F;figcaption&gt;
&lt;&#x2F;figure&gt;
&lt;p&gt;Following the idea of explicit contour prediction,
we have developed a new method called “Charting Outlines by Recurrent Adaptation” (COBRA).
It works by combining the idea of Active Contour models with deep learning.
First, a 2D CNN backbone derives feature maps from the input imagery.
Then, a 1D CNN (Snake Head) iteratively deforms an initial contour until to match the true contour.&lt;&#x2F;p&gt;
&lt;h1 id=&quot;results&quot;&gt;Results&lt;&#x2F;h1&gt;
&lt;p&gt;These animations show how COBRA iteratively
predicts glacier calving fronts:&lt;&#x2F;p&gt;
&lt;div class=&quot;visual_row&quot;&gt;
  &lt;img src=&quot;.&#x2F;anim&#x2F;6.svg&quot;&gt;&lt;&#x2F;img&gt;
  &lt;img src=&quot;.&#x2F;anim&#x2F;8.svg&quot;&gt;&lt;&#x2F;img&gt;
  &lt;img src=&quot;.&#x2F;anim&#x2F;11.svg&quot;&gt;&lt;&#x2F;img&gt;
  &lt;img src=&quot;.&#x2F;anim&#x2F;18.svg&quot;&gt;&lt;&#x2F;img&gt;
  &lt;img src=&quot;.&#x2F;anim&#x2F;25.svg&quot;&gt;&lt;&#x2F;img&gt;
  &lt;img src=&quot;.&#x2F;anim&#x2F;32.svg&quot;&gt;&lt;&#x2F;img&gt;
&lt;&#x2F;div&gt;
&lt;h1 id=&quot;paper&quot;&gt;Paper&lt;&#x2F;h1&gt;
&lt;p&gt;&lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;2307.03461&quot;&gt;Preprint available on arxiv&lt;&#x2F;a&gt;, published article available on &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;ieeexplore.ieee.org&#x2F;document&#x2F;10195954&quot;&gt;IEEE Explore&lt;&#x2F;a&gt;&lt;&#x2F;p&gt;
&lt;h1 id=&quot;code&quot;&gt;Code&lt;&#x2F;h1&gt;
&lt;p&gt;If you would like to
have a closer look at the implementation details,
work with our method,
or reproduce our results,
you can find all of our code &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;khdlr&#x2F;COBRA&quot;&gt;on Github&lt;&#x2F;a&gt;.&lt;&#x2F;p&gt;
&lt;h1 id=&quot;inference-map&quot;&gt;Inference Map&lt;&#x2F;h1&gt;
&lt;div class=&quot;applet&quot;&gt;
  &lt;iframe src=https:&amp;#x2F;&amp;#x2F;maps.heidler.info&amp;#x2F;cobra allow=&quot;fullscreen&quot; allowfullscreen&gt;&lt;&#x2F;iframe&gt;
  
    &lt;button
      type=&quot;button&quot;
      class=&quot;fullscreen&quot;
      aria-label=&quot;Open applet fullscreen&quot;
      title=&quot;Fullscreen&quot;
    &gt;
      &lt;svg viewBox=&quot;0 0 24 24&quot; aria-hidden=&quot;true&quot;&gt;
        &lt;path
          d=&quot;M5 9V5h4M15 5h4v4M19 15v4h-4M9 19H5v-4&quot;
          fill=&quot;none&quot;
          stroke=&quot;currentColor&quot;
          stroke-width=&quot;2&quot;
          stroke-linecap=&quot;round&quot;
        &#x2F;&gt;
      &lt;&#x2F;svg&gt;
    &lt;&#x2F;button&gt;
  
&lt;&#x2F;div&gt;
&lt;h1 id=&quot;video&quot;&gt;Video&lt;&#x2F;h1&gt;
&lt;div class=&quot;embed video-player&quot; style=&quot;text-align:center;&quot;&gt;
  &lt;iframe
    class=&quot;youtube-player&quot;
    type=&quot;text&#x2F;html&quot;
    width=&quot;640&quot;
    height=&quot;385&quot;
    src=&quot;https:&#x2F;&#x2F;www.youtube-nocookie.com&#x2F;embed&#x2F;D66o6BVfZuk&quot;
    allowfullscreen
    frameborder=&quot;0&quot;
  &gt;
  &lt;&#x2F;iframe&gt;
&lt;&#x2F;div&gt;
</content>
        </entry><entry xml:lang="en">
        <title>Seeing the Bigger Picture: Enabling Large Context Windows in Neural Networks by Combining Multiple Zoom Levels</title>
        <published>2021-10-12T00:00:00+00:00</published>
        <updated>2021-10-12T00:00:00+00:00</updated>
        <author>
            <name>Konrad Heidler</name>
        </author>
        <link rel="alternate" href="https://konrad.earth/blog/zoom-nn/" type="text/html"/>
        <id>https://konrad.earth/blog/zoom-nn/</id>
        
            <content type="html">&lt;div class=&quot;embed video-player&quot; style=&quot;text-align:center;&quot;&gt;
  &lt;iframe
    class=&quot;youtube-player&quot;
    type=&quot;text&#x2F;html&quot;
    width=&quot;640&quot;
    height=&quot;385&quot;
    src=&quot;https:&#x2F;&#x2F;www.youtube-nocookie.com&#x2F;embed&#x2F;aXyxxMvjIQ0&quot;
    allowfullscreen
    frameborder=&quot;0&quot;
  &gt;
  &lt;&#x2F;iframe&gt;
&lt;&#x2F;div&gt;
&lt;p&gt;Read the &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;ieeexplore.ieee.org&#x2F;abstract&#x2F;document&#x2F;9554434&quot;&gt;full paper on IEEE Xplore&lt;&#x2F;a&gt;.&lt;&#x2F;p&gt;
</content>
        </entry><entry xml:lang="en">
        <title>Self-supervised Audiovisual Representation Learning for Remote Sensing Data</title>
        <published>2021-08-02T00:00:00+00:00</published>
        <updated>2021-08-02T00:00:00+00:00</updated>
        <author>
            <name>Konrad Heidler</name>
        </author>
        <link rel="alternate" href="https://konrad.earth/blog/sounding-earth/" type="text/html"/>
        <id>https://konrad.earth/blog/sounding-earth/</id>
        
            <content type="html">&lt;p&gt;&lt;img src=&quot;https:&#x2F;&#x2F;konrad.earth&#x2F;blog&#x2F;sounding-earth&#x2F;.&#x2F;graphical_abstract.png&quot; alt=&quot;&quot; &#x2F;&gt;&lt;&#x2F;p&gt;
&lt;p&gt;Read the &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;www.sciencedirect.com&#x2F;science&#x2F;article&#x2F;pii&#x2F;S1569843222003181&quot;&gt;full paper at Science Direct&lt;&#x2F;a&gt;.&lt;&#x2F;p&gt;
</content>
        </entry><entry xml:lang="en">
        <title>HED-UNet: Combined segmentation and edge detection for monitoring the Antarctic coastline</title>
        <published>2021-03-23T00:00:00+00:00</published>
        <updated>2021-03-23T00:00:00+00:00</updated>
        <author>
            <name>Konrad Heidler</name>
        </author>
        <link rel="alternate" href="https://konrad.earth/blog/hed-unet/" type="text/html"/>
        <id>https://konrad.earth/blog/hed-unet/</id>
        
            <content type="html">&lt;div class=&quot;embed video-player&quot; style=&quot;text-align:center;&quot;&gt;
  &lt;iframe
    class=&quot;youtube-player&quot;
    type=&quot;text&#x2F;html&quot;
    width=&quot;640&quot;
    height=&quot;385&quot;
    src=&quot;https:&#x2F;&#x2F;www.youtube-nocookie.com&#x2F;embed&#x2F;NRAFh_gFU9U&quot;
    allowfullscreen
    frameborder=&quot;0&quot;
  &gt;
  &lt;&#x2F;iframe&gt;
&lt;&#x2F;div&gt;
&lt;p&gt;Read the &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;ieeexplore.ieee.org&#x2F;abstract&#x2F;document&#x2F;9383809&quot;&gt;full paper on IEEE Xplore&lt;&#x2F;a&gt;.&lt;&#x2F;p&gt;
</content>
        </entry><entry xml:lang="en">
        <title>Remote Sensing for Assessing Drought Insurance Claims in Central Europe</title>
        <published>2019-11-14T00:00:00+00:00</published>
        <updated>2019-11-14T00:00:00+00:00</updated>
        <author>
            <name>Konrad Heidler</name>
        </author>
        <link rel="alternate" href="https://konrad.earth/blog/draughts/" type="text/html"/>
        <id>https://konrad.earth/blog/draughts/</id>
        
            <content type="html">&lt;p&gt;Read the &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;ieeexplore.ieee.org&#x2F;document&#x2F;8898926&quot;&gt;full paper on IEEE Xplore&lt;&#x2F;a&gt;.&lt;&#x2F;p&gt;
</content>
        </entry>
</feed>
