Computer Graphics
TU Braunschweig

Seminar Computer Vision SS'25
Seminar

Prof. Dr.-Ing. Martin Eisemann

Hörerkreis: Bachelor & Master
Kontakt: seminarcv@cg.cs.tu-bs.de

Modul: INF-STD-66, INF-STD-68
Vst.Nr.: 4216031, 4216032

Topic: Recent research in Visual Computing

Latest News

Schedule for the final talks on the 10.07.2025 (G30):

09:30 Towards Learning-based Inverse Subsurface Scattering

10:00 Classification of Internal and External Distractions in an Educational VR Environment using Multimodal Features

10:30 IAmHero: Preliminary Findings of an Experimental Study to …

12:00 CLAY: A Controllable Large-Scale Generative Model for Creating High-quality 3D Assets

12:30 TeTriRF: Temporal Tri-Plane Radiance Fields for Efficient Free-Viewpoint Video

13:00 Subsurface Scattering for 3D Gaussian Splatting

13:45 Shape from Heat Conduction

14:15 Utilizing Gaze-Contingent Rendering to Maintain Visual Attention in Educational VR

Schedule for the fundamentals talks:

10:30 Machine Learning and Neural Networks

11:00 3D Representations

11:30 Subsurface Scattering

12:00 VR and Eye Tracking

Content

In this seminar we discuss current research results in computer vision, visual computing and image/video processing. The task of the participants is to understand and explain a certain research topic to the other participants. In a block seminar in the middle of the semester the background knowledge required for the final talks will be presented in oral presentations and at the end of the semester, the respective research topic is presented in an oral presentation. This must also be rehearsed beforehand in front of another student and his/her suggestions for improvement must be integrated.

Participants

The course is aimed at bachelor's and master's students from the fields of computer science (Informatik), IST, business informatics (Wirtschaftsinformatik), and data science.

Registration takes place centrally via StudIP. The number of participants is initially limited to 8 students, but can be extended in the kickoff if necessary.

Important Dates

All dates listed here must be adhered to. Attendance at all events is mandatory.

  • 03.02.2025 - 07.02.2025: Registration via Stud.IP
  • 10.04.2025, 10:30-12:00: Kickoff Meeting (G30, ICG)
  • 21.04.2025: End of the deregistration period
  • 22.04.2025, 10:30-12:00, G30 (ICG): Gather topics for fundamentals talk
  • 21.05.2025: Submission of presentation slides for fundamentals talk (please use the following naming scheme: Lastname_FundamentalsPresentation_SeminarCV.pdf)
  • 22.05.2025, 09:00 - 12:00, G30 (ICG): Fundamentals presentations, Block
  • Till 08.07.2025: Trial presentation for final presentation (between tandem partners from fundamentals talk)
  • 09.07.2025: Submission of presentation slides for final talk (ALL participants!) (please use the following naming scheme: Lastname_FinalPresentation_SeminarCV.pdf)
  • 10.07.2025, 09:00 - 15:00, G30 (ICG): Presentations - Block Event Part 1
  • 11.07.2025, 09:00 - 15:00, G30 (ICG): Presentations - Block Event Part 2 (probably not needed) 

Registered students have the possibility to deregister until 2 weeks after the start of the lectures at the latest. For a successful deregistration it is necessary to deregister with the seminar supervisor.

The respective drop-offs are done by email to seminarcv@cg.cs.tu-bs.de , and your advisor, and if necessary by email to the tandem partner. Unless otherwise communicated, submissions must be made by 11:59pm on the submission day.

If you have any questions about the event, please contact seminarcv@cg.cs.tu-bs.de.

In case you have a good reasing for missing parts of the on-site meeting (illness, conflict with other courses) please contact seminarcv@cg.cs.tu-bs.de beforehand and you will be assigned an additional assignment to make up for the missed information.

Format

  • The topics for the final talks will be distributed amongst the participants during the Kickoff event.
  • The topics for the fundamentals talks will be distributed amongst the participants during the second meeting.
  • The topics will be presented in approximately 20 minute presentations followed by a discussion, see important dates.
  • For the on-site lectures, a laptop of the institute or an own laptop can be used. If the institute laptop is to be used, it is necessary to contact seminarcv@cg.tu-bs.de in time, at least two weeks before the presentations. In this case, the presentation slides must be made available at least one week before the lecture.
  • The presentations will be given on site. If, for some reason, the presentations take place online, Big Blue Button will be used as a platform. In this case, students need their own PC with microphone. In addition, a video transmission during the own lecture would be desirable. If these requirements cannot be met, it is necessary to contact seminarcv@cg.cs.tu-bs.de in time.
  • The language for the presentations can be either German or English.
  • The presentations are mandatory requirements to pass the course successfully.

Files and Templates

    Topics

    1. Utilizing Gaze-Contingent Rendering to Maintain Visual Attention in Educational VR
      (Han et. al) TVCG 24

      In educational Virtual Reality (VR) environments, objects irrelevant to learning can lead to students' inattention, which adversely affects learning. However, removing these objects from virtual scenes is not feasible, as they are crucial for creating a realistic and immersive experience. In this paper, we introduce a gaze-contingent rendering (GCR) technique to address such an issue, which is independent of specific elements or configurations in virtual scenes and adaptable across various contexts. Specifically, we utilize gaze-aware rendering adjustments to adaptively reduce the visibility of objects irrelevant to learning while highlighting relevant ones. We develop three GCR strategies (i.e., blur, pixelation, and underexposure) and investigate how these strategies affect students' visual attention, academic achievement, and perceptions of the learning activity across different scenarios.
      https://ieeexplore.ieee.org/abstract/document/10807351

      Advisor: Anika Jewst

    2. Classification of Internal and External Distractions in an Educational VR Environment Using Multimodal Features
      (Asish et al.) TVCG 24

      Virtual reality (VR) can potentially enhance student engagement and memory retention in the classroom. However, distraction among participants in a VR-based classroom is a significant concern. Several factors, including mind wandering, external noise, stress, etc., can cause students to become internally and/or externally distracted while learning. To detect distractions, single or multi-modal features can be used. In this work, we investigated multi-modal features: eye tracking and EEG data, to classify the internal and external distractions in an educational VR environment. We implemented different machine learning (ML) methods, including k-nearest-neighbors (kNN), Random Forest (RF), one-dimensional convolutional neural network - long short-term memory (1 D-CNN-LSTM), and two-dimensional convolutional neural networks (2D-CNN) to classify participants' internal and external distraction states using the multi-modal features.
      https://ieeexplore.ieee.org/document/10670586

      Advisor: Anika Jewst

    3. IAmHero: Preliminary Findings of an Experimental Study to Evaluate the Statistical Significance of an Intervention for ADHD Conducted through the Use of Serious Games in Virtual Reality
      (Schena et al.) Int. J. Environ. Res. Public Health 23

      Attention-deficit/hyperactivity disorder (ADHD) is a neurodevelopmental disorder characterised by persistent patterns of inattention and/or hyperactivity-impulsivity that interfere with functioning or development. The use of new technologies, such as virtual reality (VR), represents a promising strategy in the rehabilitation of subjects with ADHD. The aim of our study is to evaluate the effectiveness of the IAmHero tool, administered through VR methods, on a cohort of subjects with ADHD between 5 and 12 years of age. The trial time was approximately 6 months. In order to assess the beneficial effects of the treatment, standardised tests assessing both ADHD symptoms and executive functions (e.g., Conners-3 scales) were administered both before and at the end of the sessions. Improvements were observed at the end of treatment in both ADHD symptoms (especially in the hyperactivity/impulsivity domain) and executive functions.
      https://www.mdpi.com/1660-4601/20/4/3414

      Advisor: Anika Jewst

    4. CLAY: A Controllable Large-scale Generative Model for Creating High-quality 3D Assets
      (Zhang, Whang et al.) SIGGRAPH 2024

      CLAY is a 3D geometry and material generator that effortlessly converts human imagination into detailed 3D assets. It supports text, images, or 3D-aware inputs (e.g., voxels, bounding boxes, and point clouds) through a large-scale model that combines a multi-resolution VAE with a minimalistic latent Diffusion Transformer. CLAY leverages neural fields for continuous, complete surfaces and generates PBR textures in 2K resolution. Trained progressively on an ultra-large 3D dataset, its 1.5-billion-parameter core empowers users—from novices to experts—to craft intricate 3D worlds with minimal effort, unleashing boundless creative potential.
      Project PagePaper

      Advisor: Jannis Möller

    5. TeTriRF: Temporal Tri-Plane Radiance Fields for Efficient Free-Viewpoint Video
      (Wu et al.) CVPR 2024

      Neural Radiance Fields (NeRF) enable photorealistic, interactive video experiences but require large storage and intensive computation. Our Temporal Tri-Plane Radiance Fields (TeTriRF) reduce these demands while retaining high-quality generation and rendering. By combining tri-plane and voxel-grid representations, TeTriRF scales to longer sequences and handles complex motions. We introduce a group training approach for efficient, temporally consistent scene encoding, then apply a compression pipeline using standard video codecs. This cuts storage by an order of magnitude compared to existing methods, without sacrificing visual quality.
      Project PagePaper

      Advisor: Jannis Möller

    6. SparseCraft: Few-Shot Neural Reconstruction through Stereopsis Guided Geometric Linearization
      (Younes, Ouasfi et al.) ECCV 2024

      We present a novel approach for recovering 3D shape and view dependent appearance from a few colored images, enabling efficient 3D reconstruction and novel view synthesis. Our method learns an implicit neural representation in the form of a Signed Distance Function (SDF) and a radiance field. The model is trained progressively through ray marching enabled volumetric rendering, and regularized with learning-free multi-view stereo (MVS) cues. Key to our contribution is a novel implicit neural shape function learning strategy that encourages our SDF field to be as linear as possible near the level-set, hence robustifying the training against noise emanating from the supervision and regularization signals. Without using any pretrained priors, our method, called SparseCraft, achieves state-of-the-art performances both in novel-view synthesis and reconstruction from sparse views in standard benchmarks, while requiring less than 10 minutes for training.
      Project PagePaper

      Advisor: Jannis Möller

    7. Subsurface Scattering for 3D Gaussian Splatting
      (Dihlmann, Jan-Niklas et al.) NeurIPS 2024 (Poster)

      3D reconstruction and relighting of objects made from scattering materials present a significant challenge due to the complex light transport beneath the surface. 3D Gaussian Splatting introduced high-quality novel view synthesis at real-time speeds. While 3D Gaussians efficiently approximate an object's surface, they fail to capture the volumetric properties of subsurface scattering. We propose a framework for optimizing an object's shape together with the radiance transfer field given multi-view OLAT (one light at a time) data. Our method decomposes the scene into an explicit surface represented as 3D Gaussians, with a spatially varying BRDF, and an implicit volumetric representation of the scattering component. A learned incident light field accounts for shadowing. We optimize all parameters jointly via ray-traced differentiable rendering. Our approach enables material editing, relighting and novel view synthesis at interactive rates. We show successful application on synthetic data and introduce a newly acquired multi-view multi-light dataset of objects in a light-stage setup. Compared to previous work we achieve comparable or better results at a fraction of optimization and rendering time while enabling detailed control over material attributes.
      Paper

      Advisor: Fabian Friederichs

    8. Shape from Heat Conduction
      (Narayanan, Sriram et al.) ECCV 2024

      Thermal cameras measure the temperature of objects based on radiation emitted in the infrared spectrum. In this work, we propose a
      novel shape recovery approach that exploits the properties of heat transport, specifically heat conduction, induced on objects when illuminated
      using simple light bulbs. Although heat transport occurs in the entirety of an object’s volume, we show a surface approximation that enables shape
      recovery and empirically analyze its validity for objects with varying thicknesses. We develop an algorithm that solves a linear system of equations to estimate the intrinsic shape Laplacian from thermal videos along with several properties including heat capacity, convection coefficient, and absorbed heat flux under uncalibrated lighting of arbitrary shapes. Further, we propose a novel shape from Laplacian objective that aims to resolve the inherent shape ambiguities by drawing insights from absorbed heat flux images using two unknown lights sources. Finally, we devise a coarse-to-fine refinement strategy that faithfully recovers both low and high-frequency shape details. We validate our method by showing accurate reconstructions, to within an error of 1-2 mm (object size ≤ 13.5 cm), in both simulations and from noisy thermal videos of real-world objects with complex shapes and material properties including those that are  transparent and translucent to visible light. We believe leveraging heat transport as a novel cue for vision can enable new imaging capabilities.
      Paper

      Advisor: Fabian Friederichs

    9. Towards Learning-based Inverse Subsurface Scattering
      (Che, Chengqian et al.) ICCP 2020

      Given images of translucent objects, of unknown shape and lighting, we aim to use learning to infer the optical parameters
      controlling subsurface scattering of light inside the objects. We introduce a new architecture, the inverse transport network (ITN), that
      aims to improve generalization of an encoder network to unseen scenes, by connecting it with a physically-accurate, differentiable
      Monte Carlo renderer capable of estimating image derivatives with respect to scattering material parameters. During training, this
      combination forces the encoder network to predict parameters that not only match groundtruth values, but also reproduce input
      images. During testing, the encoder network is used alone, without the renderer, to predict material parameters from a single input
      image. Drawing insights from the physics of radiative transfer, we additionally use material parameterizations that help reduce
      estimation errors due to ambiguities in the scattering parameter space. Finally, we augment the training loss with pixelwise weight
      maps that emphasize the parts of the image most informative about the underlying scattering parameters. We demonstrate that this
      combination allows neural networks to generalize to scenes with completely unseen geometries and illuminations better than traditional
      networks, with 38.06% reduced parameter error on average.
      Project Page | Paper

      Advisor: Fabian Friederichs

    Useful Resources

    Example of a good presentation (video on the website under the Presentation section, note how little text is needed, and how much has been visualized to create an intuitive understanding).

    General writing tips for scientific papers (mainly intended for writing scientific articles, but also good to use for summaries or even talks (at least some of it).

    Advisors

    Anika Jewst

    Fabian Friederichs

    Jannis Möller