중앙대학교 첨단영상대학원

대학원소개

논문성과

논문성과

VI Lab's (Prof. Jongwon Choi) 4 papers accepted to ICCV 2025 (AI Top-tier Conference) 관리자 │ 2025-06-30 HIT 270
We are delighted to announce that four papers from the Visual Intelligence Lab (VI Lab, Prof. Jongwon Choi) have been accepted to the 2025 International Conference on Computer Vision (ICCV) [LINK]. Title: Group-wise Scaling and Orthogonal Decomposition for Domain-Invariant Feature Extraction in Face Anti-Spoofing Authors: Seungjin Jung, Kanghee Lee, Yonghyun Jeong, Haeun Noh, Jungmin Lee, Jongwon Choi Abstract: Domain Generalizable Face Anti-Spoofing (DG-FAS) methods effectively capture domain-invariant features by aligning the directions (weights) of local decision boundaries across domains. However, the bias terms associated with these boundaries remain misaligned, leading to inconsistent classification thresholds and degraded performance on unseen target domains. To address this issue, we propose a novel DG-FAS framework that jointly aligns weights and biases through Feature Orthogonal Decomposition (FOD) and Group-wise Scaling Risk Minimization (GS-RM). Specifically, GS-RM facilitates bias alignment by balancing group-wise losses across multiple domains. FOD employs the Gram-Schmidt orthogonalization process to decompose the feature space explicitly into domain-invariant and domain-specific subspaces. By enforcing orthogonality between domain-specific and domain-invariant features during training using domain labels, FOD ensures effective weight alignment across domains without negatively impacting bias alignment. Additionally, we introduce Expected Calibration Error (ECE) as a novel evaluation metric for quantitatively assessing the effectiveness of our method in aligning bias terms across domains. Extensive experiments on benchmark datasets demonstrate that our approach achieves state-of-the-art performance, consistently improving accuracy, reducing bias misalignment, and enhancing generalization stability on unseen target domains. __________________________________________________________________________________________________________________________________________________________________________________________ Title: Beyond Spatial Frequency: Pixel-wise Temporal Frequency-based Deepfake Video Detection Authors: Taehoon Kim, Jongwook Choi, Yonghyun Jeong, Haeun Noh, Jaejun Yoo, Seungryul Baek, Jongwon Choi Abstract: We introduce a deepfake video detection approach that exploits pixel-wise temporal inconsistencies, which traditional spatial frequency-based detectors often overlook. The traditional detectors represent temporal information merely by stacking spatial frequency spectra across frames, resulting in the failure to detect pixel-wise temporal artifacts. Our approach performs a 1D Fourier transform on the time axis for each pixel, extracting features highly sensitive to temporal inconsistencies, especially in areas prone to unnatural movements. To precisely locate regions containing the temporal artifacts, we introduce an attention proposal module trained in an end-to-end manner. Additionally, our joint transformer module effectively integrates pixel-wise temporal frequency features with spatio-temporal context features, expanding the range of detectable forgery artifacts. Our framework represents a significant advancement in deepfake video detection, providing robust performance across diverse and challenging detection scenarios. __________________________________________________________________________________________________________________________________________________________________________________________ Title: InsideOut: Integrated RGB-Radiative Gaussian Splatting for Comprehensive 3D Object Representation Authors: Jungmin Lee, Seonghyuk Hong, Juyong Lee, Jaeyoon Lee, Jongwon Choi Abstract: Multi-modal data fusion plays a crucial role in integrating diverse physical properties. While RGB images capture external visual features, they lack internal features, whereas X-ray images reveal internal structures but lack external details. To bridge this gap, we propose textit{Insideout}, a novel 3DGS framework that integrates RGB and X-ray data to represent the structure and appearance of objects. Our approach consists of three key components: internal structure training, hierarchical fitting, and detail-preserving refinement. First, RGB and radiative Gaussian splats are trained to capture surface structure. Then, hierarchical fitting ensures scale and positional synchronization between the two modalities. Next, cross-sectional images are incorporated to learn internal structures and refine layer boundaries. Finally, the aligned Gaussian splats receive color from RGB Gaussians, and fine Gaussian is duplicated to enhance surface details. Experiments conducted on a newly collected dataset of paired RGB and X-ray images demonstrate the effectiveness of textit{InsideOut} in accurately representing internal and external structures. __________________________________________________________________________________________________________________________________________________________________________________________ Title: AJAHR: Amputated Joint Aware 3D Human Mesh Recovery Authors: Cho hyunjin, Giyun choi, Jongwon Choi Abstract: Existing Human Mesh Recovery (HMR) methods typically assume a standard human body structure, overlooking diverse anatomical conditions such as limb loss or mobility impairments. This assumption biases the models when applied to individuals with disabilities—a shortcoming further exacerbated by the limited availability of suitable datasets. To address this gap, we propose Amputated Joint Aware Human Recovery (AJAHR), which is an adaptive pose estimation framework that enhances mesh reconstruction for individuals with impairments. Our model incorporates a body-part amputation classifier—jointly trained alongside human mesh recovery—to detect potential amputations. We also introduce Amputee 3D (A3D), a synthetic dataset offering a wide range of amputee poses for more robust training. While maintaining strong performance on non-amputees, our approach achieves state-of-the-art results for amputated individuals.

VI Lab's (Prof. Jongwon Choi) 4 papers accepted to ICCV 2025 (AI Top-tier Conference)

관리자 │ 2025-06-30

HIT

270

We are delighted to announce that four papers from the Visual Intelligence Lab (VI Lab, Prof. Jongwon Choi) have been accepted to the 2025 International Conference on Computer Vision (ICCV) [LINK].

Title:

Group-wise Scaling and Orthogonal Decomposition for Domain-Invariant Feature Extraction in Face Anti-Spoofing

Authors:

Seungjin Jung, Kanghee Lee, Yonghyun Jeong, Haeun Noh, Jungmin Lee, Jongwon Choi

Abstract:

Domain Generalizable Face Anti-Spoofing (DG-FAS) methods effectively capture domain-invariant features by aligning the directions (weights) of local decision boundaries across domains. However, the bias terms associated with these boundaries remain misaligned, leading to inconsistent classification thresholds and degraded performance on unseen target domains. To address this issue, we propose a novel DG-FAS framework that jointly aligns weights and biases through Feature Orthogonal Decomposition (FOD) and Group-wise Scaling Risk Minimization (GS-RM). Specifically, GS-RM facilitates bias alignment by balancing group-wise losses across multiple domains. FOD employs the Gram-Schmidt orthogonalization process to decompose the feature space explicitly into domain-invariant and domain-specific subspaces. By enforcing orthogonality between domain-specific and domain-invariant features during training using domain labels, FOD ensures effective weight alignment across domains without negatively impacting bias alignment. Additionally, we introduce Expected Calibration Error (ECE) as a novel evaluation metric for quantitatively assessing the effectiveness of our method in aligning bias terms across domains. Extensive experiments on benchmark datasets demonstrate that our approach achieves state-of-the-art performance, consistently improving accuracy, reducing bias misalignment, and enhancing generalization stability on unseen target domains.

__________________________________________________________________________________________________________________________________________________________________________________________

Title:

Beyond Spatial Frequency: Pixel-wise Temporal Frequency-based Deepfake Video Detection

Authors:

Taehoon Kim, Jongwook Choi, Yonghyun Jeong, Haeun Noh, Jaejun Yoo, Seungryul Baek, Jongwon Choi

Abstract:

We introduce a deepfake video detection approach that exploits pixel-wise temporal inconsistencies, which traditional spatial frequency-based detectors often overlook. The traditional detectors represent temporal information merely by stacking spatial frequency spectra across frames, resulting in the failure to detect pixel-wise temporal artifacts. Our approach performs a 1D Fourier transform on the time axis for each pixel, extracting features highly sensitive to temporal inconsistencies, especially in areas prone to unnatural movements. To precisely locate regions containing the temporal artifacts, we introduce an attention proposal module trained in an end-to-end manner. Additionally, our joint transformer module effectively integrates pixel-wise temporal frequency features with spatio-temporal context features, expanding the range of detectable forgery artifacts. Our framework represents a significant advancement in deepfake video detection, providing robust performance across diverse and challenging detection scenarios.

__________________________________________________________________________________________________________________________________________________________________________________________

Title:

InsideOut: Integrated RGB-Radiative Gaussian Splatting for Comprehensive 3D Object Representation

Authors:

Jungmin Lee, Seonghyuk Hong, Juyong Lee, Jaeyoon Lee, Jongwon Choi

Abstract:

Multi-modal data fusion plays a crucial role in integrating diverse physical properties. While RGB images capture external visual features, they lack internal features, whereas X-ray images reveal internal structures but lack external details. To bridge this gap, we propose textit{Insideout}, a novel 3DGS framework that integrates RGB and X-ray data to represent the structure and appearance of objects. Our approach consists of three key components: internal structure training, hierarchical fitting, and detail-preserving refinement. First, RGB and radiative Gaussian splats are trained to capture surface structure. Then, hierarchical fitting ensures scale and positional synchronization between the two modalities. Next, cross-sectional images are incorporated to learn internal structures and refine layer boundaries. Finally, the aligned Gaussian splats receive color from RGB Gaussians, and fine Gaussian is duplicated to enhance surface details. Experiments conducted on a newly collected dataset of paired RGB and X-ray images demonstrate the effectiveness of textit{InsideOut} in accurately representing internal and external structures.

__________________________________________________________________________________________________________________________________________________________________________________________

Title:

AJAHR: Amputated Joint Aware 3D Human Mesh Recovery

Authors:

Cho hyunjin, Giyun choi, Jongwon Choi

Abstract:

Existing Human Mesh Recovery (HMR) methods typically assume a standard human body structure, overlooking diverse anatomical conditions such as limb loss or mobility impairments. This assumption biases the models when applied to individuals with disabilities—a shortcoming further exacerbated by the limited availability of suitable datasets. To address this gap, we propose Amputated Joint Aware Human Recovery (AJAHR), which is an adaptive pose estimation framework that enhances mesh reconstruction for individuals with impairments. Our model incorporates a body-part amputation classifier—jointly trained alongside human mesh recovery—to detect potential amputations. We also introduce Amputee 3D (A3D), a synthetic dataset offering a wide range of amputee poses for more robust training. While maintaining strong performance on non-amputees, our approach achieves state-of-the-art results for amputated individuals.

이전글	IRIS Lab's (Prof. Hak Gu Kim) one paper accepted to Interspeech 2025
다음글	IRIS Lab's (Prof. Hak Gu Kim) one paper accepted to ICCV 2025 (AI Top-tier Confe...

대학원소개

구성원

입학·학사

전공

졸업

커뮤니티

대학원소개

구성원

입학·학사

전공

졸업

커뮤니티

대학원소개

논문성과