In the Linux kernel, the following vulnerability has been resolved:
device-dax: correct pgoff align in dax_set_mapping()
pgoff should be aligned using ALIGN_DOWN() instead of ALIGN(). Otherwise,
vmf->address not aligned to fault_size will be aligned to the next
alignment, that can result in memory failure getting the wrong address.
It's a subtle situation that only can be observed in
page_mapped_in_vma() after the page is page fault handled by
dev_dax_huge_fault. Generally, there is little chance to perform
page_mapped_in_vma in dev-dax's page unless in specific error injection
to the dax device to trigger an MCE - memory-failure. In that case,
page_mapped_in_vma() will be triggered to determine which task is
accessing the failure address and kill that task in the end.
We used self-developed dax device (which is 2M aligned mapping) , to
perform error injection to random address. It turned out that error
injected to non-2M-aligned address was causing endless MCE until panic.
Because page_mapped_in_vma() kept resulting wrong address and the task
accessing the failure address was never killed properly:
[ 3783.719419] Memory failure: 0x200c9742: recovery action for dax page:
Recovered
[ 3784.049006] mce: Uncorrected hardware memory error in user-access at
200c9742380
[ 3784.049190] Memory failure: 0x200c9742: recovery action for dax page:
Recovered
[ 3784.448042] mce: Uncorrected hardware memory error in user-access at
200c9742380
[ 3784.448186] Memory failure: 0x200c9742: recovery action for dax page:
Recovered
[ 3784.792026] mce: Uncorrected hardware memory error in user-access at
200c9742380
[ 3784.792179] Memory failure: 0x200c9742: recovery action for dax page:
Recovered
[ 3785.162502] mce: Uncorrected hardware memory error in user-access at
200c9742380
[ 3785.162633] Memory failure: 0x200c9742: recovery action for dax page:
Recovered
[ 3785.461116] mce: Uncorrected hardware memory error in user-access at
200c9742380
[ 3785.461247] Memory failure: 0x200c9742: recovery action for dax page:
Recovered
[ 3785.764730] mce: Uncorrected hardware memory error in user-access at
200c9742380
[ 3785.764859] Memory failure: 0x200c9742: recovery action for dax page:
Recovered
[ 3786.042128] mce: Uncorrected hardware memory error in user-access at
200c9742380
[ 3786.042259] Memory failure: 0x200c9742: recovery action for dax page:
Recovered
[ 3786.464293] mce: Uncorrected hardware memory error in user-access at
200c9742380
[ 3786.464423] Memory failure: 0x200c9742: recovery action for dax page:
Recovered
[ 3786.818090] mce: Uncorrected hardware memory error in user-access at
200c9742380
[ 3786.818217] Memory failure: 0x200c9742: recovery action for dax page:
Recovered
[ 3787.085297] mce: Uncorrected hardware memory error in user-access at
200c9742380
[ 3787.085424] Memory failure: 0x200c9742: recovery action for dax page:
Recovered
It took us several weeks to pinpoint this problem, but we eventually
used bpftrace to trace the page fault and mce address and successfully
identified the issue.
Joao added:
; Likely we never reproduce in production because we always pin
: device-dax regions in the region align they provide (Qemu does
: similarly with prealloc in hugetlb/file backed memory). I think this
: bug requires that we touch *unpinned* device-dax regions unaligned to
: the device-dax selected alignment (page size i.e. 4K/2M/1G)
References
Configurations
Configuration 1 (hide)
|
History
25 Oct 2024, 15:05
Type | Values Removed | Values Added |
---|---|---|
References | () https://git.kernel.org/stable/c/7fcbd9785d4c17ea533c42f20a9083a83f301fa6 - Patch | |
References | () https://git.kernel.org/stable/c/9c4198dfdca818c5ce19c764d90eabd156bbc6da - Patch | |
References | () https://git.kernel.org/stable/c/b822007e8db341d6f175c645ed79866db501ad86 - Patch | |
References | () https://git.kernel.org/stable/c/e877427d218159ac29c9326100920d24330c9ee6 - Patch | |
CWE | NVD-CWE-noinfo | |
CVSS |
v2 : v3 : |
v2 : unknown
v3 : 5.5 |
CPE | cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:* cpe:2.3:o:linux:linux_kernel:6.12:rc2:*:*:*:*:*:* cpe:2.3:o:linux:linux_kernel:6.12:rc1:*:*:*:*:*:* |
|
First Time |
Linux linux Kernel
Linux |
23 Oct 2024, 15:12
Type | Values Removed | Values Added |
---|---|---|
Summary |
|
21 Oct 2024, 20:15
Type | Values Removed | Values Added |
---|---|---|
New CVE |
Information
Published : 2024-10-21 20:15
Updated : 2024-10-25 15:05
NVD link : CVE-2024-50022
Mitre link : CVE-2024-50022
CVE.ORG link : CVE-2024-50022
JSON object : View
Products Affected
linux
- linux_kernel
CWE