Google luky.org euqset.org

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Something very strange on x86_64 2.6.X kernels


Eric Dumazet <dada1@xxxxxxxxxxxxx> wrote:
>
> Hi Andi
> 
> I have very strange coredumps happening on a big 64bits program.
> 
> Some background :
> - This program is multi-threaded
> - Machine is a dual Opteron 248 machine, 12GB ram.
> - Kernel 2.6.6  (tried 2.6.10 too but problems too)
> - The program uses hugetlb pages.
> - The program uses prefetchnta
> - The program uses about 8GB of ram.
> 
> After numerous differents core dumps of this program, and gdb debugging 
> I found :
> 
> Every time the crash occurs when one thread is using some ram located at 
> virtual address 0xffffe6xx

What does "using" mean?  Is the program executing from that location?

> When examining the core image, the data saved on this page seems correct 
> (ie countains coherent user data). But one register (%rbx) is usually 
> corrupted and contains a small value (like 0x3c)
> 
> The last instruction using this register is :
> 	prefetchnta 0x18(,%rbx,4)
> 
> 
> Examining linux sources, I found that 0xffffe000 is 'special' (ia 32 
> vsyscall) and 0xffffe600 is about sigreturn subsection of this special area.
> 
> Is it possible some vm trick just kicks in and corrupts my true 64bits 
> program ?
> 

Interesting.  IIRC, opterons will very occasionally (and incorrectly) take
a fault when performing a prefetch against a dud pointer.  The kernel will
fix that up.  At a guess, I'd say tha the fixup code isn't doing the right
thing when the faulting EIP is in the vsyscall page.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


$B$3$N>pJs$,$"$J$?$NC5$7$F$$?$b$N$+$I$&$+A*Br$7$F$/$@$5$!#(B
yes/$B$^$5$K$3$l$@!*(B   no/$B0c$&$J$!(B   part/$B0lIt8+$D$+$C$?(B   try/$B$3$l$G;n$7$F$_$k(B

$B$"$J$?$,C5$7$F$$?>pJs$O$I$N$h$&$J$3$H$+!"$4<+M3$K5-F~2<$5$!#FC$K!V$^$5$K$3$l$@!*!W$H8@$&>l9g$O5-F~$r$*4j$$7$^$9!#(B
$BNc(B:$B!VJ#?t$N%^%7%s$+$i(BCATV$B7PM3$G(Bipmasquerade$B$rMxMQ$7$F(BWeb$B$r;2>H$7$?$>l9g$N@_Dj$K$D$$F!W(B
Follow-Ups: References: