Summary: According to tests at Google, it appears that today’s RAM modules have several thousand errors a year, which would be correctable if it weren’t for the fact that most of us aren’t using ECC RAM.
Previous research, such as some data from a 300-computer cluster, showed that memory modules had correctable error rates of 200 to 5,000 failures per billion hours of operation. Google, though, found the rate much higher: 25,000 to 75,000 failures per billion hours.
This is quite relevant for database servers because they write a lot rather than mainly read (desktop use). In the MySQL context, if a bit gets flipped in RAM, your data could get corrupted, or it’s ok on disk and you’re just reading corrupted data somehow. While using more RAM is good for performance, it also means a bigger RAM footprint for your data and thus more exposure to the issue.
In MySQL 5.0 and the general 5.1, the binary and relay logs do not have checksums on log events. If something gets corrupted anywh