commit 0c4f5371eb6b0566db53eb2437af2bbfc183e566 Author: Greg Kroah-Hartman Date: Tue Mar 11 16:10:41 2014 -0700 Linux 3.4.83 commit 176485d0f36455db28edec5ab6581842c8d17203 Author: Emil Goode Date: Thu Feb 13 19:30:39 2014 +0100 net: asix: add missing flag to struct driver_info commit d43ff4cd798911736fb39025ec8004284b1b0bc2 upstream. The struct driver_info ax88178_info is assigned the function asix_rx_fixup_common as it's rx_fixup callback. This means that FLAG_MULTI_PACKET must be set as this function is cloning the data and calling usbnet_skb_return. Not setting this flag leads to usbnet_skb_return beeing called a second time from within the rx_process function in the usbnet module. Signed-off-by: Emil Goode Reported-by: Bjørn Mork Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 20d700bc51e717c502e51c3fb857b03ba8f99f34 Author: Lucas Stach Date: Thu Feb 27 12:51:38 2014 +0100 net: asix: handle packets crossing URB boundaries commit 8b5b6f5413e97c3e8bafcdd67553d508f4f698cd upstream. ASIX AX88772B started to pack data even more tightly. Packets and the ASIX packet header may now cross URB boundaries. To handle this we have to introduce some state between individual calls to asix_rx_fixup(). Signed-off-by: Lucas Stach Signed-off-by: David S. Miller [ Emil: backported to 3.4: dropped changes to drivers/net/usb/ax88172a.c ] Signed-off-by: Emil Goode Signed-off-by: Greg Kroah-Hartman commit 172ba81925a7f8fbec3c0f4146c28f223829e005 Author: Mark Cave-Ayland Date: Thu Feb 27 09:53:03 2014 +0800 rtlwifi: Fix endian error in extracting packet type commit 0c5d63f0ab6728f05ddefa25aff55e31297f95e6 upstream. All of the rtlwifi drivers have an error in the routine that tests if the data is "special". If it is, the subsequant transmission will be at the lowest rate to enhance reliability. The 16-bit quantity is big-endian, but was being extracted in native CPU mode. One of the effects of this bug is to inhibit association under some conditions as the TX rate is too high. Based on suggestions by Joe Perches, the entire routine is rewritten. One of the local headers contained duplicates of some of the ETH_P_XXX definitions. These are deleted. Signed-off-by: Larry Finger Cc: Mark Cave-Ayland Signed-off-by: John W. Linville [wujg: Backported to 3.4: - adjust context - remove rtlpriv->enter_ps = false - use schedule_work(&rtlpriv->works.lps_leave_work)] Signed-off-by: Jianguo Wu Signed-off-by: Greg Kroah-Hartman commit e87845b7d048925658199c863fe22b03e01e0438 Author: Emmanuel Grumbach Date: Thu Feb 27 09:53:02 2014 +0800 iwlwifi: pcie: add SKUs for 6000, 6005 and 6235 series commit 08a5dd3842f2ac61c6d69661d2d96022df8ae359 upstream. Add some new PCI IDs to the table for 6000, 6005 and 6235 series. Signed-off-by: Emmanuel Grumbach Signed-off-by: Johannes Berg [bwh: Backported to 3.2: - Adjust filenames - Drop const from struct iwl_cfg] Signed-off-by: Ben Hutchings [wujg: Backported to 3.4: - Adjust context - Do not drop const from struct iwl_cfg] Signed-off-by: Jianguo Wu Signed-off-by: Greg Kroah-Hartman commit 60e691100e48fea19bf35b28aa886ec12c892bab Author: Stanislaw Gruszka Date: Thu Feb 27 09:53:01 2014 +0800 iwlwifi: dvm: fix calling ieee80211_chswitch_done() with NULL commit 9186a1fd9ed190739423db84bc344d258ef3e3d7 upstream. If channel switch is pending and we remove interface we can crash like showed below due to passing NULL vif to mac80211: BUG: unable to handle kernel paging request at fffffffffffff8cc IP: [] strnlen+0xd/0x40 Call Trace: [] string.isra.3+0x3e/0xd0 [] vsnprintf+0x219/0x640 [] vscnprintf+0x11/0x30 [] vprintk_emit+0x115/0x4f0 [] printk+0x61/0x63 [] ieee80211_chswitch_done+0xaf/0xd0 [mac80211] [] iwl_chswitch_done+0x34/0x40 [iwldvm] [] iwlagn_commit_rxon+0x2a3/0xdc0 [iwldvm] [] ? iwlagn_set_rxon_chain+0x180/0x2c0 [iwldvm] [] iwl_set_mode+0x36/0x40 [iwldvm] [] iwlagn_mac_remove_interface+0x8d/0x1b0 [iwldvm] [] ieee80211_do_stop+0x29d/0x7f0 [mac80211] This is because we nulify ctx->vif in iwlagn_mac_remove_interface() before calling some other functions that teardown interface. To fix just check ctx->vif on iwl_chswitch_done(). We should not call ieee80211_chswitch_done() as channel switch works were already canceled by mac80211 in ieee80211_do_stop() -> ieee80211_mgd_stop(). Resolve: https://bugzilla.redhat.com/show_bug.cgi?id=979581 Reported-by: Lukasz Jagiello Signed-off-by: Stanislaw Gruszka Reviewed-by: Emmanuel Grumbach Signed-off-by: Johannes Berg [bwh: Backported to 3.2: adjust context, filename] Signed-off-by: Ben Hutchings [wujg: Backported to 3.4: - adjust context] Signed-off-by: Jianguo Wu Signed-off-by: Greg Kroah-Hartman commit b38b29d58fe03950b2ccd5eaa9af2acb3c1f084d Author: Johannes Berg Date: Thu Feb 27 09:53:00 2014 +0800 iwlwifi: dvm: don't send BT_CONFIG on devices w/o Bluetooth commit 707aee401d2467baa785a697f40a6e2d9ee79ad5 upstream. The BT_CONFIG command that is sent to the device during startup will enable BT coex unless the module parameter turns it off, but on devices without Bluetooth this may cause problems, as reported in Redhat BZ 885407. Fix this by sending the BT_CONFIG command only when the device has Bluetooth. Reviewed-by: Emmanuel Grumbach Signed-off-by: Johannes Berg [bwh: Backported to 3.2: - Adjust filename - s/priv->lib/priv->cfg/] Signed-off-by: Ben Hutchings [wujg: Backported to 3.4: - s/priv->cfg/priv->shrd->cfg/] Signed-off-by: Jianguo Wu Signed-off-by: Greg Kroah-Hartman commit a41adefcefde57643b4fdf4696385e829fd265e3 Author: Johannes Berg Date: Thu Feb 27 09:52:59 2014 +0800 iwlwifi: always copy first 16 bytes of commands commit 8a964f44e01ad3bbc208c3e80d931ba91b9ea786 upstream. The FH hardware will always write back to the scratch field in commands, even host commands not just TX commands, which can overwrite parts of the command. This is problematic if the command is re-used (with IWL_HCMD_DFL_NOCOPY) and can cause calibration issues. Address this problem by always putting at least the first 16 bytes into the buffer we also use for the command header and therefore make the DMA engine write back into this. For commands that are smaller than 16 bytes also always map enough memory for the DMA engine to write back to. Reviewed-by: Emmanuel Grumbach Signed-off-by: Johannes Berg [bwh: Backported to 3.2: - Adjust context - Drop the IWL_HCMD_DFL_DUP handling - Fix descriptor addresses and lengths for tracepoint, but otherwise leave it unchanged] Signed-off-by: Ben Hutchings [wujg: Backported to 3.4: adjust context] Signed-off-by: Jianguo Wu Signed-off-by: Greg Kroah-Hartman commit f2da4cebeffec9908b92ecb131b1108c3ef43769 Author: Johannes Berg Date: Thu Feb 27 09:52:58 2014 +0800 iwlwifi: handle DMA mapping failures commit 7c34158231b2eda8dcbd297be2bb1559e69cb433 upstream. The RX replenish code doesn't handle DMA mapping failures, which will cause issues if there actually is a failure. This was reported by Shuah Khan who found a DMA mapping framework warning ("device driver failed to check map error"). Reported-by: Shuah Khan Reviewed-by: Emmanuel Grumbach Signed-off-by: Johannes Berg [bwh: Backported to 3.2: - Adjust filename, context, indentation - Use bus(trans) instead of trans where necessary - Use hw_params(trans).rx_page_order instead of trans_pcie->rx_page_order] Signed-off-by: Ben Hutchings [wujg: Backported to 3.4: - Adjust context - Use trans instead of bus(trans) - Use hw_params(trans).rx_page_order instead of trans_pcie->rx_page_order] Signed-off-by: Jianguo Wu Signed-off-by: Greg Kroah-Hartman commit af720fc6ae76ea10216ede42b48ee0550d65e36f Author: Emmanuel Grumbach Date: Thu Feb 27 09:52:57 2014 +0800 iwlwifi: don't handle masked interrupt commit 25a172655f837bdb032e451f95441bb4acec51bb upstream. This can lead to a panic if the driver isn't ready to handle them. Since our interrupt line is shared, we can get an interrupt at any time (and CONFIG_DEBUG_SHIRQ checks that even when the interrupt is being freed). If the op_mode has gone away, we musn't call it. To avoid this the transport disables the interrupts when the hw is stopped and the op_mode is leaving. If there is an event that would cause an interrupt the INTA register is updated regardless of the enablement of the interrupts: even if the interrupts are disabled, the INTA will be changed, but the device won't issue an interrupt. But the ISR can be called at any time, so we ought ignore the value in the INTA otherwise we can call the op_mode after it was freed. I found this bug when the op_mode_start failed, and called iwl_trans_stop_hw(trans, true). Then I played with the RFKILL button, and removed the module. While removing the module, the IRQ is freed, and the ISR is called (CONFIG_DEBUG_SHIRQ enabled). Panic. Signed-off-by: Emmanuel Grumbach Reviewed-by: Gregory Greenman Signed-off-by: Johannes Berg [bwh: Backported to 3.2: - Adjust context - Pass bus(trans), not trans, to iwl_{read,write}32()] Signed-off-by: Ben Hutchings [wujg: Backported to 3.4: - adjust context - Pass trans to iwl_{read,write}32()}] Signed-off-by: Jianguo Wu Signed-off-by: Greg Kroah-Hartman commit 8234281aea5f98e5216a01d4955ce44025e95212 Author: Johannes Berg Date: Thu Feb 27 09:52:56 2014 +0800 iwlwifi: protect SRAM debugfs commit 4fc79db178f0a0ede479b4713e00df2d106028b3 upstream. If the device is not started, we can't read its SRAM and attempting to do so will cause issues. Protect the debugfs read. Signed-off-by: Johannes Berg Signed-off-by: John W. Linville [wujg: Backported to 3.4: adjust context] Signed-off-by: Jianguo Wu Signed-off-by: Greg Kroah-Hartman commit 3ece09975ce21d9aa9a64d34d7f1f6ea179e8b4d Author: Johannes Berg Date: Thu Feb 27 09:52:55 2014 +0800 iwlwifi: fix flow handler debug code commit 94543a8d4fb302817014981489f15cb3b92ec3c2 upstream. iwl_dbgfs_fh_reg_read() can cause crashes and/or BUG_ON in slub because the ifdefs are wrong, the code in iwl_dump_fh() should use DEBUGFS, not DEBUG to protect the buffer writing code. Also, while at it, clean up the arguments to the function, some code and make it generally safer. Reported-by: Benjamin Herrenschmidt Signed-off-by: Johannes Berg Signed-off-by: John W. Linville [bwh: Backported to 3.2: adjust filenames and context] Signed-off-by: Ben Hutchings [wujg: Backported to 3.4: adjust context] Signed-off-by: Jianguo Wu Signed-off-by: Greg Kroah-Hartman commit 537762453afa3d6e03751d116b4cc21971262f8a Author: Takashi Iwai Date: Thu Jul 11 17:55:57 2013 +0200 ALSA: asihpi: Fix unlocked snd_pcm_stop() call commit 60478295d6876619f8f47f6d1a5c25eaade69ee3 upstream. snd_pcm_stop() must be called in the PCM substream lock context. Signed-off-by: Takashi Iwai Cc: Weng Meiling Signed-off-by: Greg Kroah-Hartman commit 565ca886fed885e444f33ee66d8ae19f0ef4bbe3 Author: Takashi Iwai Date: Thu Jul 11 18:02:38 2013 +0200 staging: line6: Fix unlocked snd_pcm_stop() call commit 86f0b5b86d142b9323432fef078a6cf0fb5dda74 upstream. snd_pcm_stop() must be called in the PCM substream lock context. Signed-off-by: Takashi Iwai Signed-off-by: Ben Hutchings Cc: Weng Meiling Signed-off-by: Greg Kroah-Hartman commit eeb57ebb4820dad25f283ed0e875c6a0a498ba54 Author: Takashi Iwai Date: Thu Jul 11 18:00:25 2013 +0200 ASoC: s6000: Fix unlocked snd_pcm_stop() call commit 61be2b9a18ec70f3cbe3deef7a5f77869c71b5ae upstream. snd_pcm_stop() must be called in the PCM substream lock context. Acked-by: Mark Brown Signed-off-by: Takashi Iwai Signed-off-by: Ben Hutchings Cc: Weng Meiling Signed-off-by: Greg Kroah-Hartman commit a0f8b1ca745aa237d690740f662a4e71d000941a Author: Takashi Iwai Date: Thu Jul 11 17:59:33 2013 +0200 ALSA: pxa2xx: Fix unlocked snd_pcm_stop() call commit 46f6c1aaf790be9ea3c8ddfc8f235a5f677d08e2 upstream. snd_pcm_stop() must be called in the PCM substream lock context. Acked-by: Mark Brown Signed-off-by: Takashi Iwai Signed-off-by: Ben Hutchings Cc: Weng Meiling Signed-off-by: Greg Kroah-Hartman commit 475274c68b42752d252021714895b56ee43e52f4 Author: Takashi Iwai Date: Thu Jul 11 17:58:47 2013 +0200 ALSA: usx2y: Fix unlocked snd_pcm_stop() call commit 5be1efb4c2ed79c3d7c0cbcbecae768377666e84 upstream. snd_pcm_stop() must be called in the PCM substream lock context. Signed-off-by: Takashi Iwai Signed-off-by: Ben Hutchings Cc: Weng Meiling Signed-off-by: Greg Kroah-Hartman commit 0843fce1e56c02d0f6667b88a12e2846dac3352b Author: Takashi Iwai Date: Thu Jul 11 17:58:25 2013 +0200 ALSA: ua101: Fix unlocked snd_pcm_stop() call commit 9538aa46c2427d6782aa10036c4da4c541605e0e upstream. snd_pcm_stop() must be called in the PCM substream lock context. Acked-by: Clemens Ladisch Signed-off-by: Takashi Iwai Signed-off-by: Ben Hutchings Cc: Weng Meiling Signed-off-by: Greg Kroah-Hartman commit 2f6101666e02d58039bdedc4deb320ed02e2b558 Author: Takashi Iwai Date: Thu Jul 11 17:57:55 2013 +0200 ALSA: 6fire: Fix unlocked snd_pcm_stop() call commit 5b9ab3f7324a1b94a5a5a76d44cf92dfeb3b5e80 upstream. snd_pcm_stop() must be called in the PCM substream lock context. Signed-off-by: Takashi Iwai Signed-off-by: Ben Hutchings Cc: Weng Meiling Signed-off-by: Greg Kroah-Hartman commit 593827efb8bfab1365124296a1059036fd7d6c9a Author: Takashi Iwai Date: Thu Jul 11 17:56:56 2013 +0200 ALSA: atiixp: Fix unlocked snd_pcm_stop() call commit cc7282b8d5abbd48c81d1465925d464d9e3eaa8f upstream. snd_pcm_stop() must be called in the PCM substream lock context. Signed-off-by: Takashi Iwai Signed-off-by: Ben Hutchings Cc: Weng Meiling Signed-off-by: Greg Kroah-Hartman commit 9e6114746b57adc3c6983b6cf96030372ed5d756 Author: Fabio Estevam Date: Thu Jul 4 20:01:02 2013 -0300 ASoC: sglt5000: Fix the default value of CHIP_SSS_CTRL commit 016fcab8ff46fca29375d484226ec91932aa4a07 upstream. According to the sgtl5000 reference manual, the default value of CHIP_SSS_CTRL is 0x10. Reported-by: Oskar Schirmer Signed-off-by: Fabio Estevam Signed-off-by: Mark Brown [bwh: Backported to 3.2: format of register defaults array is different] Signed-off-by: Ben Hutchings Cc: Weng Meiling Signed-off-by: Greg Kroah-Hartman commit d17395ac947090c3258799207df8e574fd0ca7fd Author: Sascha Hauer Date: Sun Mar 10 19:33:03 2013 +0100 ASoC: imx-ssi: Fix occasional AC97 reset failure commit b6e51600f4e983e757b1b6942becaa1ae7d82e67 upstream. Signed-off-by: Sascha Hauer Signed-off-by: Markus Pargmann Signed-off-by: Mark Brown [bwh: Backported to 3.2: adjust filename] Signed-off-by: Ben Hutchings Cc: Weng Meiling Signed-off-by: Greg Kroah-Hartman commit d4d811d55f75e02ae7beaea3dc611498bf2bf5fb Author: Trond Myklebust Date: Wed May 22 12:57:24 2013 -0400 SUNRPC: Prevent an rpc_task wakeup race commit a3c3cac5d31879cd9ae2de7874dc6544ca704aec upstream. The lockless RPC_IS_QUEUED() test in __rpc_execute means that we need to be careful about ordering the calls to rpc_test_and_set_running(task) and rpc_clear_queued(task). If we get the order wrong, then we may end up testing the RPC_TASK_RUNNING flag after __rpc_execute() has looped and changed the state of the rpc_task. Signed-off-by: Trond Myklebust Signed-off-by: Ben Hutchings Cc: Weng Meiling Signed-off-by: Greg Kroah-Hartman commit e8d5ce17375e2ece50659f287ad1a3daf2335d7e Author: Jeff Layton Date: Mon Jul 23 15:51:55 2012 -0400 sunrpc: clarify comments on rpc_make_runnable commit 506026c3ec270e18402f0c9d33fee37482c23861 upstream. rpc_make_runnable is not generally called with the queue lock held, unless it's waking up a task that has been sitting on a waitqueue. This is safe when the task has not entered the FSM yet, but the comments don't really spell this out. Signed-off-by: Jeff Layton Signed-off-by: Trond Myklebust Signed-off-by: Ben Hutchings Cc: Weng Meiling Signed-off-by: Greg Kroah-Hartman commit b7d2a5e8cfe2dde13c9d05e1d220c2fc66da3164 Author: David Vrabel Date: Thu Aug 15 13:21:07 2013 +0100 xen/events: mask events when changing their VCPU binding commit 5e72fdb8d827560893642e85a251d339109a00f4 upstream. commit 4704fe4f03a5ab27e3c36184af85d5000e0f8a48 upstream. When a event is being bound to a VCPU there is a window between the EVTCHNOP_bind_vpcu call and the adjustment of the local per-cpu masks where an event may be lost. The hypervisor upcalls the new VCPU but the kernel thinks that event is still bound to the old VCPU and ignores it. There is even a problem when the event is being bound to the same VCPU as there is a small window beween the clear_bit() and set_bit() calls in bind_evtchn_to_cpu(). When scanning for pending events, the kernel may read the bit when it is momentarily clear and ignore the event. Avoid this by masking the event during the whole bind operation. Signed-off-by: David Vrabel Signed-off-by: Konrad Rzeszutek Wilk Reviewed-by: Jan Beulich [bwh: Backported to 3.2: remove the BM() cast] Signed-off-by: Ben Hutchings Cc: Yijing Wang Signed-off-by: Greg Kroah-Hartman commit 62047439828b2f7c984b7471d00dc11a07a1452b Author: Konrad Rzeszutek Wilk Date: Wed Jan 23 16:54:32 2013 -0500 xen/blkback: Check for insane amounts of request on the ring (v6). commit 9371cadbbcc7c00c81753b9727b19fb3bc74d458 upstream. commit 8e3f8755545cc4a7f4da8e9ef76d6d32e0dca576 upstream. Check that the ring does not have an insane amount of requests (more than there could fit on the ring). If we detect this case we will stop processing the requests and wait until the XenBus disconnects the ring. The existing check RING_REQUEST_CONS_OVERFLOW which checks for how many responses we have created in the past (rsp_prod_pvt) vs requests consumed (req_cons) and whether said difference is greater or equal to the size of the ring, does not catch this case. Wha the condition does check if there is a need to process more as we still have a backlog of responses to finish. Note that both of those values (rsp_prod_pvt and req_cons) are not exposed on the shared ring. To understand this problem a mini crash course in ring protocol response/request updates is in place. There are four entries: req_prod and rsp_prod; req_event and rsp_event to track the ring entries. We are only concerned about the first two - which set the tone of this bug. The req_prod is a value incremented by frontend for each request put on the ring. Conversely the rsp_prod is a value incremented by the backend for each response put on the ring (rsp_prod gets set by rsp_prod_pvt when pushing the responses on the ring). Both values can wrap and are modulo the size of the ring (in block case that is 32). Please see RING_GET_REQUEST and RING_GET_RESPONSE for the more details. The culprit here is that if the difference between the req_prod and req_cons is greater than the ring size we have a problem. Fortunately for us, the '__do_block_io_op' loop: rc = blk_rings->common.req_cons; rp = blk_rings->common.sring->req_prod; while (rc != rp) { .. blk_rings->common.req_cons = ++rc; /* before make_response() */ } will loop up to the point when rc == rp. The macros inside of the loop (RING_GET_REQUEST) is smart and is indexing based on the modulo of the ring size. If the frontend has provided a bogus req_prod value we will loop until the 'rc == rp' - which means we could be processing already processed requests (or responses) often. The reason the RING_REQUEST_CONS_OVERFLOW is not helping here is b/c it only tracks how many responses we have internally produced and whether we would should process more. The astute reader will notice that the macro RING_REQUEST_CONS_OVERFLOW provides two arguments - more on this later. For example, if we were to enter this function with these values: blk_rings->common.sring->req_prod = X+31415 (X is the value from the last time __do_block_io_op was called). blk_rings->common.req_cons = X blk_rings->common.rsp_prod_pvt = X The RING_REQUEST_CONS_OVERFLOW(&blk_rings->common, blk_rings->common.req_cons) is doing: req_cons - rsp_prod_pvt >= 32 Which is, X - X >= 32 or 0 >= 32 And that is false, so we continue on looping (this bug). If we re-use said macro RING_REQUEST_CONS_OVERFLOW and pass in the rp instead (sring->req_prod) of rc, the this macro can do the check: req_prod - rsp_prov_pvt >= 32 Which is, X + 31415 - X >= 32 , or 31415 >= 32 which is true, so we can error out and break out of the function. Unfortunatly the difference between rsp_prov_pvt and req_prod can be at 32 (which would error out in the macro). This condition exists when the backend is lagging behind with the responses and still has not finished responding to all of them (so make_response has not been called), and the rsp_prov_pvt + 32 == req_cons. This ends up with us not being able to use said macro. Hence introducing a new macro called RING_REQUEST_PROD_OVERFLOW which does a simple check of: req_prod - rsp_prod_pvt > RING_SIZE And with the X values from above: X + 31415 - X > 32 Returns true. Also not that if the ring is full (which is where the RING_REQUEST_CONS_OVERFLOW triggered), we would not hit the same condition: X + 32 - X > 32 Which is false. Lets use that macro. Note that in v5 of this patchset the macro was different - we used an earlier version. [v1: Move the check outside the loop] [v2: Add a pr_warn as suggested by David] [v3: Use RING_REQUEST_CONS_OVERFLOW as suggested by Jan] [v4: Move wake_up after kthread_stop as suggested by Jan] [v5: Use RING_REQUEST_PROD_OVERFLOW instead] [v6: Use RING_REQUEST_PROD_OVERFLOW - Jan's version] Signed-off-by: Konrad Rzeszutek Wilk Reviewed-by: Jan Beulich [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings Cc: Yijing Wang Signed-off-by: Greg Kroah-Hartman commit 23ced59b3765bc593712fda19af40658829db197 Author: Jan Beulich Date: Mon Jun 17 15:16:33 2013 -0400 xen/io/ring.h: new macro to detect whether there are too many requests on the ring commit 8d9256906a97c24e97e016482b9be06ea2532b05 upstream. Backends may need to protect themselves against an insane number of produced requests stored by a frontend, in case they iterate over requests until reaching the req_prod value. There can't be more requests on the ring than the difference between produced requests and produced (but possibly not yet published) responses. This is a more strict alternative to a patch previously posted by Konrad Rzeszutek Wilk . Signed-off-by: Jan Beulich Signed-off-by: Konrad Rzeszutek Wilk Signed-off-by: Ben Hutchings Cc: Yijing Wang Signed-off-by: Greg Kroah-Hartman commit f1369580791ff1eb18a210a58405131e2c3611d2 Author: Wei Liu Date: Mon Apr 22 02:20:43 2013 +0000 xen-netback: don't disconnect frontend when seeing oversize packet commit 03393fd5cc2b6cdeec32b704ecba64dbb0feae3c upstream. Some frontend drivers are sending packets > 64 KiB in length. This length overflows the length field in the first slot making the following slots have an invalid length. Turn this error back into a non-fatal error by dropping the packet. To avoid having the following slots having fatal errors, consume all slots in the packet. This does not reopen the security hole in XSA-39 as if the packet as an invalid number of slots it will still hit fatal error case. Signed-off-by: David Vrabel Signed-off-by: Wei Liu Acked-by: Ian Campbell Signed-off-by: David S. Miller Signed-off-by: Ben Hutchings Cc: Yijing Wang Signed-off-by: Greg Kroah-Hartman commit 9832f4a0fd7b5f412b2f41ede5b431bd2102b8dd Author: Wei Liu Date: Mon Apr 22 02:20:42 2013 +0000 xen-netback: coalesce slots in TX path and fix regressions commit 2810e5b9a7731ca5fce22bfbe12c96e16ac44b6f upstream. This patch tries to coalesce tx requests when constructing grant copy structures. It enables netback to deal with situation when frontend's MAX_SKB_FRAGS is larger than backend's MAX_SKB_FRAGS. With the help of coalescing, this patch tries to address two regressions avoid reopening the security hole in XSA-39. Regression 1. The reduction of the number of supported ring entries (slots) per packet (from 18 to 17). This regression has been around for some time but remains unnoticed until XSA-39 security fix. This is fixed by coalescing slots. Regression 2. The XSA-39 security fix turning "too many frags" errors from just dropping the packet to a fatal error and disabling the VIF. This is fixed by coalescing slots (handling 18 slots when backend's MAX_SKB_FRAGS is 17) which rules out false positive (using 18 slots is legit) and dropping packets using 19 to `max_skb_slots` slots. To avoid reopening security hole in XSA-39, frontend sending packet using more than max_skb_slots is considered malicious. The behavior of netback for packet is thus: 1-18 slots: valid 19-max_skb_slots slots: drop and respond with an error max_skb_slots+ slots: fatal error max_skb_slots is configurable by admin, default value is 20. Also change variable name from "frags" to "slots" in netbk_count_requests. Please note that RX path still has dependency on MAX_SKB_FRAGS. This will be fixed with separate patch. Signed-off-by: Wei Liu Acked-by: Ian Campbell Signed-off-by: David S. Miller Signed-off-by: Ben Hutchings Cc: Yijing Wang Signed-off-by: Greg Kroah-Hartman commit 047140a3c2a68b7f1ce24dad37ba2463031aef6a Author: stephen hemminger Date: Wed Apr 10 10:54:46 2013 +0000 xen-netback: fix sparse warning commit 9eaee8beeeb3bca0d9b14324fd9d467d48db784c upstream. Fix warning about 0 used as NULL. Signed-off-by: Stephen Hemminger Signed-off-by: David S. Miller Signed-off-by: Ben Hutchings Cc: Yijing Wang Signed-off-by: Greg Kroah-Hartman commit 63f12e8d2bea38715b30a6051325230f6ec25a3b Author: Konrad Rzeszutek Wilk Date: Tue Apr 16 14:08:50 2013 -0400 xen/smp/spinlock: Fix leakage of the spinlock interrupt line for every CPU online/offline commit 66ff0fe9e7bda8aec99985b24daad03652f7304e upstream. While we don't use the spinlock interrupt line (see for details commit f10cd522c5fbfec9ae3cc01967868c9c2401ed23 - xen: disable PV spinlocks on HVM) - we should still do the proper init / deinit sequence. We did not do that correctly and for the CPU init for PVHVM guest we would allocate an interrupt line - but failed to deallocate the old interrupt line. This resulted in leakage of an irq_desc but more importantly this splat as we online an offlined CPU: genirq: Flags mismatch irq 71. 0002cc20 (spinlock1) vs. 0002cc20 (spinlock1) Pid: 2542, comm: init.late Not tainted 3.9.0-rc6upstream #1 Call Trace: [] __setup_irq+0x23e/0x4a0 [] ? kmem_cache_alloc_trace+0x221/0x250 [] request_threaded_irq+0xfb/0x160 [] ? xen_spin_trylock+0x20/0x20 [] bind_ipi_to_irqhandler+0xa3/0x160 [] ? kasprintf+0x38/0x40 [] ? xen_spin_trylock+0x20/0x20 [] ? update_max_interval+0x15/0x40 [] xen_init_lock_cpu+0x3c/0x78 [] xen_hvm_cpu_notify+0x29/0x33 [] notifier_call_chain+0x4d/0x70 [] __raw_notifier_call_chain+0x9/0x10 [] __cpu_notify+0x1b/0x30 [] _cpu_up+0xa0/0x14b [] cpu_up+0xd9/0xec [] store_online+0x94/0xd0 [] dev_attr_store+0x1b/0x20 [] sysfs_write_file+0xf4/0x170 [] vfs_write+0xb4/0x130 [] sys_write+0x5a/0xa0 [] system_call_fastpath+0x16/0x1b cpu 1 spinlock event irq -16 smpboot: Booting Node 0 Processor 1 APIC 0x2 And if one looks at the /proc/interrupts right after offlining (CPU1): 70: 0 0 xen-percpu-ipi spinlock0 71: 0 0 xen-percpu-ipi spinlock1 77: 0 0 xen-percpu-ipi spinlock2 There is the oddity of the 'spinlock1' still being present. Signed-off-by: Konrad Rzeszutek Wilk [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings Cc: Yijing Wang Signed-off-by: Greg Kroah-Hartman commit 32ed904ec15d37d35afae9ce784951ec955d20a5 Author: Konrad Rzeszutek Wilk Date: Tue Apr 16 13:49:26 2013 -0400 xen/smp: Fix leakage of timer interrupt line for every CPU online/offline. commit 888b65b4bc5e7fcbbb967023300cd5d44dba1950 upstream. In the PVHVM path when we do CPU online/offline path we would leak the timer%d IRQ line everytime we do a offline event. The online path (xen_hvm_setup_cpu_clockevents via x86_cpuinit.setup_percpu_clockev) would allocate a new interrupt line for the timer%d. But we would still use the old interrupt line leading to: kernel BUG at /home/konrad/ssd/konrad/linux/kernel/hrtimer.c:1261! invalid opcode: 0000 [#1] SMP RIP: 0010:[] [] hrtimer_interrupt+0x261/0x270 .. snip.. [] xen_timer_interrupt+0x2f/0x1b0 [] ? stop_machine_cpu_stop+0xb5/0xf0 [] handle_irq_event_percpu+0x7c/0x240 [] handle_percpu_irq+0x49/0x70 [] __xen_evtchn_do_upcall+0x1c3/0x2f0 [] xen_evtchn_do_upcall+0x2a/0x40 [] xen_hvm_callback_vector+0x6d/0x80 [] ? start_secondary+0x193/0x1a8 [] ? start_secondary+0x18f/0x1a8 There is also the oddity (timer1) in the /proc/interrupts after offlining CPU1: 64: 1121 0 xen-percpu-virq timer0 78: 0 0 xen-percpu-virq timer1 84: 0 2483 xen-percpu-virq timer2 This patch fixes it. Signed-off-by: Konrad Rzeszutek Wilk [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings Cc: Yijing Wang Signed-off-by: Greg Kroah-Hartman commit 79b6a2d6bd21bf90b38dbad0cf9210235944e8f9 Author: Konrad Rzeszutek Wilk Date: Wed Sep 19 08:30:55 2012 -0400 xen/boot: Disable BIOS SMP MP table search. commit bd49940a35ec7d488ae63bd625639893b3385b97 upstream. As the initial domain we are able to search/map certain regions of memory to harvest configuration data. For all low-level we use ACPI tables - for interrupts we use exclusively ACPI _PRT (so DSDT) and MADT for INT_SRC_OVR. The SMP MP table is not used at all. As a matter of fact we do not even support machines that only have SMP MP but no ACPI tables. Lets follow how Moorestown does it and just disable searching for BIOS SMP tables. This also fixes an issue on HP Proliant BL680c G5 and DL380 G6: 9f->100 for 1:1 PTE Freeing 9f-100 pfn range: 97 pages freed 1-1 mapping on 9f->100 .. snip.. e820: BIOS-provided physical RAM map: Xen: [mem 0x0000000000000000-0x000000000009efff] usable Xen: [mem 0x000000000009f400-0x00000000000fffff] reserved Xen: [mem 0x0000000000100000-0x00000000cfd1dfff] usable .. snip.. Scan for SMP in [mem 0x00000000-0x000003ff] Scan for SMP in [mem 0x0009fc00-0x0009ffff] Scan for SMP in [mem 0x000f0000-0x000fffff] found SMP MP-table at [mem 0x000f4fa0-0x000f4faf] mapped at [ffff8800000f4fa0] (XEN) mm.c:908:d0 Error getting mfn 100 (pfn 5555555555555555) from L1 entry 0000000000100461 for l1e_owner=0, pg_owner=0 (XEN) mm.c:4995:d0 ptwr_emulate: could not get_page_from_l1e() BUG: unable to handle kernel NULL pointer dereference at (null) IP: [] xen_set_pte_init+0x66/0x71 . snip.. Pid: 0, comm: swapper Not tainted 3.6.0-rc6upstream-00188-gb6fb969-dirty #2 HP ProLiant BL680c G5 .. snip.. Call Trace: [] __early_ioremap+0x18a/0x248 [] ? printk+0x48/0x4a [] early_ioremap+0x13/0x15 [] get_mpc_size+0x2f/0x67 [] smp_scan_config+0x10c/0x136 [] default_find_smp_config+0x36/0x5a [] setup_arch+0x5b3/0xb5b [] ? printk+0x48/0x4a [] start_kernel+0x90/0x390 [] x86_64_start_reservations+0x131/0x136 [] xen_start_kernel+0x65f/0x661 (XEN) Domain 0 crashed: 'noreboot' set - not rebooting. which is that ioremap would end up mapping 0xff using _PAGE_IOMAP (which is what early_ioremap sticks as a flag) - which meant we would get MFN 0xFF (pte ff461, which is OK), and then it would also map 0x100 (b/c ioremap tries to get page aligned request, and it was trying to map 0xf4fa0 + PAGE_SIZE - so it mapped the next page) as _PAGE_IOMAP. Since 0x100 is actually a RAM page, and the _PAGE_IOMAP bypasses the P2M lookup we would happily set the PTE to 1000461. Xen would deny the request since we do not have access to the Machine Frame Number (MFN) of 0x100. The P2M[0x100] is for example 0x80140. Fixes-Oracle-Bugzilla: https://bugzilla.oracle.com/bugzilla/show_bug.cgi?id=13665 Acked-by: Jan Beulich Signed-off-by: Konrad Rzeszutek Wilk [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings Cc: Yijing Wang Signed-off-by: Greg Kroah-Hartman commit d54ecc0f3edbf3d50b508e45b7a6b80d73f7ed64 Author: Takashi Iwai Date: Thu Jul 11 18:00:59 2013 +0200 saa7134: Fix unlocked snd_pcm_stop() call commit e6355ad7b1c6f70e2f48ae159f5658b441ccff95 upstream. snd_pcm_stop() must be called in the PCM substream lock context. Signed-off-by: Takashi Iwai [wml: Backported to 3.4: Adjust filename] Signed-off-by: Weng Meiling Signed-off-by: Greg Kroah-Hartman commit c28414d3497a0db41a9fb08650131ee15989ee9c Author: Theodore Ts'o Date: Sat Jan 12 16:19:36 2013 -0500 ext4: return ENOMEM if sb_getblk() fails commit 860d21e2c585f7ee8a4ecc06f474fdc33c9474f4 upstream. The only reason for sb_getblk() failing is if it can't allocate the buffer_head. So ENOMEM is more appropriate than EIO. In addition, make sure that the file system is marked as being inconsistent if sb_getblk() fails. Signed-off-by: "Theodore Ts'o" [xr: Backported to 3.4: - Drop change to inline.c - Call to ext4_ext_check() from ext4_ext_find_extent() is conditional] Signed-off-by: Rui Xiang Signed-off-by: Greg Kroah-Hartman commit 0ef4c881e4ade7a45174d9cf45eca6f9e10f5ccc Author: Roland Dreier Date: Thu Nov 22 02:00:11 2012 -0800 block: Don't access request after it might be freed commit 893d290f1d7496db97c9471bc352ad4a11dc8a25 upstream. After we've done __elv_add_request() and __blk_run_queue() in blk_execute_rq_nowait(), the request might finish and be freed immediately. Therefore checking if the type is REQ_TYPE_PM_RESUME isn't safe afterwards, because if it isn't, rq might be gone. Instead, check beforehand and stash the result in a temporary. This fixes crashes in blk_execute_rq_nowait() I get occasionally when running with lots of memory debugging options enabled -- I think this race is usually harmless because the window for rq to be reallocated is so small. Signed-off-by: Roland Dreier Signed-off-by: Jens Axboe [xr: Backported to 3.4: adjust context] Signed-off-by: Rui Xiang Signed-off-by: Greg Kroah-Hartman commit 50e97121b728014dfbc34f07d1ac5a507466a2b7 Author: Paul Clements Date: Wed Jul 3 15:09:04 2013 -0700 nbd: correct disconnect behavior commit c378f70adbc1bbecd9e6db145019f14b2f688c7c upstream. Currently, when a disconnect is requested by the user (via NBD_DISCONNECT ioctl) the return from NBD_DO_IT is undefined (it is usually one of several error codes). This means that nbd-client does not know if a manual disconnect was performed or whether a network error occurred. Because of this, nbd-client's persist mode (which tries to reconnect after error, but not after manual disconnect) does not always work correctly. This change fixes this by causing NBD_DO_IT to always return 0 if a user requests a disconnect. This means that nbd-client can correctly either persist the connection (if an error occurred) or disconnect (if the user requested it). Signed-off-by: Paul Clements Acked-by: Rob Landley Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds [xr: Backported to 3.4: adjust context] Signed-off-by: Rui Xiang Signed-off-by: Greg Kroah-Hartman commit b0f9634dcc55be0ad7cfbc96c790bad780bd463d Author: Jeff Layton Date: Thu Dec 27 08:05:03 2012 -0500 cifs: adjust sequence number downward after signing NT_CANCEL request commit 31efee60f489c759c341454d755a9fd13de8c03d upstream. When a call goes out, the signing code adjusts the sequence number upward by two to account for the request and the response. An NT_CANCEL however doesn't get a response of its own, it just hurries the server along to get it to respond to the original request more quickly. Therefore, we must adjust the sequence number back down by one after signing a NT_CANCEL request. Reported-by: Tim Perry Signed-off-by: Jeff Layton Signed-off-by: Steve French [bwh: Backported to 3.2: adjust filename] Signed-off-by: Ben Hutchings Cc: Rui Xiang Signed-off-by: Greg Kroah-Hartman commit b54e3acc375bf344c2273662c61ce0265969b5fd Author: Jan Kara Date: Tue Jan 29 22:48:17 2013 -0500 ext4: fix possible use-after-free with AIO commit 091e26dfc156aeb3b73bc5c5f277e433ad39331c upstream. Running AIO is pinning inode in memory using file reference. Once AIO is completed using aio_complete(), file reference is put and inode can be freed from memory. So we have to be sure that calling aio_complete() is the last thing we do with the inode. Reviewed-by: Carlos Maiolino Acked-by: Jeff Moyer Signed-off-by: Jan Kara Signed-off-by: "Theodore Ts'o" [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings Cc: Rui Xiang Signed-off-by: Greg Kroah-Hartman commit 8a4188e2d84ab2ec720f29ede1799a6882969857 Author: Adam Thomas Date: Sat Feb 2 22:35:08 2013 +0000 UBIFS: fix double free of ubifs_orphan objects commit 8afd500cb52a5d00bab4525dd5a560d199f979b9 upstream. The last orphan in the dnext list has its dnext set to NULL. Because of that, ubifs_delete_orphan assumes that it is not on the dnext list and frees it immediately instead ignoring it as a second delete. The orphan is later freed again by erase_deleted. This change adds an explicit flag to ubifs_orphan indicating whether it is pending delete. Signed-off-by: Adam Thomas Signed-off-by: Artem Bityutskiy [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings Cc: Rui Xiang Signed-off-by: Greg Kroah-Hartman commit ebdc12a0b5ed501c23b52b1ab5d28aea681badcd Author: Theodore Ts'o Date: Wed Apr 3 22:02:52 2013 -0400 ext4/jbd2: don't wait (forever) for stale tid caused by wraparound commit d76a3a77113db020d9bb1e894822869410450bd9 upstream. In the case where an inode has a very stale transaction id (tid) in i_datasync_tid or i_sync_tid, it's possible that after a very large (2**31) number of transactions, that the tid number space might wrap, causing tid_geq()'s calculations to fail. Commit deeeaf13 "jbd2: fix fsync() tid wraparound bug", later modified by commit e7b04ac0 "jbd2: don't wake kjournald unnecessarily", attempted to fix this problem, but it only avoided kjournald spinning forever by fixing the logic in jbd2_log_start_commit(). Unfortunately, in the codepaths in fs/ext4/fsync.c and fs/ext4/inode.c that might call jbd2_log_start_commit() with a stale tid, those functions will subsequently call jbd2_log_wait_commit() with the same stale tid, and then wait for a very long time. To fix this, we replace the calls to jbd2_log_start_commit() and jbd2_log_wait_commit() with a call to a new function, jbd2_complete_transaction(), which will correctly handle stale tid's. As a bonus, jbd2_complete_transaction() will avoid locking j_state_lock for writing unless a commit needs to be started. This should have a small (but probably not measurable) improvement for ext4's scalability. Signed-off-by: "Theodore Ts'o" Reported-by: Ben Hutchings Reported-by: George Barnett [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings Cc: Rui Xiang Signed-off-by: Greg Kroah-Hartman commit d7d659d6baad472a8e810412e0c27cafe899bde9 Author: Dave Chiluk Date: Tue May 28 16:06:08 2013 -0500 ncpfs: fix rmdir returns Device or resource busy commit 698b8223631472bf982ed570b0812faa61955683 upstream. 1d2ef5901483004d74947bbf78d5146c24038fe7 caused a regression in ncpfs such that directories could no longer be removed. This was because ncp_rmdir checked to see if a dentry could be unhashed before allowing it to be removed. Since 1d2ef5901483004d74947bbf78d5146c24038fe7 introduced a change that incremented dentry->d_count causing it to always be greater than 1 unhash would always fail. Thus causing the error path in ncp_rmdir to always be taken. Removing this error path is safe as unhashing is still accomplished by calls to dput from vfs_rmdir. Signed-off-by: Dave Chiluk Signed-off-by: Petr Vandrovec Signed-off-by: Al Viro Signed-off-by: Ben Hutchings Cc: Rui Xiang Signed-off-by: Greg Kroah-Hartman commit 3d956c8a399285b1b51548fbaecad202d3a845c2 Author: Jeff Layton Date: Wed Aug 7 10:29:08 2013 -0400 cifs: don't instantiate new dentries in readdir for inodes that need to be revalidated immediately commit 757c4f6260febff982276818bb946df89c1105aa upstream. David reported that commit c2b93e06 (cifs: only set ops for inodes in I_NEW state) caused a regression with mfsymlinks. Prior to that patch, if a mfsymlink dentry was instantiated at readdir time, the inode would get a new set of ops when it was revalidated. After that patch, this did not occur. This patch addresses this by simply skipping instantiating dentries in the readdir codepath when we know that they will need to be immediately revalidated. The next attempt to use that dentry will cause a new lookup to occur (which is basically what we want to happen anyway). Cc: "Stefan (metze) Metzmacher" Cc: Sachin Prabhu Reported-and-Tested-by: David McBride Signed-off-by: Jeff Layton Signed-off-by: Steve French [bwh: Backported to 3.2: need to return NULL] Signed-off-by: Ben Hutchings Cc: Rui Xiang Signed-off-by: Greg Kroah-Hartman commit b3f19e7fb89f09ca56324a06b4ef7caf6745259b Author: majianpeng Date: Tue Jul 16 15:45:48 2013 +0800 libceph: unregister request in __map_request failed and nofail == false commit 73d9f7eef3d98c3920e144797cc1894c6b005a1e upstream. For nofail == false request, if __map_request failed, the caller does cleanup work, like releasing the relative pages. It doesn't make any sense to retry this request. Signed-off-by: Jianpeng Ma Reviewed-by: Sage Weil [bwh: Backported to 3.2: adjust indentation] Signed-off-by: Ben Hutchings Cc: Rui Xiang Signed-off-by: Greg Kroah-Hartman commit fe17c202b59ee36cff6ac87ca7ed31a9e9ea5f73 Author: Maxim Patlasov Date: Fri Aug 30 17:06:04 2013 +0400 fuse: hotfix truncate_pagecache() issue commit 06a7c3c2781409af95000c60a5df743fd4e2f8b4 upstream. The way how fuse calls truncate_pagecache() from fuse_change_attributes() is completely wrong. Because, w/o i_mutex held, we never sure whether 'oldsize' and 'attr->size' are valid by the time of execution of truncate_pagecache(inode, oldsize, attr->size). In fact, as soon as we released fc->lock in the middle of fuse_change_attributes(), we completely loose control of actions which may happen with given inode until we reach truncate_pagecache. The list of potentially dangerous actions includes mmap-ed reads and writes, ftruncate(2) and write(2) extending file size. The typical outcome of doing truncate_pagecache() with outdated arguments is data corruption from user point of view. This is (in some sense) acceptable in cases when the issue is triggered by a change of the file on the server (i.e. externally wrt fuse operation), but it is absolutely intolerable in scenarios when a single fuse client modifies a file without any external intervention. A real life case I discovered by fsx-linux looked like this: 1. Shrinking ftruncate(2) comes to fuse_do_setattr(). The latter sends FUSE_SETATTR to the server synchronously, but before getting fc->lock ... 2. fuse_dentry_revalidate() is asynchronously called. It sends FUSE_LOOKUP to the server synchronously, then calls fuse_change_attributes(). The latter updates i_size, releases fc->lock, but before comparing oldsize vs attr->size.. 3. fuse_do_setattr() from the first step proceeds by acquiring fc->lock and updating attributes and i_size, but now oldsize is equal to outarg.attr.size because i_size has just been updated (step 2). Hence, fuse_do_setattr() returns w/o calling truncate_pagecache(). 4. As soon as ftruncate(2) completes, the user extends file size by write(2) making a hole in the middle of file, then reads data from the hole either by read(2) or mmap-ed read. The user expects to get zero data from the hole, but gets stale data because truncate_pagecache() is not executed yet. The scenario above illustrates one side of the problem: not truncating the page cache even though we should. Another side corresponds to truncating page cache too late, when the state of inode changed significantly. Theoretically, the following is possible: 1. As in the previous scenario fuse_dentry_revalidate() discovered that i_size changed (due to our own fuse_do_setattr()) and is going to call truncate_pagecache() for some 'new_size' it believes valid right now. But by the time that particular truncate_pagecache() is called ... 2. fuse_do_setattr() returns (either having called truncate_pagecache() or not -- it doesn't matter). 3. The file is extended either by write(2) or ftruncate(2) or fallocate(2). 4. mmap-ed write makes a page in the extended region dirty. The result will be the lost of data user wrote on the fourth step. The patch is a hotfix resolving the issue in a simplistic way: let's skip dangerous i_size update and truncate_pagecache if an operation changing file size is in progress. This simplistic approach looks correct for the cases w/o external changes. And to handle them properly, more sophisticated and intrusive techniques (e.g. NFS-like one) would be required. I'd like to postpone it until the issue is well discussed on the mailing list(s). Changed in v2: - improved patch description to cover both sides of the issue. Signed-off-by: Maxim Patlasov Signed-off-by: Miklos Szeredi [bwh: Backported to 3.2: add the fuse_inode::state field which we didn't have] Signed-off-by: Ben Hutchings Cc: Rui Xiang Signed-off-by: Greg Kroah-Hartman commit f4a69e06dc99cd683fa001c49eb92da36fd76d1d Author: Miklos Szeredi Date: Tue Sep 3 14:28:38 2013 +0200 fuse: readdir: check for slash in names commit efeb9e60d48f7778fdcad4a0f3ad9ea9b19e5dfd upstream. Userspace can add names containing a slash character to the directory listing. Don't allow this as it could cause all sorts of trouble. Signed-off-by: Miklos Szeredi [bwh: Backported to 3.2: drop changes to parse_dirplusfile() which we don't have] Signed-off-by: Ben Hutchings Cc: Rui Xiang Signed-off-by: Greg Kroah-Hartman commit 831c87640d23ccb253a02e4901bd9a325b5e8c2d Author: Vyacheslav Dubeyko Date: Mon Sep 30 13:45:12 2013 -0700 nilfs2: fix issue with race condition of competition between segments for dirty blocks commit 7f42ec3941560f0902fe3671e36f2c20ffd3af0a upstream. Many NILFS2 users were reported about strange file system corruption (for example): NILFS: bad btree node (blocknr=185027): level = 0, flags = 0x0, nchildren = 768 NILFS error (device sda4): nilfs_bmap_last_key: broken bmap (inode number=11540) But such error messages are consequence of file system's issue that takes place more earlier. Fortunately, Jerome Poulin and Anton Eliasson were reported about another issue not so recently. These reports describe the issue with segctor thread's crash: BUG: unable to handle kernel paging request at 0000000000004c83 IP: nilfs_end_page_io+0x12/0xd0 [nilfs2] Call Trace: nilfs_segctor_do_construct+0xf25/0x1b20 [nilfs2] nilfs_segctor_construct+0x17b/0x290 [nilfs2] nilfs_segctor_thread+0x122/0x3b0 [nilfs2] kthread+0xc0/0xd0 ret_from_fork+0x7c/0xb0 These two issues have one reason. This reason can raise third issue too. Third issue results in hanging of segctor thread with eating of 100% CPU. REPRODUCING PATH: One of the possible way or the issue reproducing was described by Jermoe me Poulin : 1. init S to get to single user mode. 2. sysrq+E to make sure only my shell is running 3. start network-manager to get my wifi connection up 4. login as root and launch "screen" 5. cd /boot/log/nilfs which is a ext3 mount point and can log when NILFS dies. 6. lscp | xz -9e > lscp.txt.xz 7. mount my snapshot using mount -o cp=3360839,ro /dev/vgUbuntu/root /mnt/nilfs 8. start a screen to dump /proc/kmsg to text file since rsyslog is killed 9. start a screen and launch strace -f -o find-cat.log -t find /mnt/nilfs -type f -exec cat {} > /dev/null \; 10. start a screen and launch strace -f -o apt-get.log -t apt-get update 11. launch the last command again as it did not crash the first time 12. apt-get crashes 13. ps aux > ps-aux-crashed.log 13. sysrq+W 14. sysrq+E wait for everything to terminate 15. sysrq+SUSB Simplified way of the issue reproducing is starting kernel compilation task and "apt-get update" in parallel. REPRODUCIBILITY: The issue is reproduced not stable [60% - 80%]. It is very important to have proper environment for the issue reproducing. The critical conditions for successful reproducing: (1) It should have big modified file by mmap() way. (2) This file should have the count of dirty blocks are greater that several segments in size (for example, two or three) from time to time during processing. (3) It should be intensive background activity of files modification in another thread. INVESTIGATION: First of all, it is possible to see that the reason of crash is not valid page address: NILFS [nilfs_segctor_complete_write]:2100 bh->b_count 0, bh->b_blocknr 13895680, bh->b_size 13897727, bh->b_page 0000000000001a82 NILFS [nilfs_segctor_complete_write]:2101 segbuf->sb_segnum 6783 Moreover, value of b_page (0x1a82) is 6786. This value looks like segment number. And b_blocknr with b_size values look like block numbers. So, buffer_head's pointer points on not proper address value. Detailed investigation of the issue is discovered such picture: [-----------------------------SEGMENT 6783-------------------------------] NILFS [nilfs_segctor_do_construct]:2310 nilfs_segctor_begin_construction NILFS [nilfs_segctor_do_construct]:2321 nilfs_segctor_collect NILFS [nilfs_segctor_do_construct]:2336 nilfs_segctor_assign NILFS [nilfs_segctor_do_construct]:2367 nilfs_segctor_update_segusage NILFS [nilfs_segctor_do_construct]:2371 nilfs_segctor_prepare_write NILFS [nilfs_segctor_do_construct]:2376 nilfs_add_checksums_on_logs NILFS [nilfs_segctor_do_construct]:2381 nilfs_segctor_write NILFS [nilfs_segbuf_submit_bio]:464 bio->bi_sector 111149024, segbuf->sb_segnum 6783 [-----------------------------SEGMENT 6784-------------------------------] NILFS [nilfs_segctor_do_construct]:2310 nilfs_segctor_begin_construction NILFS [nilfs_segctor_do_construct]:2321 nilfs_segctor_collect NILFS [nilfs_lookup_dirty_data_buffers]:782 bh->b_count 1, bh->b_page ffffea000709b000, page->index 0, i_ino 1033103, i_size 25165824 NILFS [nilfs_lookup_dirty_data_buffers]:783 bh->b_assoc_buffers.next ffff8802174a6798, bh->b_assoc_buffers.prev ffff880221cffee8 NILFS [nilfs_segctor_do_construct]:2336 nilfs_segctor_assign NILFS [nilfs_segctor_do_construct]:2367 nilfs_segctor_update_segusage NILFS [nilfs_segctor_do_construct]:2371 nilfs_segctor_prepare_write NILFS [nilfs_segctor_do_construct]:2376 nilfs_add_checksums_on_logs NILFS [nilfs_segctor_do_construct]:2381 nilfs_segctor_write NILFS [nilfs_segbuf_submit_bh]:575 bh->b_count 1, bh->b_page ffffea000709b000, page->index 0, i_ino 1033103, i_size 25165824 NILFS [nilfs_segbuf_submit_bh]:576 segbuf->sb_segnum 6784 NILFS [nilfs_segbuf_submit_bh]:577 bh->b_assoc_buffers.next ffff880218a0d5f8, bh->b_assoc_buffers.prev ffff880218bcdf50 NILFS [nilfs_segbuf_submit_bio]:464 bio->bi_sector 111150080, segbuf->sb_segnum 6784, segbuf->sb_nbio 0 [----------] ditto NILFS [nilfs_segbuf_submit_bio]:464 bio->bi_sector 111164416, segbuf->sb_segnum 6784, segbuf->sb_nbio 15 [-----------------------------SEGMENT 6785-------------------------------] NILFS [nilfs_segctor_do_construct]:2310 nilfs_segctor_begin_construction NILFS [nilfs_segctor_do_construct]:2321 nilfs_segctor_collect NILFS [nilfs_lookup_dirty_data_buffers]:782 bh->b_count 2, bh->b_page ffffea000709b000, page->index 0, i_ino 1033103, i_size 25165824 NILFS [nilfs_lookup_dirty_data_buffers]:783 bh->b_assoc_buffers.next ffff880219277e80, bh->b_assoc_buffers.prev ffff880221cffc88 NILFS [nilfs_segctor_do_construct]:2367 nilfs_segctor_update_segusage NILFS [nilfs_segctor_do_construct]:2371 nilfs_segctor_prepare_write NILFS [nilfs_segctor_do_construct]:2376 nilfs_add_checksums_on_logs NILFS [nilfs_segctor_do_construct]:2381 nilfs_segctor_write NILFS [nilfs_segbuf_submit_bh]:575 bh->b_count 2, bh->b_page ffffea000709b000, page->index 0, i_ino 1033103, i_size 25165824 NILFS [nilfs_segbuf_submit_bh]:576 segbuf->sb_segnum 6785 NILFS [nilfs_segbuf_submit_bh]:577 bh->b_assoc_buffers.next ffff880218a0d5f8, bh->b_assoc_buffers.prev ffff880222cc7ee8 NILFS [nilfs_segbuf_submit_bio]:464 bio->bi_sector 111165440, segbuf->sb_segnum 6785, segbuf->sb_nbio 0 [----------] ditto NILFS [nilfs_segbuf_submit_bio]:464 bio->bi_sector 111177728, segbuf->sb_segnum 6785, segbuf->sb_nbio 12 NILFS [nilfs_segctor_do_construct]:2399 nilfs_segctor_wait NILFS [nilfs_segbuf_wait]:676 segbuf->sb_segnum 6783 NILFS [nilfs_segbuf_wait]:676 segbuf->sb_segnum 6784 NILFS [nilfs_segbuf_wait]:676 segbuf->sb_segnum 6785 NILFS [nilfs_segctor_complete_write]:2100 bh->b_count 0, bh->b_blocknr 13895680, bh->b_size 13897727, bh->b_page 0000000000001a82 BUG: unable to handle kernel paging request at 0000000000001a82 IP: [] nilfs_end_page_io+0x12/0xd0 [nilfs2] Usually, for every segment we collect dirty files in list. Then, dirty blocks are gathered for every dirty file, prepared for write and submitted by means of nilfs_segbuf_submit_bh() call. Finally, it takes place complete write phase after calling nilfs_end_bio_write() on the block layer. Buffers/pages are marked as not dirty on final phase and processed files removed from the list of dirty files. It is possible to see that we had three prepare_write and submit_bio phases before segbuf_wait and complete_write phase. Moreover, segments compete between each other for dirty blocks because on every iteration of segments processing dirty buffer_heads are added in several lists of payload_buffers: [SEGMENT 6784]: bh->b_assoc_buffers.next ffff880218a0d5f8, bh->b_assoc_buffers.prev ffff880218bcdf50 [SEGMENT 6785]: bh->b_assoc_buffers.next ffff880218a0d5f8, bh->b_assoc_buffers.prev ffff880222cc7ee8 The next pointer is the same but prev pointer has changed. It means that buffer_head has next pointer from one list but prev pointer from another. Such modification can be made several times. And, finally, it can be resulted in various issues: (1) segctor hanging, (2) segctor crashing, (3) file system metadata corruption. FIX: This patch adds: (1) setting of BH_Async_Write flag in nilfs_segctor_prepare_write() for every proccessed dirty block; (2) checking of BH_Async_Write flag in nilfs_lookup_dirty_data_buffers() and nilfs_lookup_dirty_node_buffers(); (3) clearing of BH_Async_Write flag in nilfs_segctor_complete_write(), nilfs_abort_logs(), nilfs_forget_buffer(), nilfs_clear_dirty_page(). Reported-by: Jerome Poulin Reported-by: Anton Eliasson Cc: Paul Fertser Cc: ARAI Shun-ichi Cc: Piotr Szymaniak Cc: Juan Barry Manuel Canham Cc: Zahid Chowdhury Cc: Elmer Zhang Cc: Kenneth Langga Signed-off-by: Vyacheslav Dubeyko Acked-by: Ryusuke Konishi Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds [bwh: Backported to 3.2: nilfs_clear_dirty_page() has not been separated from nilfs_clear_dirty_pages()] Signed-off-by: Ben Hutchings Cc: Rui Xiang Signed-off-by: Greg Kroah-Hartman commit 73a1182808116574b21e134b80a437cec9459642 Author: Jiri Olsa Date: Wed Sep 5 19:51:33 2012 +0200 perf tools: Fix cache event name generation commit 275ef3878f698941353780440fec6926107a320b upstream. If the event name is specified with all 3 components, the last one overwrites the previous one during the name composing within the parse_events_add_cache function. Fixing this by properly adjusting the string index. Reported-by: Joel Uckelman Signed-off-by: Jiri Olsa Cc: Corey Ashford Cc: Frederic Weisbecker Cc: Ingo Molnar Cc: Joel Uckelman Cc: Paul Mackerras Cc: Peter Zijlstra LPU-Reference: 20120905175133.GA18352@krava.brq.redhat.com [ committer note: Remove the newline fix, done already in 42e1fb7 ] Signed-off-by: Arnaldo Carvalho de Melo Cc: Vinson Lee Signed-off-by: Greg Kroah-Hartman commit f25c118ba56ee7d9430a9e8daa6eb5e783146a00 Author: Arnaldo Carvalho de Melo Date: Thu Sep 6 14:43:28 2012 -0300 perf tools: Remove extraneous newline when parsing hardware cache events commit 42e1fb776087713b5482cd7cf6cac998fbdd6544 upstream. Noticed while developing a 'perf test' entry to verify that perf_evsel__name works. Cc: David Ahern Cc: Frederic Weisbecker Cc: Jiri Olsa Cc: Mike Galbraith Cc: Namhyung Kim Cc: Paul Mackerras Cc: Peter Zijlstra Cc: Stephane Eranian Link: http://lkml.kernel.org/n/tip-xz6zgh38mp3cjnd2udh38z8f@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo Cc: Vinson Lee Signed-off-by: Greg Kroah-Hartman commit 446327d6f1763bbe7f9793cc8dae04fca82c9d48 Author: Jiang Liu Date: Tue Jul 31 16:43:30 2012 -0700 mm/hotplug: correctly add new zone to all other nodes' zone lists commit 08dff7b7d629807dbb1f398c68dd9cd58dd657a1 upstream. When online_pages() is called to add new memory to an empty zone, it rebuilds all zone lists by calling build_all_zonelists(). But there's a bug which prevents the new zone to be added to other nodes' zone lists. online_pages() { build_all_zonelists() ..... node_set_state(zone_to_nid(zone), N_HIGH_MEMORY) } Here the node of the zone is put into N_HIGH_MEMORY state after calling build_all_zonelists(), but build_all_zonelists() only adds zones from nodes in N_HIGH_MEMORY state to the fallback zone lists. build_all_zonelists() ->__build_all_zonelists() ->build_zonelists() ->find_next_best_node() ->for_each_node_state(n, N_HIGH_MEMORY) So memory in the new zone will never be used by other nodes, and it may cause strange behavor when system is under memory pressure. So put node into N_HIGH_MEMORY state before calling build_all_zonelists(). Signed-off-by: Jianguo Wu Signed-off-by: Jiang Liu Cc: Mel Gorman Cc: Michal Hocko Cc: Minchan Kim Cc: Rusty Russell Cc: Yinghai Lu Cc: Tony Luck Cc: KAMEZAWA Hiroyuki Cc: KOSAKI Motohiro Cc: David Rientjes Cc: Keping Chen Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings Cc: Qiang Huang Signed-off-by: Greg Kroah-Hartman commit 71a2068595ac1e8c45e97dff47df0148c2acbbfc Author: Tejun Heo Date: Tue Jun 25 11:48:32 2013 -0700 cgroup: fix RCU accesses to task->cgroups commit 14611e51a57df10240817d8ada510842faf0ec51 upstream. task->cgroups is a RCU pointer pointing to struct css_set. A task switches to a different css_set on cgroup migration but a css_set doesn't change once created and its pointers to cgroup_subsys_states aren't RCU protected. task_subsys_state[_check]() is the macro to acquire css given a task and subsys_id pair. It RCU-dereferences task->cgroups->subsys[] not task->cgroups, so the RCU pointer task->cgroups ends up being dereferenced without read_barrier_depends() after it. It's broken. Fix it by introducing task_css_set[_check]() which does RCU-dereference on task->cgroups. task_subsys_state[_check]() is reimplemented to directly dereference ->subsys[] of the css_set returned from task_css_set[_check](). This removes some of sparse RCU warnings in cgroup. v2: Fixed unbalanced parenthsis and there's no need to use rcu_dereference_raw() when !CONFIG_PROVE_RCU. Both spotted by Li. Signed-off-by: Tejun Heo Reported-by: Fengguang Wu Acked-by: Li Zefan [bwh: Backported to 3.2: - Adjust context - Remove CONFIG_PROVE_RCU condition - s/lockdep_is_held(&cgroup_mutex)/cgroup_lock_is_held()/] Signed-off-by: Ben Hutchings Cc: Qiang Huang Signed-off-by: Greg Kroah-Hartman commit 2f590c47013d610800770be179b8fbfd7c6bb1f8 Author: Kees Cook Date: Mon Feb 25 21:32:25 2013 +0000 proc connector: reject unprivileged listener bumps commit e70ab977991964a5a7ad1182799451d067e62669 upstream. While PROC_CN_MCAST_LISTEN/IGNORE is entirely advisory, it was possible for an unprivileged user to turn off notifications for all listeners by sending PROC_CN_MCAST_IGNORE. Instead, require the same privileges as required for a multicast bind. Signed-off-by: Kees Cook Cc: Evgeniy Polyakov Cc: Matt Helsley Acked-by: Evgeniy Polyakov Acked-by: Matt Helsley Signed-off-by: David S. Miller [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings Cc: Qiang Huang Signed-off-by: Greg Kroah-Hartman commit f7741e3b3f9d3827e7bc75118118f102553f7564 Author: Greg Edwards Date: Mon Nov 4 09:08:12 2013 -0700 KVM: IOMMU: hva align mapping page size commit 27ef63c7e97d1e5dddd85051c03f8d44cc887f34 upstream. When determining the page size we could use to map with the IOMMU, the page size should also be aligned with the hva, not just the gfn. The gfn may not reflect the real alignment within the hugetlbfs file. Most of the time, this works fine. However, if the hugetlbfs file is backed by non-contiguous huge pages, a multi-huge page memslot starts at an unaligned offset within the hugetlbfs file, and the gfn is aligned with respect to the huge page size, kvm_host_page_size() will return the huge page size and we will use that to map with the IOMMU. When we later unpin that same memslot, the IOMMU returns the unmap size as the huge page size, and we happily unpin that many pfns in monotonically increasing order, not realizing we are spanning non-contiguous huge pages and partially unpin the wrong huge page. Ensure the IOMMU mapping page size is aligned with the hva corresponding to the gfn, which does reflect the alignment within the hugetlbfs file. Reviewed-by: Marcelo Tosatti Signed-off-by: Greg Edwards Signed-off-by: Gleb Natapov [bwh: Backported to 3.2: s/__gfn_to_hva_memslot/gfn_to_hva_memslot/] Signed-off-by: Ben Hutchings Cc: Qiang Huang Signed-off-by: Greg Kroah-Hartman commit 67bc20f722e2d3b77566f3eac9e9f6ce1b890d23 Author: Alexander Graf Date: Thu Jan 17 13:50:25 2013 +0100 KVM: PPC: Emulate dcbf commit d3286144c92ec876da9e30320afa875699b7e0f1 upstream. Guests can trigger MMIO exits using dcbf. Since we don't emulate cache incoherent MMIO, just do nothing and move on. Reported-by: Ben Collins Signed-off-by: Alexander Graf Tested-by: Ben Collins [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings Cc: Qiang Huang Signed-off-by: Greg Kroah-Hartman commit 115f50635c5addb145f20e090fa6345ee31fb2b8 Author: Christian Borntraeger Date: Tue Oct 2 16:25:38 2012 +0200 s390/kvm: dont announce RRBM support commit 87cac8f879a5ecd7109dbe688087e8810b3364eb upstream. Newer kernels (linux-next with the transparent huge page patches) use rrbm if the feature is announced via feature bit 66. RRBM will cause intercepts, so KVM does not handle it right now, causing an illegal instruction in the guest. The easy solution is to disable the feature bit for the guest. This fixes bugs like: Kernel BUG at 0000000000124c2a [verbose debug info unavailable] illegal operation: 0001 [#1] SMP Modules linked in: virtio_balloon virtio_net ipv6 autofs4 CPU: 0 Not tainted 3.5.4 #1 Process fmempig (pid: 659, task: 000000007b712fd0, ksp: 000000007bed3670) Krnl PSW : 0704d00180000000 0000000000124c2a (pmdp_clear_flush_young+0x5e/0x80) R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 EA:3 00000000003cc000 0000000000000004 0000000000000000 0000000079800000 0000000000040000 0000000000000000 000000007bed3918 000000007cf40000 0000000000000001 000003fff7f00000 000003d281a94000 000000007bed383c 000000007bed3918 00000000005ecbf8 00000000002314a6 000000007bed36e0 Krnl Code:>0000000000124c2a: b9810025 ogr %r2,%r5 0000000000124c2e: 41343000 la %r3,0(%r4,%r3) 0000000000124c32: a716fffa brct %r1,124c26 0000000000124c36: b9010022 lngr %r2,%r2 0000000000124c3a: e3d0f0800004 lg %r13,128(%r15) 0000000000124c40: eb22003f000c srlg %r2,%r2,63 [ 2150.713198] Call Trace: [ 2150.713223] ([<00000000002312c4>] page_referenced_one+0x6c/0x27c) [ 2150.713749] [<0000000000233812>] page_referenced+0x32a/0x410 [...] CC: Alex Graf Signed-off-by: Martin Schwidefsky Signed-off-by: Christian Borntraeger Signed-off-by: Marcelo Tosatti Signed-off-by: Ben Hutchings Cc: Qiang Huang Signed-off-by: Greg Kroah-Hartman commit bf5597c693051a29efb77a6853daa869cd9667ab Author: Dominik Dingel Date: Fri Jul 26 15:04:00 2013 +0200 KVM: s390: move kvm_guest_enter,exit closer to sie commit 2b29a9fdcb92bfc6b6f4c412d71505869de61a56 upstream. Any uaccess between guest_enter and guest_exit could trigger a page fault, the page fault handler would handle it as a guest fault and translate a user address as guest address. Signed-off-by: Dominik Dingel Signed-off-by: Christian Borntraeger CC: stable@vger.kernel.org Signed-off-by: Paolo Bonzini [hq: Backported to 3.4: adjust context] Signed-off-by: Qiang Huang Signed-off-by: Greg Kroah-Hartman commit 30ec268be37bdb5b1614cce40af8083f1a7c27f3 Author: Tejun Heo Date: Tue Oct 16 15:03:14 2012 -0700 cgroup: cgroup_subsys->fork() should be called after the task is added to css_set commit 5edee61edeaaebafe584f8fb7074c1ef4658596b upstream. cgroup core has a bug which violates a basic rule about event notifications - when a new entity needs to be added, you add that to the notification list first and then make the new entity conform to the current state. If done in the reverse order, an event happening inbetween will be lost. cgroup_subsys->fork() is invoked way before the new task is added to the css_set. Currently, cgroup_freezer is the only user of ->fork() and uses it to make new tasks conform to the current state of the freezer. If FROZEN state is requested while fork is in progress between cgroup_fork_callbacks() and cgroup_post_fork(), the child could escape freezing - the cgroup isn't frozen when ->fork() is called and the freezer couldn't see the new task on the css_set. This patch moves cgroup_subsys->fork() invocation to cgroup_post_fork() after the new task is added to the css_set. cgroup_fork_callbacks() is removed. Because now a task may be migrated during cgroup_subsys->fork(), freezer_fork() is updated so that it adheres to the usual RCU locking and the rather pointless comment on why locking can be different there is removed (if it doesn't make anything simpler, why even bother?). Signed-off-by: Tejun Heo Cc: Oleg Nesterov Cc: Rafael J. Wysocki [hq: Backported to 3.4: - Adjust context - Iterate over first CGROUP_BUILTIN_SUBSYS_COUNT elements of subsys] Signed-off-by: Qiang Huang Signed-off-by: Greg Kroah-Hartman commit f47929fd5093c4b5c134ff2b2811ae327102bafd Author: Johannes Weiner Date: Thu Nov 29 13:54:23 2012 -0800 mm: vmscan: fix endless loop in kswapd balancing commit 60cefed485a02bd99b6299dad70666fe49245da7 upstream. Kswapd does not in all places have the same criteria for a balanced zone. Zones are only being reclaimed when their high watermark is breached, but compaction checks loop over the zonelist again when the zone does not meet the low watermark plus two times the size of the allocation. This gets kswapd stuck in an endless loop over a small zone, like the DMA zone, where the high watermark is smaller than the compaction requirement. Add a function, zone_balanced(), that checks the watermark, and, for higher order allocations, if compaction has enough free memory. Then use it uniformly to check for balanced zones. This makes sure that when the compaction watermark is not met, at least reclaim happens and progress is made - or the zone is declared unreclaimable at some point and skipped entirely. Signed-off-by: Johannes Weiner Reported-by: George Spelvin Reported-by: Johannes Hirte Reported-by: Tomas Racek Tested-by: Johannes Hirte Reviewed-by: Rik van Riel Cc: Mel Gorman Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds [hq: Backported to 3.4: adjust context] Signed-off-by: Qiang Huang Signed-off-by: Greg Kroah-Hartman commit 8371cffe8a9c704e3d240d5cb73142b6bd62245b Author: Hannes Reinecke Date: Wed Feb 26 10:07:04 2014 +0100 dm mpath: fix stalls when handling invalid ioctls commit a1989b330093578ea5470bea0a00f940c444c466 upstream. An invalid ioctl will never be valid, irrespective of whether multipath has active paths or not. So for invalid ioctls we do not have to wait for multipath to activate any paths, but can rather return an error code immediately. This fix resolves numerous instances of: udevd[]: worker [] unexpectedly returned with status 0x0100 that have been seen during testing. Signed-off-by: Hannes Reinecke Signed-off-by: Mike Snitzer Signed-off-by: Greg Kroah-Hartman commit 8d8e4839b5457e20e371a5f7485ce7855c7870c9 Author: Linus Walleij Date: Thu Feb 13 10:39:01 2014 +0100 dma: ste_dma40: don't dereference free:d descriptor commit e9baa9d9d520fb0e24cca671e430689de2d4a4b2 upstream. It appears that in the DMA40 driver the DMA tasklet will very often dereference memory for a descriptor just free:d from the DMA40 slab. Nothing happens because no other part of the driver has yet had a chance to claim this memory, but it's really nasty to dereference free:d memory, so let's check the flag before the descriptor is free and store it in a bool variable. Reported-by: Dan Carpenter Signed-off-by: Linus Walleij Signed-off-by: Vinod Koul Signed-off-by: Greg Kroah-Hartman commit 54499e71eccdeab9295af9150e3ffb99223447a8 Author: Jan Kara Date: Thu Feb 20 17:02:27 2014 +0100 quota: Fix race between dqput() and dquot_scan_active() commit 1362f4ea20fa63688ba6026e586d9746ff13a846 upstream. Currently last dqput() can race with dquot_scan_active() causing it to call callback for an already deactivated dquot. The race is as follows: CPU1 CPU2 dqput() spin_lock(&dq_list_lock); if (atomic_read(&dquot->dq_count) > 1) { - not taken if (test_bit(DQ_ACTIVE_B, &dquot->dq_flags)) { spin_unlock(&dq_list_lock); ->release_dquot(dquot); if (atomic_read(&dquot->dq_count) > 1) - not taken dquot_scan_active() spin_lock(&dq_list_lock); if (!test_bit(DQ_ACTIVE_B, &dquot->dq_flags)) - not taken atomic_inc(&dquot->dq_count); spin_unlock(&dq_list_lock); - proceeds to release dquot ret = fn(dquot, priv); - called for inactive dquot Fix the problem by making sure possible ->release_dquot() is finished by the time we call the callback and new calls to it will notice reference dquot_scan_active() has taken and bail out. Signed-off-by: Jan Kara Signed-off-by: Greg Kroah-Hartman commit 186ef2385c50ee6b2232f2ab8edb354ca71332bf Author: Eric Paris Date: Thu Feb 20 10:56:45 2014 -0500 SELinux: bigendian problems with filename trans rules commit 9085a6422900092886da8c404e1c5340c4ff1cbf upstream. When writing policy via /sys/fs/selinux/policy I wrote the type and class of filename trans rules in CPU endian instead of little endian. On x86_64 this works just fine, but it means that on big endian arch's like ppc64 and s390 userspace reads the policy and converts it from le32_to_cpu. So the values are all screwed up. Write the values in le format like it should have been to start. Signed-off-by: Eric Paris Acked-by: Stephen Smalley Signed-off-by: Paul Moore Signed-off-by: Greg Kroah-Hartman commit f80747a43fc2613b9f5e1ded16f50ef28815652e Author: Peter Zijlstra Date: Mon Feb 24 12:06:12 2014 +0100 perf: Fix hotplug splat commit e3703f8cdfcf39c25c4338c3ad8e68891cca3731 upstream. Drew Richardson reported that he could make the kernel go *boom* when hotplugging while having perf events active. It turned out that when you have a group event, the code in __perf_event_exit_context() fails to remove the group siblings from the context. We then proceed with destroying and freeing the event, and when you re-plug the CPU and try and add another event to that CPU, things go *boom* because you've still got dead entries there. Reported-by: Drew Richardson Signed-off-by: Peter Zijlstra Cc: Will Deacon Link: http://lkml.kernel.org/n/tip-k6v5wundvusvcseqj1si0oz0@git.kernel.org Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 23f0913c2b3816b11f32dcf774758b3f63b81f60 Author: Lai Jiangshan Date: Sat Feb 15 22:02:28 2014 +0800 workqueue: ensure @task is valid across kthread_stop() commit 5bdfff96c69a4d5ab9c49e60abf9e070ecd2acbb upstream. When a kworker should die, the kworkre is notified through WORKER_DIE flag instead of kthread_should_stop(). This, IIRC, is primarily to keep the test synchronized inside worker_pool lock. WORKER_DIE is first set while holding pool->lock, the lock is dropped and kthread_stop() is called. Unfortunately, this means that there's a slight chance that the target kworker may see WORKER_DIE before kthread_stop() finishes and exits and frees the target task before or during kthread_stop(). Fix it by pinning the target task before setting WORKER_DIE and putting it after kthread_stop() is done. tj: Improved patch description and comment. Moved pinning above WORKER_DIE for better signify what it's protecting. Signed-off-by: Lai Jiangshan Signed-off-by: Tejun Heo Signed-off-by: Greg Kroah-Hartman commit 8d36299431351e5c3a893d9c1eae7579ebc57ad5 Author: Guenter Roeck Date: Sat Feb 15 17:54:06 2014 -0800 hwmon: (max1668) Fix writing the minimum temperature commit 500a91571f0a5d0d3242d83802ea2fd1faccc66e upstream. When trying to set the minimum temperature, the driver was erroneously writing the maximum temperature into the chip. Signed-off-by: Guenter Roeck Reviewed-by: Jean Delvare Signed-off-by: Greg Kroah-Hartman commit 564a4d6c3205fc001579c00a0ce19b665f860487 Author: Joerg Dorchain Date: Fri Feb 21 20:29:33 2014 +0100 USB: ftdi_sio: add Cressi Leonardo PID commit 6dbd46c849e071e6afc1e0cad489b0175bca9318 upstream. Hello, the following patch adds an entry for the PID of a Cressi Leonardo diving computer interface to kernel 3.13.0. It is detected as FT232RL. Works with subsurface. Signed-off-by: Joerg Dorchain Signed-off-by: Greg Kroah-Hartman commit f922754e32d72b890ad471b5efd76b38c25f7d17 Author: Aleksander Morgado Date: Wed Feb 12 16:04:45 2014 +0100 USB: serial: option: blacklist interface 4 for Cinterion PHS8 and PXS8 commit 12df84d4a80278a5b1abfec3206795291da52fc9 upstream. This interface is to be handled by the qmi_wwan driver. CC: Hans-Christoph Schemmel CC: Christian Schmiedl CC: Nicolaus Colberg CC: David McCullough Signed-off-by: Aleksander Morgado Signed-off-by: Greg Kroah-Hartman commit 1d0230273dea6597819c4fe9f64f5bb891241578 Author: Lan Tianyu Date: Wed Feb 26 21:03:05 2014 +0800 ACPI / processor: Rework processor throttling with work_on_cpu() commit f3ca4164529b875374c410193bbbac0ee960895f upstream. acpi_processor_set_throttling() uses set_cpus_allowed_ptr() to make sure that the (struct acpi_processor)->acpi_processor_set_throttling() callback will run on the right CPU. However, the function may be called from a worker thread already bound to a different CPU in which case that won't work. Make acpi_processor_set_throttling() use work_on_cpu() as appropriate instead of abusing set_cpus_allowed_ptr(). Reported-and-tested-by: Jiri Olsa Signed-off-by: Lan Tianyu [rjw: Changelog] Signed-off-by: Rafael J. Wysocki Signed-off-by: Greg Kroah-Hartman commit b0b0c264549a424a453bf21ba41941db83f0899b Author: Hans de Goede Date: Thu Feb 13 16:32:51 2014 +0100 ACPI / video: Filter the _BCL table for duplicate brightness values commit bd8ba20597f0cfef3ef65c3fd2aa92ab23d4c8e1 upstream. Some devices have duplicate entries in there brightness levels table, ie on my Dell Latitude E6430 the table looks like this: [ 3.686060] acpi backlight index 0, val 80 [ 3.686095] acpi backlight index 1, val 50 [ 3.686122] acpi backlight index 2, val 5 [ 3.686147] acpi backlight index 3, val 5 [ 3.686172] acpi backlight index 4, val 5 [ 3.686197] acpi backlight index 5, val 5 [ 3.686223] acpi backlight index 6, val 5 [ 3.686248] acpi backlight index 7, val 5 [ 3.686273] acpi backlight index 8, val 6 [ 3.686332] acpi backlight index 9, val 7 [ 3.686356] acpi backlight index 10, val 8 [ 3.686380] acpi backlight index 11, val 9 etc. Notice that brightness values 0-5 are all mapped to 5. This means that if userspace writes any value between 0 and 5 to the brightness sysfs attribute and then reads it, it will always return 0, which is somewhat unexpected. This is a problem for ie gnome-settings-daemon, which uses read-modify-write logic when the users presses the brightness up or down keys. This is done this way to take brightness changes from other sources into account. On this specific laptop what happens once the brightness has been set to 0, is that gsd reads 0, adds 5, writes 5, and on the next brightness up key press again reads 0, so things get stuck at the lowest brightness setting. Filtering out the duplicate table entries, makes any write to brightness read back as the written value as one would expect, fixing this. Signed-off-by: Hans de Goede Reviewed-by: Aaron Lu Signed-off-by: Rafael J. Wysocki Signed-off-by: Greg Kroah-Hartman commit 9f84cff07159c0b21b3725a216c3f3d615f93b7c Author: Jean Delvare Date: Mon Feb 24 09:39:27 2014 +0100 i7core_edac: Fix PCI device reference count commit c0f5eeed0f4cef4f05b74883a7160e7edde58b6a upstream. The reference count changes done by pci_get_device can be a little misleading when the usage diverges from the most common scheme. The reference count of the device passed as the last parameter is always decreased, even if the function returns no new device. So if we are going to try alternative device IDs, we must manually increment the device reference count before each retry. If we don't, we end up decreasing the reference count, and after a few modprobe/rmmod cycles the PCI devices will vanish. In other words and as Alan put it: without this fix the EDAC code corrupts the PCI device list. This fixes kernel bug #50491: https://bugzilla.kernel.org/show_bug.cgi?id=50491 Signed-off-by: Jean Delvare Link: http://lkml.kernel.org/r/20140224093927.7659dd9d@endymion.delvare Reviewed-by: Alan Cox Cc: Mauro Carvalho Chehab Cc: Doug Thompson Signed-off-by: Borislav Petkov Signed-off-by: Greg Kroah-Hartman commit 31d183aa91d4220afc6429e8b9578aa436a3597d Author: Tejun Heo Date: Mon Feb 3 10:42:07 2014 -0500 sata_sil: apply MOD15WRITE quirk to TOSHIBA MK2561GSYN commit 9f9c47f00ce99329b1a82e2ac4f70f0fe3db549c upstream. It's a bit odd to see a newer device showing mod15write; however, the reported behavior is highly consistent and other factors which could contribute seem to have been verified well enough. Also, both sata_sil itself and the drive are fairly outdated at this point making the risk of this change fairly low. It is possible, probably likely, that other drive models in the same family have the same problem; however, for now, let's just add the specific model which was tested. Signed-off-by: Tejun Heo Reported-by: matson References: http://lkml.kernel.org/g/201401211912.s0LJCk7F015058@rs103.luxsci.com Signed-off-by: Greg Kroah-Hartman commit 13e60b225fc1f8394447363dc3ce02899badc99e Author: Denis V. Lunev Date: Thu Jan 30 15:20:30 2014 +0400 ata: enable quirk from jmicron JMB350 for JMB394 commit efb9e0f4f43780f0ae0c6428d66bd03e805c7539 upstream. Without the patch the kernel generates the following error. ata11.15: SATA link up 1.5 Gbps (SStatus 113 SControl 310) ata11.15: Port Multiplier vendor mismatch '0x197b' != '0x123' ata11.15: PMP revalidation failed (errno=-19) ata11.15: failed to recover PMP after 5 tries, giving up This patch helps to bypass this error and the device becomes functional. Signed-off-by: Denis V. Lunev Signed-off-by: Tejun Heo Cc: Signed-off-by: Greg Kroah-Hartman commit 494dac8e333e873cb5266ede3be957680d6194ad Author: Peter Zijlstra Date: Fri Feb 21 16:03:12 2014 +0100 perf/x86: Fix event scheduling commit 26e61e8939b1fe8729572dabe9a9e97d930dd4f6 upstream. Vince "Super Tester" Weaver reported a new round of syscall fuzzing (Trinity) failures, with perf WARN_ON()s triggering. He also provided traces of the failures. This is I think the relevant bit: > pec_1076_warn-2804 [000] d... 147.926153: x86_pmu_disable: x86_pmu_disable > pec_1076_warn-2804 [000] d... 147.926153: x86_pmu_state: Events: { > pec_1076_warn-2804 [000] d... 147.926156: x86_pmu_state: 0: state: .R config: ffffffffffffffff ( (null)) > pec_1076_warn-2804 [000] d... 147.926158: x86_pmu_state: 33: state: AR config: 0 (ffff88011ac99800) > pec_1076_warn-2804 [000] d... 147.926159: x86_pmu_state: } > pec_1076_warn-2804 [000] d... 147.926160: x86_pmu_state: n_events: 1, n_added: 0, n_txn: 1 > pec_1076_warn-2804 [000] d... 147.926161: x86_pmu_state: Assignment: { > pec_1076_warn-2804 [000] d... 147.926162: x86_pmu_state: 0->33 tag: 1 config: 0 (ffff88011ac99800) > pec_1076_warn-2804 [000] d... 147.926163: x86_pmu_state: } > pec_1076_warn-2804 [000] d... 147.926166: collect_events: Adding event: 1 (ffff880119ec8800) So we add the insn:p event (fd[23]). At this point we should have: n_events = 2, n_added = 1, n_txn = 1 > pec_1076_warn-2804 [000] d... 147.926170: collect_events: Adding event: 0 (ffff8800c9e01800) > pec_1076_warn-2804 [000] d... 147.926172: collect_events: Adding event: 4 (ffff8800cbab2c00) We try and add the {BP,cycles,br_insn} group (fd[3], fd[4], fd[15]). These events are 0:cycles and 4:br_insn, the BP event isn't x86_pmu so that's not visible. group_sched_in() pmu->start_txn() /* nop - BP pmu */ event_sched_in() event->pmu->add() So here we should end up with: 0: n_events = 3, n_added = 2, n_txn = 2 4: n_events = 4, n_added = 3, n_txn = 3 But seeing the below state on x86_pmu_enable(), the must have failed, because the 0 and 4 events aren't there anymore. Looking at group_sched_in(), since the BP is the leader, its event_sched_in() must have succeeded, for otherwise we would not have seen the sibling adds. But since neither 0 or 4 are in the below state; their event_sched_in() must have failed; but I don't see why, the complete state: 0,0,1:p,4 fits perfectly fine on a core2. However, since we try and schedule 4 it means the 0 event must have succeeded! Therefore the 4 event must have failed, its failure will have put group_sched_in() into the fail path, which will call: event_sched_out() event->pmu->del() on 0 and the BP event. Now x86_pmu_del() will reduce n_events; but it will not reduce n_added; giving what we see below: n_event = 2, n_added = 2, n_txn = 2 > pec_1076_warn-2804 [000] d... 147.926177: x86_pmu_enable: x86_pmu_enable > pec_1076_warn-2804 [000] d... 147.926177: x86_pmu_state: Events: { > pec_1076_warn-2804 [000] d... 147.926179: x86_pmu_state: 0: state: .R config: ffffffffffffffff ( (null)) > pec_1076_warn-2804 [000] d... 147.926181: x86_pmu_state: 33: state: AR config: 0 (ffff88011ac99800) > pec_1076_warn-2804 [000] d... 147.926182: x86_pmu_state: } > pec_1076_warn-2804 [000] d... 147.926184: x86_pmu_state: n_events: 2, n_added: 2, n_txn: 2 > pec_1076_warn-2804 [000] d... 147.926184: x86_pmu_state: Assignment: { > pec_1076_warn-2804 [000] d... 147.926186: x86_pmu_state: 0->33 tag: 1 config: 0 (ffff88011ac99800) > pec_1076_warn-2804 [000] d... 147.926188: x86_pmu_state: 1->0 tag: 1 config: 1 (ffff880119ec8800) > pec_1076_warn-2804 [000] d... 147.926188: x86_pmu_state: } > pec_1076_warn-2804 [000] d... 147.926190: x86_pmu_enable: S0: hwc->idx: 33, hwc->last_cpu: 0, hwc->last_tag: 1 hwc->state: 0 So the problem is that x86_pmu_del(), when called from a group_sched_in() that fails (for whatever reason), and without x86_pmu TXN support (because the leader is !x86_pmu), will corrupt the n_added state. Reported-and-Tested-by: Vince Weaver Signed-off-by: Peter Zijlstra Cc: Paul Mackerras Cc: Steven Rostedt Cc: Stephane Eranian Cc: Dave Jones Link: http://lkml.kernel.org/r/20140221150312.GF3104@twins.programming.kicks-ass.net Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit 8dd74be4a175ff8f5818a14a9c13aa027b667cf6 Author: Laurent Dufour Date: Mon Feb 24 17:30:55 2014 +0100 powerpc/crashdump : Fix page frame number check in copy_oldmem_page commit f5295bd8ea8a65dc5eac608b151386314cb978f1 upstream. In copy_oldmem_page, the current check using max_pfn and min_low_pfn to decide if the page is backed or not, is not valid when the memory layout is not continuous. This happens when running as a QEMU/KVM guest, where RTAS is mapped higher in the memory. In that case max_pfn points to the end of RTAS, and a hole between the end of the kdump kernel and RTAS is not backed by PTEs. As a consequence, the kdump kernel is crashing in copy_oldmem_page when accessing in a direct way the pages in that hole. This fix relies on the memblock's service memblock_is_region_memory to check if the read page is part or not of the directly accessible memory. Signed-off-by: Laurent Dufour Tested-by: Mahesh Salgaonkar Signed-off-by: Benjamin Herrenschmidt Signed-off-by: Greg Kroah-Hartman commit 5dd00a93ea27897deb79e23797f7ced6910e7979 Author: Trond Myklebust Date: Tue Feb 11 09:15:54 2014 -0500 SUNRPC: Fix races in xs_nospace() commit 06ea0bfe6e6043cb56a78935a19f6f8ebc636226 upstream. When a send failure occurs due to the socket being out of buffer space, we call xs_nospace() in order to have the RPC task wait until the socket has drained enough to make it worth while trying again. The current patch fixes a race in which the socket is drained before we get round to setting up the machinery in xs_nospace(), and which is reported to cause hangs. Link: http://lkml.kernel.org/r/20140210170315.33dfc621@notabene.brown Fixes: a9a6b52ee1ba (SUNRPC: Don't start the retransmission timer...) Reported-by: Neil Brown Signed-off-by: Trond Myklebust Signed-off-by: Greg Kroah-Hartman commit 52b418e832d133f5fa657efae0ca0fa87242fe2c Author: Lars-Peter Clausen Date: Sat Feb 22 18:30:13 2014 +0100 ASoC: wm8958-dsp: Fix firmware block loading commit 548da08fc1e245faf9b0d7c41ecd8e07984fc332 upstream. The codec->control_data contains a pointer to the device's regmap struct. But wm8994_bulk_write() expects a pointer to the parent wm8998 device. The issue was introduced in commit d9a7666f ("ASoC: Remove ASoC-specific WM8994 I/O code"). Fixes: d9a7666f ("ASoC: Remove ASoC-specific WM8994 I/O code") Signed-off-by: Lars-Peter Clausen Signed-off-by: Mark Brown Signed-off-by: Greg Kroah-Hartman commit 71a3a9e7d1b783ab7a61a82361f3849d4592ece1 Author: Takashi Iwai Date: Tue Feb 18 09:24:12 2014 +0100 ASoC: sta32x: Fix array access overflow commit 025c3fa9256d4c54506b7a29dc3befac54f5c68d upstream. Preset EQ enum of sta32x codec driver declares too many number of items and it may lead to the access over the actual array size. Use SOC_ENUM_SINGLE_DECL() helper and it's automatically fixed. Signed-off-by: Takashi Iwai Acked-by: Liam Girdwood Acked-by: Lars-Peter Clausen Signed-off-by: Mark Brown Signed-off-by: Greg Kroah-Hartman commit 7cc835745183631e6fd0ec173e4942942a508191 Author: Takashi Iwai Date: Thu Feb 27 07:41:32 2014 +0100 ASoC: sta32x: Fix wrong enum for limiter2 release rate commit b3619b288b621e63f66908045f48495869a996a6 upstream. There is a typo in the Limiter2 Release Rate control, a wrong enum for Limiter1 is assigned. It must point to Limiter2. Spotted by a compile warning: In file included from sound/soc/codecs/sta32x.c:34:0: sound/soc/codecs/sta32x.c:223:29: warning: ‘sta32x_limiter2_release_rate_enum’ defined but not used [-Wunused-variable] static SOC_ENUM_SINGLE_DECL(sta32x_limiter2_release_rate_enum, ^ include/sound/soc.h:275:18: note: in definition of macro ‘SOC_ENUM_DOUBLE_DECL’ struct soc_enum name = SOC_ENUM_DOUBLE(xreg, xshift_l, xshift_r, \ ^ sound/soc/codecs/sta32x.c:223:8: note: in expansion of macro ‘SOC_ENUM_SINGLE_DECL’ static SOC_ENUM_SINGLE_DECL(sta32x_limiter2_release_rate_enum, ^ Signed-off-by: Takashi Iwai Signed-off-by: Mark Brown Signed-off-by: Greg Kroah-Hartman commit ff177edb2ec3e16428d0fdf834706840f1111d1f Author: Takashi Iwai Date: Tue Feb 18 09:37:30 2014 +0100 ASoC: wm8770: Fix wrong number of enum items commit 7a6c0a58dc824523966f212c76322d47c5b0e6fe upstream. wm8770 codec driver defines ain_enum with a wrong number of items. Use SOC_ENUM_DOUBLE_DECL() macro and it's automatically fixed. Signed-off-by: Takashi Iwai Acked-by: Liam Girdwood Acked-by: Charles Keepax Acked-by: Lars-Peter Clausen Signed-off-by: Mark Brown Signed-off-by: Greg Kroah-Hartman commit 969d5499ee64ec2ece7b6c9bb553011a6cd20a00 Author: Clemens Ladisch Date: Sun Feb 16 17:11:10 2014 +0100 ALSA: usb-audio: work around KEF X300A firmware bug commit 624aef494f86ed0c58056361c06347ad62b26806 upstream. When the driver tries to access Function Unit 10, the KEF X300A speakers' firmware apparently locks up, making even PCM streaming impossible. Work around this by ignoring this FU. Signed-off-by: Clemens Ladisch Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 29a3cd46644ec8098dbe1c12f89643b5c11831a9 Author: Florian Westphal Date: Sat Feb 22 10:33:26 2014 +0100 net: ip, ipv6: handle gso skbs in forwarding path commit fe6cc55f3a9a053482a76f5a6b2257cee51b4663 upstream. [ use zero netdev_feature mask to avoid backport of netif_skb_dev_features function ] Marcelo Ricardo Leitner reported problems when the forwarding link path has a lower mtu than the incoming one if the inbound interface supports GRO. Given: Host R1 R2 Host sends tcp stream which is routed via R1 and R2. R1 performs GRO. In this case, the kernel will fail to send ICMP fragmentation needed messages (or pkt too big for ipv6), as GSO packets currently bypass dstmtu checks in forward path. Instead, Linux tries to send out packets exceeding the mtu. When locking route MTU on Host (i.e., no ipv4 DF bit set), R1 does not fragment the packets when forwarding, and again tries to send out packets exceeding R1-R2 link mtu. This alters the forwarding dstmtu checks to take the individual gso segment lengths into account. For ipv6, we send out pkt too big error for gso if the individual segments are too big. For ipv4, we either send icmp fragmentation needed, or, if the DF bit is not set, perform software segmentation and let the output path create fragments when the packet is leaving the machine. It is not 100% correct as the error message will contain the headers of the GRO skb instead of the original/segmented one, but it seems to work fine in my (limited) tests. Eric Dumazet suggested to simply shrink mss via ->gso_size to avoid sofware segmentation. However it turns out that skb_segment() assumes skb nr_frags is related to mss size so we would BUG there. I don't want to mess with it considering Herbert and Eric disagree on what the correct behavior should be. Hannes Frederic Sowa notes that when we would shrink gso_size skb_segment would then also need to deal with the case where SKB_MAX_FRAGS would be exceeded. This uses sofware segmentation in the forward path when we hit ipv4 non-DF packets and the outgoing link mtu is too small. Its not perfect, but given the lack of bug reports wrt. GRO fwd being broken this is a rare case anyway. Also its not like this could not be improved later once the dust settles. Acked-by: Herbert Xu Reported-by: Marcelo Ricardo Leitner Signed-off-by: Florian Westphal Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 32e66c065179e598c0e692f4d57ded48715e0d99 Author: Florian Westphal Date: Sat Feb 22 10:33:25 2014 +0100 net: add and use skb_gso_transport_seglen() commit de960aa9ab4decc3304959f69533eef64d05d8e8 upstream. [ no skb_gso_seglen helper in 3.4, leave tbf alone ] This moves part of Eric Dumazets skb_gso_seglen helper from tbf sched to skbuff core so it may be reused by upcoming ip forwarding path patch. Signed-off-by: Florian Westphal Acked-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 19e48381ab5da05c5fb398336a90a52a2b623cdc Author: Daniel Borkmann Date: Mon Feb 17 12:11:11 2014 +0100 net: sctp: fix sctp_connectx abi for ia32 emulation/compat mode [ Upstream commit ffd5939381c609056b33b7585fb05a77b4c695f3 ] SCTP's sctp_connectx() abi breaks for 64bit kernels compiled with 32bit emulation (e.g. ia32 emulation or x86_x32). Due to internal usage of 'struct sctp_getaddrs_old' which includes a struct sockaddr pointer, sizeof(param) check will always fail in kernel as the structure in 64bit kernel space is 4bytes larger than for user binaries compiled in 32bit mode. Thus, applications making use of sctp_connectx() won't be able to run under such circumstances. Introduce a compat interface in the kernel to deal with such situations by using a 'struct compat_sctp_getaddrs_old' structure where user data is copied into it, and then sucessively transformed into a 'struct sctp_getaddrs_old' structure with the help of compat_ptr(). That fixes sctp_connectx() abi without any changes needed in user space, and lets the SCTP test suite pass when compiled in 32bit and run on 64bit kernels. Fixes: f9c67811ebc0 ("sctp: Fix regression introduced by new sctp_connectx api") Signed-off-by: Daniel Borkmann Acked-by: Neil Horman Acked-by: Vlad Yasevich Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 4d4d28ccf594979041d90e0ac9d22b183b61071e Author: Emil Goode Date: Thu Feb 13 17:50:19 2014 +0100 usbnet: remove generic hard_header_len check [ Upstream commit eb85569fe2d06c2fbf4de7b66c263ca095b397aa ] This patch removes a generic hard_header_len check from the usbnet module that is causing dropped packages under certain circumstances for devices that send rx packets that cross urb boundaries. One example is the AX88772B which occasionally send rx packets that cross urb boundaries where the remaining partial packet is sent with no hardware header. When the buffer with a partial packet is of less number of octets than the value of hard_header_len the buffer is discarded by the usbnet module. With AX88772B this can be reproduced by using ping with a packet size between 1965-1976. The bug has been reported here: https://bugzilla.kernel.org/show_bug.cgi?id=29082 This patch introduces the following changes: - Removes the generic hard_header_len check in the rx_complete function in the usbnet module. - Introduces a ETH_HLEN check for skbs that are not cloned from within a rx_fixup callback. - For safety a hard_header_len check is added to each rx_fixup callback function that could be affected by this change. These extra checks could possibly be removed by someone who has the hardware to test. - Removes a call to dev_kfree_skb_any() and instead utilizes the dev->done list to queue skbs for cleanup. The changes place full responsibility on the rx_fixup callback functions that clone skbs to only pass valid skbs to the usbnet_skb_return function. Signed-off-by: Emil Goode Reported-by: Igor Gnatenko Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit dcff06e8e8b6fceeef93c8c9a1eab38dc166a69b Author: Jiri Bohac Date: Fri Feb 14 18:13:50 2014 +0100 bonding: 802.3ad: make aggregator_identifier bond-private [ Upstream commit 163c8ff30dbe473abfbb24a7eac5536c87f3baa9 ] aggregator_identifier is used to assign unique aggregator identifiers to aggregators of a bond during device enslaving. aggregator_identifier is currently a global variable that is zeroed in bond_3ad_initialize(). This sequence will lead to duplicate aggregator identifiers for eth1 and eth3: create bond0 change bond0 mode to 802.3ad enslave eth0 to bond0 //eth0 gets agg id 1 enslave eth1 to bond0 //eth1 gets agg id 2 create bond1 change bond1 mode to 802.3ad enslave eth2 to bond1 //aggregator_identifier is reset to 0 //eth2 gets agg id 1 enslave eth3 to bond0 //eth3 gets agg id 2 Fix this by making aggregator_identifier private to the bond. Signed-off-by: Jiri Bohac Acked-by: Veaceslav Falico Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 6c649ae0be4eb26e17600be49c6fc0ff3058bc88 Author: Nithin Sujir Date: Thu Feb 6 14:13:05 2014 -0800 tg3: Fix deadlock in tg3_change_mtu() [ Upstream commit c6993dfd7db9b0c6b7ca7503a56fda9236a4710f ] Quoting David Vrabel - "5780 cards cannot have jumbo frames and TSO enabled together. When jumbo frames are enabled by setting the MTU, the TSO feature must be cleared. This is done indirectly by calling netdev_update_features() which will call tg3_fix_features() to actually clear the flags. netdev_update_features() will also trigger a new netlink message for the feature change event which will result in a call to tg3_get_stats64() which deadlocks on the tg3 lock." tg3_set_mtu() does not need to be under the tg3 lock since converting the flags to use set_bit(). Move it out to after tg3_netif_stop(). Reported-by: David Vrabel Tested-by: David Vrabel Signed-off-by: Michael Chan Signed-off-by: Nithin Nayak Sujir Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit d4f0afc9ecf77da7da6e1e5a8b35ce8067bdf0dc Author: Maciej Żenczykowski Date: Fri Feb 7 16:23:48 2014 -0800 net: fix 'ip rule' iif/oif device rename [ Upstream commit 946c032e5a53992ea45e062ecb08670ba39b99e3 ] ip rules with iif/oif references do not update: (detach/attach) across interface renames. Signed-off-by: Maciej Żenczykowski CC: Willem de Bruijn CC: Eric Dumazet CC: Chris Davis CC: Carlo Contavalli Google-Bug-Id: 12936021 Acked-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit c649eee753c545e430d461274400dcd458c68ea9 Author: Olivier Langlois Date: Sat Feb 1 01:11:09 2014 -0500 rtlwifi: rtl8192ce: Fix too long disable of IRQs commit f78bccd79ba3cd9d9664981b501d57bdb81ab8a4 upstream. rtl8192ce is disabling for too long the local interrupts during hw initiatialisation when performing scans The observable symptoms in dmesg can be: - underruns from ALSA playback - clock freezes (tstamps do not change for several dmesg entries until irqs are finaly reenabled): [ 250.817669] rtlwifi:rtl_op_config():<0-0-0> 0x100 [ 250.817685] rtl8192ce:_rtl92ce_phy_set_rf_power_state():<0-1-0> IPS Set eRf nic enable [ 250.817732] rtl8192ce:_rtl92ce_init_mac():<0-1-0> reg0xec:18051d59:11 [ 250.817796] rtl8192ce:_rtl92ce_init_mac():<0-1-0> reg0xec:18051d59:11 [ 250.817910] rtl8192ce:_rtl92ce_init_mac():<0-1-0> reg0xec:18051d59:11 [ 250.818024] rtl8192ce:_rtl92ce_init_mac():<0-1-0> reg0xec:18051d59:11 [ 250.818139] rtl8192ce:_rtl92ce_init_mac():<0-1-0> reg0xec:18051d59:11 [ 250.818253] rtl8192ce:_rtl92ce_init_mac():<0-1-0> reg0xec:18051d59:11 [ 250.818367] rtl8192ce:_rtl92ce_init_mac():<0-1-0> reg0xec:18051d59:11 [ 250.818472] rtl8192ce:_rtl92ce_init_mac():<0-1-0> reg0xec:18051d59:11 [ 250.818472] rtl8192ce:_rtl92ce_init_mac():<0-1-0> reg0xec:18051d59:11 [ 250.818472] rtl8192ce:_rtl92ce_init_mac():<0-1-0> reg0xec:18051d59:11 [ 250.818472] rtl8192ce:_rtl92ce_init_mac():<0-1-0> reg0xec:18051d59:11 [ 250.818472] rtl8192ce:_rtl92ce_init_mac():<0-1-0> reg0xec:98053f15:10 [ 250.818472] rtl8192ce:rtl92ce_sw_led_on():<0-1-0> LedAddr:4E ledpin=1 [ 250.818472] rtl8192c_common:rtl92c_download_fw():<0-1-0> Firmware Version(49), Signature(0x88c1),Size(32) [ 250.818472] rtl8192ce:rtl92ce_enable_hw_security_config():<0-1-0> PairwiseEncAlgorithm = 0 GroupEncAlgorithm = 0 [ 250.818472] rtl8192ce:rtl92ce_enable_hw_security_config():<0-1-0> The SECR-value cc [ 250.818472] rtl8192c_common:rtl92c_dm_check_txpower_tracking_thermal_meter():<0-1-0> Schedule TxPowerTracking direct call!! [ 250.818472] rtl8192c_common:rtl92c_dm_txpower_tracking_callback_thermalmeter():<0-1-0> rtl92c_dm_txpower_tracking_callback_thermalmeter [ 250.818472] rtl8192c_common:rtl92c_dm_txpower_tracking_callback_thermalmeter():<0-1-0> Readback Thermal Meter = 0xe pre thermal meter 0xf eeprom_thermalmeter 0xf [ 250.818472] rtl8192c_common:rtl92c_dm_txpower_tracking_callback_thermalmeter():<0-1-0> Initial pathA ele_d reg0xc80 = 0x40000000, ofdm_index=0xc [ 250.818472] rtl8192c_common:rtl92c_dm_txpower_tracking_callback_thermalmeter():<0-1-0> Initial reg0xa24 = 0x90e1317, cck_index=0xc, ch14 0 [ 250.818472] rtl8192c_common:rtl92c_dm_txpower_tracking_callback_thermalmeter():<0-1-0> Readback Thermal Meter = 0xe pre thermal meter 0xf eeprom_thermalmeter 0xf delta 0x1 delta_lck 0x0 delta_iqk 0x0 [ 250.818472] rtl8192c_common:rtl92c_dm_txpower_tracking_callback_thermalmeter():<0-1-0> <=== [ 250.818472] rtl8192c_common:rtl92c_dm_initialize_txpower_tracking_thermalmeter():<0-1-0> pMgntInfo->txpower_tracking = 1 [ 250.818472] rtl8192ce:rtl92ce_led_control():<0-1-0> ledaction 3 [ 250.818472] rtl8192ce:rtl92ce_sw_led_on():<0-1-0> LedAddr:4E ledpin=1 [ 250.818472] rtlwifi:rtl_ips_nic_on():<0-1-0> before spin_unlock_irqrestore [ 251.154656] PCM: Lost interrupts? [Q]-0 (stream=0, delta=15903, new_hw_ptr=293408, old_hw_ptr=277505) The exact code flow that causes that is: 1. wpa_supplicant send a start_scan request to the nl80211 driver 2. mac80211 module call rtl_op_config with IEEE80211_CONF_CHANGE_IDLE 3. rtl_ips_nic_on is called which disable local irqs 4. rtl92c_phy_set_rf_power_state() is called 5. rtl_ps_enable_nic() is called and hw_init()is executed and then the interrupts on the device are enabled A good solution could be to refactor the code to avoid calling rtl92ce_hw_init() with the irqs disabled but a quick and dirty solution that has proven to work is to reenable the irqs during the function rtl92ce_hw_init(). I think that it is safe doing so since the device interrupt will only be enabled after the init function succeed. Signed-off-by: Olivier Langlois Acked-by: Larry Finger Signed-off-by: John W. Linville Signed-off-by: Greg Kroah-Hartman commit 6ed01545c37ed5dc485f1cbdda6d5fdbe1199dad Author: Olivier Langlois Date: Sat Feb 1 01:11:10 2014 -0500 rtlwifi: Fix incorrect return from rtl_ps_enable_nic() commit 2e8c5e56b307271c2dab6f8bfd1d8a3822ca2390 upstream. rtl_ps_enable_nic() is called from loops that will loop until this function returns true or a maximum number of retries is performed. hw_init() returns non-zero on error. In that situation return false to restore the original design intent to retry hw init when it fails. Signed-off-by: Olivier Langlois Acked-by: Larry Finger Signed-off-by: John W. Linville Signed-off-by: Greg Kroah-Hartman commit ed20a7cd3107a15d5748710b8d8b9b7d6f6a8fd6 Author: Stanislaw Gruszka Date: Mon Feb 10 22:38:28 2014 +0100 rtl8187: fix regression on MIPS without coherent DMA commit b6213e413a4e0c66548153516b074df14f9d08e0 upstream. This patch fixes regression caused by commit a16dad77634 "MIPS: Fix potencial corruption". That commit fixes one corruption scenario in cost of adding another one, which actually start to cause crashes on Yeeloong laptop when rtl8187 driver is used. For correct DMA read operation on machines without DMA coherence, kernel have to invalidate cache, such it will refill later with new data that device wrote to memory, when that data is needed to process. We can only invalidate full cache line. Hence when cache line includes both dma buffer and some other data (written in cache, but not yet in main memory), the other data can not hit memory due to invalidation. That happen on rtl8187 where struct rtl8187_priv fields are located just before and after small buffers that are passed to USB layer and DMA is performed on them. To fix the problem we align buffers and reserve space after them to make them match cache line. This patch does not resolve all possible MIPS problems entirely, for that we have to assure that we always map cache aligned buffers for DMA, what can be complex or even not possible. But patch fixes visible and reproducible regression and seems other possible corruptions do not happen in practice, since Yeeloong laptop works stable without rtl8187 driver. Bug report: https://bugzilla.kernel.org/show_bug.cgi?id=54391 Reported-by: Petr Pisar Bisected-by: Tom Li Reported-and-tested-by: Tom Li Signed-off-by: Stanislaw Gruszka Acked-by: Larry Finger Acked-by: Hin-Tak Leung Signed-off-by: John W. Linville Signed-off-by: Greg Kroah-Hartman commit 01ef0c9fb8c1d681b114de0ad2a29a6f018eb4f5 Author: Jeff Layton Date: Fri Feb 14 07:20:35 2014 -0500 cifs: ensure that uncached writes handle unmapped areas correctly commit 5d81de8e8667da7135d3a32a964087c0faf5483f upstream. It's possible for userland to pass down an iovec via writev() that has a bogus user pointer in it. If that happens and we're doing an uncached write, then we can end up getting less bytes than we expect from the call to iov_iter_copy_from_user. This is CVE-2014-0069 cifs_iovec_write isn't set up to handle that situation however. It'll blindly keep chugging through the page array and not filling those pages with anything useful. Worse yet, we'll later end up with a negative number in wdata->tailsz, which will confuse the sending routines and cause an oops at the very least. Fix this by having the copy phase of cifs_iovec_write stop copying data in this situation and send the last write as a short one. At the same time, we want to avoid sending a zero-length write to the server, so break out of the loop and set rc to -EFAULT if that happens. This also allows us to handle the case where no address in the iovec is valid. [Note: Marking this for stable on v3.4+ kernels, but kernels as old as v2.6.38 may have a similar problem and may need similar fix] Reviewed-by: Pavel Shilovsky Reported-by: Al Viro Signed-off-by: Jeff Layton Signed-off-by: Steve French Signed-off-by: Greg Kroah-Hartman commit da06e904d622fd9012fdbe468995c013492b6ed4 Author: Chen Gang Date: Sat Feb 1 20:35:54 2014 +0800 avr32: Makefile: add '-D__linux__' flag for gcc-4.4.7 use commit 8d80390cfc9434d5aa4fb9e5f9768a66b30cb8a6 upstream. For avr32 cross compiler, do not define '__linux__' internally, so it will cause issue with allmodconfig. The related error: CC [M] fs/coda/psdev.o In file included from include/linux/coda.h:64, from fs/coda/psdev.c:45: include/uapi/linux/coda.h:221: error: expected specifier-qualifier-list before 'u_quad_t' The related toolchain version (which only download, not re-compile): [root@gchen linux-next]# /upstream/toolchain/download/avr32-gnu-toolchain-linux_x86/bin/avr32-gcc -v Using built-in specs. Target: avr32 Configured with: /data2/home/toolsbuild/jenkins-knuth/workspace/avr32-gnu-toolchain/src/gcc/configure --target=avr32 --host=i686-pc-linux-gnu --build=x86_64-pc-linux-gnu --prefix=/home/toolsbuild/jenkins-knuth/workspace/avr32-gnu-toolchain/avr32-gnu-toolchain-linux_x86 --enable-languages=c,c++ --disable-nls --disable-libssp --disable-libstdcxx-pch --with-dwarf2 --enable-version-specific-runtime-libs --disable-shared --enable-doc --with-mpfr-lib=/home/toolsbuild/jenkins-knuth/workspace/avr32-gnu-toolchain/avr32-gnu-toolchain-linux_x86/lib --with-mpfr-include=/home/toolsbuild/jenkins-knuth/workspace/avr32-gnu-toolchain/avr32-gnu-toolchain-linux_x86/include --with-gmp=/home/toolsbuild/jenkins-knuth/workspace/avr32-gnu-toolchain/avr32-gnu-toolchain-linux_x86 --with-mpc=/home/toolsbuild/jenkins-knuth/workspace/avr32-gnu-toolchain/avr32-gnu-toolchain-linux_x86 --enable-__cxa_atexit --disable-shared --with-newlib --with-pkgversion=AVR_32_bit_GNU_Toolchain_3.4.2_435 --with-bugurl=http://www .atmel.com/avr Thread model: single gcc version 4.4.7 (AVR_32_bit_GNU_Toolchain_3.4.2_435) Signed-off-by: Chen Gang Acked-by: Hans-Christian Egtvedt Signed-off-by: Greg Kroah-Hartman commit c48ca4946a2e39775e3a89ea803a84a782fc27be Author: Paul Gortmaker Date: Fri Jan 10 09:29:39 2014 -0500 avr32: fix missing module.h causing build failure in mimc200/fram.c commit 5745d6a41a4f4aec29e2ccd591c6fb09ed73a955 upstream. Causing this: In file included from arch/avr32/boards/mimc200/fram.c:13: include/linux/miscdevice.h:51: error: field 'list' has incomplete type include/linux/miscdevice.h:55: error: expected specifier-qualifier-list before 'mode_t' arch/avr32/boards/mimc200/fram.c:42: error: 'THIS_MODULE' undeclared here (not in a function) Reported-by: Fengguang Wu Cc: Haavard Skinnemoen Cc: Hans-Christian Egtvedt Signed-off-by: Paul Gortmaker Signed-off-by: Sergei Trofimovich Acked-by: Hans-Christian Egtvedt Signed-off-by: Greg Kroah-Hartman commit d25de0cc257d404bfbd401a246a37a89171a163b Author: Vinayak Kale Date: Wed Feb 12 07:30:01 2014 +0100 ARM: 7957/1: add DSB after icache flush in __flush_icache_all() commit 39544ac9df20f73e49fc6b9ac19ff533388c82c0 upstream. Add DSB after icache flush to complete the cache maintenance operation. Signed-off-by: Vinayak Kale Acked-by: Catalin Marinas Signed-off-by: Russell King Signed-off-by: Greg Kroah-Hartman commit 0dcbe514c5d7cbbb6e52e771454475e967bbf85a Author: Will Deacon Date: Fri Feb 7 19:12:20 2014 +0100 ARM: 7953/1: mm: ensure TLB invalidation is complete before enabling MMU commit bae0ca2bc550d1ec6a118fb8f2696f18c4da3d8e upstream. During __v{6,7}_setup, we invalidate the TLBs since we are about to enable the MMU on return to head.S. Unfortunately, without a subsequent dsb instruction, the invalidation is not guaranteed to have completed by the time we write to the sctlr, potentially exposing us to junk/stale translations cached in the TLB. This patch reworks the init functions so that the dsb used to ensure completion of cache/predictor maintenance is also used to ensure completion of the TLB invalidation. Reported-by: Albin Tonnerre Signed-off-by: Will Deacon Signed-off-by: Russell King Signed-off-by: Greg Kroah-Hartman commit 6c3515f0ec08a1aa21007f3b261d2c3445adaae7 Author: Theodore Ts'o Date: Sun Feb 16 19:29:32 2014 -0500 ext4: don't leave i_crtime.tv_sec uninitialized commit 19ea80603715d473600cd993b9987bc97d042e02 upstream. If the i_crtime field is not present in the inode, don't leave the field uninitialized. Fixes: ef7f38359 ("ext4: Add nanosecond timestamps") Reported-by: Vegard Nossum Tested-by: Vegard Nossum Signed-off-by: "Theodore Ts'o" Signed-off-by: Greg Kroah-Hartman commit 080368aadcb269a5949f331e6f2594720866eb83 Author: Theodore Ts'o Date: Sat Feb 15 22:42:25 2014 -0500 ext4: fix online resize with a non-standard blocks per group setting commit 3d2660d0c9c2f296837078c189b68a47f6b2e3b5 upstream. The set_flexbg_block_bitmap() function assumed that the number of blocks in a blockgroup was sb->blocksize * 8, which is normally true, but not always! Use EXT4_BLOCKS_PER_GROUP(sb) instead, to fix block bitmap corruption after: mke2fs -t ext4 -g 3072 -i 4096 /dev/vdd 1G mount -t ext4 /dev/vdd /vdd resize2fs /dev/vdd 8G Signed-off-by: "Theodore Ts'o" Reported-by: Jon Bernard Signed-off-by: Greg Kroah-Hartman commit a8909f1fc06b034f0d790f48c8babf9bc6b2341f Author: Theodore Ts'o Date: Wed Feb 12 12:16:04 2014 -0500 ext4: don't try to modify s_flags if the the file system is read-only commit 23301410972330c0ae9a8afc379ba2005e249cc6 upstream. If an ext4 file system is created by some tool other than mke2fs (perhaps by someone who has a pathalogical fear of the GPL) that doesn't set one or the other of the EXT2_FLAGS_{UN}SIGNED_HASH flags, and that file system is then mounted read-only, don't try to modify the s_flags field. Otherwise, if dm_verity is in use, the superblock will change, causing an dm_verity failure. Signed-off-by: "Theodore Ts'o" Signed-off-by: Greg Kroah-Hartman