Ben Skeggs [Sat, 9 Aug 2014 18:10:24 +0000 (04:10 +1000)]
drm/nouveau/dmaobj: switch to a slightly saner design
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Sat, 9 Aug 2014 18:10:24 +0000 (04:10 +1000)]
drm/nouveau/dmaobj: update to an improved style of class definition
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Sat, 9 Aug 2014 18:10:24 +0000 (04:10 +1000)]
drm/nouveau/device: audit and version NV_DEVICE class
The full object interfaces are about to be exposed to userspace, so we
need to check for any security-related issues and version the structs
to make it easier to handle any changes we may need in the future.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Sat, 9 Aug 2014 18:10:23 +0000 (04:10 +1000)]
drm/nouveau: use ioctl interface for abi16 gpuobj free
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Sat, 9 Aug 2014 18:10:23 +0000 (04:10 +1000)]
drm/nouveau: use ioctl interface for abi16 ntfy alloc
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Sat, 9 Aug 2014 18:10:23 +0000 (04:10 +1000)]
drm/nouveau: use ioctl interface for abi16 grobj alloc
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Sat, 9 Aug 2014 18:10:23 +0000 (04:10 +1000)]
drm/nouveau: remove as much direct use of core headers as possible
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Sat, 9 Aug 2014 18:10:23 +0000 (04:10 +1000)]
drm/nouveau: remove (most) hardcoded object handle usage
The PFIFO<->EVO sync buffers will be fixed up later when inter-channel
sync in general is improved.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Sat, 9 Aug 2014 18:10:22 +0000 (04:10 +1000)]
drm/nouveau: port to nvif client/device/objects
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Sat, 9 Aug 2014 18:10:22 +0000 (04:10 +1000)]
drm/nouveau: initial pass at moving to struct nvif_device
This is an attempt at isolating some of the changes necessary to port
to NVIF in a separate commit.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Sat, 9 Aug 2014 18:10:22 +0000 (04:10 +1000)]
drm/nouveau: kill nouveau_dev() + wrap register macros
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Sat, 9 Aug 2014 18:10:22 +0000 (04:10 +1000)]
drm/nouveau: fix some usages of the wrong print function
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Sat, 9 Aug 2014 18:10:21 +0000 (04:10 +1000)]
drm/nouveau/nvif: import library functions for the ioctl/event interfaces
This is a wrapper around the interfaces defined in an earlier commit,
and is also used by various userspace (either by a libdrm backend, or
libpciaccess) tools/tests.
In the future this will be extended to handle channels, replacing some
long-unloved code we currently use, and allow fifo/display/mpeg (hi
Ilia ;)) engines to all be exposed in the same way.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Sat, 9 Aug 2014 18:10:21 +0000 (04:10 +1000)]
drm/nouveau/client: add method to retrieve device list
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Sat, 9 Aug 2014 18:10:21 +0000 (04:10 +1000)]
drm/nouveau/core: remove NV_D0 family
The one place where it mattered has been replaced with a class check,
which is more appropriate anyway.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Sat, 9 Aug 2014 18:10:21 +0000 (04:10 +1000)]
drm/nouveau/device: add method to retrieve some basic device info
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Sat, 9 Aug 2014 18:10:20 +0000 (04:10 +1000)]
drm/nouveau/core: import ioctl/event interfaces
This forms the basis for the new APIs that will be exposed to userspace,
giving it access to:
- Object method calls, the immediately useful of which is performance
counters and the abiity to manipulate the ZBC tables.
- Information on the child classes an object supports, in order to avoid
having to try all supported classes until successful.
- Notifications, which will be used in the future to inform the client
if its channel was killed due to a lockup, etc.
This commit imports the interfaces, but are not currently used. The DRM
portion of the driver will be ported to speak to the core using these
interfaces as much as possible.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Sat, 9 Aug 2014 18:10:20 +0000 (04:10 +1000)]
drm/nouveau/core: add function to return list of supported children
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Sat, 9 Aug 2014 18:10:20 +0000 (04:10 +1000)]
drm/nouveau/core: rework event interface
This is a lot of prep-work for being able to send event notifications
back to userspace. Events now contain data, rather than a "something
just happened" signal.
Handler data is now embedded into a containing structure, rather than
being kmalloc()'d, and can optionally have the notify routine handled
in a workqueue.
Various races between suspend/unload with display HPD/DP IRQ handlers
automagically solved as a result.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Sat, 9 Aug 2014 18:10:20 +0000 (04:10 +1000)]
drm/nouveau/core: move handle-based object apis to handle.c
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Sat, 9 Aug 2014 18:10:20 +0000 (04:10 +1000)]
drm/nouveau/core: fail creation of zero-argument objects, when arguments are passed
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Sat, 9 Aug 2014 18:10:20 +0000 (04:10 +1000)]
drm/nouveau: store a pointer to vm in nouveau_cli
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Sat, 9 Aug 2014 18:10:19 +0000 (04:10 +1000)]
drm/nouveau: store vblank event handler data in nv_crtc
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Sat, 9 Aug 2014 18:10:19 +0000 (04:10 +1000)]
drm/nv50/kms: create ctxdma objects for framebuffers as required
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Sat, 9 Aug 2014 18:10:19 +0000 (04:10 +1000)]
drm/nv50/kms: move framebuffer wrangling out of common code
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Mario Kleiner [Wed, 6 Aug 2014 04:09:44 +0000 (06:09 +0200)]
drm/nouveau: Bump version from 1.1.1 to 1.1.2
Linux 3.16 fixed multiple bugs in kms pageflip completion events
and timestamping, which were originally introduced in Linux 3.13.
These fixes have been backported to all stable kernels since 3.13.
However, the userspace nouveau-ddx needs to be aware if it is
running on a kernel on which these bugs are fixed, or not.
Bump the patchlevel of the drm driver version to signal this,
so backporting this patch to stable 3.13+ kernels will give the
ddx the required info.
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Cc: <stable@vger.kernel.org> #v3.13+
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Thu, 7 Aug 2014 21:21:53 +0000 (07:21 +1000)]
drm/nv50-/sw: use nv50_software_context_dtor....
You would not believe the troubles this caused me...
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Tue, 5 Aug 2014 12:03:49 +0000 (22:03 +1000)]
drm/nv50-/fb: use dma_mapping_error() to check dma_map_page() result
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Mario Kleiner [Tue, 29 Jul 2014 00:36:44 +0000 (02:36 +0200)]
drm/nouveau: Dis/Enable vblank irqs during suspend/resume.
Vblank irqs don't get disabled during suspend or driver
unload, which causes irq delivery after "suspend" or
driver unload, at least until the gpu is powered off.
This could race with drm_vblank_cleanup() in the case
of nouveau and cause a use-after-free bug if the driver
is unloaded.
More annoyingly during everyday use, at least on nv50
display engine (likely also others), vblank irqs are
off after a resume from suspend, but the drm doesn't
know this, so all vblank related functionality is dead
after a resume. E.g., all windowed OpenGL clients will
hang at swapbuffers time, as well as many fullscreen
clients in many cases. This makes suspend/resume useless
if one wants to use any OpenGL apps after the resume.
In Linux 3.16, drm_vblank_on() was added, complementing
the older drm_vblank_off() to solve these problems
elegantly, so use those calls in nouveaus suspend/resume
code.
For kernels 3.8 - 3.15, we need to cherry-pick the
drm_vblank_on() patch to support this patch.
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Cc: <stable@vger.kernel.org> #v3.16
Cc: <stable@vger.kernel.org> #v3.8+: f275228: drm: Add drm_vblank_on()
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Alexandre Courbot [Sat, 26 Jul 2014 09:36:02 +0000 (18:36 +0900)]
drm/nouveau: platform: update moved Tegra header
Header for tegra_powergate functions has moved to soc/tegra/pmc.h.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Alexandre Courbot [Sat, 26 Jul 2014 09:41:41 +0000 (18:41 +0900)]
drm/nouveau/gk20a: reclocking support
Add support for reclocking on GK20A, using a statically-defined pstates
table. The algorithms for calculating the coefficients and setting the
clocks are directly taken from the ChromeOS kernel.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Alexandre Courbot [Sat, 26 Jul 2014 09:41:40 +0000 (18:41 +0900)]
drm/nouveau/clk: support for non-BIOS pstates
Make nouveau_clock_create() take new two optional arguments: an array
of pstates and its size. When these are specified,
nouveau_clock_create() will use the provided pstates instead of
probing them using the BIOS.
This is useful for platforms which do not provide a BIOS, like Tegra.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Alexandre Courbot [Sat, 26 Jul 2014 09:41:39 +0000 (18:41 +0900)]
drm/nouveau/clk: make therm and volt devices optional
Allow the clock subsystem to operate even if voltage and thermal devices
are not set for the device (for people with watercooling! ;))
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Samuel Pitoiset [Mon, 21 Jul 2014 09:59:44 +0000 (11:59 +0200)]
drm/nouveau/perfmon: do not forget to destroy the engine context
This fixes a crash when we reload Nouveau DRM.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Alexandre Courbot [Thu, 31 Jul 2014 09:09:42 +0000 (18:09 +0900)]
drm/nouveau: map pages using DMA API
The DMA API is the recommended way to map pages no matter what the
underlying bus is. Use the DMA functions for page mapping and remove
currently existing wrappers.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Roy Spliet [Sat, 2 Aug 2014 15:15:01 +0000 (17:15 +0200)]
drm/nouveau/pwr/macros: Stop playing Russian roulette on data memory
Signed-off-by: Roy Spliet <rspliet@eclipso.eu>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Alexandre Courbot [Tue, 15 Jul 2014 01:36:11 +0000 (10:36 +0900)]
drm/nve4/graph: do not crash if no power device present
Detect and workaround the absence of a power device so chips that do not
feature one (e.g. GK20A) can still use this driver.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Alexandre Courbot [Fri, 27 Jun 2014 11:36:54 +0000 (20:36 +0900)]
drm/gk20a: add BAR instance
GK20A's BAR is functionally identical to NVC0's, but do not support
being ioremapped write-combined. Create a BAR instance for GK20A that
reflect that state.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Alexandre Courbot [Fri, 27 Jun 2014 10:28:50 +0000 (19:28 +0900)]
drm/nouveau/bar: add noncached ioremap property
Some BARs (like GK20A's) do not support being ioremapped write-combined.
Add a boolean property to the BAR structure and handle that case in the
Nouveau BO implementation.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Alexandre Courbot [Thu, 26 Jun 2014 05:33:32 +0000 (14:33 +0900)]
drm/nouveau: support for probing platform devices
Add a platform driver for Nouveau devices declared using the device tree
or platform data. This driver currently supports GK20A on Tegra
platforms and is only compiled for these platforms if Nouveau is
enabled.
Nouveau will probe the chip type itself using the BOOT0 register, so all
this driver really needs to do is to make sure the module is powered and
its clocks active before calling nouveau_drm_platform_probe().
Heavily based on work done by Thierry Reding.
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Mon, 30 Jun 2014 03:18:48 +0000 (13:18 +1000)]
drm/nouveau/kms: restore acceleration before fb_set_suspend() resumes
This *should* be safe these days.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Sat, 28 Jun 2014 10:44:07 +0000 (20:44 +1000)]
drm/nouveau/kms: take more care when pulling down accelerated fbcon
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Fri, 13 Jun 2014 04:17:09 +0000 (14:17 +1000)]
drm/nouveau: expose pstate selection per-power source in sysfs
echo ac:id >> pstate # select mode when on mains power
echo dc:id >> pstate # select mode when on battery
echo id >> pstate # select mode for both
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Fri, 13 Jun 2014 03:23:42 +0000 (13:23 +1000)]
drm/nouveau/clk: allow selection of different power state for ac vs battery
v2:
- s/init/fini/ typo, reported by Alex
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Fri, 13 Jun 2014 04:58:21 +0000 (14:58 +1000)]
drm/nouveau/clk: schedule pstate changes through a workqueue
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Fri, 13 Jun 2014 02:42:21 +0000 (12:42 +1000)]
drm/nouveau/device: register for acpi events
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Thu, 12 Jun 2014 12:15:21 +0000 (22:15 +1000)]
drm/gk208-/gr: stop touching 0x260 inappropriately
As a side note.. It's a bit hard to figure out how to name this commit..
GK20A is NVEA, which is before NV108 (GK208).. Confusing.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Thu, 12 Jun 2014 11:22:32 +0000 (21:22 +1000)]
drm/gk110b/gr: initvals differ from gk110
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Thu, 12 Jun 2014 09:49:08 +0000 (19:49 +1000)]
drm/gk104/gr: disable PGOB at init time
This removes the previous hack that worked on some boards.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Thu, 12 Jun 2014 08:58:05 +0000 (18:58 +1000)]
drm/gk104/pwr: implement PGOB disable method
As documented at:
ftp://download.nvidia.com/open-gpu-doc/gk104-disable-graphics-power-gating/1/gk104-disable-graphics-power-gating.txt
NVIDIA were not able document the steps necessary to detect whether this
is required or not at this time. However, they did confirm that this
procedure is safe to perform unconditionally on GK104/6. GK107 does not
have the power gating feature, and it was recommended that we do not
perform these steps there as the effects were not verified.
The disable path is from observing the binary driver, and not
documented in the link above.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Thu, 12 Jun 2014 08:31:32 +0000 (18:31 +1000)]
drm/nouveau/pwr: tidy
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Linus Torvalds [Sat, 9 Aug 2014 16:58:12 +0000 (09:58 -0700)]
Merge branch 'signal-cleanup' of git://git./linux/kernel/git/rw/misc
Pull arch signal handling cleanup from Richard Weinberger:
"This patch series moves all remaining archs to the get_signal(),
signal_setup_done() and sigsp() functions.
Currently these archs use open coded variants of the said functions.
Further, unused parameters get removed from get_signal_to_deliver(),
tracehook_signal_handler() and signal_delivered().
At the end of the day we save around 500 lines of code."
* 'signal-cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/misc: (43 commits)
powerpc: Use sigsp()
openrisc: Use sigsp()
mn10300: Use sigsp()
mips: Use sigsp()
microblaze: Use sigsp()
metag: Use sigsp()
m68k: Use sigsp()
m32r: Use sigsp()
hexagon: Use sigsp()
frv: Use sigsp()
cris: Use sigsp()
c6x: Use sigsp()
blackfin: Use sigsp()
avr32: Use sigsp()
arm64: Use sigsp()
arc: Use sigsp()
sas_ss_flags: Remove nested ternary if
Rip out get_signal_to_deliver()
Clean up signal_delivered()
tracehook_signal_handler: Remove sig, info, ka and regs
...
Linus Torvalds [Sat, 9 Aug 2014 16:34:19 +0000 (09:34 -0700)]
Merge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm
Pull ARM fixes from Russell King:
"A number of small fixes:
- fix loading of the translation table base registers for LPAE
- add two new syscalls to the ARM syscall tables"
* 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
ARM: wire up memfd_create syscall
ARM: wire up getrandom syscall
ARM: 8114/1: LPAE: load upper bits of early TTBR0/TTBR1
Linus Torvalds [Sat, 9 Aug 2014 16:33:18 +0000 (09:33 -0700)]
Merge tag 'arc-v3.17-rc1' of git://git./linux/kernel/git/vgupta/arc
Pull ARC changes from Vineet Gupta:
"Mostly cleanup/refactoring in core intc, cache flush, IPI send..."
* tag 'arc-v3.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
mm, arc: remove obsolete pagefault oom killer comment
ARC: help gcc elide icache helper for !SMP
ARC: move common ops for line/full cache into helpers
ARC: cache boot reporting updates
ARC: [intc] mask/unmask can be hidden again
ARC: [plat-arcfpga] No need for init_irq hack
ARC: [intc] don't mask all IRQ by default
ARC: prune extra header includes from smp.c
ARC: update some comments
ARC: [SMP] unify cpu private IRQ requests (TIMER/IPI)
Linus Torvalds [Sat, 9 Aug 2014 16:32:45 +0000 (09:32 -0700)]
Merge tag 'please-pull-getrandom' of git://git./linux/kernel/git/aegl/linux
Pull ia64 system call update from Tony Luck:
"Wire up getrandom system call for ia64"
* tag 'please-pull-getrandom' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux:
[IA64] Wire up getrandom() system call
Linus Torvalds [Sat, 9 Aug 2014 16:15:07 +0000 (09:15 -0700)]
Merge branch 'i2c/for-3.17' of git://git./linux/kernel/git/wsa/linux
Pull i2c updates from Wolfram Sang:
"Highlights:
- class based instantiation finally dropped for most embedded drivers
bringing boot up performance gains
- removed two drivers (one outdated, one a duplicate)
- ACPI has now operation region support (thanks to Lan Tianyu)
- the i2c-stub driver got overhauled and gained new features to
become more useful when writing i2c client drivers (thanks to
Guenter Roeck and Jean Delvare)
The rest is driver bugfixes, added bindings/ids, cleanups..."
* 'i2c/for-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (43 commits)
i2c: mpc: delete unneeded test before of_node_put
i2c: rk3x: fix interrupt handling issue
i2c: imx: Fix format warning for dev_dbg
i2c: qup: disable clks and return instead of just returning error
i2c: exynos5: always enable HSI2C
i2c: designware: add new bindings
i2c: gpio: Drop dead code in i2c_gpio_remove
i2c: pca954x: put the mux to disconnected state after resume
i2c: st: Update i2c timings
drivers/i2c/busses: use correct type for dma_map/unmap
i2c: i2c-st: Use %pa to print 'resource_size_t' type
i2c: s3c2410: resume the I2C controller earlier
i2c: stub: Avoid an array overrun on I2C block transfers
i2c: i801: Add device ID for Intel Wildcat Point PCH
i2c: i801: Fix the alignment of the device table
i2c: stub: Add support for banked register ranges
i2c: stub: Remember the number of emulated chips
i2c: stub: Add support for SMBus block commands
i2c: efm32: correct namespacing of location property
i2c: exynos5: remove extra line and fix an assignment
...
Johannes Weiner [Wed, 6 Aug 2014 06:32:56 +0000 (23:32 -0700)]
Documentation: SubmittingPatches: overhaul changelog description
Maintainers often repeat the same feedback on poorly written
changelogs - describe the problem, justify your changes, quantify
optimizations, describe user-visible changes - but our documentation
on writing changelogs doesn't include these things. Fix that.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pavel Machek [Wed, 6 Aug 2014 06:32:41 +0000 (23:32 -0700)]
Documentation: freefall: simplify pathnames
Copying to local variable is actually not neccessary, if all we need
to do is snprintf(). This also removes problem where devname could be
missing zero termination.
Reported-by: Rickard Strandqvist <rickard_strandqvist@spectrumdigital.se>
Signed-off-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Michael Ellerman [Wed, 6 Aug 2014 06:32:17 +0000 (23:32 -0700)]
Documentation: add How to avoid botching up ioctls
I pointed some folks at this and they wondered why it wasn't in the
kernel Documentation directory. So now it is.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Stephen Rothwell [Sat, 9 Aug 2014 15:25:50 +0000 (08:25 -0700)]
ARM: dts: exynos5420: remove disp_pd
This was caused by commit
5a8da524049c ("ARM: dts: exynos5420: add dsi
node"), which conflicted with
d51cad7df871 ("ARM: dts: remove display
power domain for exynos5420").
The DTS addition should never have been merged through the DRM tree in
the first place, and it lacked an ack from the platform maintainer
(who would have known that the disp_pd reference got removed).
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Olof Johansson <olof@lixom.net>
Tomasz Figa [Tue, 5 Aug 2014 12:43:10 +0000 (14:43 +0200)]
ARM: EXYNOS: Fix suspend/resume sequences
Due to recent consolidation of Exynos suspend and cpuidle code, some
parts of suspend and resume sequences are executed two times, once from
exynos_pm_syscore_ops and then from exynos_cpu_pm_notifier() and thus it
breaks suspend, at least on Exynos4-based boards. In addition, simple
core power down from a cpuidle driver could, in case of CPU 0 could
result in calling functions that are specific to suspend and deeper idle
states.
This patch fixes the issue by moving those operations outside the CPU PM
notifier into suspend and AFTR code paths. This leads to a bit of code
duplication, but allows additional code simplification, so in the end
more code is removed than added.
Fixes: 85f9f90808b4 ("ARM: EXYNOS: Use the cpu_pm notifier for pm")
Cc: Kukjin Kim <kgene.kim@samsung.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Olof Johansson <olof@lixom.net>
Cc: arm@kernel.org
Signed-off-by: Tomasz Figa <t.figa@samsung.com>
[b.zolnierkie: ported patch over current changes]
[b.zolnierkie: fixed exynos_aftr_finisher() return value]
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
Olof Johansson [Sat, 9 Aug 2014 15:23:27 +0000 (08:23 -0700)]
Merge tag 'omap-for-v3.17/soc-fixes' of git://git./linux/kernel/git/tmlind/linux-omap into fixes
Merge "few omap fixes for v3.17 merge window" from Tony Lindgren:
Few fixes for the v3.17 merge window:
- Fix for DPLL rate rounding
- Fix for omap3 ES3.1.2 suspend
- Few coding style fixes
* tag 'omap-for-v3.17/soc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: OMAP3: Fix coding style problems in arch/arm/mach-omap2/control.c
ARM: OMAP3: Fix choice of omap3_restore_es function in OMAP34XX rev3.1.2 case.
ARM: OMAP2+: clock: allow omap2_dpll_round_rate() to round to next-lowest rate
Signed-off-by: Olof Johansson <olof@lixom.net>
Doug Anderson [Thu, 7 Aug 2014 15:44:19 +0000 (17:44 +0200)]
ARM: dts: Fix the sort ordering of EHCI and HSIC in rk3288.dtsi
The EHCI and HSIC device tree nodes were added in the wrong place.
Fix them.
Signed-off-by: Doug Anderson <dianders@chromium.org>
Signed-off-by: Kever Yang <kever.yang@rock-chips.com>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Olof Johansson <olof@lixom.net>
Alexandre Courbot [Mon, 4 Aug 2014 09:28:54 +0000 (18:28 +0900)]
drm/ttm: expose CPU address of DMA-allocated pages
Pages allocated using the DMA API have a coherent memory mapping. Make
this mapping visible to drivers so they can decide to use it instead of
creating their own redundant one.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Acked-by: David Airlie <airlied@linux.ie>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Russell King [Sat, 9 Aug 2014 07:43:11 +0000 (08:43 +0100)]
ARM: wire up memfd_create syscall
Add the memfd_create syscall to ARM.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Russell King [Fri, 8 Aug 2014 09:56:34 +0000 (10:56 +0100)]
ARM: wire up getrandom syscall
Add the new getrandom syscall for ARM.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Konstantin Khlebnikov [Fri, 25 Jul 2014 08:16:28 +0000 (09:16 +0100)]
ARM: 8114/1: LPAE: load upper bits of early TTBR0/TTBR1
This patch fixes booting when idmap pgd lays above 4gb. Commit
4756dcbfd37 mostly had fixed this, but it'd failed to load upper bits.
Also this fixes adding TTBR1_OFFSET to TTRR1: if lower part overflows
carry flag must be added to the upper part.
Signed-off-by: Konstantin Khlebnikov <k.khlebnikov@samsung.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Ilya Dryomov [Fri, 8 Aug 2014 08:43:39 +0000 (12:43 +0400)]
libceph: set last_piece in ceph_msg_data_pages_cursor_init() correctly
Determining ->last_piece based on the value of ->page_offset + length
is incorrect because length here is the length of the entire message.
->last_piece set to false even if page array data item length is <=
PAGE_SIZE, which results in invalid length passed to
ceph_tcp_{send,recv}page() and causes various asserts to fire.
# cat pages-cursor-init.sh
#!/bin/bash
rbd create --size 10 --image-format 2 foo
FOO_DEV=$(rbd map foo)
dd if=/dev/urandom of=$FOO_DEV bs=1M &>/dev/null
rbd snap create foo@snap
rbd snap protect foo@snap
rbd clone foo@snap bar
# rbd_resize calls librbd rbd_resize(), size is in bytes
./rbd_resize bar $(((4 << 20) + 512))
rbd resize --size 10 bar
BAR_DEV=$(rbd map bar)
# trigger a 512-byte copyup -- 512-byte page array data item
dd if=/dev/urandom of=$BAR_DEV bs=1M count=1 seek=5
The problem exists only in ceph_msg_data_pages_cursor_init(),
ceph_msg_data_pages_advance() does the right thing. The size_t cast is
unnecessary.
Cc: stable@vger.kernel.org # 3.10+
Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: Alex Elder <elder@linaro.org>
Linus Torvalds [Sat, 9 Aug 2014 01:13:21 +0000 (18:13 -0700)]
Merge tag 'for-linus-
20140808' of git://git.infradead.org/linux-mtd
Pull MTD updates from Brian Norris:
"AMD-compatible CFI driver:
- Support OTP programming for Micron M29EW family
- Increase buffer write timeout, according to detected flash
parameter info
NAND
- Add helpers for retrieving ONFI timing modes
- GPMI: provide option to disable bad block marker swapping (required
for Ka-On electronics platforms)
SPI NOR
- EON EN25QH128 support
- Support new Flag Status Register (FSR) on a few Micron flash
Common
- New sysfs entries for bad block and ECC stats
And a few miscellaneous refactorings, cleanups, and driver
improvements"
* tag 'for-linus-
20140808' of git://git.infradead.org/linux-mtd: (31 commits)
mtd: gpmi: make blockmark swapping optional
mtd: gpmi: remove line breaks from error messages and improve wording
mtd: gpmi: remove useless (void *) type casts and spaces between type casts and variables
mtd: atmel_nand: NFC: support multiple interrupt handling
mtd: atmel_nand: implement the nfc_device_ready() by checking the R/B bit
mtd: atmel_nand: add NFC status error check
mtd: atmel_nand: make ecc parameters same as definition
mtd: nand: add ONFI timing mode to nand_timings converter
mtd: nand: define struct nand_timings
mtd: cfi_cmdset_0002: fix do_write_buffer() timeout error
mtd: denali: use 8 bytes for READID command
mtd/ftl: fix the double free of the buffers allocated in build_maps()
mtd: phram: Fix whitespace issues
mtd: spi-nor: add support for EON EN25QH128
mtd: cfi_cmdset_0002: Add support for locking OTP memory
mtd: cfi_cmdset_0002: Add support for writing OTP memory
mtd: cfi_cmdset_0002: Invalidate cache after entering/exiting OTP memory
mtd: cfi_cmdset_0002: Add support for reading OTP
mtd: spi-nor: add support for flag status register on Micron chips
mtd: Account for BBT blocks when a partition is being allocated
...
Linus Torvalds [Sat, 9 Aug 2014 01:09:33 +0000 (18:09 -0700)]
Merge tag 'fbdev-3.17' of git://git./linux/kernel/git/tomba/linux
Pull fbdev updates from Tomi Valkeinen:
- much better HDMI infoframe support for OMAP
- Cirrus Logic CLPS711X framebuffer driver
- DT support for PL11x CLCD driver
- various small fixes
* tag 'fbdev-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux: (35 commits)
OMAPDSS: DSI: fix depopulating dsi peripherals
video: hyperv: hyperv_fb: refresh the VM screen by force on VM panic
video: ARM CLCD: Fix DT-related build problems
drivers: video: fbdev: atmel_lcdfb.c: Add ability to inverted backlight PWM.
video: ARM CLCD: Add DT support
drm/omap: Add infoframe & dvi/hdmi mode support
OMAPDSS: HDMI: remove the unused code
OMAPDSS: HDMI5: add support to set infoframe & HDMI mode
OMAPDSS: HDMI4: add support to set infoframe & HDMI mode
OMAPDSS: HDMI: add infoframe and hdmi_dvi_mode fields
OMAPDSS: add hdmi ops to hdmi-connector and tpd12s015
OMAPDSS: add hdmi ops to hdmi_ops and omap_dss_driver
OMAPDSS: HDMI: remove custom avi infoframe
OMAPDSS: HDMI5: use common AVI infoframe support
OMAPDSS: HDMI4: use common AVI infoframe support
OMAPDSS: Kconfig: select HDMI
OMAPDSS: HDMI: fix name conflict
OMAPDSS: DISPC: clean up dispc_mgr_timings_ok
OMAPDSS: DISPC: reject interlace for lcd out
OMAPDSS: DISPC: fix debugfs reg dump
...
Linus Torvalds [Sat, 9 Aug 2014 01:06:29 +0000 (18:06 -0700)]
Merge tag 'pwm/for-3.17-rc1' of git://git./linux/kernel/git/thierry.reding/linux-pwm
Pull pwm changes from Thierry Reding:
"The set of changes for this merge window contains two new drivers: one
for Rockchip SoCs and another for STMicroelectronics STiH4xx SoCs.
The remainder of the changes are the usual small cleanups such as
removing redundant OOM messages, signalling that a PWM chip's
operations can sleep and removing an unneeded dependency"
* tag 'pwm/for-3.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm:
pwm: rockchip: Added to support for RK3288 SoC
pwm: rockchip: document RK3288 SoC compatible
pwm: sti: Remove PWM period table
pwm: sti: Sync between enable/disable calls
pwm: sti: Ensure same period values for all channels
pwm: sti: Fix PWM prescaler handling
pwm: sti: Supply Device Tree binding documentation for ST's PWM IP
pwm: sti: Add new driver for ST's PWM IP
pwm: imx: set can_sleep flag for imx_pwm
pwm: lpss: remove dependency on clk framework
pwm: pwm-tipwmss: remove unnecessary OOM messages
pwm: rockchip: document device tree bindings
pwm: add Rockchip SoC PWM support
Linus Torvalds [Sat, 9 Aug 2014 01:00:35 +0000 (18:00 -0700)]
Merge tag 'gpio-v3.17-1' of git://git./linux/kernel/git/linusw/linux-gpio
Pull GPIO update from Linus Walleij:
"This is the bulk of GPIO changes for the v3.17 development cycle, and
this time we got a lot of action going on and it will continue:
- The core GPIO library implementation has been split up in three
different files:
- gpiolib.c for the latest and greatest and shiny GPIO library code
using GPIO descriptors only
- gpiolib-legacy.c for the old integer number space API that we are
phasing out gradually
- gpiolib-sysfs.c for the sysfs interface that we are not entirely
happy with, but has to live on for ABI compatibility
- Add a flags argument to *gpiod_get* functions, with some
backward-compatibility macros to ease transitions. We should have
had the flags there from the beginning it seems, now we need to
clean up the mess. There is a plan on how to move forward here
devised by Alexandre Courbot and Mark Brown
- Split off a special <linux/gpio/machine.h> header for the board
gpio table registration, as per example from the regulator
subsystem
- Start to kill off the return value from gpiochip_remove() by
removing the __must_check attribute and removing all checks inside
the drivers/gpio directory. The rationale is: well what were we
supposed to do if there is an error code? Not much: print an error
message. And gpiolib already does that. So make this function
return void eventually
- Some cleanups of hairy gpiolib code, make some functions not to be
used outside the library private and make sure they are not
exported, remove gpiod_lock/unlock_as_irq() as the existing
function is for driver-internal use and fine as it is, delete
gpio_ensure_requested() as it is not meaningful anymore
- Support the GPIOF_ACTIVE_LOW flag from gpio_request_one() function
calls, which is logical since this is already supported when
referencing GPIOs from e.g. device trees
- Switch STMPE, intel-mid, lynxpoint and ACPI (!) to use the gpiolib
irqchip helpers cutting down on GPIO irqchip boilerplate a bit more
- New driver for the Zynq GPIO block
- The usual incremental improvements around a bunch of drivers
- Janitorial syntactic and semantic cleanups by Jingoo Han, and
Rickard Strandqvist especially"
* tag 'gpio-v3.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: (37 commits)
MAINTAINERS: update GPIO include files
gpio: add missing includes in machine.h
gpio: add flags argument to gpiod_get*() functions
MAINTAINERS: Update Samsung pin control entry
gpio / ACPI: Move event handling registration to gpiolib irqchip helpers
gpio: lynxpoint: Convert to use gpiolib irqchip
gpio: split gpiod board registration into machine header
gpio: remove gpio_ensure_requested()
gpio: remove useless check in gpiolib_sysfs_init()
gpiolib: Export gpiochip_request_own_desc and gpiochip_free_own_desc
gpio: move gpio_ensure_requested() into legacy C file
gpio: remove gpiod_lock/unlock_as_irq()
gpio: make gpiochip_get_desc() gpiolib-private
gpio: simplify gpiochip_export()
gpio: remove export of private of_get_named_gpio_flags()
gpio: Add support for GPIOF_ACTIVE_LOW to gpio_request_one functions
gpio: zynq: Clear pending interrupt when enabling a IRQ
gpio: drop retval check enforcing from gpiochip_remove()
gpio: remove all usage of gpio_remove retval in driver/gpio
devicetree: Add Zynq GPIO devicetree bindings documentation
...
Linus Torvalds [Sat, 9 Aug 2014 00:39:48 +0000 (17:39 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/dtor/input
Pull input updates from Dmitry Torokhov:
- big update to Wacom driver by Benjamin Tissoires, converting it to
HID infrastructure and unifying USB and Bluetooth models
- large update to ALPS driver by Hans de Goede, which adds support for
newer touchpad models as well as cleans up and restructures the code
- more changes to Atmel MXT driver, including device tree support
- new driver for iPaq x3xxx touchscreen
- driver for serial Wacom tablets
- driver for Microchip's CAP1106
- assorted cleanups and improvements to existing drover and input core
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (93 commits)
Input: wacom - update the ABI doc according to latest changes
Input: wacom - only register once the MODULE_* macros
Input: HID - remove hid-wacom Bluetooth driver
Input: wacom - add copyright note and bump version to 2.0
Input: wacom - remove passing id for wacom_set_report
Input: wacom - check for bluetooth protocol while setting OLEDs
Input: wacom - handle Intuos 4 BT in wacom.ko
Input: wacom - handle Graphire BT tablets in wacom.ko
Input: wacom - prepare the driver to include BT devices
Input: hyperv-keyboard - register as a wakeup source
Input: imx_keypad - remove ifdef round PM methods
Input: jornada720_ts - get rid of space indentation and use tab
Input: jornada720_ts - switch to using managed resources
Input: alps - Rushmore and v7 resolution support
Input: mcs5000_ts - remove ifdef around power management methods
Input: mcs5000_ts - protect PM functions with CONFIG_PM_SLEEP
Input: ads7846 - release resources on failure for clean exit
Input: wacom - add support for 0x12C ISDv4 sensor
Input: atmel_mxt_ts - use deep sleep mode when stopped
ARM: dts: am437x-gp-evm: Update binding for touchscreen size
...
Linus Torvalds [Fri, 8 Aug 2014 22:57:47 +0000 (15:57 -0700)]
Merge branch 'akpm' (second patchbomb from Andrew Morton)
Merge more incoming from Andrew Morton:
"Two new syscalls:
memfd_create in "shm: add memfd_create() syscall"
kexec_file_load in "kexec: implementation of new syscall kexec_file_load"
And:
- Most (all?) of the rest of MM
- Lots of the usual misc bits
- fs/autofs4
- drivers/rtc
- fs/nilfs
- procfs
- fork.c, exec.c
- more in lib/
- rapidio
- Janitorial work in filesystems: fs/ufs, fs/reiserfs, fs/adfs,
fs/cramfs, fs/romfs, fs/qnx6.
- initrd/initramfs work
- "file sealing" and the memfd_create() syscall, in tmpfs
- add pci_zalloc_consistent, use it in lots of places
- MAINTAINERS maintenance
- kexec feature work"
* emailed patches from Andrew Morton <akpm@linux-foundation.org: (193 commits)
MAINTAINERS: update nomadik patterns
MAINTAINERS: update usb/gadget patterns
MAINTAINERS: update DMA BUFFER SHARING patterns
kexec: verify the signature of signed PE bzImage
kexec: support kexec/kdump on EFI systems
kexec: support for kexec on panic using new system call
kexec-bzImage64: support for loading bzImage using 64bit entry
kexec: load and relocate purgatory at kernel load time
purgatory: core purgatory functionality
purgatory/sha256: provide implementation of sha256 in purgaotory context
kexec: implementation of new syscall kexec_file_load
kexec: new syscall kexec_file_load() declaration
kexec: make kexec_segment user buffer pointer a union
resource: provide new functions to walk through resources
kexec: use common function for kimage_normal_alloc() and kimage_crash_alloc()
kexec: move segment verification code in a separate function
kexec: rename unusebale_pages to unusable_pages
kernel: build bin2c based on config option CONFIG_BUILD_BIN2C
bin2c: move bin2c in scripts/basic
shm: wait for pins to be released when sealing
...
Joe Perches [Fri, 8 Aug 2014 21:26:20 +0000 (14:26 -0700)]
MAINTAINERS: update nomadik patterns
Commit
3a19805920f1 ("pinctrl: nomadik: move all Nomadik drivers to
subdir") move the files, update the patterns
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Alessandro Rubini <rubini@unipv.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Fri, 8 Aug 2014 21:26:18 +0000 (14:26 -0700)]
MAINTAINERS: update usb/gadget patterns
Several commits have moved files around, update the section patterns.
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Thomas Dahlmann <dahlmann.thomas@arcor.de>
Cc: Nicolas Ferre <nicolas.ferre@atmel.com>
Cc: Li Yang <leoli@freescale.com>
Cc: Eric Miao <eric.y.miao@gmail.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Haojian Zhuang <haojian.zhuang@gmail.com>
Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: Andrzej Pietrasiewicz <andrzej.p@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Fri, 8 Aug 2014 21:26:16 +0000 (14:26 -0700)]
MAINTAINERS: update DMA BUFFER SHARING patterns
One pattern per F: line please...
Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Vivek Goyal [Fri, 8 Aug 2014 21:26:13 +0000 (14:26 -0700)]
kexec: verify the signature of signed PE bzImage
This is the final piece of the puzzle of verifying kernel image signature
during kexec_file_load() syscall.
This patch calls into PE file routines to verify signature of bzImage. If
signature are valid, kexec_file_load() succeeds otherwise it fails.
Two new config options have been introduced. First one is
CONFIG_KEXEC_VERIFY_SIG. This option enforces that kernel has to be
validly signed otherwise kernel load will fail. If this option is not
set, no signature verification will be done. Only exception will be when
secureboot is enabled. In that case signature verification should be
automatically enforced when secureboot is enabled. But that will happen
when secureboot patches are merged.
Second config option is CONFIG_KEXEC_BZIMAGE_VERIFY_SIG. This option
enables signature verification support on bzImage. If this option is not
set and previous one is set, kernel image loading will fail because kernel
does not have support to verify signature of bzImage.
I tested these patches with both "pesign" and "sbsign" signed bzImages.
I used signing_key.priv key and signing_key.x509 cert for signing as
generated during kernel build process (if module signing is enabled).
Used following method to sign bzImage.
pesign
======
- Convert DER format cert to PEM format cert
openssl x509 -in signing_key.x509 -inform DER -out signing_key.x509.PEM -outform
PEM
- Generate a .p12 file from existing cert and private key file
openssl pkcs12 -export -out kernel-key.p12 -inkey signing_key.priv -in
signing_key.x509.PEM
- Import .p12 file into pesign db
pk12util -i /tmp/kernel-key.p12 -d /etc/pki/pesign
- Sign bzImage
pesign -i /boot/vmlinuz-3.16.0-rc3+ -o /boot/vmlinuz-3.16.0-rc3+.signed.pesign
-c "Glacier signing key - Magrathea" -s
sbsign
======
sbsign --key signing_key.priv --cert signing_key.x509.PEM --output
/boot/vmlinuz-3.16.0-rc3+.signed.sbsign /boot/vmlinuz-3.16.0-rc3+
Patch details:
Well all the hard work is done in previous patches. Now bzImage loader
has just call into that code and verify whether bzImage signature are
valid or not.
Also create two config options. First one is CONFIG_KEXEC_VERIFY_SIG.
This option enforces that kernel has to be validly signed otherwise kernel
load will fail. If this option is not set, no signature verification will
be done. Only exception will be when secureboot is enabled. In that case
signature verification should be automatically enforced when secureboot is
enabled. But that will happen when secureboot patches are merged.
Second config option is CONFIG_KEXEC_BZIMAGE_VERIFY_SIG. This option
enables signature verification support on bzImage. If this option is not
set and previous one is set, kernel image loading will fail because kernel
does not have support to verify signature of bzImage.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: Greg Kroah-Hartman <greg@kroah.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: WANG Chao <chaowang@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Matt Fleming <matt@console-pimps.org>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Vivek Goyal [Fri, 8 Aug 2014 21:26:11 +0000 (14:26 -0700)]
kexec: support kexec/kdump on EFI systems
This patch does two things. It passes EFI run time mappings to second
kernel in bootparams efi_info. Second kernel parse this info and create
new mappings in second kernel. That means mappings in first and second
kernel will be same. This paves the way to enable EFI in kexec kernel.
This patch also prepares and passes EFI setup data through bootparams.
This contains bunch of information about various tables and their
addresses.
These information gathering and passing has been written along the lines
of what current kexec-tools is doing to make kexec work with UEFI.
[akpm@linux-foundation.org: s/get_efi/efi_get/g, per Matt]
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: Greg Kroah-Hartman <greg@kroah.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: WANG Chao <chaowang@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Matt Fleming <matt@console-pimps.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Vivek Goyal [Fri, 8 Aug 2014 21:26:09 +0000 (14:26 -0700)]
kexec: support for kexec on panic using new system call
This patch adds support for loading a kexec on panic (kdump) kernel usning
new system call.
It prepares ELF headers for memory areas to be dumped and for saved cpu
registers. Also prepares the memory map for second kernel and limits its
boot to reserved areas only.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: Greg Kroah-Hartman <greg@kroah.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: WANG Chao <chaowang@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Vivek Goyal [Fri, 8 Aug 2014 21:26:06 +0000 (14:26 -0700)]
kexec-bzImage64: support for loading bzImage using 64bit entry
This is loader specific code which can load bzImage and set it up for
64bit entry. This does not take care of 32bit entry or real mode entry.
32bit mode entry can be implemented if somebody needs it.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: Greg Kroah-Hartman <greg@kroah.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: WANG Chao <chaowang@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Vivek Goyal [Fri, 8 Aug 2014 21:26:04 +0000 (14:26 -0700)]
kexec: load and relocate purgatory at kernel load time
Load purgatory code in RAM and relocate it based on the location.
Relocation code has been inspired by module relocation code and purgatory
relocation code in kexec-tools.
Also compute the checksums of loaded kexec segments and store them in
purgatory.
Arch independent code provides this functionality so that arch dependent
bootloaders can make use of it.
Helper functions are provided to get/set symbol values in purgatory which
are used by bootloaders later to set things like stack and entry point of
second kernel etc.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: Greg Kroah-Hartman <greg@kroah.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: WANG Chao <chaowang@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Vivek Goyal [Fri, 8 Aug 2014 21:26:02 +0000 (14:26 -0700)]
purgatory: core purgatory functionality
Create a stand alone relocatable object purgatory which runs between two
kernels. This name, concept and some code has been taken from
kexec-tools. Idea is that this code runs after a crash and it runs in
minimal environment. So keep it separate from rest of the kernel and in
long term we will have to practically do no maintenance of this code.
This code also has the logic to do verify sha256 hashes of various
segments which have been loaded into memory. So first we verify that the
kernel we are jumping to is fine and has not been corrupted and make
progress only if checsums are verified.
This code also takes care of copying some memory contents to backup region.
[sfr@canb.auug.org.au: run host built programs from objtree]
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: Greg Kroah-Hartman <greg@kroah.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: WANG Chao <chaowang@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Vivek Goyal [Fri, 8 Aug 2014 21:25:59 +0000 (14:25 -0700)]
purgatory/sha256: provide implementation of sha256 in purgaotory context
Next two patches provide code for purgatory. This is a code which does
not link against the kernel and runs stand alone. This code runs between
two kernels. One of the primary purpose of this code is to verify the
digest of newly loaded kernel and making sure it matches the digest
computed at kernel load time.
We use sha256 for calculating digest of kexec segmetns. Purgatory can't
use stanard crypto API as that API is not available in purgatory context.
Hence, I have copied code from crypto/sha256_generic.c and compiled it
with purgaotry code so that it could be used. I could not #include
sha256_generic.c file here as some of the function signature requiered
little tweaking. Original functions work with crypto API but these ones
don't
So instead of doing #include on sha256_generic.c I just copied relevant
portions of code into arch/x86/purgatory/sha256.c. Now we shouldn't have
to touch this code at all. Do let me know if there are better ways to
handle it.
This patch does not enable compiling of this code. That happens in next
patch. I wanted to highlight this change in a separate patch for easy
review.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: Greg Kroah-Hartman <greg@kroah.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: WANG Chao <chaowang@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Vivek Goyal [Fri, 8 Aug 2014 21:25:57 +0000 (14:25 -0700)]
kexec: implementation of new syscall kexec_file_load
Previous patch provided the interface definition and this patch prvides
implementation of new syscall.
Previously segment list was prepared in user space. Now user space just
passes kernel fd, initrd fd and command line and kernel will create a
segment list internally.
This patch contains generic part of the code. Actual segment preparation
and loading is done by arch and image specific loader. Which comes in
next patch.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: Greg Kroah-Hartman <greg@kroah.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: WANG Chao <chaowang@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Vivek Goyal [Fri, 8 Aug 2014 21:25:55 +0000 (14:25 -0700)]
kexec: new syscall kexec_file_load() declaration
This is the new syscall kexec_file_load() declaration/interface. I have
reserved the syscall number only for x86_64 so far. Other architectures
(including i386) can reserve syscall number when they enable the support
for this new syscall.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: Greg Kroah-Hartman <greg@kroah.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: WANG Chao <chaowang@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Vivek Goyal [Fri, 8 Aug 2014 21:25:52 +0000 (14:25 -0700)]
kexec: make kexec_segment user buffer pointer a union
So far kexec_segment->buf was always a user space pointer as user space
passed the array of kexec_segment structures and kernel copied it.
But with new system call, list of kexec segments will be prepared by
kernel and kexec_segment->buf will point to a kernel memory.
So while I was adding code where I made assumption that ->buf is pointing
to kernel memory, sparse started giving warning.
Make ->buf a union. And where a user space pointer is expected, access it
using ->buf and where a kernel space pointer is expected, access it using
->kbuf. That takes care of sparse warnings.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: Greg Kroah-Hartman <greg@kroah.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: WANG Chao <chaowang@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Vivek Goyal [Fri, 8 Aug 2014 21:25:50 +0000 (14:25 -0700)]
resource: provide new functions to walk through resources
I have added two more functions to walk through resources.
Currently walk_system_ram_range() deals with pfn and /proc/iomem can
contain partial pages. By dealing in pfn, callback function loses the
info that last page of a memory range is a partial page and not the full
page. So I implemented walk_system_ram_res() which returns u64 values to
callback functions and now it properly return start and end address.
walk_system_ram_range() uses find_next_system_ram() to find the next ram
resource. This in turn only travels through siblings of top level child
and does not travers through all the nodes of the resoruce tree. I also
need another function where I can walk through all the resources, for
example figure out where "GART" aperture is. Figure out where ACPI memory
is.
So I wrote another function walk_iomem_res() which walks through all
/proc/iomem resources and returns matches as asked by caller. Caller can
specify "name" of resource, start and end and flags.
Got rid of find_next_system_ram_res() and instead implemented more generic
find_next_iomem_res() which can be used to traverse top level children
only based on an argument.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: Greg Kroah-Hartman <greg@kroah.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: WANG Chao <chaowang@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Vivek Goyal [Fri, 8 Aug 2014 21:25:48 +0000 (14:25 -0700)]
kexec: use common function for kimage_normal_alloc() and kimage_crash_alloc()
kimage_normal_alloc() and kimage_crash_alloc() are doing lot of similar
things and differ only little. So instead of having two separate
functions create a common function kimage_alloc_init() and pass it the
"flags" argument which tells whether it is normal kexec or kexec_on_panic.
And this function should be able to deal with both the cases.
This consolidation also helps later where we can use a common function
kimage_file_alloc_init() to handle normal and crash cases for new file
based kexec syscall.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: Greg Kroah-Hartman <greg@kroah.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: WANG Chao <chaowang@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Vivek Goyal [Fri, 8 Aug 2014 21:25:45 +0000 (14:25 -0700)]
kexec: move segment verification code in a separate function
Previously do_kimage_alloc() will allocate a kimage structure, copy
segment list from user space and then do the segment list sanity
verification.
Break down this function in 3 parts. do_kimage_alloc_init() to do actual
allocation and basic initialization of kimage structure.
copy_user_segment_list() to copy segment list from user space and
sanity_check_segment_list() to verify the sanity of segment list as passed
by user space.
In later patches, I need to only allocate kimage and not copy segment list
from user space. So breaking down in smaller functions enables re-use of
code at other places.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: Greg Kroah-Hartman <greg@kroah.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: WANG Chao <chaowang@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Vivek Goyal [Fri, 8 Aug 2014 21:25:43 +0000 (14:25 -0700)]
kexec: rename unusebale_pages to unusable_pages
Let's use the more common "unusable".
This patch was originally written and posted by Boris. I am including it
in this patch series.
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: Greg Kroah-Hartman <greg@kroah.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: WANG Chao <chaowang@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Vivek Goyal [Fri, 8 Aug 2014 21:25:41 +0000 (14:25 -0700)]
kernel: build bin2c based on config option CONFIG_BUILD_BIN2C
currently bin2c builds only if CONFIG_IKCONFIG=y. But bin2c will now be
used by kexec too. So make it compilation dependent on CONFIG_BUILD_BIN2C
and this config option can be selected by CONFIG_KEXEC and CONFIG_IKCONFIG.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: Greg Kroah-Hartman <greg@kroah.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: WANG Chao <chaowang@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Vivek Goyal [Fri, 8 Aug 2014 21:25:38 +0000 (14:25 -0700)]
bin2c: move bin2c in scripts/basic
This patch series does not do kernel signature verification yet. I plan
to post another patch series for that. Now distributions are already
signing PE/COFF bzImage with PKCS7 signature I plan to parse and verify
those signatures.
Primary goal of this patchset is to prepare groundwork so that kernel
image can be signed and signatures be verified during kexec load. This
should help with two things.
- It should allow kexec/kdump on secureboot enabled machines.
- In general it can help even without secureboot. By being able to verify
kernel image signature in kexec, it should help with avoiding module
signing restrictions. Matthew Garret showed how to boot into a custom
kernel, modify first kernel's memory and then jump back to old kernel and
bypass any policy one wants to.
This patch (of 15):
Kexec wants to use bin2c and it wants to use it really early in the build
process. See arch/x86/purgatory/ code in later patches.
So move bin2c in scripts/basic so that it can be built very early and
be usable by arch/x86/purgatory/
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: Greg Kroah-Hartman <greg@kroah.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: WANG Chao <chaowang@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Herrmann [Fri, 8 Aug 2014 21:25:36 +0000 (14:25 -0700)]
shm: wait for pins to be released when sealing
If we set SEAL_WRITE on a file, we must make sure there cannot be any
ongoing write-operations on the file. For write() calls, we simply lock
the inode mutex, for mmap() we simply verify there're no writable
mappings. However, there might be pages pinned by AIO, Direct-IO and
similar operations via GUP. We must make sure those do not write to the
memfd file after we set SEAL_WRITE.
As there is no way to notify GUP users to drop pages or to wait for them
to be done, we implement the wait ourself: When setting SEAL_WRITE, we
check all pages for their ref-count. If it's bigger than 1, we know
there's some user of the page. We then mark the page and wait for up to
150ms for those ref-counts to be dropped. If the ref-counts are not
dropped in time, we refuse the seal operation.
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Ryan Lortie <desrt@desrt.ca>
Cc: Lennart Poettering <lennart@poettering.net>
Cc: Daniel Mack <zonque@gmail.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Herrmann [Fri, 8 Aug 2014 21:25:34 +0000 (14:25 -0700)]
selftests: add memfd/sealing page-pinning tests
Setting SEAL_WRITE is not possible if there're pending GUP users. This
commit adds selftests for memfd+sealing that use FUSE to create pending
page-references. FUSE is very helpful here in that it allows us to delay
direct-IO operations for an arbitrary amount of time. This way, we can
force the kernel to pin pages and then run our normal selftests.
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Ryan Lortie <desrt@desrt.ca>
Cc: Lennart Poettering <lennart@poettering.net>
Cc: Daniel Mack <zonque@gmail.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Herrmann [Fri, 8 Aug 2014 21:25:32 +0000 (14:25 -0700)]
selftests: add memfd_create() + sealing tests
Some basic tests to verify sealing on memfds works as expected and
guarantees the advertised semantics.
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Ryan Lortie <desrt@desrt.ca>
Cc: Lennart Poettering <lennart@poettering.net>
Cc: Daniel Mack <zonque@gmail.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Herrmann [Fri, 8 Aug 2014 21:25:29 +0000 (14:25 -0700)]
shm: add memfd_create() syscall
memfd_create() is similar to mmap(MAP_ANON), but returns a file-descriptor
that you can pass to mmap(). It can support sealing and avoids any
connection to user-visible mount-points. Thus, it's not subject to quotas
on mounted file-systems, but can be used like malloc()'ed memory, but with
a file-descriptor to it.
memfd_create() returns the raw shmem file, so calls like ftruncate() can
be used to modify the underlying inode. Also calls like fstat() will
return proper information and mark the file as regular file. If you want
sealing, you can specify MFD_ALLOW_SEALING. Otherwise, sealing is not
supported (like on all other regular files).
Compared to O_TMPFILE, it does not require a tmpfs mount-point and is not
subject to a filesystem size limit. It is still properly accounted to
memcg limits, though, and to the same overcommit or no-overcommit
accounting as all user memory.
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Ryan Lortie <desrt@desrt.ca>
Cc: Lennart Poettering <lennart@poettering.net>
Cc: Daniel Mack <zonque@gmail.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Herrmann [Fri, 8 Aug 2014 21:25:27 +0000 (14:25 -0700)]
shm: add sealing API
If two processes share a common memory region, they usually want some
guarantees to allow safe access. This often includes:
- one side cannot overwrite data while the other reads it
- one side cannot shrink the buffer while the other accesses it
- one side cannot grow the buffer beyond previously set boundaries
If there is a trust-relationship between both parties, there is no need
for policy enforcement. However, if there's no trust relationship (eg.,
for general-purpose IPC) sharing memory-regions is highly fragile and
often not possible without local copies. Look at the following two
use-cases:
1) A graphics client wants to share its rendering-buffer with a
graphics-server. The memory-region is allocated by the client for
read/write access and a second FD is passed to the server. While
scanning out from the memory region, the server has no guarantee that
the client doesn't shrink the buffer at any time, requiring rather
cumbersome SIGBUS handling.
2) A process wants to perform an RPC on another process. To avoid huge
bandwidth consumption, zero-copy is preferred. After a message is
assembled in-memory and a FD is passed to the remote side, both sides
want to be sure that neither modifies this shared copy, anymore. The
source may have put sensible data into the message without a separate
copy and the target may want to parse the message inline, to avoid a
local copy.
While SIGBUS handling, POSIX mandatory locking and MAP_DENYWRITE provide
ways to achieve most of this, the first one is unproportionally ugly to
use in libraries and the latter two are broken/racy or even disabled due
to denial of service attacks.
This patch introduces the concept of SEALING. If you seal a file, a
specific set of operations is blocked on that file forever. Unlike locks,
seals can only be set, never removed. Hence, once you verified a specific
set of seals is set, you're guaranteed that no-one can perform the blocked
operations on this file, anymore.
An initial set of SEALS is introduced by this patch:
- SHRINK: If SEAL_SHRINK is set, the file in question cannot be reduced
in size. This affects ftruncate() and open(O_TRUNC).
- GROW: If SEAL_GROW is set, the file in question cannot be increased
in size. This affects ftruncate(), fallocate() and write().
- WRITE: If SEAL_WRITE is set, no write operations (besides resizing)
are possible. This affects fallocate(PUNCH_HOLE), mmap() and
write().
- SEAL: If SEAL_SEAL is set, no further seals can be added to a file.
This basically prevents the F_ADD_SEAL operation on a file and
can be set to prevent others from adding further seals that you
don't want.
The described use-cases can easily use these seals to provide safe use
without any trust-relationship:
1) The graphics server can verify that a passed file-descriptor has
SEAL_SHRINK set. This allows safe scanout, while the client is
allowed to increase buffer size for window-resizing on-the-fly.
Concurrent writes are explicitly allowed.
2) For general-purpose IPC, both processes can verify that SEAL_SHRINK,
SEAL_GROW and SEAL_WRITE are set. This guarantees that neither
process can modify the data while the other side parses it.
Furthermore, it guarantees that even with writable FDs passed to the
peer, it cannot increase the size to hit memory-limits of the source
process (in case the file-storage is accounted to the source).
The new API is an extension to fcntl(), adding two new commands:
F_GET_SEALS: Return a bitset describing the seals on the file. This
can be called on any FD if the underlying file supports
sealing.
F_ADD_SEALS: Change the seals of a given file. This requires WRITE
access to the file and F_SEAL_SEAL may not already be set.
Furthermore, the underlying file must support sealing and
there may not be any existing shared mapping of that file.
Otherwise, EBADF/EPERM is returned.
The given seals are _added_ to the existing set of seals
on the file. You cannot remove seals again.
The fcntl() handler is currently specific to shmem and disabled on all
files. A file needs to explicitly support sealing for this interface to
work. A separate syscall is added in a follow-up, which creates files that
support sealing. There is no intention to support this on other
file-systems. Semantics are unclear for non-volatile files and we lack any
use-case right now. Therefore, the implementation is specific to shmem.
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Ryan Lortie <desrt@desrt.ca>
Cc: Lennart Poettering <lennart@poettering.net>
Cc: Daniel Mack <zonque@gmail.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Herrmann [Fri, 8 Aug 2014 21:25:25 +0000 (14:25 -0700)]
mm: allow drivers to prevent new writable mappings
This patch (of 6):
The i_mmap_writable field counts existing writable mappings of an
address_space. To allow drivers to prevent new writable mappings, make
this counter signed and prevent new writable mappings if it is negative.
This is modelled after i_writecount and DENYWRITE.
This will be required by the shmem-sealing infrastructure to prevent any
new writable mappings after the WRITE seal has been set. In case there
exists a writable mapping, this operation will fail with EBUSY.
Note that we rely on the fact that iff you already own a writable mapping,
you can increase the counter without using the helpers. This is the same
that we do for i_writecount.
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Ryan Lortie <desrt@desrt.ca>
Cc: Lennart Poettering <lennart@poettering.net>
Cc: Daniel Mack <zonque@gmail.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Fri, 8 Aug 2014 21:25:22 +0000 (14:25 -0700)]
MAINTAINERS: remove unused NFSD pattern
A series of commits by Christoph Hellwig removed all the files in this
directory, remove the pattern.
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This page took 0.054038 seconds and 5 git commands to generate.