aboutsummaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/i915/i915_gem_gtt.h
AgeCommit message (Collapse)AuthorFilesLines
2016-11-17drm/i915: dev_priv cleanup in i915_gem_gtt.cGravatar Tvrtko Ursulin 1-4/+4
Started with removing INTEL_INFO(dev) and cascaded into a quite big trickle of function prototype changes. Still, I think it is for the better. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-11-17drm/i915: dev_priv and a small cascade of cleanups in i915_gem.cGravatar Tvrtko Ursulin 1-1/+1
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-11-11drm/i915: Split out i915_vma.cGravatar Joonas Lahtinen 1-213/+12
As a side product, had to split two other files; - i915_gem_fence_reg.h - i915_gem_object.h (only parts that needed immediate untanglement) I tried to move code in as big chunks as possible, to make review easier. i915_vma_compare was moved to a header temporarily. v2: - Use i915_gem_fence_reg.{c,h} v3: - Rebased v4: - Fix building when DEBUG_GEM is enabled by reordering a bit. Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1478861034-30643-1-git-send-email-joonas.lahtinen@linux.intel.com
2016-11-01drm/i915: Store the vma in an rbtree under the objectGravatar Chris Wilson 1-0/+1
With full-ppgtt one of the main bottlenecks is the lookup of the VMA underneath the object. For execbuf there is merit in having a very fast direct lookup of ctx:handle to the vma using a hashtree, but that still leaves a large number of other lookups. One way to speed up the lookup would be to use a rhashtable, but that requires extra allocations and may exhibit poor worse case behaviour. An alternative is to use an embedded rbtree, i.e. no extra allocations and deterministic behaviour, but at the slight cost of O(lgN) lookups (instead of O(1) for rhashtable). The major of such tree will be very shallow and so not much slower, and still scales much, much better than the current unsorted list. v2: Bump vma_compare() to return a long, as we return the result of comparing two pointers. References: https://bugs.freedesktop.org/show_bug.cgi?id=87726 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161101115400.15647-1-chris@chris-wilson.co.uk
2016-10-28drm/i915: Enable multiple timelinesGravatar Chris Wilson 1-1/+3
With the infrastructure converted over to tracking multiple timelines in the GEM API whilst preserving the efficiency of using a single execution timeline internally, we can now assign a separate timeline to every context with full-ppgtt. v2: Add a comment to indicate the xfer between timelines upon submission. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-35-chris@chris-wilson.co.uk
2016-10-28drm/i915: Move GEM activity tracking into a common struct reservation_objectGravatar Chris Wilson 1-0/+1
In preparation to support many distinct timelines, we need to expand the activity tracking on the GEM object to handle more than just a request per engine. We already use the struct reservation_object on the dma-buf to handle many fence contexts, so integrating that into the GEM object itself is the preferred solution. (For example, we can now share the same reservation_object between every consumer/producer using this buffer and skip the manual import/export via dma-buf.) v2: Reimplement busy-ioctl (by walking the reservation object), postpone the ABI change for another day. Similarly use the reservation object to find the last_write request (if active and from i915) for choosing display CS flips. Caveats: * busy-ioctl: busy-ioctl only reports on the native fences, it will not warn of stalls (in set-domain-ioctl, pread/pwrite etc) if the object is being rendered to by external fences. It also will not report the same busy state as wait-ioctl (or polling on the dma-buf) in the same circumstances. On the plus side, it does retain reporting of which *i915* engines are engaged with this object. * non-blocking atomic modesets take a step backwards as the wait for render completion blocks the ioctl. This is fixed in a subsequent patch to use a fence instead for awaiting on the rendering, see "drm/i915: Restore nonblocking awaits for modesetting" * dynamic array manipulation for shared-fences in reservation is slower than the previous lockless static assignment (e.g. gem_exec_lut_handle runtime on ivb goes from 42s to 66s), mainly due to atomic operations (maintaining the fence refcounts). * loss of object-level retirement callbacks, emulated by VMA retirement tracking. * minor loss of object-level last activity information from debugfs, could be replaced with per-vma information if desired Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-21-chris@chris-wilson.co.uk
2016-10-28drm/i915: Pass around sg_table to get_pages/put_pages backendGravatar Chris Wilson 1-2/+4
The plan is to move obj->pages out from under the struct_mutex into its own per-object lock. We need to prune any assumption of the struct_mutex from the get_pages/put_pages backends, and to make it easier we pass around the sg_table to operate on rather than indirectly via the obj. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-13-chris@chris-wilson.co.uk
2016-10-14drm/i915: Remove unused "valid" parameter from pte_encodeGravatar Michał Winiarski 1-3/+2
We never used any invalid ptes, those were put in place for a possibility of doing gpu faults. However our batchbuffers are not restricted in length, so everything needs to be pointing to something and thus out-of-bounds is pointing to scratch. Remove the valid flag as it is always true. v2: Expand commit msg, patch reorder (Mika) Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1476360162-24062-1-git-send-email-michal.winiarski@intel.com
2016-10-12drm/i915: Always use the GTT for error captureGravatar Chris Wilson 1-0/+2
Since the GTT provides universal access to any GPU page, we can use it to reduce our plethora of read methods to just one. It also has the important characteristic of being exactly what the GPU sees - if there are incoherency problems, seeing the batch as executed (rather than as trapped inside the cpu cache) is important. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161012090522.367-4-chris@chris-wilson.co.uk
2016-08-22drm/i915: Embed the scratch page struct into each VMGravatar Chris Wilson 1-5/+1
As the scratch page is no longer shared between all VM, and each has their own, forgo the small allocation and simply embed the scratch page struct into the i915_address_space. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20160822074431.26872-2-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Acked-by: Mika Kuoppala <mika.kuoppala@intel.com>
2016-08-19drm/i915: Embed the io-mapping struct inside drm_i915_privateGravatar Chris Wilson 1-1/+1
As io_mapping.h now always allocates the struct, we can avoid that allocation and extra pointer dance by embedding the struct inside drm_i915_private Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20160819155428.1670-5-chris@chris-wilson.co.uk
2016-08-18drm/i915: Track display alignment on VMAGravatar Chris Wilson 1-0/+1
When using the aliasing ppgtt and pageflipping with the shrinker/eviction active, we note that we often have to rebind the backbuffer before flipping onto the scanout because it has an invalid alignment. If we store the worst-case alignment required for a VMA, we can avoid having to rebind at critical junctures. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-28-chris@chris-wilson.co.uk
2016-08-18drm/i915: Choose not to evict faultable objects from the GGTTGravatar Chris Wilson 1-0/+1
Often times we do not want to evict mapped objects from the GGTT as these are quite expensive to teardown and frequently reused (causing an equally, if not more so, expensive setup). In particular, when faulting in a new object we want to avoid evicting an active object, or else we may trigger a page-fault-of-doom as we ping-pong between evicting two objects. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-26-chris@chris-wilson.co.uk
2016-08-18drm/i915: Move fence tracking from object to vmaGravatar Chris Wilson 1-0/+8
In order to handle tiled partial GTT mmappings, we need to associate the fence with an individual vma. v2: A couple of silly drops replaced spotted by Joonas Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-21-chris@chris-wilson.co.uk
2016-08-18drm/i915: Move map-and-fenceable tracking to the VMAGravatar Chris Wilson 1-2/+8
By moving map-and-fenceable tracking from the object to the VMA, we gain fine-grained tracking and the ability to track individual fences on the VMA (subsequent patch). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-16-chris@chris-wilson.co.uk
2016-08-15drm/i915: Introduce i915_ggtt_offset()Gravatar Chris Wilson 1-0/+9
This little helper only exists to safely discard the upper unused 32bits of the general 64-bit VMA address - as we know that all Global GTT currently are less than 4GiB in size and so that the upper bits must be zero. In many places, we use a u32 for the global GTT offset and we want to document where we are discarding the full VMA offset. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-28-git-send-email-chris@chris-wilson.co.uk
2016-08-15drm/i915: Track pinned VMAGravatar Chris Wilson 1-14/+0
Treat the VMA as the primary struct responsible for tracking bindings into the GPU's VM. That is we want to treat the VMA returned after we pin an object into the VM as the cookie we hold and eventually release when unpinning. Doing so eliminates the ambiguity in pinning the object and then searching for the relevant pin later. v2: Joonas' stylistic nitpicks, a fun rebase. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-27-git-send-email-chris@chris-wilson.co.uk
2016-08-15drm/i915: Consolidate i915_vma_unpin_and_release()Gravatar Chris Wilson 1-0/+1
In a few places, we repeat a call to clear a pointer to a vma whilst unpinning and releasing a reference to its owner. Refactor those into a common function. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-26-git-send-email-chris@chris-wilson.co.uk
2016-08-15drm/i915: Track pinned vma inside gucGravatar Chris Wilson 1-0/+6
Since the guc allocates and pins and object into the GGTT for its usage, it is more natural to use that pinned VMA as our resource cookie. v2: Embrace naming tautology v3: Rewrite comments for guc_allocate_vma() Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-12-git-send-email-chris@chris-wilson.co.uk
2016-08-15drm/i915: Create a VMA for an objectGravatar Chris Wilson 1-0/+5
In many places, we wish to store the VMA in preference to the object itself and so being able to create the persistent VMA is useful. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-9-git-send-email-chris@chris-wilson.co.uk
2016-08-15drm/i915: Always set the vma->pagesGravatar Chris Wilson 1-2/+1
Previously, we would only set the vma->pages pointer for GGTT entries. However, if we always set it, we can use it to prettify some code that may want to access the backing store associated with the VMA (as assigned to the VMA). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-8-git-send-email-chris@chris-wilson.co.uk
2016-08-11drm/i915: Rewrite fb rotation GTT handlingGravatar Ville Syrjälä 1-4/+1
Redo the fb rotation handling in order to: - eliminate the NV12 special casing - handle fb->offsets[] properly - make the rotation handling easier for the plane code To achieve these goals we reduce intel_rotation_info to only contain (for each plane) the rotated view width,height,stride in tile units, and the page offset into the object where the plane starts. Each plane is handled exactly the same way, no special casing for NV12 or other formats. We then store the computed rotation_info under intel_framebuffer so that we don't have to recompute it again. To handle fb->offsets[] we treat them as a linear offsets and convert them to x/y offsets from the start of the relevant GTT mapping (either normal or rotated). We store the x/y offsets under intel_framebuffer, and for some extra convenience we also store the rotated pitch (ie. tile aligned plane height). So for each plane we have the normal x/y offsets, rotated x/y offsets, and the rotated pitch. The normal pitch is available already in fb->pitches[]. While we're gathering up all that extra information, we can also easily compute the storage requirements for the framebuffer, so that we can check that the object is big enough to hold it. When it comes time to deal with the plane source coordinates, we first rotate the clipped src coordinates to match the relevant GTT view orientation, then add to them the fb x/y offsets. Next we compute the aligned surface page offset, and as a result we're left with some residual x/y offsets. Finally, if required by the hardware, we convert the remaining x/y offsets into a linear offset. For gen2/3 we simply skip computing the final page offset, and just convert the src+fb x/y offsets directly into a linear offset since that's what the hardware wants. After this all platforms, incluing SKL+, compute these things in exactly the same way (excluding alignemnt differences). v2: Use BIT(DRM_ROTATE_270) instead of ROTATE_270 when rotating plane src coordinates Drop some spurious changes that got left behind during development v3: Split out more changes to prep patches (Daniel) s/intel_fb->plane[].foo.bar/intel_fb->foo[].bar/ for brevity Rename intel_surf_gtt_offset to intel_fb_gtt_offset Kill the pointless 'plane' parameter from intel_fb_gtt_offset() v4: Fix alignment vs. alignment-1 when calling _intel_compute_tile_offset() from intel_fill_fb_info() Pass the pitch in tiles in stad of pixels to intel_adjust_tile_offset() from intel_fill_fb_info() Pass the full width/height of the rotated area to drm_rect_rotate() for clarity Use u32 for more offsets v5: Preserve the upper_32_bits()/lower_32_bits() handling for the fb ggtt offset (Sivakumar) v6: Rebase due to drm_plane_state src/dst rects Cc: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470821001-25272-2-git-send-email-ville.syrjala@linux.intel.com Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2016-08-04drm/i915: Make i915_vma_pin() small and inlineGravatar Chris Wilson 1-16/+37
Not only is i915_vma_pin() called for every single object on every single execbuf, it is usually a simple increment as the VMA is already bound for execution by the GPU. Rearrange the tests for unbound and pin_count overflow so that we can do the increment and test very cheaply and compact enough to inline the operation into execbuf. The trick used is to note that we can check for an overflow bit (keeping space available for it inside the flags) at the same time as checking the binding bits. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-17-git-send-email-chris@chris-wilson.co.uk
2016-08-04drm/i915: Combine all i915_vma bitfields into a single set of flagsGravatar Chris Wilson 1-22/+33
In preparation to perform some magic to speed up i915_vma_pin(), which is among the hottest of hot paths in execbuf, refactor all the bitfields accessed by i915_vma_pin() into a single unified set of flags. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-16-git-send-email-chris@chris-wilson.co.uk
2016-08-04drm/i915: Start passing around i915_vma from execbufferGravatar Chris Wilson 1-0/+14
During execbuffer we look up the i915_vma in order to reserve them in the VM. However, we then do a double lookup of the vma in order to then pin them, all because we lack the necessary interfaces to operate on i915_vma - so introduce i915_vma_pin()! v2: Tidy parameter lists to remove one level of redirection in the hot path. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-15-git-send-email-chris@chris-wilson.co.uk
2016-08-04drm/i915: Wrap vma->pin_count accessors with small inline helpersGravatar Chris Wilson 1-2/+29
In the next few patches, the VMA pinning API is overhauled and to reduce the churn we pull out the update to the accessors into a prep patch. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-14-git-send-email-chris@chris-wilson.co.uk
2016-08-04drm/i915: Record allocated vma sizeGravatar Chris Wilson 1-4/+1
Tracking the size of the VMA as allocated allows us to dramatically reduce the complexity of later functions (like inserting the VMA in to the drm_mm range manager). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-13-git-send-email-chris@chris-wilson.co.uk
2016-08-04drm/i915: Mark the context and address space as closedGravatar Chris Wilson 1-0/+9
When the user closes the context mark it and the dependent address space as closed. As we use an asynchronous destruct method, this has two purposes. First it allows us to flag the closed context and detect internal errors if we to create any new objects for it (as it is removed from the user's namespace, these should be internal bugs only). And secondly, it allows us to immediately reap stale vma. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470293567-10811-27-git-send-email-chris@chris-wilson.co.uk
2016-08-04drm/i915: Release vma when the handle is closedGravatar Chris Wilson 1-0/+1
In order to prevent a leak of the vma on shared objects, we need to hook into the object_close callback to destroy the vma on the object for this file. However, if we destroyed that vma immediately we may cause unexpected application stalls as we try to unbind a busy vma - hence we defer the unbind to when we retire the vma. v2: Keep vma allocated until closed. This is useful for a later optimisation, but it is required now in order to handle potential recursion of i915_vma_unbind() by retiring itself. v3: Comments are important. Testcase: igt/gem_ppggtt/flink-and-close-vma-leak Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470293567-10811-26-git-send-email-chris@chris-wilson.co.uk
2016-08-04drm/i915: Track active vma requestsGravatar Chris Wilson 1-0/+33
Hook the vma itself into the i915_gem_request_retire() so that we can accurately track when a solitary vma is inactive (as opposed to having to wait for the entire object to be idle). This improves the interaction when using multiple contexts (with full-ppgtt) and eliminates some frequent list walking when retiring objects after a completed request. A side-effect is that we get an active vma reference for free. The consequence of this is shown in the next patch... v2: Update inline names to be consistent with i915_gem_object_get_active() Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470293567-10811-25-git-send-email-chris@chris-wilson.co.uk
2016-08-04drm/i915: Store owning file on the i915_address_spaceGravatar Chris Wilson 1-6/+11
For the global GTT (and aliasing GTT), the address space is owned by the device (it is a global resource) and so the per-file owner field is NULL. For per-process GTT (where we create an address space per context), each is owned by the opening file. We can use this ownership information to both distinguish GGTT and ppGTT address spaces, as well as occasionally inspect the owner. v2: Whitespace, tells us who owns i915_address_space Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470293567-10811-6-git-send-email-chris@chris-wilson.co.uk
2016-08-04drm/i915: Rearrange GGTT probing to avoid needing a vfuncGravatar Chris Wilson 1-3/+0
Since we have a static if-else-chain for device probing of the global GTT, we do not need to use a function pointer, let alone store it when we never use it again. So use the if-else-chain to call down into the device specific probe. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470293567-10811-5-git-send-email-chris@chris-wilson.co.uk
2016-08-04drm/i915: Split early global GTT initialisationGravatar Chris Wilson 1-1/+1
Initialising the global GTT is tricky as we wish to use the drm_mm range manager during the modesetting initialisation (to capture stolen allocations from the BIOS) before we actually enable GEM. To overcome this, we currently setup the drm_mm first and then carefully rebind them. v2: Fixup after rebasing v3: GGTT initialisation needs to be split around kicking out conflicts v4: Restore an old UMS BUG_ON(mappable > total) as a DRM_ERROR plus fixup of probe results. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470293567-10811-4-git-send-email-chris@chris-wilson.co.uk
2016-08-04drm/i915: Update GGTT initialisation functions to take drm_i915_privateGravatar Chris Wilson 1-5/+5
Since these are internal functions they operate on drm_i915_private and not the drm_device being passed in. So pass in the drm_i915_private instead, and remove one layer of dancing. No space wins here, just conforming to the norm in function parameters. v2: Include all the probe functions Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470293567-10811-3-git-send-email-chris@chris-wilson.co.uk
2016-08-04drm/i915: Split GGTT initialisation between probing and setupGravatar Chris Wilson 1-0/+1
In order to handle conflicting drivers (i.e. vgacon) having a different setup of hardware, we have to remove those other drivers before we try to setup our own mappings. This requires us to split GGTT initialisation between probing for the hardware location (part of the PCI BAR) and later establishing the kernel resources for it. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470293567-10811-2-git-send-email-chris@chris-wilson.co.uk
2016-07-20drm/i915: Treat ringbuffer writes as write to normal memoryGravatar Chris Wilson 1-0/+1
Ringbuffers are now being written to either through LLC or WC paths, so treating them as simply iomem is no longer adequate. However, for the older !llc hardware, the hardware is documentated as treating the TAIL register update as serialising, so we can relax the barriers when filling the rings (but even if it were not, it is still an uncached register write and so serialising anyway.). For simplicity, let's ignore the iomem annotation. v2: Remove iomem from ringbuffer->virtual_address v3: And for good measure add iomem elsewhere to keep sparse happy Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> #v2 Link: http://patchwork.freedesktop.org/patch/msgid/1469005202-9659-8-git-send-email-chris@chris-wilson.co.uk Link: http://patchwork.freedesktop.org/patch/msgid/1469017917-15134-7-git-send-email-chris@chris-wilson.co.uk
2016-06-27drm/i915: tweak gen6_for_{each_pde, all_pdes} macrosGravatar Dave Gordon 1-20/+20
Gen8 versions of these macros were updated a few months ago (e8ebd8e drm/i915: eliminate 'temp' in gen8_for_each macros) originally because at least one iterator could generate an out of bounds access, but also because eliminating the 'temp' parameter generated smaller and faster code. Matthew Auld recently noticed the same problem with the gen6 versions and provided a patch https://lists.freedesktop.org/archives/intel-gfx/2016-June/099334.html but while we're changing these, we might as well make them as much like the gen8 versions as possible, including the style of using "&& (..., true)" rather than ": (..., 1) : 0", and of course eliminating the redundant 'temp'. Furthermore, the "all_pdes" version is only used in one place, so we can improve code efficiency by changing both the macro parameters and the calling code to reduce extra dereferences. Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1466793466-23500-1-git-send-email-david.s.gordon@intel.com
2016-06-13drm/i915: Add support for mapping an object page by pageGravatar Chris Wilson 1-0/+5
Introduced a new vm specfic callback insert_page() to program a single pte in ggtt or ppgtt. This allows us to map a single page in to the mappable aperture space. This can be iterated over to access the whole object by using space as meagre as page size. v2: Added low level rpm assertions to insert_page routines (Chris) v3: Added POSTING_READ post register write (Tvrtko) v4: Rebase (Ankit) v5: Removed wmb() and FLUSH_CTL from insert_page, caller to take care of it (Chris) v6: insert_page not working correctly without FLSH_CNTL write, added the write again. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2016-05-10drm/i915: Use drm_i915_private as the native pointer for intel_uncore.cGravatar Chris Wilson 1-1/+1
Pass drm_i915_private to the uncore init/fini routines and their subservients as it is their native type. text data bss dec hex filename 6309978 3578778 696320 10585076 a183f4 vmlinux 6309530 3578778 696320 10584628 a18234 vmlinux a modest 400 bytes of saving, but 60 lines of code deleted! Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1462885804-26750-1-git-send-email-chris@chris-wilson.co.uk
2016-05-10drm/i915: Re-enable GGTT earlier during resume on pre-gen6 platformsGravatar Ville Syrjälä 1-0/+1
Move the intel_enable_gtt() call to happen before we touch the GTT during resume. Right now it's done way too late. Before commit ebb7c78d358b ("agp/intel-gtt: Only register fake agp driver for gen1") it was actually done earlier on account of also getting called from the resume hook of the fake agp driver. With the fake agp driver no longer getting registered we must move the call up. The symptoms I've seen on my 830 machine include lowmem corruption, other kinds of memory corruption, and straight up hung machine during or just after resume. Not really sure what causes the memory corruption, but so far I've not seen any with this fix. I think we shouldn't really need to call this during init, but we have been doing that so I've decided to keep the call. However moving that call earlier could be prudent as well. Doing it right after the intel-gtt probe seems appropriate. Also tested this on 946gz,elk,ilk and all seemed quite happy with this change. v2: Reorder init_hw vs. enable_hw functions (Chris) Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: drm-intel-fixes@lists.freedesktop.org Fixes: ebb7c78d358b ("agp/intel-gtt: Only register fake agp driver for gen1") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1462559755-353-1-git-send-email-ville.syrjala@linux.intel.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-05-05drm/i915: Unexport i915_ppgtt_init()Gravatar Chris Wilson 1-1/+0
As i915_ppgtt_init() is not used outside of i915_gem_gtt.c we can make it static. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1462443767-5194-1-git-send-email-chris@chris-wilson.co.uk Reviewed-by: Matthew Auld <matthew.auld@intel.com>
2016-04-28drm/i915: Rearrange switch_context to load the aliasing ppgtt on first useGravatar Chris Wilson 1-1/+0
The code to switch_mm() is already handled by i915_switch_context(), the only difference required to setup the aliasing ppgtt is that we need to emit te switch_mm() on the first context, i.e. when transitioning from engine->last_context == NULL. This allows us to defer the initialisation of the GPU from early device initialisation to first use, which should marginally speed up both. The caveat is that we then defer the context initialisation until first use - i.e. we cannot assume that the GPU engines are initialised. For example, this means that power contexts for rc6 (Ironlake) need to explicitly loaded, as they are. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1461833819-3991-11-git-send-email-chris@chris-wilson.co.uk
2016-04-28drm/i915: Move ioremap_wc tracking onto VMAGravatar Chris Wilson 1-0/+35
By tracking the iomapping on the VMA itself, we can share that area between multiple users. Also by only revoking the iomapping upon unbinding from the mappable portion of the GGTT, we can keep that iomap across multiple invocations (e.g. execlists context pinning). Note that by moving the iounnmap tracking to the VMA, we actually end up fixing a leak of the iomapping in intel_fbdev. v1.5: Rebase prompted by Tvrtko v2: Drop dev_priv parameter, we can recover the i915_ggtt from the vma. v3: Move handling of ioremap space exhaustion to vmap_purge and also allow vmallocs to recover old iomaps. Add Tvrtko's kerneldoc. v4: Fix a use-after-free in shrinker and rearrange i915_vma_iomap v5: Back to i915_vm_to_ggtt v6: Use i915_vma_pin_iomap and i915_vma_unpin_iomap to mark critical sections and ensure the VMA cannot be reaped whilst mapped. v7: Move i915_vma_iounmap so that consumers of the API are not tempted, and add iomem annotations Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1461833819-3991-5-git-send-email-chris@chris-wilson.co.uk
2016-03-31drm/i915: Refer to GGTT {,VM} consistentlyGravatar Joonas Lahtinen 1-1/+1
Refer to the GGTT VM consistently as "ggtt->base" instead of just "ggtt", "vm" or indirectly through other variables like "dev_priv->ggtt.base" to avoid confusion with the i915_ggtt object itself and PPGTT VMs. Refer to the GGTT as "ggtt" instead of indirectly through chaining. As a bonus gets rid of the long-standing i915_obj_to_ggtt vs. i915_gem_obj_to_ggtt conflict, due to removal of i915_obj_to_ggtt! v2: - Added some more after grepping sources with Chris v3: - Refer to GGTT VM through ggtt->base consistently instead of ggtt_vm (Chris) v4: - Convert all dev_priv->ggtt->foo accesses to ggtt->foo. v5: - Make patch checker happy Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-03-30drm/i915: Rename GGTT init functionsGravatar Joonas Lahtinen 1-4/+3
Rename and document the GGTT init functions to give a better idea of the context where they are called from. i915_gem_gtt_init => i915_ggtt_init_hw i915_gem_init_global_gtt => i915_gem_init_ggtt i915_global_gtt_cleanup => i915_ggtt_cleanup_hw Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1458830866-12578-1-git-send-email-joonas.lahtinen@linux.intel.com
2016-03-18drm/i915/gtt: Clean up GGTT probing codeGravatar Joonas Lahtinen 1-3/+2
Use less pointers with the probing code, making it much less confusing to read. Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2016-03-18drm/i915: Rename dev_priv->gtt to dev_priv->ggttGravatar Joonas Lahtinen 1-5/+4
Refer to Global GTT consistently as GGTT, thus rename dev_priv->gtt to dev_priv->ggtt and struct i915_gtt to struct i915_ggtt. Fix a couple of whitespace problems while at it. v2: - Fix a typo in commit message. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2016-03-01drm/i915: Reorganize intel_rotation_infoGravatar Ville Syrjälä 1-7/+4
Throw out a bunch of unnecessary stuff from struct intel_rotation_info, and pull most of the remaining stuff to live under an array of per-color plane sub-structures. What still remains outside the sub-structure will be reorgranized later as well, but that requires more work elsewhere so leave it be for now. v2: Split the vma size == luma+chroma size fix to prep patch (Daniel) Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> (v1) Link: http://patchwork.freedesktop.org/patch/msgid/1455569699-27905-8-git-send-email-ville.syrjala@linux.intel.com
2016-02-26drm/i915: Reduce the pointer dance of i915_is_ggtt()Gravatar Chris Wilson 1-0/+5
The multiple levels of indirect do nothing but hinder the compiler and the pointer chasing turns to be quite painful but painless to fix. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Dave Gordon <david.s.gordon@intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1456484600-11477-1-git-send-email-tvrtko.ursulin@linux.intel.com
2016-02-26drm/i915: Rename vma->*_list to *_link for consistencyGravatar Chris Wilson 1-2/+2
Elsewhere we have adopted the convention of using '_link' to denote elements in the list (and '_list' for the actual list_head itself), and that the name should indicate which list the link belongs to (and preferrably not just where the link is being stored). s/vma_link/obj_link/ (we iterate over obj->vma_list) s/mm_list/vm_link/ (we iterate over vm->[in]active_list) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>